This paper presents an ongoing research project, started in March 2010 and sponsored by the Swiss National Science Foundation, which aims at improving machine translation output in terms of textual coherence. Coherence in text is mainly due to inter-sentential dependencies. Statistical Machine Translation (SMT) systems, currently sentence-based, often fail to translate these dependencies correctly. Within the COMTIS project, state-of-the-art linguistics research and Natural Language Processing (NLP) techniques are combined to identify and to label inter-sentential dependencies that can be learned by SMT system in the training phase.
B. Cartoni, University of Geneva