Machine Translation @ the NCLT & the CNGL

Dublin City University, Ireland

Home People Projects Publications Events Announcements Links
-- current projects --
  EuroMatrix+
  Panacea
  CoSyne
  PLuTO
  T4ME
  Confident MT

-- completed projects --
  Prospect
  Attempt
  Sign language
    translation

  Evaluation
  Transbooster
  DVD subtitling
  LFG-DOP
  EBMT & Marker
    Hypothesis

  DOP & DOT
  Hybrid EBMT-SMT
Title:Confident MT: Estimating Translation Quality for Improved Statistical Machine Translation
Duration:November 2011 -- November 2014
Funded by:IRCSET (Irish Research Council for Science, Engineering and Technology)
People:Rasoul Samad Zadeh Kaljahi, Raphael Rubino, Jennifer Foster, Johann Roturier, Fred Hollowood
Description:

The commercial demand for high quality Machine Translation (MT) is obvious. For localization purposes, a software company such as Symantec needs to deliver helpful content to its customers in their native languages. However, MT evaluation via automatic metrics is only possible when a reference translation is available. In the more realistic setting where no such reference is available, reliable techniques for estimating the quality of translation system output are needed.

As more and more customers move away from traditional call centres and corporate websites in favour of self-service via dedicated discussion forums, there is a growing need for machine translation of User-Generated Content (UGC). Because UGC is an unedited mix of writing styles containing spelling mistakes, abbreviations and non-standard punctuation, it poses a particular challenge for Natural Language Processing (NLP) tools that have been trained on well-formed text.

The aim of the Confident MT project is to develop Confidence Estimation (CE, or QE for Quality Estimation) methods to measure the reliability of MT output in the context of UGC about Symantec products. The CE methods will be applied across a range of MT systems (such as Rule-Based, Example-Based, Phrase-Based SMT and Syntax-Enhanced SMT) and the results will be used to inform the optimal combination of MT systems.

   

Last update: April 19 2012
Related Sites: NCLT | School of Computing | School of Applied Languages and Intercultural Studies | Dublin City University