John Tinsley, B.Sc., Ph.D.
PLuTO Project Co-ordinator
Centre for Next Generation Localisation
Dublin City University


I'm currently working on the Pluto project which aims to develop a commercial web-based solution for patent search and translation. Challenges encountered in doing this related to the domain adaptation of MT technology to suit the particular language found in patent documents e.g. legalese and specific terminolgy related to the technology at the core of the patent.

My main area of interest within NLP is in machine translation, particularly linguistic/syntactic approaches to the problem. I am also interested in parallel treebanks, their construction and use. I believe they are very useful resources and I am particularly interested in exploiting them for use in MT. Below is a description of my general Ph.D. topic.

(from NCLT page)

Title: Attempt: "All Trees" Efficient Models of Parsing and Translation

Duration: October 1st 2006 -- September 30th 2009

Funded by: Science Foundation Ireland's Research Frontiers Programme

People: Ventsislav Zhechev, Mary Hearne, Andy Way, and myself

Collaboration: Khalil Sima'an, University of Amsterdam

Description: Current statistical approaches to Machine Translation often produce 'word salad'. Despite the fact that knowledge of syntax has been shown to be useful in other MT paradigms, no-one has successfully incorporated such models into today's leading SMT systems. Example-based models currently achieve state-of-the-art performance in both parsing and translation, but computational efficiency can be a problem. This project proposes a number of efficient approaches to the problem of Translation by Parsing using all training examples ("all trees"), focusing in the first instance on the underlying monolingual parsing models and scaling them in subsequent phases to the bilingual case.

Support: ICHEC: Irish Centre for High-End Computing