National Centre for Language Technology
Research AreasCALL Computer Assisted Language Learning
Within NCLT, we have looked at the use of authoring tools in the foreign language classroom as well as for self-access purposes. We are conducting research in developing tools and an adequate framework for the use and integration of Computer-Mediated Communication (CMC), mainly within a task or project-based approach. Some of the areas we are focusing on are learner autonomy, tandem language learning, multidisciplinary and multilingual collaborative learning at a distance (e.g. TECHNE Project), the effects of syntax priming in bilingual asyncronous communication and motivation among others. We are working on the development of applications and tools which support communication over the web for language learning purposes as well as allowing for collection of data for learner corpora. We are also interested in the development and use of tools that facilitate learning in multilingual virtual learning environments. Such tools involve collaboration with Machine Translation and Speech Technology colleagues.
The CALL group is involved in teacher training activities. In 2001, two workshops were organised for DCU staff and facilitated by world experts in CALL. We are contributing to the OILTE Project (CALL training for trainers project for primary and secondary teachers in Ireland) in collaboration with the Linguistics Institute of Ireland (ITE), EUROCALL and the NCTE.
In particular, researchers in this area develop and test computational models of how people use and combine their mental representations of concepts and categories. For example one of us (Costello) has developed a model of how people classify items in simple categories (how they recognise and identify members of categories like "animal", "pet", or "carnivore") and how they manipulate those categories to classify items in combinations of categories (how they identify "a carnivorous animal which is also a pet"). This model accounts for a number of empirically observed patterns in people's classification of items in simple and combined categories, and can make accurate predictions about people's classifications (see papers). Related research aims to apply this work to medical diagnosis and the identification of multiple disorders in patients, and to classification in subsective and privative adjective-noun combinations such as "skilful violinist" and "fake surgeon" (again see papers). Other releated research involves developing an information retrieval (IR) system based on this model. IR systems aim to find documents matching a user's query, where the query consists of a combination of terms (e.g. a query for documents describing "carnivorous pet animals"). Models of classification in combined categories should provide a useful framework for such systems.
Papers of interest:
Krauss, M. (1992). The world's languages in crisis. Language Vol. 68 No. 1 (1992)
Unesco (1996) World Conference on Linguistic Rights: Barcelona Declaration. Barcelona, 1996
Ward, M. (2001), "A template for a CALL program for Endangered Languages". Linguistic Perspectives on Endangered Languages, Helsinki, Finland 2001 (to appear).
Languages evolve historically to be optimal communication systems where human language learning mechanisms have evolved in order to learn these systems more efficiently. Machines in their learning of natural language have to start at a place that humans mastered thousands of years ago, uttering previously unheard signals and collectively establishing meaning. The question that this research deals with is how can a communication system that uses evolutionary computation and reinforcement learning evolve if initially none of the conspirators have mastered the system.
The term MT is associated with standalone translation programs. Nowadays a number of Computer-aided translation (CAT) tools exist, such as Translation Memory, dictionary lookup programs and Terminology management tools. These translation aids are of particular use when translating highly repetitive texts, such as technical documentation. The main growth in the use of MT is via the Internet, with hundreds of millions of pages of text available for training statistical systems. On-line translation systems such as Babelfish are being used to connect an increasingly gloabl -- and linguistically diverse -- public.
The field of MT is currently as healthy as it has been for years. More CAT (and even MT) tools are being used in the software localisation industry, conferences abound, and more and more students are being exposed to MT and translation tools in their programmes of study. This has all come about primarily because of a widespread recognition -- long known by many MT researchers -- of the limitations of MT, so that people's expectations are much more realistic than before, with the result that their aims are much more likely to be met by MT than was previously the case.
Crouch, R. and van Genabith, J., (1999), "Context Change, Underspecification and the Structure of Glue Language Derivations", in (ed.) Mary Dalrymple, Semantics and Syntax in Lexical Functional Grammar: The Resource Logic Approach pp. 117-189, The MIT Press, Cambridge, Massachusetts, ISBN 0-262-04171-5.
van Genabith, J. and Crouch, R., (1999), "Dynamic and Underspecified Semantics for LFG", in (ed.) Mary Dalrymple, Semantics and Syntax in Lexical Functional Grammar: The Resource Logic Approach pp. 209-260, The MIT Press, Cambridge, Massachusetts, ISBN 0-262-04171-5.
van Genabith, J. and Crouch, R., (1999), "How to Glue a Donkey to an f-Structure or Porting a Dynamic Meaning Representation Language into LFG's Linear Logic Based Glue Language Semantics", in: Computing Meaning, volume 1, (eds.) Harry Bunt and Reinhard Muskens, Studies in Linguistics and Philosophy , volume 73, Kluwer Academic Press, Dordrecht, Boston and London, 1999, pp.129 - 148, ISBN 0-7923-6108-3
It is now widely accepted within the computer industry that Ireland is a world centre of excellence in software localisation with most major software firms having a significant presence in the field in this country. It is estimated that Ireland exports up to 60% of PC-based software sold in Europe, and is the world's second-largest exporter of software after the USA [LOCA97]. Those companies that have chosen Ireland for their product localisation centres include software publishers such as Microsoft World Product Group Ireland, Lotus Development Ireland, Corel Corporation, Symantec, Visio International, Novell, Oracle Corporation and Claris; hardware manufacturers such as Gateway 2000 and Sun Microsystems; Service Providers such as Berlitz International; and tools developers such as Trados.
Speech technology encompasses many subdisciplines and encorporates much technology from other areas of NLP. The most immediate applications that spring to mind are: automatic speech recognition (ASR) - in simple terms, talking to machines; and speech generation (synthesis) - getting machines to talk.
Related to speech recognition is speaker recognition, which has two main subdivisions: speaker identification and speaker verification, the latter having implications for security.
Language identification plays an important role in multiligual spoken language systems, which are important in dealing with speakers from different language groups within a country, or visitors from abroad.
Multimodality involves combinations of speech, text and image processing. Some examples are integrating recognition of facial movements with speech recognition and generating facial movements for talking heads.
Some other areas of huge importance to these applications and speech communication in general are: speech analysis, speech coding and speech enhancement. Speech analysis is particularly important in helping us better understand the production of human speech which in turn helps us improve speech technology in general.
An excellent overview of speech technology (and human language technology in general) can be found at http://cslu.cse.ogi.edu/HLTsurvey/HLTsurvey.html
Last Updated: 12th July 2002 by email@example.com