Day
Two: 25 November 2005 (09.00-17.00)
09.00
Coffee and Registration (for Day Two delegates
only)
09.25
Introduction by Chair: Professor Ruslan
Mitkov
Prof. Dr. Ruslan Mitkov has been working
in Computational Linguistics, Machine Translation
and related areas since the early 1980s.
His extensively cited research output and
interests cover areas such as anaphora resolution,
machine translation, translation memory
and translation aids, automatic abstracting,
centering, term extraction, question answering
and computer-aided learning/testing. He
is author of the monograph Anaphora resolution
(Longman) and sole editor of the Oxford
Handbook of Computational Linguistics (Oxford
University Press). Current projects include
his role as Editor-in-Chief of the Natural
Language Processing book series of John
Benjamins publishers and Editor-Consultant
of Oxford University Press' publications
in Computational Linguistics. He was recently
Guest Editor of the journals Machine Translation
and Computational Linguistics. He has been
invited to give keynote speeches at a number
of international conferences and has acted
as Programme Chair for several important
conferences on Natural Language Processing
and Machine Translation. At the University
of Wolverhampton he developed a successful
group in Natural Language Processing which
is already known for its innovative research
in various areas of the field and for its
NLP tools and resources.
09.25
Automatic Detection of Translation Errors:
The TransCheck System
Graham Russell, Université de
Motréal, Canada
This
paper will discuss the application of the
state of the art language technology to
the practical problem of automatic detection
of translation errors. Specifically, it
describes the current version of TransCheck,
a translation tool under development. The
work is motivated with reference to features
of the contemporary translation market,
including the acceptance of computer implemented
support for different parts of the translation-process.
It is situated with respect to other types
of textual error detection and to the overall
translation problem; the nature of translation
errors is discussed and the difficulty of
general case translation error detection
is shown.
10.05
Discussion
10.10
EXTER: A Breakthrough Solution for Efficient
Terminology Extraction
Cyril Chantrier, TEMIS SA, France
Temis
decided to collaborate with EDF and build
a new generation of terminology extraction
tools, built on the Lexter prototype and
the Temis extraction technology-namely Insight
Discovery Extractor (IDE). The name of the
extraction solution is EXTER. Having worked
on a project for a car manufacturer on a
corpus of 3 million words in French, the
EXTER solution provided an extremely good
level of quality and relevancy of proposed
terms, according to the users of output
with 50 000 terms candidates instead of
the expected 300 000. As a result, not only
the quality of the proposed term is higher,
but the validation and the cleaning has
been divided by a factor of 6, increasing
the ROI and the TTM of such a project. The
current version is supporting French and
English, with extensions in German, Spanish
and Italian.
10.50
Discussion
10.55
Coffee
11.15
MyCaTEx, A Language Independent Term
Extractor
José Vega, my-xML, Luxembourg
This
paper presents my-xML Candidate Term Extractor
(MyCaTEx), that works without any language
specific resources. It is currently being
developed by my-xML, a language engineering
company specialising in multilingual content
management. MyCaTEx term extraction algorithm
is based on the current research of Jacques
Vergne, University of Caen, France. MyCaTEx
can be used for term extraction, semi-automatic
and automatic generation of multilingual
thesauri and document tagging and classification.
11.50
Discussion
11.55
Research Meets Practice: t-survey 2005:
An Online Survey on Terminology Extraction
and Terminology Management
Daniel Zielinski, Saarland University,
Germany
This
paper reports the results of an ongoing
online survey on terminology management
and terminology extraction conducted by
the Linguistic Data Processing section of
the Applied Linguistics and Translating/Interpreting
Department at the University. The survey
has been available on the Internet in English,
French, German and Spanish since mid-May
2005 and continues to be accessible to the
public. It has been promoted in many major
CAT mailing lists and by translator and
interpreter associations. To date, almost
400 professional translators, terminologists
and interpreters all over the world have
responded to the questionnaire. With this
survey, we want to investigate how research
and practice are related in the area of
terminology, extraction and to evaluate
if there is any need to reconcile both.
Aimed at translators, terminologists, interpreters
and project managers, the main goals of
the survey are to investigate the dissemination
and application of terminology management
tools (with a focus on terminology extraction
tools) and to assess the demands on today's
terminology extraction tools.
12.30
Discussion
12.35
Embedding free online machine translation
into monolingual websites for multilingual
dissemination: a case study of implementation
Federico Gaspari, University of Manchester,
UK
A
growing number of websites that are only
available in one language rely
on free online machine translation (MT)
services to disseminate their
contents in a variety of other languages,
in order to make themselves
accessible to Internet users with different
linguistic backgrounds. This
approach to the management and delivery
of digital information that
bypasses professional localisation and translation
raises a number of
thorny issues, but clearly shows that free
online machine translation
services are regarded as valuable tools
to overcome language barriers in
the online environment. However, the vast
majority of websites that adopt this strategy
fail to take full advantage of the potential
offered by free web-based MT, mainly due
to poor consideration of crucial issues
in human-computer interaction and web usability
that are vital to ensure that Internet users
have a
positive and successful online experience.
This paper presents the key
stages and challenges involved in implementing
this approach to the
multilingual dissemination of online content,
whereby free online machine
translation is embedded into the architecture
of a monolingual website.
The main technical and practical issues
are illustrated by means of an
implementation case study based on the website
supporting London's
successful bid to host the 2012 Olympic
Games.
13.10
Discussion
13.15
Lunch
14.15
Panel Discussion: The Current State of Localisation
with Michael Anobile (LISA) and Reinhard
Schaler.
15.15
Tea
15.40
Does Using Controlled Language Improve Machine
Translation Results?
Nathalie de Preux, University of Geneva,
Switzerland
A
major concern for those companies that use
MT is to improve the quality of
the system produced raw translation as much
as possible. One promising
approach seems to be to influence the input
text by constraining its lexical
items and grammatical constructions - in
other words, restricting input text to
a controlled language (CL). The extent of
the improvement of applying CL rules
can be evaluated by comparing the machine
translations of CL and non CL texts.
In the quantitative part of our evaluation,
the results are relatively
satisfying. It proves to be the case that
once the number of errors are
counted in the translation of each version
(taking into account the gravity of
the errors), there is an improvement of
about 25% in the translation of texts
to which CL has been applied. However, the
results of the qualitative
evaluation are not so positive, with improvement
being assessed at around 8%.
Of course, the subjective nature of qualitative
criteria undermines the
reliability of these results. Thus, as a
whole, our results show that texts
produced with the aid of a CL lead to better
translations (by MT Systems) than
do free texts.
16.15
Discussion
16.20
Controlled Language and the Implementation
of Machine Translation for Technical Documentation
Laura Ramirez Polo, Saarland University,
Germany
This
paper will present a study examining whether
texts written in controlled language are
more translatable than texts which are not
compliant with the CL rule set. For this
study, the system CLAT, a sophisticated
language checker developed by IAI, will
be used. The study is divided into two phases.
The goal of both phases are different and
can be summarised as follows:
1) selection of resources and 2) evaluation.
Using the FEMTI-Framework as a base, this
study works to establish a standardised
methodology and to explore new metrics of
evaluation for contexts where MT comes into
question as a technology and making the
evaluation design re-usable for future potential
evaluations.
16.55
Discussion and Close of Conference