Translating
and the Computer 28 Conference
Day
One: 16 November 2006
(Click
here for Day Two: 17 November 2006)
08.45 Registration
09.15
Introduction by Chair, Daniel Grasmick,
SAP AG, Germany
09.20
Very Large Translation Memories: is the
free model viable? Yves Champollion, www.champollion.net,
France
Building a set of very large translation
memories supporting all language combinations
is a challenging project. The VLTM project
aims at offering translators a repository
of TM in all languages, made available for
free, with a search-engine approach. The
author will explore the technical, entrepreneurial,
and deontological aspects of this idealistic
project, discuss its feasibility, and present
the current state reached by the project.
10.00
Developing Effective Localisation Tools
Strategies, Robert Martin, Alchemy Software
Development Ltd., and Beverley White and
Tim Swales, Canon Localization Services
This will be a presentation and discussion-based
session exploring of the decision-making
processes used by companies while developing
their own respective Software Localization
Tools strategies. Alchemy Software and Canon
will explore the various factors from the
client and tool provider perspectives:
1) ISV Perspective - How do ISV's develop
their strategies and perform needs analysis
to determine whether to build or buy their
own tools? Why do some companies completely
outsource their tools strategy and related
selection/developmental decisions to companies
such as Alchemy or SDL/Trados? Why do others
internalize or select a vendor-supplied
solution for internal use?
2) Tool Vendor Perspective - What are the
major influences upon Localization Tool
Vendors in their developmental strategies?
How have these changed in our industry?
How are tool providers responding to the
needs of the clients? What conflicts or
gaps exist between the need and the solutions
provided? How can clients and vendors work
together to innovate and further develop
existing tools solutions? Where do LSPs
fit into the tools equation? What drives
Innovation and new opportunities in the
market?
Attendees would be expected to participate
in discussions and thus work together with
the moderators as a group to develop an
understanding of the drivers behind Software
Localization Tools strategies for a variety
of organization types within the industry.
10.35
Discussion
10.45
Coffee
11.15
Translation Memories Survey 2006: Users'
perceptions around TM usage, Elina Lagoudaki,
Imperial College London
The Translation Memories Survey 2006 reported
in this paper was initiated with a view
to act as a channel of information deriving
from users (or potential users) of TM systems,
which will shed light to their work habits
and practices in need of technology solutions.
The main interest behind this survey is
to present the users' perspective about
TM systems. It looks ahead to supply data
on the application domain, that is, information
on the procedural aspects of the translation
activity, on frequent work practices and
on the environment in which the translation
activity is performed, as well as an evaluation
of different TM tools by their users. It
also offers a glimpse into the future of
TM technology the way translation professionals
visualise it. The survey's purpose is to
complement the research conducted by TM
system engineers towards the optimisation
of translation support tools.
11.50
Discussion
11.55
Going Global with the TextBase Translation
Memory, François Tardif, Customer
Service - MultiCorpora Europe, Belgium,
and Anne Laugesen, Global
Denmark, Denmark
Driven by fast-paced global competition
where the time-to-market of new products,
services and communications into multiple
languages and cultures is mission-critical,
organizations are increasingly demanding
translation services that provide faster
turnaround while maintaining the highest
level of quality. A key driver behind the
need for speed and quality is the ongoing
explosion of web-based content and the related
expectations of content freshness and quality.
Most of these organizations, that operate
at a global level, have several thousands
of previously translated documents scattered
throughout the companies containing valuable
terminology that remains locked away. Unlocking
100% of these translations can empower all
the collaborators - the authors, translators,
revisers and external service providers
- to develop a greater synergy, saving research
time and improving consistency. Unlike traditional
translation memory tools, the TextBase Translation
Memory can leverage more repetitions since
it is designed to operate at the full-text
level rather than from a laborious database
of pre-aligned sentences out of context.
12.15
Discussion
12.30
Termprofile.com - Supporting a conference
interpreter's workflow,
Anja Ruetten, Germany
A tool that helps searching for terms on
the internet, differentiating between the
countries of origin of the hits, presenting
relative frequency and comparative numbers
for alternative terms in one language or
different languages on one glance is one
of the instruments that can significantly
enhance interpreters' (and of course translators')
workflow. TermProfile shows if terms or
expressions in a certain language are used
in the respective countries. It offers an
interface to do three queries in parallel
and then shows the search result numbers
on one screen. For each query it is possible
to enter a second term ("control term")
in order to obtain a relative frequency
of a term. Termprofile offers a web-statistical
profile of terms and expressions with dimensions
and relations that go far beyond a simple
Google search.
13.00 Discussion
13.05
Lunch and Exhibition
14.00
TILP Ask the Expert Session: TechLink:
Training and Education for Localisation
Are
you a freelancer wondering how you could
possibly pay hundreds of euros for professional
training to develop your career? Are you
an employer asking yourself how to attract
properly educated and independently certified
professionals localisers? Are you an educator
of future localisers who does not have proper
access to the appropriate resources? Then
you should attend this special session as
it will discuss these and similar questions
with international experts, addressing the
opinions and views of delegates.
Career
development, courses on offer, and professional
certification will be centre stage at this
session. It will be run by TILP and supported
by the EU-funded TechLink project, which
aims at developing localisation training
courses for Asia.
14.00
Introduction by Chair: Reinhard Schäler
14.20 Training needs in localisation - the
client's view, Liam Cronin (Microsoft,
Ireland)
14.40 Training initiatives in localisation
- the vendor's view, Charles Campbell
(spanishbackoffice, Argentina)
15.00 Career development: staying
up-to-date-the freelancers' view, Miriam
Lee (Selfemployed and VP Federation
Internationale des Traducteurs, FIT)
15.20 Access to resources and accreditation
- the training providers' view,
Dr. Tim Altanero (Austin, Texas)and Debbie
Folaron (Concordia University, Canada)
15.40 Break
16.00 Panel Discussion
16.50 End of Day One
17.00
TILP 2006 AGM
17.30
Reception for
all ASLIB delegates on the occasion of the
official
announcement of the 2006 TILP Fellowship.
=============================================================
Day
Two: 17 November 2006
08.45
Coffee and Registration (for Day Two delegates
only)
09.15
Introduction by Chair: Professor Ruslan
Mitkov, University of Wolverhampton, UK
09.20
W3C Internationalization Tag Set: A Gentle
Introduction, Christian Lieske, SAP,
Germany
XML has many built-in capabilities to support
the worldwide use of content. Proper use
of these capabilities for the purpose of
internationalization (i18n) and localization
(l10n), however, sometimes requires considerable
expertise. This holds especially for developers
of XML schemas, and producers of XML instances
(such as authors or translators). The ITS
Working Group (ITS WG) has created a standard
which makes it easier to create XML which
is internationalized and can be localized
effectively. On the one hand, the ITS WG
identifies concepts (such as "directionality")
which are important for i18n and l10n. On
the other hand, the ITS WG defines implementations
of these concepts (termed "ITS data
categories") as a set of elements and
attributes called the Internationalization
Tag Set (ITS). ITS can be used with new
as well as with existing XML-based content.
Furthermore, ITS works with popular schema
languages such as XML DTD, XML Schema and
RELAX NG, and widely adopted XML schemas
such as DocBook and DITA. It is expected
that ITS can be used in a wide variety of
processing contexts.
10.00
Improving the Quality of Automated DVD
Subtitles via Example-Based Machine Translation
(EBMT), Stephen Armstrong, Dublin City University
Motivated by the varying quality of DVD
subtitles due to declining budgets and time-frames,
our project set out to examine the scope
of a technology-based solution. This paper
aims to investigate whether a correlation
exists between the quality of automated
DVD subtitles and the corpus used to train
and test
the system. We want to investigate whether
or not the quality varies greatly depending
on the corpus used for training purposes.
The method used to produce automated DVD
subtitles is Example-Based Machine Translation
(EBMT) and the suggested training data are
a language-specific corpus containing only
existing DVD subtitles and corpora containing
a mix of specific and general language data.
The Example-Based Machine Translation (EBMT)
system we are using was designed and implemented
by the NCLT MT group at Dublin City University
(Stroppa et. al, 2006).
10.35
Discussion
10.45
Coffee
11.15
Automatic Multilingual Subtitling in
the eTitle Project, Antoni Oliver Gonzalez,
GliCom, Spain
European media companies and archives produce
and hold large quantities of high-quality
programme material that is not exploited
because it is in the wrong language, or
contains elements that are too local in
appeal. While these programmes could be
localised by sensitive editing and sub-titling,
the costs of doing so are prohibitive for
most classes of material. eTitle is a two-year
project that ended in February 2006 and
was designed to create web-based solutions
that allow media content owners to exploit
it internationally, through multilingual
and cross-platform localisation. eTitle
builds on a spectrum of newly available
technologies for Digital Asset Management,
Automated speech-to-text, Machine Translation,
Sentence Compression, Subtitling Automation
and Metadata Automation to provide a much
more cost-effective digital workflow.
11.50
Discussion
11.55
Integrated Bilingual Specialist Dictionaries
- LexTerm initiative, Marie-Jeanne Derouin,
Langenscheidt, Germany and André
Le Meur, Université de Rennes 2,
France
Dictionary publishers dealing with millions
of bilingual and multilingual data have
to find answers to the following two main
issues:
- accommodating the needs of a still large
dictionary users community for traditional
printed or electronic dictionaries and
- meeting the need of the professional users
(translators, technical writers etc.) beyond
Machine Readable Dictionaries on CD-ROM
or Online
From now the dictionaries have also to be
developed as being one component of multifunctional
tools for Computer Assisted Translation
(CAT). For this purpose the German specialist
dictionary publisher, Langenscheidt Fachverlag
(LFG) in Munich proposes a global solution
together with experts from the University
of Rennes 2 in France and well known Translation
Memory providers. The aim is to produce
two versions out of a single source for
every bilingual dictionary: a lemma-oriented
one for paper and electronic dictionaries
and a concept-oriented one for integration
in other language tools. LexTerm is a methodology
for reusing lexicographical data and building
a bridge between dictionaries and terminology.
It relies on ISO standards (ISO 16642 for
terminology and ISO 1951 for lexicography)
and a XSL library, which is publicly available.
12.30
Discussion
12.40
Lunch and Exhibition
14.10
Introduction by Chair: Chris Pyne, SAP
AG, Germany
14.15
Opentrad Apertium open-source machine translation
system: an opportunity for business and
research, Gema Ramírez-Sánchez,
Transducens(DLSI, Universitat d'Alacant),
Spain
Most successful machine translation systems
(MTS) built until now use proprietary software
and data, and are distributed as commercial
products or are accessible through the net
with some restrictions. This kind of MTS
are regarded by professional translators
and researchers as closed products only
suitable for use or reverse engineering.
We present Opentrad Apertium, an open-source
shallow-transfer machine translation engine
for related-language pairs, developed in
a large, government-funded open-source development
project involving 4 universities and 3 enterprises
all located in Spain. The system uses standard
formats for linguistic data (based on XML)
in order to ease interoperability. The translation
technologies used in Opentrad Apertium are
modular, what makes them naturally adaptable
to a wide variety of purposes in addition
to machine translation.
14.45 Discussion
14.50
Business Process Outsourcing in Document
Management, Luc Huygh, euroscript, Luxembourg
This presentation aims to give an overview
of how Documentum - one of the leading enterprise
content management systems on the market
- was chosen to support and steer production
processes for the European Institutions.
The ultimate aim is to include all of the
company's production processes under the
EMC Documentum hood (not only institutional).
At the Aslib conference, Luc will present
a pilot, which has been running for over
two years for the European Parliament. The
pilot processes multilingual documents that
come with very high turnaround times and
feature a highly specialized language. The
Documentum solution under scrutiny here
is called escæpe (short for euroscript
advanced production environment for document
processing).
15.20
Discussion
15.25
Tea
15.45
The use of multi-level annotation and
alignment for the translator, Mihaela Vela
& Silvia Hansen-Schirra, Saarland University
Up to now the annotation of translation
corpora, i.e. their linguistic enrichment,
has been carried out in order to empirically
investigate the properties of translated
text. On the other hand, practical translators
also work with large amounts of translated
texts, the enrichment of these parallel
texts, however, being mostly limited to
sentence alignment. The use of these aligned
texts in translation memories is again limited
to string-based queries. The aim of this
paper is to show how a multiply annotated
and aligned corpus can be used as a translation
memory, exploiting the linguistic enrichment
of the corpus. The research described here
is part of a pilot project called KOALA
for which we use the CroCo Corpus (cf. Hansen-Schirra
et al. 2006) which consists of English originals,
their German translations as well as German
originals and their English translations.
Both translation directions are represented
in eight registers. Altogether the corpus
comprises one million words.
16.15
Discussion
16.20
Fully Automatic High Quality Machine
Translation of Restricted Text - A Case
Study, Uwe Muegge, Medtronic, Inc., USA.
Medtronic is currently in the process of
consolidating multiple distributed legacy
product databases into one centrally managed
SAP database. With a nine-figure budget,
the Centerpiece effort is the largest and
most visible IT project Medtronic has ever
undertaken. One crucial part of this project
is the translation - into eight languages
- of existing descriptions for 50000 products,
as well as approx. 200 new descriptions
that are being added to the database every
week. With both normalized source text and
comprehensive, authoritative terminology
in place, Medtronic is in an excellent position
to use machine translation to produce translations
of these product descriptions at the push
of a button. This presentation illustrates
the processes that allow Medtronic to produce
translations in-house, instantly, at higher
quality than previous human translations,
and at a fraction of the cost of human translation.
16.50
Discussion
16.55
Close of Conference