On Saturday the 17th of September 2016 I had the pleasure of attending the Threlford Memorial Lecture, where Professor Dorothy Kenny of Dublin City University gave an insightful talk on machine translation and why translation is at something of a crossroads at the moment.
She started her lecture by dismissing the idea that nowadays translators find themselves in a state of malaise because of the new technological developments in translation industry (machine translation in particular). In her role as an educator she feels that the onus is on her to adopt a practical and optimistic approach to teach students the way forward by designing an appropriate and up to date curriculum. In fact, based on her research she concluded that translators and interpreters are generally well disposed towards technology. Furthermore, she raised the issues of the ethics of machine translation and the economic implications for the translation industry.
First of all, she wanted to briefly discuss the architecture of SMTs and why we need to view them as a process rather than a tool. As you may well know, there are a lot of statistical machine translations (Google Translate, Microsoft Translator, Language Weaver) on the market and it is important to understand that they work similarly to a recycling process. Thus, SMTs have a particular architecture and they consist of training the system how to translate from ST to TT based on a probabilistic model using corpora of already translated texts. They are present on the market under the form of numerous translation memory products which are designed with post-editing in mind. However, just when everyone thought that SMTs are here to stay, it appears that a new technology is emerging: Neural Machine Translation which is based on a simpler architecture. Nevertheless, given the assiduous research required, it will be a few good years until we will actually see it as a commercial product on the market.
Also, for interpreters it will be useful to know that the same technology is used to translate spoken languages: for example, the EU-bridge project. A person speaks naturally – his/her speech is analysed and rendered by a speech recognition system and then the text obtained is fed into an SMT. Of course the automatic speech recognition system will cause some errors and others will take place during the SMT process; for example, if the speaker pauses, the system might assume it is a full stop and the meaning might be altered. And these are exactly the type of mistakes that would be corrected during the post-editing stage. A similar speech translation system is Skype Translate and although the reviews have been mixed since its launch, the general feedback seems to be: “It is automatic translation, so what do you expect?”
Interestingly, she then raised the question of what is the reaction of Translation Studies to SMTs? Some say that all translators will become post-editors and it is not a matter of if but when. Please bear in mind that these are informed opinions of practitioners. Thus, she then pondered whether indeed there are other opportunities for translators/post-editors in this case? Or perhaps, quite the opposite is true: if translating is such a complex skill requiring a lot of effort and talent, wouldn’t this be a huge loss? And is this true for all languages? After all, some highly inflected languages such as Turkish and Finish score quite badly at statistical machine translation.
But perhaps for translators the big question is: Why would anyone become a post-editor? And why should they be called only post-editors since terminology remains an important issue which requires an additional effort. And an equally important issue: how much is the market worth? After all, post-editing is only a small fraction of the market. She quoted that the Common Sense Advisory estimated in 2014 the translation industry to be worth 37 billion dollars and only 1.1 is attributed to post-editing but this segment is growing.
To my surprise, research indicates that post-editing makes a translator more productive but also more tired and frustrated. This immediately made me think of how remote-interpreting is regarded as more stressful for interpreters and yet most certainly this is a reality that is here to stay because it’s more accessible and cost-effective. But, the irony is that post-editing is often such a mechanical process – the translator is often correcting basic mistakes all over again – that in fact by automating the translation process SMTs have given humans a mechanical repetitive task that should really be meant for machines.
And of course remuneration remains an issue. So, how does one compensate translators for post-editing so that it is worthwhile for them to undertake this frustrating task? Professor Dorothy Kenny quoted 2 experiments carried out in 2012 showing that most translators (Italian, French) would charge 75% per cent of their normal rate for post-editing, except for German translators who requested 110% of their standard rate. Apart from a decent remuneration, post-editors would also want a way of eliminating some of the repetitive tasks (the automatic completion and predictions help with that), a quality confidence score and to know where the data is coming from (provenance data).
But perhaps her most innovative idea is that we need to discuss the ethics of machine translation and we need to factor in the decision-making agents. Thus, the purpose of using SMTs is to produce texts that are good enough to be used as target text and in general they are recommended for technical, less creative type of texts for this very reason. But, if post-editing is often associated with mere ‘good enough’ quality and a professional translator should be continuously striving for excellence (Andrew Chesterman, ‘The Return to Ethics’: Special Issue of The Translator, Volume 2001) how can one reconcile this ethical principle with just being good enough? Doesn’t that mean that overall the quality standards of translation will be lowered? and what are the implications for translators and for the industry? Learning to translate at a professional level is a hard-earned accomplishment and she expressed her concern that we risk losing this ability and replacing it with another (post-editing).
Sadly, it appears that human translators seem to have been completely obscured from the equation although statistically-minded computer scientists rely heavily on human translators and their skills. Basically, people donate data and other people collect it and make money from it. Let’s not forget that SMT has benefited from readily available aligned parallel corpora created by bilingual government sources or multilingual organizations such as UN or EU. This leads on to legal issues, namely that of Copyright Law. Personally, I was surprised to find out that a Report issued in 2014 by European Commission clearly states that short segments and sub-segments are protected by copyrights law and it is not clear that what translation agencies are doing with this data (corpora) does not represent in fact an infringement of Copyright Law.
And last but not least, this shift from translation to post-editing seems to have enabled a new type of linguists – monolingual translators – to compete with bilingual translators when using today’s machine translation systems. Thus Philipp Koehn (University of Edinburg) carried out a study on monolingual translators with no knowledge of the source language, only aided by a machine translation system and those who were skilled (good language skills in the target language and understands the domain) came close to professional bilingual performance on some of the documents.
She concluded that although perhaps it is still not clear how we need to address all these issues, it is time we agreed on an ethical basis that will surely help us rise to the challenge.
References:
Chesterman, Andrew, Proposal for a Hieronymic Oath, In Anthony Pym (editor) The Return to Ethics, special issue of The Translator, 2001, 7, 2, pp. 139-154
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C. J., Bojar, O., Constantin, A., and Herbst, E. (2007). Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177–180, Prague