Beyond text: the importance of external research to the quality of translation outputs in Japanese

Google first introduced neural translation to its MT platform in November 2016. Since then, machine translation has been considerably developed and refined. The fluency of neural machine translation has certainly improved a great deal.

Translation companies and language service providers (LSPs) have embraced machine translation as a means of reducing their costs. However, given the necessity for Machine Translation Post Editing (MTPE), is machine translation now effective in reducing production times for Japanese text translations?

I have conducted a study that explored the quality of MT outputs in Japanese utilising source text that is typical of a real-world translation project.

Has neural machine translation (NMT) achieved human parity?

Microsoft has published research that suggests NMT has achieved human parity. But such assertions are questionable given the limited inter-sentential context available. In addition, the proficiency of evaluators and the evaluation methods used to draw this conclusion require investigation. It is clear that machine translation can deliver poor results in certain contexts such as when numbers are involved. It is also problematic that external references cannot be utilised.

I was keen to identify the errors that appear in MT-generated texts and to explore whether customisation would enhance and accelerate the post-editing process. I aimed to establish the proportion of critical errors that could occur due to insufficient research into relevant external information.

The importance of research

Research is often the most time-consuming aspect of the translation process and therefore may appear to be the least cost-effective. Whether MPTE ultimately saves time or lengthens the translation process is a matter of debate. Studies have indicated that experienced professional translators tend to spend more time conducting research than novice translators. The conflicting opinions regarding how long post-editing can take could be due to researchers having looked at work conducted by translators with varying levels of expertise.

I felt it was crucial to establish the cost-effectiveness of MPTE. This work is often shunned by translators as they feel that their efforts in this regard are undervalued and that they are underpaid as a result.

What is neural machine Translation?

Neural Machine Translation utilises a neural network based on the activity of nerve cells in the human brain. The technology is described as Artificial Intelligence (AI) as it simulates the workings of human memory. NMT is enhanced by deep learning technology. An NMT system can learn the various features of inputted data including text, images and voice data. The computer model is a perceptron – a model that can simulate the ability of the human brain to recognise and discriminate.

Neural networks have been researched for more than 50 years, but to obtain highly accurate results, it is necessary to accumulate a large amount of data. This wasn’t possible until the arrival of the internet and the development of computers that possessed greater processing capabilities.

The development of neural network-type technology for Machine Translation was underway as early as the 1980s. The Janus Translation system appeared in 1991 and was a speech-to-speech translation system using similar technology to that of neural networks. Janus could translate spoken English utterances into Japanese and German. New Technologies similar to neural models appeared in 1997.

The Google Brain project was established in 2011. Google and Microsoft released Neural Machine Translation systems in 2016. This development dramatically changed the quality of translation outputs.

Neural networks are heavily dependent on data, and this clearly leaves Google in an advantageous position. Since NMT was incorporated into Google Translate in 2016, the platform has become the best-known and most used in the world. It currently supports 109 languages. There are other platforms that can accommodate some level of customisation. These include DeepL, Amazon Translate and Microsoft Translator.

Is the hype surrounding NMT justified?

Expectations for NMT have always been high and this has led to a considerable amount of hype surrounding the technology. Scholars have suggested that NMT could mean the end of foreign language teaching and even professional translation.

In truth, it is impossible to assess what the true implications of NMT could be. The advances that have been made in both NMT and Statistical Machine Translation (SMT) have already significantly changed the way that professional translators work. It is now common to use electronic dictionaries for translation. Both translators and post-editors are using Computer Assisted Translation tools (CAT tools). Google Translate is known to have produced questionable translation outputs in the past but the neural MT system DeepL has been supported by many academics.

Since neural capability, many aspects of life from signage to academic papers have been translated using MT. But the technology still produces erroneous outputs and makes errors that humans never would. Those errors can simply be amusing but may also have more serious consequences.

How do translation professionals feel about NMT?

A recent survey of translators has revealed that despite the numerous translation tools available to them, most felt that no single MT engine could fulfil all their requirements. Concerns were expressed regarding the level of technical expertise needed to use the systems. Translators also felt that the limitations imposed by the technology including text length restrictions were problematic and that difficulties in evaluating MT performance remained.

The primary issue for translation professionals now appears to be that while the general quality of NMT is increasing, the amount they can charge for postediting is rapidly decreasing as it is perceived to be less important. Many have begun to resist or even refuse postediting work for this reason. Studies have shown that it is pausing (thinking) and referencing have the greatest impact on the time taken when undertaking post-editing but post-editors are generally paid according to volume and not time.

NMT tool providers are now offering customisation options and additional functionality. While it may appear that almost anyone can create a valid custom MT engine, such an endeavour requires both technical skills and considerable linguistic knowledge. The inevitable consequence of unskilled translators using MT will be embarrassing errors.

What are the challenges of machine translation?

NMT continues to present many challenges to its users, and these continue to cause confusion:

System ambiguity.

The reasons why trained neural systems choose specific words or sentences during decoding aren’t properly understood. The insertion of a single space or punctuation mark can entirely change the output and so systems do not yield predictable outcomes.

Domain mismatch (Domain adaptation)

The performance of an NMT system deteriorates when translating out-of-domain sentences. MNT systems tend to prioritise adequacy over fluency thus when NMT is facing an unknown domain, it produces lower quality output.

Volume of training data

An NMT system relies upon a large corpus of training data. When the scope of the training data is reduced, the accuracy of outputs drops. MNT demonstrates weakness in translating low-frequency words.

Long sentences

MNT’s accuracy can only be maintained when sentences feature 60 characters or less. Long sentences result in dramatically reduced output quality.

Compromise

It is not possible to be both faithful to the source language and natural in the target language. A compromise is required between meaning and naturalness to gain acceptable translations overall. Technology cannot rival humans in making such compromises.

What issues arise with machine translation?

Machine translation is a form of what is known as intralingual translation (translation from one language to another). It is the practice of producing text in the target language that fulfils the role of the source text. The structural components of the text must be considered alongside the literal meaning of the words. Any translation involves at least two languages and two cultural traditions. Language, culture and life experience should be taken into account when translating. The role of the document and the purpose of the translation must be considered. Style is also an important consideration. In other words, translating is a complex undertaking that requires finesse.

Machine translation can lack the required finesse to deal with the following issues:

Lexical ambiguity: words having multiple meanings.

Many words can have two or more meanings. For example, the word note can mean notation, warning or musical sound. Such lexical ambiguities can result in translation errors.

Lexical mismatches: differing conceptual structures between language communities.

Lexical mismatches can arise from cultural differences. For instance, the Japanese language features many ways to express the concept of rain. There are no equivalent expressions in English for many of these.

Lexical holes: unlexicalized concepts across languages.

When there is no appropriate word in the target language to use, this is called a lexical hole. A translator must coin an appropriate paraphrase.

Multiword lexemes: idioms, phrasal verbs, and collocations.

Such figures of speech are challenging for translators as are euphemisms and the use of hyperbole, but all are common features of marketing material. Attention must be paid to the deep meaning of the phrases, otherwise, the resulting translations could be confusing or may not carry the same impact.

Specialized terminology and proper names: words used by certain discourse communities together with the names of people, places and organizations that often do not appear in dictionaries.

It is hard for translators to settle upon the correct translation of these aspects of any language. It is highly beneficial for clients to provide their LSPs with the correct terminology and precise definitions at the outset of a project.

False cognates: words that appear to be the same in more than one language but are not.

False cognates could be described as false friends. They usually occur when words have been borrowed from another language. For example, in Japan, a handle is a vehicle’s steering wheel whereas in English, it is a handgrip or nob.

All the above-mentioned features of content are difficult for any machine translation system to process and thus require considerable human intervention. When the first forms of NMT appeared, translation agencies and Language Service Providers were excited by the possibilities and started to provide NMT services. However, recent surveys have suggested that the use of NMT is increasing but that LSPs still have issues using the systems and yielding profits from their use. This is largely because MTPE is both challenging and time-consuming. In addition, skilled post-editors are difficult to source.

I conducted my own survey which revealed that most Japanese translators don’t accept MPTE work and that those who do accept it are poorly paid for their efforts.

Translations of superior quality cannot be produced via a linguistic process alone. External referencing, consulting related documentation and gaining an understanding of non-linguistic information including cultural norms are also crucial aspects of the work. Research suggests that readers favour human translations over machine translations, even when MT outputs have been post-edited. In addition, it has been found that the quality of post-edited outputs will depend heavily on the quality of the initial outputs and that NMT does not reduce the level of skill required for post-editing.

Looking at the importance of customisation and external research

The objectives of my study were to discover to what extent improvements can be made to translations by the customisation of MT and by the conducting of additional research into the subject of the source material.

In my study, text was Machine translated and the MT outputs with and without customisation were analysed to identify errors by type. The source text was extracted from the Microsoft documentation site. The source language was English, and the target language was Japanese. I compared MT output produced using DeepL with output produced using Google AutoML with customisation utilising a large corpus.

I found that the DeepL outputs contained three times as many errors as the Google customised MT outputs. I also discovered that the longer the sentences were in the DeepL outputs, the higher the incidence of errors. With outputs lacking the external research required for document-level textual information, the DeepL outputs featured more than twice as many errors as the Google customised outputs. Google customised MT was contextually superior to DeepL. This indicates that customisation could help to improve contextual referencing within a corpus.

DeepL produced more punctuation errors than Google. Both systems generated a huge number of terminology errors. This demonstrates that there are many issues that must be addressed by terminology integration or human post-editing. This is an area that requires further exploration. But it is obvious that the choice of MT and the quality of the corpus can critically affect the quality of the final output.

My analysis of the outputs revealed several further issues. I discovered that inconsistent translations of terminology frequently occurred in MT outputs and that the technology struggled to cope with the absence of plurals in Japanese. There were stylistic inconsistencies in the MT outputs together with inconsistent translations of phrases that appeared numerous times in the source text. Punctuation and spacing errors also appeared in the outputs, but there were fewer of these errors in the Google outputs. I found that the outputs from both systems featured word or sentence-level omissions together with content distortion that could be avoided by conducting external research.

My study clearly demonstrated that many issues remain with machine translation and that both customisation and external research are vital if superior translations are to be produced. MT cannot perform research and the burden of research falls on post-editors. Many of the errors that I found would have been easily identified by human translators and so would not have required correction by editors.

What is external research?

External references are important aspects of translation. Such references include visual references, user interfaces, or other outside information such as web references etc. Human Translators often request such references by sending queries to their employer or to the client and simply by undertaking web searches. MT systems cannot identify what terminology was used in which segment of a text. In addition, MT cannot decide whether any vocabulary or terminology is appropriate for the context. For instance, MT struggles to differentiate between American English and British English.

What are my conclusions?

Human translators use research and their specialist knowledge to choose the right context and apply the best possible choice of translation. Machine Translation simply cannot do this. Research on external references is the most time-consuming aspect of translation work. The time required to conduct research should not be underestimated neither should its impact on translation quality.

There are many obstacles to using and developing the MT environment so that it can produce improved outputs. These include platforms that cannot be customised and the fact that it is very difficult to establish a suitable level of customisation by simply using CAT tools. Terminology integration and Corpus data training isn’t possible at present on DeepL. This is particularly problematic when translating specialized content.

Customised Google MT delivers fewer errors than DeepL. However, to use any MT system satisfactorily, it is necessary to spend time learning how to deal with the MT platform and how to bring in terminology. It is also necessary to organise human post-editing. If the volume of text to be translated isn’t sufficiently large, is it worth spending considerable time and money on customising MT? Is it really possible to obtain satisfactory results? The answer to both questions at present is probably no.

New technology has inspired high expectations, but those expectations are unrealistic at present. Serious issues with MT remain. To make matters worse, I don’t feel that adequate ways of assessing MT outputs have been established. It could even be said that the increasing fluency of machine translation conceals hidden dangers. A study comparing the reliability of fluent and inaccurate sentences found that people tend to trust fluent sentences more than inaccurate ones. What if seemingly fluent sentences can also be wholly inaccurate? Humans are on a mission to explore the possibilities of MT but should be aware of the inherent risks.

Machine Translation Post Editing (MTPE) remains a necessity and so machine translation is often ineffective in reducing production times for translation companies and therefore their costs. MT could even lengthen the process due to the sheer number of errors and inconsistencies that must be identified and corrected.

NMT can prove useful and highly effective in many settings and its efficacy will continue to improve. But at this particular juncture, it does not deliver time and cost savings for professional translators who are dedicated to producing superior translations.

#japanesetranslation #machinetranslation

Blog

Japanese Translation Articles

Artificial Intelligence and MT