ULG's Language Services Blog

When Machine Translation Works, and When It Is Likely To Need Some Help

This article was originally published in April 2018 and has been updated.

Machine Translation (MT) and its applications continue to gain traction as one of the hottest topics in the language industry, and under the guise of “natural language processing” (NLP), it is also one of the hottest topics in the much broader field of Artificial Intelligence (AI). With so much visibility and investment it's hard to stay on top of where MT can drive real value within your organization. We hope this article will help!

So, what does the future of MT really hold? For now, it’s hard to say definitively what the technology will look like in 10 years. But, we do know that for the time being, it’s impossible to get by using only MT for all of your translation needs. This is not so much a limitation of the translation but more so of the ability of the system to understand the context and evolution of language within an industry or specific set of content. As you develop new products and solutions, or as the language you operate in evolves to use new slang or new innovations, the AI still struggles to map what it knows about language in one field to another field. At an industrial level within AI this is referred to as “general artificial intelligence” as differentiated from current stat “artificial intelligence.”

However, that does not mean that MT as it stands does not represent tremendous opportunity for language solutions.

Drives efficiency and speed

Knowing where MT can act as a standalone solution versus where MT will need help from human translation or human training typically comes down to a few factors:

  1. Cultural sensitivity of the messaging
  2. Regulatory or liability demand of the content
  3. Demand for speed of delivery

We will touch on each of these areas and address how to help MT reach your needs.

Cultural sensitivity

Cultural sensitivity is where the target culture of your audience will make a big difference to your outcome. The clearest example is in marketing content. The MT will easily translate “get your big TV for the big game on Thanksgiving,” but it won't realize that to a number of cultures this message alone will not help you sell “big TVs.” More subtly, the way to speak to a youth market in Germany or a Control valves market in Japan is not captured by off--the-shelf MT systems like Google Translate. Where there is a high cultural sensitivity to your message, you will typically need both training of the MT engine – to align it to your key terms and phrases and then also post editing, where a linguist can review the MT output and revise it to align it to your target market.

Regulatory or liability demands

MT within regulatory settings, including security regulations like HIPAA, are important for MT application. If you use Google Translate, the content submitted for translation ultimately ends up in the public domain. If someone in your office takes an HR complaint about your business, for example – that complaint could go public. Conversely, if you have high liability for the content (such as clinical trials results or feedback or contractual issues), even if an MT engine made the error, you will still be directly liable for that error and its impact. Remember, you cannot take an AI translation engine to court to testify about why it chose a given translation term or why a specific error occurred. Professional translation agencies like ULG carry significant and appropriate professional insurance for just such challenges. An added challenge in these situations is that AI-driven MT is often extremely variable; it can be near-perfect for 99% of the document, but then in the last 1% be entirely wrong. If there is a high demand for regulatory compliance or liability protection, then that 1% will outweigh any potential savings.

Recently we have started to proof, test, and deploy AI-driven MT in these areas. Just a few years ago we would have said it was not possible, but as AI and the ability to train and customize an engine improve, new doors have unlocked. For this type of content (and compliance with ISO Quality and Information Security Management System Standards) we always use a two-stage human translation authentication. The process involves MT, human translation review, and is then followed by human consistency and second review. This multi-step approach still yields up to 15% cost and time savings with zero risk of error.

Demand for speed

When speed is of the essence – a good example being legal court productions where millions of words are needed within a few weeks – MT is literally the only answer. The key here is to understand that the timeframe simply does not allow for human support, and ultimately if you are clear on that point, that context will be beneficial. When you have an extreme speed demand even reviewing it (post edit style) is unlikely, as an average reviewer can read 10,000 words a day of human translation, equating to five hundred days of work for 5 million words. If you accept that human review is out of the question and that MT must stand alone to produce the required volume within the required timeline, then you can also accept the “available quality” output or what is termed as “raw MT” output.

If this is the focus, then 4-5 days training of the engine up front might well help to improve the “raw” output, in a way that even a hundred days of post editing would not. Focusing the human effort on engine training will yield much better returns without sacrificing your timelines.

By accepting raw output in this regard, you can avoid creating disconnects about quality with your target audience. Where enterprises say “we can deliver these 10,000 FAQS through MT, but there will be quality issues; please help us with those,” they get very enthusiastic receptions from their user base. However, where enterprises use MT hoping to get human quality in a tight timeframe and process where it is not even possible, they get significant pushback to the tune of, “this looks like it was machine translated": the same output delivered with different expectations, resulting in very different outcomes.

Knowing when and where to apply MT and how to best integrate and use human linguistic resources to augment, train, and support the desired outcome has saved our customers millions of dollars each year. If you have any interest in MT, we would love to connect you to one of our world-leading computational linguistic experts who can share what we have learned when working with hundreds of enterprise-class MT clients.