86

Google Translate is notoriously unreliable for Latin. However, the translations do make some amount of sense. Is there some kind of translation task involving Latin that Google Translate is relatively reliable for?

I tried it for simple phrases to and from English. Here are the result of my little test1:

  1. Puer canem amat. → The boy dog loves. (The boy loves the dog.)
  2. Canem puer amat. → The boy loves the dog. (Correct!)
  3. Puerum canis amat. → The dog loves the child. (Correct!)
  4. Canis puerum amat. → The dog loves the child. (Correct!)
  5. Puer canis amat. → The boy dog loves. (The boy of the dog loves.)
  6. Canis puer amat. → The dog loves the child. (The boy of the dog loves.)
  7. Canis pueri amat. → The dog loves children. (The dog of the boy loves.)
  8. Pueri canis amat. → The dog loves children. (The dog of the boy loves.)
  9. The boy loves the dog. → Puer canem diligit. (Correct!)
  10. The dog loves the boy. → Canis puerum amat. (Correct!)
  11. The boy of the dog loves. → Puer canem amat. (Puer canis amat.)
  12. The dog of the boy loves. → Canis pueri amat. (Correct!)
  13. The boy and the dog walk together because it does not rain. → Puer et canis ambulare quia non pluet simul. (Puer et canis una ambulant quia non pluit.)

These might not be the perfect example sentences, but they demonstrate some basic syntax. (Whether or not the content makes sense should not affect translation in simple cases like this, but it might be that Google Translate is not wired that way.)

It seems that translation from Latin to English is very difficult even with simple structures. English to Latin is much better, but it fails for the slightly more complicated sentence. Google does offer alternatives, and some of them greatly improve the last sentence, but someone with no knowledge in Latin will not be able to pick the right ones.

The tool might work better for translating individual words2 or with some language other than English. And perhaps it does translate simple SVO clauses consistently well from English to Latin — I did not do an extensive test, and I have no prior experience. Does someone have more experience with Google Translate? It would be good to know if there is something it is useful and reliable for, even if the scope is very limited.


1 Format: Original → Google translation (better translation if Google fails)

2 For translating individual words it's better to look at any online Latin dictionary. But my question is not whether there are better tools than Google Translate. The question whether Google Translate can be trusted for anything at all regarding Latin.

Rafael
  • 11,428
  • 2
  • 32
  • 66
Joonas Ilmavirta
  • 113,294
  • 21
  • 192
  • 587
  • 9
    I actually find it is a fun study/practice tool, as there is a part of Google Translate called Google Translate Community that lets you translate phrases they give you to or from Latin, and it lets you check other people's translations. In addition, it helps improve Google Translate, so that's a plus! – Sam K May 11 '17 at 01:21
  • 4
    @SamK Hmm... You could actually add that as an answer. If Google Translate comes with a tool for practicing and it (slowly!) helps improve the system, it is indeed interesting. – Joonas Ilmavirta May 11 '17 at 06:58
  • 10
    I think that Latin, being inflected and syncretic and having no fixed word order, is one of those languages where anything corpus-related is doomed to fail. In order to translate from Latin, a system will actually have to "know" Latin. Especially texts emulating so-called Ciceronian diction will need more than just a Markov chain and/or crowd-sourced user input. – blagae May 11 '17 at 10:20
  • 2
    @blagae I fully agree. Google Translate does not seem to have a good (any?) structural understanding of Latin, and without that it's pretty hopeless. – Joonas Ilmavirta May 11 '17 at 17:28
  • 2
    Amusingly, I was just playing around with Google Translate for Latin and found this gem: dolor: pain - sit amet: a lot - dolor sit amet: carrots. (All three also have the "community checked" shield as well.) – R.M. May 11 '17 at 17:53
  • I've been using Google Translate recently because of working on software that was originally written in France. As I don't read French, Google Translate has been useful for getting some sense out of the comments. – CWallach May 11 '17 at 23:36
  • The simple answer is: It's good for translating movie and tv subtitles to English. And, at times, newspaper articles. Don't try to use it for translating a non-English language to a non-English language, because that is pretty useless. – GwenKillerby May 13 '17 at 13:41
  • My experience with non-English languages is that Google Translator does a pretty good work that just needs a revision, provided that the corpus is big enough. I often use it to translate Catalan to Spanish and Spanish to Catalan and it helps me to save a lot of work. I suppose the Latin-English corpus is just to small to provide a useful translation beyond grasping the meaning. – Pere Jun 30 '17 at 18:05
  • I just tried all 8 of the Latin→English examples in the question. Google Translate gives correct results for 1–6 without the trailing . and for 7 and 8 with the trailing . so I was able to get successful translations of all 8 sentences out of it. – ShreevatsaR Jul 03 '17 at 17:51
  • @ShreevatsaR Interesting! I'm not sure if it's learning over time or depends on the user's translation history. Dependence on trailing punctuation is weird, but I have seen that before. Perhaps you can indeed squeeze the right translation out of it, but not consistently, and not easily if you don't know Latin. – Joonas Ilmavirta Jul 03 '17 at 18:02
  • Btw, Google translator translates “google” to “google“, haha. – mykhal Jul 22 '20 at 06:13

6 Answers6

141

A classmate of mine who got his Ph.D. in natural-language processing and now works at Google told me the following. It might be out of date and I might be remembering it wrong. But I just did a little, er, googling, and this seems to be passably well corroborated by other sources.

How it works

Google Translate is completely statistical. It has no model of grammar, syntax, or meaning. It works by correlating sequences of up to five consecutive words found in texts from both languages.

Here's the conceit. Ignore all the complexity, structure, and meaning of language and pretend that people speak just by randomly choosing one word after another. The only question now is how to calculate the probabilities. A simple way is to say that the probability of each word is determined by the previous word spoken. For example, if the last word you said was "two", there is a certain probability that your next word will be "or". If you just said "or", there is a certain probability that your next word will be "butane". You could calculate these word-to-next-word probabilities from their frequencies in real text. If you generate new text according to these probabilities, you'll get random but just slightly coherent gibberish: TWO OR BUTANE GAS AND OF THE SAME. That's called a Markov model. If you use a window of more words, say five, the resulting gibberish will look more likely to have been written by a schizophrenic than by an aphasic. A variation called a hidden Markov model introduces "states", where each state has its own set of probabilities for "emitting" words as well as a set of "transition" probabilities for what will be the next state. This can simulate a little more of the influence of context on each word choice.

Google Translate's algorithm is proprietary, and I think it's a little more sophisticated than hidden Markov models, but the principle is the same. They let computers run on lots of text in each language, assigning probabilities to word sequences according to the principle "Assuming this text was generated by a random gibberish-generator, what probabilities maximize the chance that this exact text would have been generated?" Manually translated texts provide data to line up word sequences in one language with word sequences in another. Translation, then, is finding the highest-probability sequence from one language's gibberish-generator that corresponds to whatever makes the other language's gibberish-generator produce the input text.

What it's reliable for

Consequently, you won't learn much about what Google Translate is reliable for by trying out different grammatical structures. If you're lucky, all you'll get from that is an ELIZA effect. What Google Translate is most reliable for is translating documents produced by the United Nations between the languages in use there. This is because UN documents have provided a disproportionately large share of the manually translated texts from which Google Translate draws its five-word sequences.

Witness what happens when I type this in:

À l'exception de ce qui peut être convenu dans les accords particuliers de tutelle conclus conformément aux Articles 77, 79 et 81 et plaçant chaque territoire sous le régime de tutelle, et jusqu'à ce que ces accords aient été conclus, aucune disposition du présent Chapitre ne sera interprétée comme modifiant directement ou indirectement en aucune manière les droits quelconques d'aucun État ou d'aucun peuple ou les dispositions d'actes internationaux en vigueur auxquels des Membres de l'Organisation peuvent être parties.

It gives me:

Except as may be agreed upon in the special guardianship agreements concluded in accordance with Articles 77, 79 and 81 and placing each territory under the trusteeship system, and until such agreements have been concluded, This Chapter shall not be construed as directly or indirectly modifying in any way the rights of any State or any people or the provisions of international instruments in force to which Members of the Organization may be parties.

Perfect! (Almost.)

This is why its Latin translations tend to be so poor: it has a very thin corpus of human-made translations of Latin on which to base its probability estimates—and, of course, it's using an approach that's based on probabilities of word sequences, disregarding grammar and meaning.

So, until the United Nations starts doing its business in Latin, Google Translate is not going to do a very good job. And even then, don't expect much unless you're translating text pasted from UN documents.

The five-word window

Here's an illustration of the five-word window. I enter:

Pants, as you expected, were worn.

Pants were worn.

Pants, as you expected, are worn.

The Latin translations (with my manual translations back to English):

Anhelat quemadmodum speravimus confecta. (He is panting just as we hoped accomplished.)

Braccas sunt attriti. (The trousers have been worn away [like "attrition"].)

Anhelat, ut spe teris. (He is panting, just as, by hope, you are wearing [something] out.)

Notice that the first and third sentences border on ungrammatical nonsense. The second sentence makes sense but it's ungrammatical; it should be Braccae sunt attritae. There aren't any five-word sequences in Google Translate's English database that line up well with "pants as you expected were/are," so it's flailing. Notice that in the third sentence, by the time it got to "worn", it had forgotten which sense of "pants" it chose at the start of the sentence. Or rather, it didn't forget, because it never tracked it. It only tracked five-word sequences.

So, whether the sentence makes sense sort of affects the translation, but it's worse than that. What matters is exact, word-for-word matching with texts in the database.

Entering Latin into Google Translate (with words changed from the first sentence shown in bold):

Abraham vero aliam duxit uxorem nomine Cetthuram.

Quintilianus vero aliam duxit uxorem nomine Cetthuram.

Abraham vero aliam duxit uxorem nomine Iuliam.

Abraham vero canem duxit uxorem nomine Fido.

English output:

And Abraham took another wife, and her name was Keturah.

Quintilian, now the wife of another wife, and her name was Keturah.

And Abraham took another wife, and the name of his wife, a daughter named Julia.

And Abraham took a wife, and brought him to a dog by the name of Fido.

The Vulgate and the ASV translation (or similar) would appear to be among Google Translate's source texts. Notice what happens when the input is off by as little as one word.


The above explains just enough so that a layperson can understand what Google Translate is good at, what it's bad at, and why—and so they won't be misled by the results of experimenting with different grammatical structures. If you're interested in more rigorous and thorough information about the full complexities of this approach, google for "statistical machine translation". Some further info is here, including Google's rollout, now in progress, of an entirely new translation algorithm (which hasn't reached Latin yet).

Ben Kovitz
  • 15,914
  • 2
  • 32
  • 86
  • 10
    Thank you! This was insightful, and the translations were hilarious. This really helps me understand how and why Google Translate fails, and that's what I was after. – Joonas Ilmavirta May 11 '17 at 05:41
  • Or, as Google Translate helps me put it: Gratias tibi! Hoc est prudentissimus, qua translationes sint praeditae sunt, et hilares. Realiter intelligere quomodo et cur hoc adjuvat me: Google Translate Caelata lex, ut 'quid esset post. (Quintilianus exponentia es certus vos volo ut Sic latine?) – Joonas Ilmavirta May 11 '17 at 05:41
  • 32
    I joined just to upvote this. I think such an explanation would have its place on non-specific sites too, like linguistics.se or even cs.se – vsz May 11 '17 at 06:09
  • 10
    This is the single best explanation of why and how Google Translate does what it does. The example sentences show exactly where the pitfalls are and how it's possible that's where it's going wrong. Great answer! – Mast May 11 '17 at 06:20
  • 15
    I think this description is out of date. In November 2016 Google changed how their algorithm works to use neural nets rather than statistical methods. It is now apparently significantly more accurate. https://blog.google/products/translate/found-translation-more-accurate-fluent-sentences-google-translate/ – Peter Collingridge May 11 '17 at 11:13
  • 2
    @PeterCollingridge Thanks—I was not aware of that. From that blog post, a blog post from March, and the look of the translations, it appears that Latin is still on the old system. We'll see what happens when they switch it to neural machine translation. – Ben Kovitz May 11 '17 at 11:44
  • For some languages Google uses a machine learning approach: https://www.cnet.com/news/google-translate-uses-machine-learning-for-its-cool-new-trick/ (also there exists an official blog post. List of languages outdated) – Florian Reisinger May 11 '17 at 12:36
  • 1
    @Peter Even before that date it’s unfortunately not an accurate description of how Google Translate worked. HMMs are powerful but — on their own — completely inadequate for translation. It’s true that Google used a statistical model of language, but it’s rather more complex. Why is this relevant? Because it gives a wrong impression of what Google Translate would be in principle capable of. – Konrad Rudolph May 11 '17 at 15:55
  • 11
    Neural nets are also "statistical". The difference is in the mechanics of calculation and in how the statistics is collected. As for translation quality, the new algorithm produces more grammatical output but seems prone to dropping words or whole clauses, not infrequently completely changing the meaning (and since the output is grammatical the poor user won't notice unless the user knows both languages). Some examples from Google: https://drive.google.com/file/d/0B4-Ig7UAZe3BSUYweVo3eVhNY3c/view – Anton Tykhyy May 11 '17 at 16:02
  • 1
    @PeterCollingridge A more up-to-date description of Google's translation algorithm would make a nice answer. It would be highly relevant and interesting, so I would gladly see it here even if it does not precisely answer my question as stated. Do you want to write up a description? – Joonas Ilmavirta May 11 '17 at 17:47
  • 3
    In fact, if anyone at all has recent and accurate information about how Google Translate work, I invite you to write it up as an answer. – Joonas Ilmavirta May 11 '17 at 17:49
  • 3
    @PeterCollingridge From that article I get the impression that the new method is not implemented for Latin yet. If you can find the current status of Latin translation methodology somewhere, that would make a nice answer. – Joonas Ilmavirta May 11 '17 at 18:26
  • This is a great answer. I joined just to vote it up. And now I am learning more about HMMs because it will be useful for one of my projects. – Omar and Lorraine May 12 '17 at 06:45
  • 1
    Even if it's not 100%, I still give you props for a "close enough" description. –  May 13 '17 at 05:54
  • Congratulations for the first answer on the site to score a hundred votes! – Joonas Ilmavirta Jan 31 '18 at 21:17
  • @AntonTykhyy I've noticed this in Chinese translations lately. I would paste a paragraph, and only the initial sentence would be translated. It's very strange! – cmw May 15 '21 at 02:32
25

To answer the question ("What is Latin Google Translate good for?") as stated:

Absolutely nothing.

At least reliably.

brianpck
  • 40,688
  • 5
  • 94
  • 204
12

While Google Translate may be useless to an advanced Latin speaker, a beginner such as me finds it helpful for gaining quick insights.

I use it in combination with little tricks, such as splitting the sentence into separate clauses and/or adding punctuation and converting the syntax to the Subject–Verb–Object word order which makes Latin more “palpable” to Anglo-Saxon Google Translate.

Regardless of initial accuracy (or lack of it), this workflow is ultimately more efficient for me than the tedious alternative, which is using dictionaries and trying to match the meaning of each word manually from a multitude of confusing possibilities. Which are confusing because ancient notions don’t map onto modern notions on a 1:1 basis. The average Latin word appears to have a dozen of equally plausible meanings that appear unrelated to each other to the modern European mind. Or at least that’s how I see Latin as a beginner.

Example (borrowed from @BenKovitz above):

  1. Quintilianus vero aliam duxit uxorem nomine Cetthuram. (raw Latin)
  2. Vero; Quintilianus duxit aliam uxorem; nomine Cetthuram. (manually pre-processed)
  3. Vero, Quintilianus duxit aliam uxorem, nomine Cetthuram. (manually pre-processed)

Result:

  1. Quintilian, now the wife of another wife, and her name was Keturah. (nonsense)
  2. however; Quintilian [married another woman] for his wife; her name was Keturah. (wow!)
  3. However, Quintilian [married another] wife, whose name was Keturah. (wow!)

Square brackets indicate user-selected alternative.

Joonas Ilmavirta
  • 113,294
  • 21
  • 192
  • 587
user1580
  • 121
  • 2
  • 2
    The examples from Ben Kovitz aren't exactly relevant to this use case: they are contrived to show context sensitivity--in this case they are one-word variations from something we know is in the corpus. The question is: have you come across a semi-useful translation "in the wild"? I would argue that you'd get a better result by using the first dictionary entry for every word in a sentence and putting it into a blender. – brianpck May 12 '17 at 00:15
  • 2
    @brianpck Actually, “using the first dictionary entry for every word in a sentence and putting it into a blender” might be useful as well, if it had a convenient “just type your full text here” interface like Google Translate does. – ShreevatsaR Mar 01 '18 at 03:07
11

I think the examples given in the question and in the answers are grammatically non-trivial and thus prone to translation mistakes, even by a human learning a language (two substantives, two verbs, etc). That Google makes mistakes might not be so surprising, and we could even argue that the tests were too stringent.

Perhaps a much more revealing test is to try Google translate with unequivocally simple examples. So, to see if Google understands us, I will try "I do not understand". From the grammatical point of view, it's trivial: one person/pronoun, one verb. No complex lexical twists possible. So, what do we get? Brace yourselves:

enter image description here

So far so good! But to be honest, many people just use "don't" rather than "do not". So...

enter image description here

What??? Did Google just changed the person? How on earth could it mistake an "I" by a plural "you"?

Just for the sake of it, I also try it in Spanish:

enter image description here

Same basic mistake.

What about "I understand"?

enter image description here

Better. What about "I do understand"?

enter image description here

Totally odd. If we reverse the direction of translation, we get:

enter image description here

From where we get the amazing conclusion that:

I do understand = I do not understand

I think that we can now safely conclude that Google translate is terrible even for the most simple tasks.

luchonacho
  • 12,354
  • 5
  • 34
  • 100
10

The point of Google Translate (and this is not just for Latin) is that it only serves a purpose if you are able to judge the quality of its output, and understand that it's not a faithful account of the source text in any case.

This means, of course, that it must only be used to translate a text in a language you don't know into a language you do.

But if you're content with just getting the gist of what is being said, then it's great for approximately understanding what a web page is talking about in general lines. If it's the choice between understanding part of it and understanding nothing at all, then there may be value in using it.

Finally, whenever you do use it, you must always be wary of what you get presented with. Could this have possibly been misconstrued? Does the surrounding text generally agree with the sense of this sentence? Some rudimentary text criticism may give you some confidence of whether a given translation is sound or not.

Wtrmute
  • 1,226
  • 10
  • 11
  • 5
    Excellent point about translating only to a language that you already know! It's easy to ridicule Google Translate because the results come out sounding so laughable, but it is useful as a quick way to get the gist of an article—when supplemented with some common sense on the part of the user. It's probably much less helpful for learning a language, which is what most of us do with Latin; but I'm going to keep this in mind the next time I simply want to understand what a medieval document says. – Ben Kovitz May 12 '17 at 04:38
6

I found a very specific use for it this summer. For a couple weeks I was at one of the living-Latin summer programs run by SALVI, and the rule there—as at other such programs I've attended—is that you're not allowed to speak any language but Latin except in emergencies. Since my husband, who speaks no Latin, gets lonely when it's just him and the dog and was annoyed that we'd be out of touch for so long, I wanted to be able to communicate with him a little bit, and Google Translate turned out to be just the thing. We sent each other very short, simple sentences ("Quōmodo tē habēs? Tē dēsīderō," etc.); the Latin he sent wasn't always correct but I always understood what he was trying to say, and I have to assume the same was true of the English that resulted from my Latin. It made him feel much better about my absence.

So the answer is: Google Translate is good for preserving marital bliss!

Joel Derfner
  • 16,468
  • 1
  • 43
  • 101