1

When I looked at words in Proto-Indo-European and how the words evolved, I found that there aren't a lot of words in that proto-language and that the words appear to be somewhat shorter than those in English. I am wondering if it has to do with limits of comparative recovery of PIE vocab, unsophisticated word choice, or a combination of both. Also, the words do seem to be somewhat limited in variety.

Number File
  • 1,561
  • 1
  • 8
  • 22
  • 1
    Could you give some examples? That would help us better understand what you mean—there are some definite gaps in the PIE vocabulary, but they exist for different reasons, and I'm not sure which types you're asking about. – Draconis Nov 22 '19 at 23:15
  • Example: -o in Spanish (masc.) comes from -Hos which means nouns of authority. I'm not sure if this is clear. – Number File Nov 22 '19 at 23:25
  • 2
    Masculine -o in Spanish generally comes from *-o-m, the thematic animate accusative singular ending (> Old Latin -om > Latin -ŭm > Romance -o). But that has nothing to do with the vocabulary. It's just an ending. – Draconis Nov 22 '19 at 23:37
  • Most "native" Germanic words, or at least morphemes, are monosyllables anyway, so does English really have "longer" vocabulary than PIE? Well, sure, it has a ton of non-Germanic words in it, so the actual answer is probably yes, but had the Norman conquest not happened, would we still be discussing this in these terms? – LjL Nov 22 '19 at 23:45
  • @LjL that's a red herring, German has Zusammensetzung all the same. I still think it would be useful to adapt computer terminology and deem a word anything that fits into a fixed size of memory, with word size depending on the processor (there are Very Long Instruction Word arcgitectures, VLIW, not very successfull, and Complex Instruction Sets, and, I guess, variable length architectures; but most of them run arrays of Reduced Instruction Sets under the hood; RAM Bus is still fixed size, anyhow, memory is chunked in pages, bars, disks and we are again writing on tablets) – vectory Nov 24 '19 at 17:15

3 Answers3

9

The vocabulary of PIE must have been larger than meets the eye.

By rough estimates, a language could meet its semantic needs with as few as 3,000 independent roots and their derivatives. Most languages have twice that many roots, but many roots are borrowed and many are rare. (For example, Hans Wehr’s Dictionary of Modern Written Arabic lists about 3,000 roots, but searches of old lexicons find over 6,000. Chinese students must learn 5,000 characters to read proficiently. Panini’s Dhatupatha lists 2,000 verb roots alone in Sanskrit, but only half are actually found in texts.)

Pokorny catalogued about 2,000 reconstructed roots, but many items in his lexicon seem shaky, either for lack of solid attestation in multiple branches of the IE family, or for too-loose semantic connection with alleged cognates, and some seeming cognates may actually be loanwords. By conservative estimate, only one-third of Pokorny’s material is beyond question.

We might try to work backwards from 750 well-attested roots. Glottochronology has estimated the rate of vocabulary replacement at 14-19% per thousand years. Using the latter figure, we would expect half the PIE roots to be preserved in any given major branch after 3,500 years. With five major branches (indo-Iranian, Greek, Italo-Celtic, Balto-Slavic, and Germanic), the probability that a given root would have been lost entirely is very low. But any figure between 750 and 1,500 roots is hardly enough.

Where might glottochronology have gone wrong? Swadesh’s widely cited estimate is based on a list of only 200 widely distributed words, but rare roots may have higher rates of replacement.

And where might lexicography have gone wrong? An unknown number of roots may actually have survived in one branch or another but still be lost to history for lack of identifiable cognates in other branches.

Bert Barrois
  • 570
  • 3
  • 9
5

It is pretty clear that the size of the Proto-Indogermanic vocabulary is limited by the method of reconstruction. Since the most frequent words of a language tend to be short and frequent words have a better chance of preservation, there is a bias towards short words in reconstruction.

Another factor is that we know that Proto-Indogermanic has had some derivational morphology that was productively used, but dictionaries are concentrating on roots and just the derivational morphemes and don't list the derived words. Reducing the language just to roots and morphemes may give a wrong impression on how the language actually looked like. For a better illustration, look at some made-up Proto-Indogermanic texts like Schleicher's fable.

Sir Cornflakes
  • 30,154
  • 3
  • 65
  • 128
  • 1
    Please keep the word Indogermanic, it is my preferred term over Indoeuropean. The main reason for my preference is that there are non-Indogermanic Ancient European languages like Iberian (the non-Celtic one) and Basque. – Sir Cornflakes Nov 23 '19 at 16:15
  • 2
    The word "Indo-Germanic" belongs to the vocabulary of Nazis. It should not be allowed on the site. –  Nov 23 '19 at 16:21
  • 8
    @ArnaudFournet please, that's nonsense. I don't really like the term and prefer "Indo-European" (if that's flawed because there are European languages that aren't Indo-European, then isn't Indo-Germanic flawed by not including languages that do belong, considering that Germanic is it's own family? Honestly, I've refrained from upvoting some of jknappen's posts for this very reason), but the term has been used countless times in legitimate linguistics works, and the fact many of those works were written before the end of WW2 really doesn't imply that they are all somehow "Nazi". – LjL Nov 23 '19 at 16:31
  • 4
    Additionally, I believe the rules request that we only edit posts without changing the author's likely intentions, so I wouldn't try to wrestle this over edits. – LjL Nov 23 '19 at 16:35
  • 2
    @LjL You may want to take a look at Norman 1929, for a historical background, https://www.jstor.org/stable/3715962 In professional linguistic research the term "Indo-Germanic" is no longer used (with the exception of some research written in German, and even there things are changing). – Alex B. Nov 23 '19 at 17:17
  • 3
    Basque and Iberian are language isolates, they're not IE languages. I personally do not edit anything written by other users here. I'm just trying to understand why you insist on using "Indogermanic" in English. – Alex B. Nov 23 '19 at 19:26
  • @AlexB. But Basque and Iberian are undoubtly European languages and this is my point. – Sir Cornflakes Nov 23 '19 at 19:30
  • 4
    IE is a technical term with a very specific meaning and use. Unfortunately, Indogermanic has negative connotations in English and that is why it was abandoned (like "Gothick" or "Aryan" to describe the IE language family). Additionally, IE is more inclusive than IG. Since you read in German, take a look at Wachter 1997 https://www.academia.edu/40829765/Indogermanisch_oder_Indoeurop%C3%A4isch – Alex B. Nov 23 '19 at 19:35
  • 3
    @AlexB. Contrary to your argument Wachter pleas for tolerance to the term indogermanisch. —This discussion goes now a little too long here, we should cast it into some answerable questions, I think. – Sir Cornflakes Nov 23 '19 at 20:20
  • 2
    Yes, I read Wachter, who wrote about the use of the word Indogermanisch in German, not in English. I also it to the very last paragraph and I encourage you to read it again too. A lot has been written on this, e.g. Ruth Roemer Sprachwissenschaft und Rassenideologie in Deutschland (Wilhelm Fink, 1985). I agree, I don't see much point in discussing this further. – Alex B. Nov 23 '19 at 22:06
  • 1
    @AlexB. a no-registration-necessary link https://hvs.philhist.unibas.ch/fileadmin/user_upload/hvs/Archiv_Lehre/Idg-oder-ie.pdf – Vladimir F Героям слава Nov 23 '19 at 23:42
  • Now on meta: https://linguistics.meta.stackexchange.com/questions/1876/banning-the-nazi-ugly-word-indo-germanic – Sir Cornflakes Nov 24 '19 at 10:24
  • You are implying the PIE reconstruction index were small. However, with almost 2000 stems and a few synthesized words, it's almost too big. There was the related question whether ancient anguages had fewer words, and you answered well, yes we have quite a bunch now, more things more names--which is not enough to estimate how many roots one could reasonably expect. – vectory Nov 24 '19 at 18:58
  • @jk-ReinstateMonica What about ‘Indo-‘ then? There are also plenty of non-IE Indian languages. ‘Indo-European’ may be flawed (arguably ‘Tarim-European’ would make more sense), but it is more logical and less flawed than ‘Indo-Germanic’, which is just completely illogical and makes no sense at all. The fact that there are non-IE languages in Europe is irrelevant to the fact that Europe (including Iceland) is the westernmost boundary of the family in pre-Modern times. – Janus Bahs Jacquet Aug 05 '20 at 12:58
  • At least, Indo-Iranian is a well established branch of the language family. There is no "European" branch, but I have seen trees drawn with a binary split between Indo-Iranian and European in popular language tree illustrations, see the picture linked in this answer for an example https://linguistics.stackexchange.com/a/34339/9781 – Sir Cornflakes Aug 05 '20 at 14:49
1

My impression is that the vocabulary of PIE was quite large. We already have a lot of roots to conclude that it had at least not less roots than Latin or Greek.

In addition it had a lot of suffixes that could modify the meaning as well as compound words, similar to German.

And the words definitely were not shorter than in English, they were much longer (you should not confuse roots with words).

Anixx
  • 6,643
  • 1
  • 26
  • 38