8

Preemptive note: This question is about sound-based writing systems, excluding logographic systems like Chinese. Transitional systems like Egyptian hieroglyphs, Maya script or Man’yōgana are also excluded, as are heterograms, etc. I’m only asking about glyphs that are purely phonetic in value/intention: essentially alphabets, syllabaries, abugidas and abjads (for abjads, assume optional vowel marking to always be present).


 

The individual glyphs in many writing systems derive from originally logographic depictions but are now used exclusively for their phonetic (or phonemic) value, determined by the language the system is being used to express. In some writing systems, like the Latin alphabet, each glyph has been simplified enormously, to the point that few consist of more than two or three strokes when written. In others, the simplification has been less dramatic, and though rarely recognisable as any erstwhile logogram, many glyphs remains more complex than just a couple of strokes.1

On the other hand, in alphabetic writing, each glyph roughly (very roughly) represents a single phonetic entity, whereas in abugidas, abjads and various other systems, each glyph represents multiple phonetic entities. As a result, if you compare an alphabetic writing system with a high number of average strokes per glyph to an abugida with a low number of average strokes per glyph, the same phonetic sequence – assuming it’s expressible in both – would require far fewer strokes in the abugida than in the alphabet.

As an example, this is the random sequence ⟨kanita⟩ (or close equivalent) expressed in a selection of scripts:

Script (Language) String Strokes
Zhuyin/Bopomofo ㄎㄚㄋㄧㄊㄚ 19
Latin (English) kanita 13
Ge’ez (Amharic) ካኒታ 11
Devanagari (Sanskrit) कनित 11
Cherokee Syllabary ᎧᏂᏔ 8
Inuktitut Syllabics ᑲᓂᑕ 3

Even from this very cursory sample (and even with the ambiguity of deciding exactly when something is one or two strokes), it’s clear that there’s quite a huge variation in how much graphite a pen will have to deposit on the paper in order to write the same phonetic sequence.

Some writing systems have inordinately complex glyphs to write very simple sounds, like simple vowels in some Brahmi scripts: Malayalam ആ ā (5 strokes), Javanese ꦈꦴ ū (6), Tibetan ཨོ o (7). But this may not necessarily give the system as a whole a high stroke-to-sound ratio – for instance, those three examples all have identical or even lower stroke counts if you add a preceding consonant: കാ (4), ꦏꦹ (6), ཀོ ko (6).

Conversely, some systems may have only quite simple glyphs, but encode very little information in each glyph, requiring more glyphs in total, like Zhuyin/Bopomofo marking each consonant, vowel and tone as a separate glyph – or perhaps a script that marks things like place and manner of articulation, vowel height, phonation, etc., with separate glyphs or markers (if such a script even exists).2

This sort of general state of affairs made me wonder – which writing systems have the highest and lowest average stroke counts overall when used to write representatively in a language it’s regularly used to write? Or more broadly, which language/writing system pairs are most/least economical in writing strings of comparable phonetic length when it comes to how far the pen will have to travel across the paper?

I realise, of course, that precise numbers are likely impossible here, but rough approximations will do fine as well. If they can be backed up by some sort of data, all the better, though I don’t expect there really is any hard data available.

 


1 ‘Strokes’ are well-defined in CJK writing systems, but not elsewhere. I don’t want to get too bogged down in brushstroke technicalities here, so I’ll use a simplistic definition: a stroke is any continuous, non-intersecting pen movement that does not include sharp corners; so the letter S is one stroke, while ɣ is two (intersection) and Z is three (sharp corners). It can still often be a coin toss whether something should count as one or two strokes, of course.

2 As John Lawler points out in a comment, there are also many languages that use sound-based scripts, but have complex relationships between sound and script, such as English, French, Tibetan or Thai. In such cases, the shortest realistic form should be assumed. For example, the abstract sequence ⟨kanita⟩ used above would likely correspond to /ˈkanɪtə/ phonemically in English, which could ostensibly be written in various more or less abstruse ways, such as khannyttah, but the shortest way that would be understandable would be something like canita, which should then be preferred.

Janus Bahs Jacquet
  • 4,673
  • 1
  • 22
  • 28
  • 1
    I'd be wary about including systems like bopomofo, cuneiform, kana, etc which can be used purely phonetically but usually aren't—because that fact changes the evolutionary pressures on them. – Draconis Jul 23 '20 at 18:56
  • 1
    @Draconis I’m only including them in phonetic uses (where are Bopomofo and kana used non-phonetically?). Cuneiform syllabics would be fine, but obviously not Sumerograms or other heterograms. – Janus Bahs Jacquet Jul 23 '20 at 19:00
  • 1
    I mean that, in general usage, most people don't write in purely phonetic bopomofo/cuneiform/kana—logograms are prevalent, which means there's less evolutionary pressure to make the phonetic parts efficient. I'm guessing that's likely to skew the data. – Draconis Jul 23 '20 at 19:04
  • @Draconis Oh, I see what you mean now. Yes, that’s true – they aren’t really used as the normal writing system for any language, so it’ll probably be hard to find anything useful on them. Then again, since they’re not usually used as a primary script on their own, they can also just be ignored for present purposes. – Janus Bahs Jacquet Jul 23 '20 at 19:08
  • You're going to have a hard time dealing with English, where there is simply no consistent match between spelling (to say nothing of stroke number) and pronunciation. – jlawler Feb 12 '23 at 18:00
  • 1
    @jlawler True, if we include the vagaries of spelling in individual languages, then things will get out of hand pretty quickly (Irish would probably be even worse, with something like [ˈɰiːʊ] being equally representable by ghuíomh and dh’fhaoithigheadh). I don’t think I really considered that when I first asked the question. It probably makes most sense in general to limit the question to uses where each glyph has a direct, regular correspondence to a sound (or sounds) in the spoken form. – Janus Bahs Jacquet Feb 12 '23 at 18:29
  • @JanusBahsJacquet Yeah, like Devanagari (which I have heard has the best data/ink ratio of any phonetically-based system) and Hanggul (which has other delights). – jlawler Feb 12 '23 at 19:52
  • "how much graphite a pen will have to deposit on the paper in order to write the same phonetic sequence"?? Do you mean translation of the name of a language? Aren't you comparing apples and oranges? Why not just count the number of strokes needed in a single language for each of the letters and add those up? Then, just compare them. It also depends on how you define stroke... – Lambie Feb 12 '23 at 21:14
  • @Lambie Translation of the name of a language? What? Did you even read the question..? – Janus Bahs Jacquet Feb 12 '23 at 21:25
  • @JanusBahsJacquet I have read it several times and do not GET IT. Stroke to sound? Last time I looked in any language, there are alphabets of some kind. Then, there are words. If anything, you could measure how long it takes to write the strokes involved for individual letters or/words, right? Obviously, the language with the most strokes in all its letters taken together would take the most time to write. Also, is it stroke to sound OR sound to stroke? Not clear. Also, in English, printed letter and cursive would be different. And if they have the "same phonetic sequence"? – Lambie Feb 12 '23 at 21:43
  • It might make sense to count how long it takes to write letters but two different languages?? What is the linguistic point of that? ta in English and the same sound in Japanese? I mean.... – Lambie Feb 12 '23 at 21:46
  • "which writing systems have the highest and lowest average stroke counts overall when used to write representatively in a language it’s regularly used to write?" The ones with the highest number and lowest number of strokes per letter for all the letters in that language. – Lambie Feb 12 '23 at 21:48
  • 1
    Some linguistic Tufte ought to write a book about The Visual Display of Phonological Information, – jlawler Feb 13 '23 at 16:34

2 Answers2

0

Hardly shall you get better ratio than a stroke per two phonemes, since you exclude logograms. [s (ʃ?)] is 7 to 1, as is that ཨུ[i], but [t] is one to one, so tibetan probably wins. Also bopomofo uses less strokes than it seems: https://simple.wikipedia.org/wiki/Zhuyin shows them in different color.

aeiou.nu
  • 1
  • 1
  • Those are strokes as traditionally counted in Bopomofo (seemingly the same principles as apply in CJK stroke counting in general). That only really works in a CJK context, though, which is why I use a different definition in the question, one that can apply to any script. – Janus Bahs Jacquet Mar 10 '24 at 22:40
0

For a pathological case, Arabic rasm has far less than one stroke per phoneme, since most letters are designed to be written without lifting the pen. Entire words can (and are) thus be written in a single stroke.

A few letters do require ending one stroke and beginning another (as do word breaks), which means the strokes-to-phonemes ratio doesn't actually approach zero as texts get longer. But there may be a cursive script out there somewhere that never requires starting a new stroke, meaning that you can get the strokes-to-phonemes ratio arbitrarily small by looking at a sufficiently large text.

Draconis
  • 65,972
  • 3
  • 141
  • 215
  • Note note 1: for the context of this question, a stroke is “any continuous, non-intersecting pen movement that does not include sharp corners”. Several individual Arabic glyphs consist of multiple strokes by this definition. – Janus Bahs Jacquet Mar 10 '24 at 22:30
  • @JanusBahsJacquet Ah, I overlooked the "non-intersecting". Oops. – Draconis Mar 10 '24 at 23:53