5

I found that many (most?) Chinese characters look like they're composed of more primitive ones, like:

做 = 亻十口夂

你 = 亻尔

吃 = 卩乙

and so on.

I this a real phenomenon or just my European imagination?

If it is not real then why? If it is real, then what is the name of this phenomenon? I know there are "radicals", but I'm not sure all components I see belong to the set of radicals.

Dims
  • 389
  • 1
  • 3
  • 10
  • 1
    A comprehensive title which deals with this is Qiu Xigui’s Chinese Writing. It takes a lengthy book to explain. – dROOOze Jun 18 '18 at 17:22
  • @droooze just "yes" or "no" and if "no", then "why" – Dims Jun 18 '18 at 17:24
  • 1
    It’s not a generic yes or no, it depends on the character you’re looking at. So it’s sometimes a yes and sometimes a no and sometimes an illusion due to millenia of script evolution. – dROOOze Jun 18 '18 at 17:27
  • 1
    https://en.wikipedia.org/wiki/Chinese_character_classification – Stumpy Joe Pete Jun 18 '18 at 17:35
  • 1
    Moreover, they all do, and here's a huge repo of decompositions https://github.com/amake/cjk-decomp – Vitaly Osipov Jun 19 '18 at 04:37
  • @drooze What do you want to say? The OP asked a clear question and came up with some clear examples that look simple and correct. Are these examples valid or not? Does 做 decompose into 亻十口夂 or not? – John Frazer Jun 19 '18 at 21:33

4 Answers4

4

recent studies called them components (部件). there're 1300+ basic component:

http://chardb.iis.sinica.edu.tw/system_intro.jsp

水巷孑蠻
  • 15,695
  • 2
  • 16
  • 35
4

Chinese characters were originally unique pictographic symbols depicting more or less the concept of the words they stood for. Over time as the written language evolved, many of these symbols were simplified and as they were simplified, different styles of writing and shorthand for parts of symbols became standardised.

For example, while 有 is written as combining + 月 the actual origin of the symbol is the combination of + ⺼.

As such, the components that make up a modern character may be completely unrelated to the original characters combined to create it.

E.g. while 犬 looks like 大 plus one stroke, they actually derive from independent symbols which (when simplified) have a similar appearance:

Original character Modern character
enter image description here
enter image description here

This is somewhat analogous to how e.g. woodchuck in English looks like it comes from wood + chuck, but is actually from the Algonquian word wuchak. To make a word easier to write, it is written using a common set of standardised subcomponents* shared across most words.


Notes

* Of course, as many subcomponents are themselves composed of other components, one can go further:

Decomposed into the different... Example
strokes 有 ← (一 + 丿) + (一 + 一 + 丨 + ) etc

But unless you are learning to write the characters stroke by stroke, this is about as useful as decomposing english words into the lines, circles, half circles etc that make up each letter.

  • Characters are not made up of radicals, in any sense of the word. You cannot decompose even the most basic of characters, like 可 or 年, into radicals. – dROOOze Jan 21 '21 at 11:10
  • @dROOOze is 可 not made of the following radicals, though? 口⼀⼅ – 大胳膊雅各布 Jan 21 '21 at 13:12
  • I mean, if you want to look at it that way, then 年 is made up of the radicals 丿一一丨一丨. “Radical” decomposition or “orthographic decomposition” is overly complicated and offers very little. – dROOOze Jan 21 '21 at 17:20
3

It is true, here is an example

This character is composed of 氵 and 目,which represent water and eye respectively, and the character means tear!

In more traditional writings, this would become more obvious. In fact, if you look at ancient writings, you will find out that Chinese origins from actual drawing of stuff around us.

zyy
  • 3,089
  • 1
  • 6
  • 20
2

Naïve visual decomposition does not work for a lot of the time, so I’ll just demonstrate some examples.

Firstly, your own:

  • 做 is comprised of 作 and 攵. 乍 was graphically corrupted into 古, but that’s not it’s original form. 作 is in turn decomposable into 亻 and 乍.

  • 你 is decomposable into 亻and 尔.

  • 吃 is decomposable into 口 and 乞. 乞 is not decomposable; the way you did it is equivalent to “decomposing” the Roman letter “d” into “c” and “l”.

Next, some non-decomposable characters that look like they’re decomposable:

  • 龍, an entire picture of a dragon

  • 能, an entire picture of a bear

  • 魚, an entire picture of a fish

  • 它, an entire picture of a snake

  • 气, a picture of streaks of clouds in the sky

Finally, some decomposable characters which look like they’re non-decomposable, or have unexpected decompositions:

  • 之, decomposable into 止 and 一

  • 戍, decomposable into 人 and 戈

  • 喪, decomposable into 桑, 2x口, and 亾

dROOOze
  • 22,662
  • 2
  • 42
  • 65
  • Why do you say, that 能 is not decomposable, when it is obviously constists of ㄙ月匕匕 which are known and frequent parts? This is the question! How "non-decomposability" can be justified in such situations? Character genesis is not the same as their shape! Even if 作 historically transformed to 古 it doesn't make them the same character! I agree, that decomposition of d into c and l is strange, but decomposition of i into stick and dot is not, and even decomposition of w into v v is not strange it is even called "double v". – Dims Jun 19 '18 at 06:02
  • @Dims well, anyone's free to decompose characters in any way they like. Why stop at the decomposition that you suggested? We can even go down further and say that everything including ㄙ月匕古 is decomposable into one of 20 or so strokes. I don't see the usefulness of this kind of decomposition though, but you're free to use whatever you feel is most helpful for your purposes. – dROOOze Jun 19 '18 at 06:15
  • The question is not about my freedom, but about problem domain. Of course you can decompose into strokes or even pixels, but parts I mention are better because they are complex constructions repeating in different characters many times. – Dims Jun 19 '18 at 06:21
  • @Dims If you've already defined the problem domain (which seems like graphical decomposition) then there's no further question to be asked. You can write a simple algorithm to just detect graphically connecting regions of a character and process thousands of character images to retrieve a rapid decomposition that way. To understand how the characters work as part of Chinese language usage, however, does not overlap with such a decomposition method, and that would be my preferred problem domain. – dROOOze Jun 19 '18 at 06:26
  • @Dims I think there was someone on here that sells books that, decomposing in the method you're suggesting (see the OP of this "question"). – dROOOze Jun 19 '18 at 06:31
  • @drooze you're answering to a different question: "is the geometric decomposition of a Chinese character always historically true to the earliest known form of that character?"; the answer to that question is of course no. But on the surface you totally can decompose 能 into 厶⺝匕匕, even though it is derived from a single pictorial sign. Yes, it is not objectively clear where exactly to stop; that's a dilemma that afflicts all sciences. Water is composed of H2O; ultimately that's protons, neutrons, electrons, which are in turn quarks, gluons &c. Makes still sense to think of water as H and O. – John Frazer Jun 19 '18 at 21:50
  • @JohnFrazer I’m afraid I don’t know how to reply. I’ve already taken the position that (1) anyone is free to decompose characters in any way they like and (2) decomposing it purely based on graphics objectively does not provide insight into the character’s function in language (unlike your H2O example). Nobody’s an ultimate arbitrator on what’s valid or not. I believe that a problem domain (graphics) that has very little to do with the language itself is genuinely unhelpful for understanding the language. This decomp method can be helpful for rare character lookup and font generation. – dROOOze Jun 20 '18 at 03:06
  • @JohnFrazer also, unlike the graphical decomposition method that you aptly described with it is not objectively clear where to stop, a decomposition based on glyph origins does have an objectively clear stopping point; you stop when further breakdown no longer provides insight into the character’s semantic or phonetic function. Such a breakdown also matches exactly how the character is thought to be constructed when it was first created. – dROOOze Jun 20 '18 at 03:19
  • I'm not saying your method is wrong, I'm saying the graphical analysis is one thing, historical reconstruction another. The OP is, IMHO, concerned with the former, not the latter; the link between the two is, as you show, a varied and not always obvious one. I'd like to add that the historical reconstruction process is much less objective and much more contested than you make it look per your latest comment. Let's face it, when a new OBI inscription is found and another treatise written, all our 'facts' may become mood (ex. 東 is not sun+tree, but a picture of a bundle, according to some). – John Frazer Jun 20 '18 at 09:32
  • @JohnFrazer If the OP is concerned with the former, then there is no real question here to do with the Chinese language. The question might as well be asked at an image processing forum. As to 東, nobody says it’s a sun and a tree anymore, because there’s no archaeological or linguistic evidence to suggest it. You seem to be under the impression that deciphering OBI is a bit random; I assure you that that’s not the case. – dROOOze Jun 20 '18 at 09:36
  • I think you're putting a bit of too much faith in the capabilities of automated image processing here, and a bit too much faith in the definiteness of conclusions drawn in the latest and greatest papers on OBI. Inquiries into the nature and structure of Chinese writing & into practical Chinese lexicography is, as a field of study, at least 2000 years old, and we're still turning out new stuff each year. There's no evidence to strongly suggest we're done at this point. Also AFAIK no software exists that does sensible automated graphical analysis of CJK characters, that's purely hypothetical. – John Frazer Jun 20 '18 at 12:49
  • @JohnFrazer I'm not sure what you're trying to say here, most academic disciplines try to build/determine successively better models to determine reality; that does not mean that since we don't have the real answer, we substitute it with one that doesn't match the discipline at all with the only redeeming feature being that it's internally consistent. The core issue still remains: Graphical decomposition of characters has very little to do with the Chinese language. Splitting 能 into ㄙ月匕匕 is like splitting the word endearment into enc and learment - it's completely nonsensical. – dROOOze Jun 20 '18 at 13:03