GB2312 is a strict subset of GBK, which in turn is a strict subset of GB18030.
Windows CP936 originally only covers GB2312, but expanded to most of GBK since Win95.
GB2312 has 6763 Chinese characters.
GBK has 21003 Chinese character. ref
GB18030 (originally) has 27484 Chinese characers. (same ref as above)
Also, GB18030 contains a mapping into Unicode, which means all Unicode characters can also be represented in GB18030.
Unicode 14.0 has 92865 Chinese characters (CJK Unified Ideograph). These characters in Unicode are called CJK Unified Ideographs as they are not only used in Chinese, but also in Japanese, Korean, and other languages that use (or used to use) Chinese characters. Some characters are never used in Chinese.
However, the characters in GB2312 are used most frequently. They already cover most characters in everyday use. The use of other characters are rare.
[ 0x0 .. 0x10ffff ]) to sequences of one to four bytes. Strictly speaking, 'Unicode' is not a character set either; 'Unicode v14' is one, 'Unicode v15' is another. – John Frazer Apr 22 '22 at 04:48