Since an Unicode character is U+XXXX(in hex), so it only needs two bytes, then why we come up with various of different encoding scheme like UTF-8 which takes from one to four bytes? Can't we just map the each Unicode character into binary data of two bytes, why we ever need four bytes to encode?
Asked
Active
Viewed 37 times
0
-
2The entire unicode code points no longer fit in two bytes. That's what's many done early on (UCS-2) and it became obsolete eventually. – Alejandro May 27 '20 at 15:03
-
1Unicode is a 21-bit (actually slightly less) character set so it definitely can't fit in 2 bytes. For example is U+1F632 – phuclv May 27 '20 at 15:11