4

What is the character encoding expected in libc? For example, gethostname(char name, size_t namelen); takes char as argument. Is it expected that the name parameter be encoded in utf8(which keeps the ascii intact) or plain ascii or some other format?

Also does C mandates any character encoding scheme?

chappar
  • 7,057
  • 12
  • 41
  • 57

4 Answers4

3

All string functions (except widechar ones) support only native charset, e.g. ASCII on Unix/Linux/Windows or EBCDIC on IBM mainframe/midrange computers.

qrdl
  • 32,678
  • 14
  • 55
  • 84
  • How do use these functions in non english environment? – chappar May 28 '09 at 06:50
  • Also, i think libc hasn't got wchar_t* equivalent of all char * functions. – chappar May 28 '09 at 06:55
  • You have to convert yourself or get some lib to do the job - see more here: http://stackoverflow.com/questions/313555/light-c-unicode-library. Anyway you cannot name you host in UTF-8, can you? – qrdl May 28 '09 at 06:57
1
  • char uses ASCII
  • wchar_t is the standard C datatype for unicode

use and in order to deal with the wide characters.

dfa
  • 111,277
  • 30
  • 187
  • 226
0

char should be a 7-bit compatible ASCII encoding (I can't find any definite reference on this though). The definition of wchar_t is left to the implementation, but the C standard requires that the characters from the C portable character set be the same. If I understand this correctly, then

char a = 'a';
wchar_t aw = L'a';
if (a == (char)aw) {
    // should be true
}

The standard does not say anything about UTF-8.

JesperE
  • 61,479
  • 20
  • 135
  • 194
0

You will probably have to use a third-party library, such as GLib. This lib is portable and very useful, it also provides regular expressions, data structures and more.

Bastien Léonard
  • 58,016
  • 19
  • 77
  • 94