5

I'm working with unicode/wide characters and I'm trying to create a toString method (Java ::toString equiv). Will ostream handle wide characters, if so is there a way to warn the consumer of the stream that it is unicode coming out of it?

Cœur
  • 34,719
  • 24
  • 185
  • 251
monksy
  • 13,936
  • 16
  • 71
  • 122

2 Answers2

3

Neither ostream nor the rest of C++ know anything about Unicode. Usually you write a string conversion in C++ as follows:

template<typename Char, typename Traits>
std::basic_ostream<Char, Traits>&
operator<<(std::basic_ostream<Char, Traits>& stream, const YourType& object) {
    return stream << object.a << object.b;  // or whatever
}

Whether you get something Unicode-like is up to the implementation. Streams in C++ are never text streams in the sense of Java, and C++'s strings are not strings in the sense of Java. If you want a real Unicode string, you might want to have a look at the ICU library.

Philipp
  • 45,923
  • 12
  • 81
  • 107
0

Wide strings are non-portable and better be avoided in C++. They can be UTF-16, UTF-32 or even non-Unicode depending on platform.

The common approach in C++ is to use UTF-8 encoded multibyte char strings internally and, if necessary, do transcoding at system boundaries. Most modern systems such as Linux and macOS work well with UTF-8 so passing them to an output stream works. Windows is the most problematic because of legacy code pages. You can use Boost Nowide or the {fmt} library for portable UTF-8 output.

Disclaimer: I'm the author of {fmt}.

vitaut
  • 43,200
  • 23
  • 168
  • 291