1

Question:

I query a Quake3 masterserver via UDP, and get the response as below. As you can see, I had trouble figuring out the encoding of what the server sent... Is there any way to detect or set the receive encoding ?

            baBuffer = new byte[1024*100]; // 100 kb should be enough
        int recv = sctServerConnection.ReceiveFrom(baBuffer, ref tmpRemote);

        Console.WriteLine("Message received from {0}:", tmpRemote.ToString());

        System.Text.Encoding encResponseEncoding = System.Text.Encoding.Default; // Wrong...
        //encResponseEncoding = System.Text.Encoding.ASCII;
        //encResponseEncoding = System.Text.Encoding.UTF8;
        //encResponseEncoding = System.Text.Encoding.GetEncoding(437); // ANSI-DOS
        //encResponseEncoding = System.Text.Encoding.GetEncoding(1252);// ANSI-WestEurope
        //encResponseEncoding = System.Text.Encoding.GetEncoding(1250); // Ansi-Centraleuro
        //encResponseEncoding = System.Text.Encoding.GetEncoding("ISO-8859-1");
        //encResponseEncoding = System.Text.Encoding.GetEncoding("ISO-8859-9");
        //encResponseEncoding = System.Text.Encoding.UTF32;
        encResponseEncoding = System.Text.Encoding.UTF7; // Bingo !
soulmerge
  • 71,140
  • 18
  • 117
  • 149
Stefan Steiger
  • 73,615
  • 63
  • 359
  • 429

3 Answers3

1

The encoding (if it is actually text) is determined by the protocol. If you don't have a protocol spec and you don't have the source code then, yes, you'll have to guess.

Hans Passant
  • 897,808
  • 140
  • 1,634
  • 2,455
1

There is no way to safely detect encoding, you can just guess it. See also How can I detect the encoding/codepage of a text file.

Community
  • 1
  • 1
svick
  • 225,720
  • 49
  • 378
  • 501
1

You can look for the Byte Order Mark (BOM). Here's some VB.Net code that I use:

Private Shared Function GetStringFromBytes(ByVal bytes() As Byte) As String
    Dim ByteLegth = bytes.Count
    If (ByteLegth >= 3) AndAlso (bytes(0) = &HEF) AndAlso (bytes(1) = &HBB) AndAlso (bytes(2) = &HBF) Then
        Return System.Text.Encoding.UTF8.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFE) AndAlso (bytes(1) = &HFF) Then
        Return System.Text.Encoding.BigEndianUnicode.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFF) AndAlso (bytes(1) = &HFE) Then
        Return System.Text.Encoding.Unicode.GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &H0) AndAlso (bytes(1) = &H0) AndAlso (bytes(2) = &HFE) AndAlso (bytes(3) = &HFF) Then
        Return New System.Text.UTF32Encoding(True, True).GetString(bytes)
    ElseIf (ByteLegth >= 2) AndAlso (bytes(0) = &HFF) AndAlso (bytes(1) = &HFE) AndAlso (bytes(2) = &H0) AndAlso (bytes(3) = &H0) Then
        Return System.Text.Encoding.UTF32.GetString(bytes)
    Else
        'No BOM, assume ASCII
        Return System.Text.Encoding.ASCII.GetString(bytes)
    End If
End Function
Chris Haas
  • 50,516
  • 12
  • 134
  • 260