21

I've just read a article about differences between http1 and http2, but the main question that I have is when it says that http2 is a binary protocol but http1 is a textual protocol.

Maybe I'm wrong but I know that any data is, text or whatever format it can be has a binary representation form in memory, and even when transfer through TCP/IP network the data is splitted in a format according the layer(OSI model or TCP/IP model representation) which means that technically textual format doesn't exist in the context of data transfer through network.

So I cannot really understand this different between http2 and http1, can you help me please with a better explanation?

liuliang
  • 322
  • 1
  • 3
  • 13
Christian Lisangola
  • 2,564
  • 3
  • 19
  • 31

4 Answers4

45

Binary is probably a confusing term - everything is ultimately binary at some point in computers!

HTTP/2 has a highly structured format where HTTP messages are formatted into packets (called frames) and where each frame is assigned to a stream. HTTP/2 frames have a specific format, including a length which is declared at the beginning of each frame and various other fields in the frame header. In many ways it’s like a TCP packet. Reading an HTTP/2 frame can follow a defined process (the first 24 bits are the length of this packet, followed by 8 bits which define the frame type... etc.). After the frame header comes the payload (e.g. HTTP Headers, or the Body payload) and these will also be in a specific format that is known in advance. An HTTP/2 message can be sent in one or more frames.

By contrast HTTP/1.1 is an unstructured format made up of lines of text in ASCII encoding - so yes this is transmitted as binary ultimately, but it’s basically a stream of characters rather than being specifically broken into separate pieces/frames (other than lines). HTTP/1.1 messages (or at least the first HTTP Request/Response line and HTTP Headers) are parsed by reading in characters one at a time, until a new line character is reached. This is kind of messy as you don’t know in advance how long each line is so you must process it character by character. In HTTP/1.1 the HTTP Body’s length is handled slightly different as typically is known in advance as a content-length HTTP header will define this. An HTTP/1.1 message must be sent in its entirety as one continuous stream of data and the connection can not be used for anything else but transmitting that message until it is completed.

The advantage that HTTP/2 brings is that, by packaging messages into specific frames we can intermingle the messages: here’s a bit of request 1, here’s a bit of request 2, here’s some more of request 1... etc. In HTTP/1.1 this is not possible as the HTTP message is not wrapped into packets/frames tagged with an id as to which request this belongs to.

I’ve a diagram here and an animated version here that help conceptualise this better.

Barry Pollard
  • 35,640
  • 6
  • 72
  • 85
  • Then how about the json body payload. Is it also in binary? In addition, for http1.1 content-type:image/gif(octect-stream), is body also encoded as textual or binary? – Stan Peng Apr 12 '22 at 12:39
  • 1
    Body’s are stored in a DATA frame which includes a header giving details like frame type, stream… etc. The payload within that can be text for text formats, but not for binary formats like GIF. However typically text resources are compressed with gzip or Brotli so will be binary data. – Barry Pollard Apr 12 '22 at 13:52
  • So basically text resources will be text which has to be translated to byte char by char. iintegers are still not supported in the text payload. Good thing is that we can use gzip(huffman encoding and LZ77) to shorten binary length – Stan Peng Apr 13 '22 at 01:19
4

HTTP basically encodes all relevant instructions as ASCII code points, e.g.:

GET /foo HTTP/1.1

Yes, this is represented as bytes on the actual transport layer, but the commands are based on ASCII bytes, and are hence readable as text.

HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still "GET /foo" to be found somewhere in there.)

deceze
  • 491,798
  • 79
  • 706
  • 853
0

for example:

Connection: keep-alive

in http1.1

in http1.1 it will be encoded into(often in utf-8):

Connection: keep-alive

Just the text.

in http2

Beforehand the client and the server have agreed on some value collections like:

headerField: ['user-agent','cookie', 'connection',...]

connection-values: ['keep-alive', 'close'...]

Then Connection: keep-alive will be encode into:

2:0

end

Here is a protocol similiar with http2 binary protocol: thrift binary protocol

zizhen zhan
  • 148
  • 1
  • 6
-3

I believe the primary reason HTTP/2 uses binary encoding is to pack the payload into the fixed sized frames. Plain text cannot fit exactly into the frame. So binary encoding the data and splitting into multiple frames would make lot more sense.

Prasanna
  • 2,290
  • 9
  • 11