40

I am working with Python3.2. I need to take a hex stream as an input and parse it at bit-level. So I used

bytes.fromhex(input_str)

to convert the string to actual bytes. Now how do I convert these bytes to bits?

user904832
  • 479
  • 1
  • 5
  • 10
  • 1
    Bytes are bits, just 8 at a time ;) - The answer depends on what you want to do, please be more specific Also bit-manipulation is mostly done on byte level... – Martin Thurau Jan 11 '12 at 07:27
  • I want to represent the bytes in the form a bit string so that I can do something like: field1 = bit_string[0:1] field2 = bit_string[1:16] and so on – user904832 Jan 11 '12 at 07:31
  • Confusing title. Hexadecimals are nothing to do with bytes. Title should be: "Convert hexadecimals to bits in python" – illuminato Feb 13 '22 at 16:30

11 Answers11

45

Another way to do this is by using the bitstring module:

>>> from bitstring import BitArray
>>> input_str = '0xff'
>>> c = BitArray(hex=input_str)
>>> c.bin
'0b11111111'

And if you need to strip the leading 0b:

>>> c.bin[2:]
'11111111'

The bitstring module isn't a requirement, as jcollado's answer shows, but it has lots of performant methods for turning input into bits and manipulating them. You might find this handy (or not), for example:

>>> c.uint
255
>>> c.invert()
>>> c.bin[2:]
'00000000'

etc.

Alex Reynolds
  • 94,180
  • 52
  • 233
  • 338
29

Operations are much faster when you work at the integer level. In particular, converting to a string as suggested here is really slow.

If you want bit 7 and 8 only, use e.g.

val = (byte >> 6) & 3

(this is: shift the byte 6 bits to the right - dropping them. Then keep only the last two bits 3 is the number with the first two bits set...)

These can easily be translated into simple CPU operations that are super fast.

Has QUIT--Anony-Mousse
  • 73,503
  • 12
  • 131
  • 189
28

What about something like this?

>>> bin(int('ff', base=16))
'0b11111111'

This will convert the hexadecimal string you have to an integer and that integer to a string in which each byte is set to 0/1 depending on the bit-value of the integer.

As pointed out by a comment, if you need to get rid of the 0b prefix, you can do it this way:

>>> bin(int('ff', base=16)).lstrip('0b')
'11111111'

or this way:

>>> bin(int('ff', base=16))[2:]
'11111111'
wjandrea
  • 23,210
  • 7
  • 49
  • 68
jcollado
  • 37,681
  • 8
  • 99
  • 131
  • lstrip('-0b') # remove leading zeros and minus sign – ahoffer Jan 11 '12 at 07:35
  • @ahoffer Thanks for your comment. I've updated my answer to let the OP know how to remove the `0b` prefix. – jcollado Jan 11 '12 at 07:39
  • 10
    Note that `lstrip('0b')` will also remove, say, `00bb` since the argument to `lstrip` is a *set* of characters to remove. It'll work fine in this case, but I prefer the `[2:]` solution since it's more explicit. – Martin Geisler Jan 11 '12 at 07:45
  • @MartinGeisler Yes, `bin` leading zeros are already removed when converting to an integer, but it's worth to note that `lstrip` removes a set of characters, not a prefix. – jcollado Jan 11 '12 at 07:50
10

using python format string syntax

>>> mybyte = bytes.fromhex("0F") # create my byte using a hex string
>>> binary_string = "{:08b}".format(int(mybyte.hex(),16))
>>> print(binary_string)
00001111

The second line is where the magic happens. All byte objects have a .hex() function, which returns a hex string. Using this hex string, we convert it to an integer, telling the int() function that it's a base 16 string (because hex is base 16). Then we apply formatting to that integer so it displays as a binary string. The {:08b} is where the real magic happens. It is using the Format Specification Mini-Language format_spec. Specifically it's using the width and the type parts of the format_spec syntax. The 8 sets width to 8, which is how we get the nice 0000 padding, and the b sets the type to binary.

I prefer this method over the bin() method because using a format string gives a lot more flexibility.

ZenCodr
  • 1,106
  • 8
  • 12
  • but this method doesn't let you take a variable number of bytes as input, right? you need to hard-code how long the final binary string needs to be. – Nathan Wailes Aug 26 '20 at 11:49
9

I think simplest would be use numpy here. For example you can read a file as bytes and then expand it to bits easily like this:

Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)
Mikhail V
  • 1,345
  • 13
  • 22
4

Use ord when reading reading bytes:

byte_binary = bin(ord(f.read(1))) # Add [2:] to remove the "0b" prefix

Or

Using str.format():

'{:08b}'.format(ord(f.read(1)))
Jacob Valenta
  • 6,400
  • 6
  • 29
  • 42
4

To binary:

bin(byte)[2:].zfill(8)
Ferguzz
  • 5,277
  • 5
  • 30
  • 40
3

Here how to do it using format()

print "bin_signedDate : ", ''.join(format(x, '08b') for x in bytevector)

It is important the 08b . That means it will be a maximum of 8 leading zeros be appended to complete a byte. If you don't specify this then the format will just have a variable bit length for each converted byte.

Joniale
  • 426
  • 3
  • 15
3
input_str = "ABC"
[bin(byte) for byte in bytes(input_str, "utf-8")]

Will give:

['0b1000001', '0b1000010', '0b1000011']
AJP
  • 24,201
  • 19
  • 81
  • 116
1

The other answers here provide the bits in big-endian order ('\x01' becomes '00000001')

In case you're interested in little-endian order of bits, which is useful in many cases, like common representations of bignums etc - here's a snippet for that:

def bits_little_endian_from_bytes(s):
    return ''.join(bin(ord(x))[2:].rjust(8,'0')[::-1] for x in s)

And for the other direction:

def bytes_from_bits_little_endian(s):
    return ''.join(chr(int(s[i:i+8][::-1], 2)) for i in range(0, len(s), 8))
yairchu
  • 21,919
  • 7
  • 67
  • 105
0

One line function to convert bytes (not string) to bit list. There is no endnians issue when source is from a byte reader/writer to another byte reader/writer, only if source and target are bit reader and bit writers.

def byte2bin(b):
    return [int(X) for X in "".join(["{:0>8}".format(bin(X)[2:])for X in b])]
user6830669
  • 133
  • 4