31

I'm trying to read a file byte by byte, but I'm not sure how to do that. I'm trying to do it like that:

file = open(filename, 'rb')
while 1:
   byte = file.read(8)
   # Do something...

So does that make the variable byte to contain 8 next bits at the beginning of every loop? It doesn't matter what those bytes really are. The only thing that matters is that I need to read a file in 8-bit stacks.

EDIT:

Also I collect those bytes in a list and I would like to print them so that they don't print out as ASCII characters, but as raw bytes i.e. when I print that bytelist it gives the result as

['10010101', '00011100', .... ]
skaffman
  • 390,936
  • 96
  • 800
  • 764
zaplec
  • 1,535
  • 4
  • 22
  • 48

5 Answers5

40

To read one byte:

file.read(1)

8 bits is one byte.

Mark Byers
  • 767,688
  • 176
  • 1,542
  • 1,434
19

To answer the second part of your question, to convert to binary you can use a format string and the ord function:

>>> byte = 'a'
>>> '{0:08b}'.format(ord(byte))
'01100001'

Note that the format pads with the right number of leading zeros, which seems to be your requirement. This method needs Python 2.6 or later.

Scott Griffiths
  • 20,729
  • 8
  • 54
  • 84
17

The code you've shown will read 8 bytes. You could use

with open(filename, 'rb') as f:
   while 1:
      byte_s = f.read(1)
      if not byte_s:
         break
      byte = byte_s[0]
      ...
kennytm
  • 491,404
  • 99
  • 1,053
  • 989
2

There's a python module especially made for reading and writing to and from binary encoded data called 'struct'. Since versions of Python under 2.6 doesn't support str.format, a custom method needs to be used to create binary formatted strings.

import struct

# binary string
def bstr(n): # n in range 0-255
    return ''.join([str(n >> x & 1) for x in (7,6,5,4,3,2,1,0)])

# read file into an array of binary formatted strings.
def read_binary(path):
    f = open(path,'rb')
    binlist = []
    while True:
        bin = struct.unpack('B',f.read(1))[0] # B stands for unsigned char (8 bits)
        if not bin:
            break
        strBin = bstr(bin)
        binlist.append(strBin)
    return binlist
  • 1
    If you're just using it for a single character, surely you'd do better to just use `ord(f.read(1))` instead of `struct.unpack('B', f.read(1))[0]`? (You'd need to make it something like `c = f.read(1); if not c: break; binlist.append(bstr(ord(c)))`.) – Chris Morgan Dec 13 '11 at 09:52
  • I've got this error: ---> 12 bin = struct.unpack('B',f.read(1))[0] # B stands for unsigned char (8 bits) error: unpack requires a buffer of 1 bytes – MGM Oct 31 '18 at 12:14
0

Late to the party, but this may help anyone looking for a quick solution:

you can use bin(ord('b')).replace('b', '')bin() it gives you the binary representation with a 'b' after the last bit, you have to remove it. Also ord() gives you the ASCII number to the char or 8-bit/1 Byte coded character.

Cheers

e-nouri
  • 2,486
  • 1
  • 20
  • 34