15

Say I have a binary file; it contains positive binary numbers, but written in little endian as 32-bit integers

How do I read this file? I have this right now.

int main() {
    FILE * fp;
    char buffer[4];
    int num = 0;
    fp=fopen("file.txt","rb");
    while ( fread(&buffer, 1, 4,fp) != 0) {

        // I think buffer should be 32 bit integer I read,
        // how can I let num equal to 32 bit little endian integer?
    }
    // Say I just want to get the sum of all these binary little endian integers,
    // is there an another way to make read and get sum faster since it's all 
    // binary, shouldnt it be faster if i just add in binary? not sure..
    return 0;
}
Jonathan Leffler
  • 698,132
  • 130
  • 858
  • 1,229
user1713700
  • 187
  • 1
  • 2
  • 7
  • 1
    possible duplicate of [Byte swap during copy](http://stackoverflow.com/questions/7342527/byte-swap-during-copy) – OmnipotentEntity Oct 21 '12 at 19:14
  • @OmnipotentEntity: the question covers the same class of problem, but is different, I think. A beginner will find the linked question & answers difficult to follow. – slashmais Oct 21 '12 at 19:25
  • 1
    If you are using a 80x86 machine - all of them uses little-endian - you won't need to make any adjustments to the numbers. – slashmais Oct 21 '12 at 19:40
  • 1
    Several answers are assuming the *reader* is NOT little endian as well. The OP made no mention of that; only that the *writer* used LE-output format. The subject code should be portable to deal with either an LE or BE reader (which some are, thankfully). – WhozCraig Oct 21 '12 at 19:51

3 Answers3

20

This is one way to do it that works on either big-endian or little-endian architectures:

int main() {
    unsigned char bytes[4];
    int sum = 0;
    FILE *fp=fopen("file.txt","rb");
    while ( fread(bytes, 4, 1,fp) != 0) {
        sum += bytes[0] | (bytes[1]<<8) | (bytes[2]<<16) | (bytes[3]<<24);
    }
    return 0;
}
skyking
  • 13,166
  • 34
  • 53
Vaughn Cato
  • 61,903
  • 5
  • 80
  • 122
  • 2
    This is technically undefined behavior if the number you're reading in is negative, this can be avoided by casting `(unsigned) bytes[3] << 24`. – Dietrich Epp Oct 21 '12 at 19:37
  • +1: for not assuming the reader is BE (though I would have gone 4,1 on the `fread()`) – WhozCraig Oct 21 '12 at 19:52
  • @WhozCraig: Good point on the size vs. count arguments to fread -- changed. – Vaughn Cato Oct 21 '12 at 20:02
  • do I need 0xFF after bytes[0]? like (buffer[0] & 0xFF) | (buffer[1] & 0xFF) << 8.. just curious, what bytes[1] will be if I have a number 123456. thx! – user1713700 Oct 21 '12 at 22:15
  • @user1713700: each element of bytes just holds one byte, and since it is unsigned, there's no need to mask off the upper bits, because they'll all be zero. – Vaughn Cato Oct 22 '12 at 00:22
  • @VaughnCato I'm glad someone answered with this. There are several questions on SO about swapping endian-ness but your answer addresses the problem of "reading in a binary number as byte sequence of known endianness" in a cross platform and simple way. – cheshirekow Aug 29 '13 at 15:30
  • There are a few fine points that makes this not fully portable. First of all `int` doesn't have to be big enough to hold a 32-bit value (there doesn't need to exist a type that holds exactly 32-bits at all). Second the `unsigned char` type doesn't have to be exactly 8-bits either. – skyking Dec 07 '15 at 09:28
12

If you are using linux you should look here ;-)

It is about useful functions such as le32toh

Kylo
  • 2,284
  • 19
  • 24
5

From CodeGuru:

inline void endian_swap(unsigned int& x)
{
    x = (x>>24) | 
        ((x<<8) & 0x00FF0000) |
        ((x>>8) & 0x0000FF00) |
        (x<<24);
}

So, you can read directly to unsigned int and then just call this.

while ( fread(&num, 1, 4,fp) != 0) {
    endian_swap(num); 
    // conversion done; then use num
}
Reunanen
  • 7,711
  • 2
  • 34
  • 55