83

I've just make excises of gzip on python.

import gzip
f=gzip.open('Onlyfinnaly.log.gz','rb')
file_content=f.read()
print file_content

And I get no output on the screen. As a beginner of python, I'm wondering what should I do if I want to read the content of the file in the gzip file. Thank you.

Jon Clements
  • 132,101
  • 31
  • 237
  • 267
Michael
  • 873
  • 1
  • 6
  • 6
  • 6
    Try `print open('Onlyfinnaly.log.gz', 'rb').read().decode('zlib')`. If that doesn't work, can you confirm that the file contains something? – Blender Oct 15 '12 at 19:23
  • Yeah, I'm totally sure there is a file whose name is 'Onlyfinally.log'. And what I'm trying to do is to read the content and select some to store another file. But it turn only the blank line on the screen. – Michael Oct 15 '12 at 19:48
  • 1
    Your code looks correct, but be aware that you are reading the entire file into a string. A more efficient way is usually to read the gzip stream in chunks and process them one at a time. – Krumelur Oct 16 '12 at 10:50
  • 1
    One of these has a typo. Your q has Onlyfinnaly and your comment has Onlyfinally. The code is otherwise right. – Himanshu Oct 16 '12 at 11:30

4 Answers4

89

Try gzipping some data through the gzip libary like this...

import gzip
content = "Lots of content here"
f = gzip.open('Onlyfinnaly.log.gz', 'wb')
f.write(content)
f.close()

... then run your code as posted ...

import gzip
f=gzip.open('Onlyfinnaly.log.gz','rb')
file_content=f.read()
print file_content

This method worked for me as for some reason the gzip library fails to read some files.

Matt Olan
  • 1,781
  • 17
  • 25
  • 10
    It's slightly preferable to use `with` like in @Arunava's answer, because the file will be closed even if an error occurs while reading (or you forget about it). As a bonus it's also shorter. – Mark Jan 21 '17 at 20:46
58

python: read lines from compressed text files

Using gzip.GzipFile:

import gzip

with gzip.open('input.gz','r') as fin:        
    for line in fin:        
        print('got line', line)
vinzee
  • 17,022
  • 14
  • 42
  • 60
Arunava Ghosh
  • 705
  • 1
  • 7
  • 8
  • TIL: The mode argument gzip.open can be any of 'r', 'rb', 'a', 'ab', 'w', 'wb', 'x' or 'xb' for binary mode, or 'rt', 'at', 'wt', or 'xt' for text mode. The default is 'rb'. https://docs.python.org/3/library/gzip.html – Trutane Feb 10 '22 at 20:58
4

If you want to read the contents to a string, then open the file in text mode (mode="rt")

import gzip

with gzip.open("Onlyfinnaly.log.gz", mode="rt") as f:
    file_content = f.read()
    print(file_content)
Michael Hall
  • 2,141
  • 1
  • 18
  • 33
1

for parquet file, pls using pandas to read

data = read_parquet("file.parquet.gzip")
data.head()
Fan Yang
  • 452
  • 6
  • 7