79

I got a problem when I am using python to save an image from url either by urllib2 request or urllib.urlretrieve. That is the url of the image is valid. I could download it manually using the explorer. However, when I use python to download the image, the file cannot be opened. I use Mac OS preview to view the image. Thank you!

UPDATE:

The code is as follow

def downloadImage(self):
    request = urllib2.Request(self.url)
    pic = urllib2.urlopen(request)
    print "downloading: " + self.url
    print self.fileName
    filePath = localSaveRoot + self.catalog  + self.fileName + Picture.postfix
    # urllib.urlretrieve(self.url, filePath)
    with open(filePath, 'wb') as localFile:
        localFile.write(pic.read())

The image URL that I want to download is http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg

This URL is valid and I can save it through the browser but the python code would download a file that cannot be opened. The Preview says "It may be damaged or use a file format that Preview doesn't recognize." I compare the image that I download by Python and the one that I download manually through the browser. The size of the former one is several byte smaller. So it seems that the file is uncompleted, but I don't know why python cannot completely download it.

Shaoxiang Su
  • 821
  • 1
  • 7
  • 7
  • Why can't it be opened? What error do you get? What does ``file `` tell you? Did the file download correctly or were you blocked by ``User-Agent`` or ``Cookie`` restrictions or similar? – James Mills May 14 '15 at 04:31
  • 1
    Include the python code you are trying in the question please – Tom McClure May 14 '15 at 04:32
  • Sorry for the confusing. I have provided more details. Thanks a lot. I wonder if it is because the HTTP request in python is different with downloading by a browser so python cannot bring me a completed image file. – Shaoxiang Su May 14 '15 at 06:50
  • It seems that requests is a much better module than urllib and urllib2 – Shaoxiang Su May 14 '15 at 08:15

8 Answers8

147
import requests

img_data = requests.get(image_url).content
with open('image_name.jpg', 'wb') as handler:
    handler.write(img_data)
Vlad Bezden
  • 72,691
  • 22
  • 233
  • 168
75

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)
DeepSpace
  • 72,713
  • 11
  • 96
  • 140
  • That's perfect! Thank you so much! I don't know why requests module could complete that while urllib and urllib2 cannot do that, but anyways. – Shaoxiang Su May 14 '15 at 07:24
  • It does not work for the following URL; any idea how to fix it? genome.jp/pathway/ko02024+K07173 – Cleb Oct 17 '21 at 20:03
  • @Cleb That's not an image – Shidouuu Dec 04 '21 at 22:49
  • This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Anyone who knows why? – Parseval Apr 13 '22 at 09:38
18

It is the simplest way to download and save the image from internet using urlib.request package.

Here, you can simply pass the image URL(from where you want to download and save the image) and directory(where you want to save the download image locally, and give the image name with .jpg or .png) Here I given "local-filename.jpg" replace with this.

Python 3

import urllib.request
imgURL = "http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg"

urllib.request.urlretrieve(imgURL, "D:/abc/image/local-filename.jpg")

You can download multiple images as well if you have all the image URLs from the internet. Just pass those image URLs in for loop, and the code automatically download the images from the internet.

Ankit Lad
  • 209
  • 2
  • 5
  • I tried this but I get an error: HTTPError: Forbidden. Do you know why this is? I'm using this URL: http://assets.ellosgroup.com/i/ellos/ell_1682670-01_Fs. – Parseval Apr 13 '22 at 09:46
12

Python code snippet to download a file from an url and save with its name

import requests

url = 'http://google.com/favicon.ico'
filename = url.split('/')[-1]
r = requests.get(url, allow_redirects=True)
open(filename, 'wb').write(r.content)
Basil Jose
  • 884
  • 10
  • 12
4
import random
import urllib.request

def download_image(url):
    name = random.randrange(1,100)
    fullname = str(name)+".jpg"
    urllib.request.urlretrieve(url,fullname)     
download_image("http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg")
mdaniel
  • 27,592
  • 5
  • 48
  • 48
learner
  • 49
  • 2
  • 2
    Welcome to Stackoverflow and thanks for your contribution! Could you add an explanation to your answer what the code does and why it works? Thanks! – Max Vollmer Sep 09 '18 at 14:40
  • How do I add the headers for url in urlretrieve? I had a problem with images opening in the browser but not through code using urlretrive. I have tried urlopen but I don't know how to download the image using urlopen. – Eswar Mar 27 '19 at 14:38
1

Anyone who is wondering how to get the image extension then you can try split method of string on image url:

str_arr = str(img_url).split('.')
img_ext = '.' + str_arr[3] #www.bigbasket.com/patanjali-atta.jpg (jpg is after 3rd dot so)
img_data = requests.get(img_url).content
with open(img_name + img_ext, 'wb') as handler:
    handler.write(img_data)
Ssubrat Rrudra
  • 787
  • 6
  • 18
1

download and save image to directory

import requests

headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
           "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
           "Accept-Language": "en-US,en;q=0.9"
           }

img_data = requests.get(url=image_url, headers=headers).content
with open(create_dir() + "/" + 'image_name' + '.png', 'wb') as handler:
    handler.write(img_data)

for creating directory

def create_dir():
    # Directory
    dir_ = "CountryFlags"
    # Parent Directory path
    parent_dir = os.path.dirname(os.path.realpath(__file__))
    # Path
    path = os.path.join(parent_dir, dir_)
    os.mkdir(path)
    return path
zaheer
  • 123
  • 8
0

For linux in case; you can use wget command

import os
url1 = 'YOUR_URL_WHATEVER'
os.system('wget {}'.format(url1))
Vicrobot
  • 3,516
  • 1
  • 14
  • 29
  • That gives me an empty image for the following URL: https://www.genome.jp/pathway/ko02024+K07173 Any idea how to fix this? – Cleb Oct 17 '21 at 19:51
  • @Cleb That's because the url you provided doesn't belong to an image. Try it with ```url1 = 'https://www.genome.jp/tmp/mark_pathway1641220140108369/ko02024.png'``` in this case – RAZ0229 Jan 03 '22 at 14:32