64

I'm trying to GET an URL of the following format using requests.get() in python:

http://api.example.com/export/?format=json&key=site:dummy+type:example+group:wheel

#!/usr/local/bin/python

import requests

print(requests.__versiom__)
url = 'http://api.example.com/export/'
payload = {'format': 'json', 'key': 'site:dummy+type:example+group:wheel'}
r = requests.get(url, params=payload)
print(r.url)

However, the URL gets percent encoded and I don't get the expected response.

2.2.1
http://api.example.com/export/?key=site%3Adummy%2Btype%3Aexample%2Bgroup%3Awheel&format=json

This works if I pass the URL directly:

url = http://api.example.com/export/?format=json&key=site:dummy+type:example+group:wheel
r = requests.get(url)

Is there some way to pass the the parameters in their original form - without percent encoding?

Thanks!

Satyen Rai
  • 1,273
  • 1
  • 11
  • 19
  • 1
    It is a [standard](http://en.wikipedia.org/wiki/Percent-encoding). What is wrong with it? – alecxe May 06 '14 at 13:58
  • 5
    @alecxe: The site I'm querying doesn't seem to work with percent encoded URLs and I get unexpected response. – Satyen Rai May 06 '14 at 14:18
  • 3
    I got this problem with Google Maps API and comma in `location=43.585278,39.720278` and I didn't find solution. – furas May 06 '14 at 14:28

6 Answers6

74

It is not good solution but you can use directly string:

r = requests.get(url, params='format=json&key=site:dummy+type:example+group:wheel')

BTW:

Code which convert payload to this string

payload = {
    'format': 'json', 
    'key': 'site:dummy+type:example+group:wheel'
}

payload_str = "&".join("%s=%s" % (k,v) for k,v in payload.items())
# 'format=json&key=site:dummy+type:example+group:wheel'

r = requests.get(url, params=payload_str)

EDIT (2020):

You can also use urllib.parse.urlencode(...) with parameter safe=':+' to create string without converting chars :+ .

As I know requests also use urllib.parse.urlencode(...) for this but without safe=.

import requests
import urllib.parse

payload = {
    'format': 'json', 
    'key': 'site:dummy+type:example+group:wheel'
}

payload_str = urllib.parse.urlencode(payload, safe=':+')
# 'format=json&key=site:dummy+type:example+group:wheel'

url = 'https://httpbin.org/get'

r = requests.get(url, params=payload_str)

print(r.text)

I used page https://httpbin.org/get to test it.

furas
  • 119,752
  • 10
  • 94
  • 135
  • Thanks, That's what I'm currently doing to make it work. I'm looking for a solution similar to the (obsolete) one described [here](https://groups.google.com/forum/#!topic/scraperwiki/spEFwwzxrQA). Thanks anyway! – Satyen Rai May 06 '14 at 15:18
  • I was looking for better solution (similar to the obsolete one) in requests source code but I didn't find it. – furas May 06 '14 at 15:28
  • 1
    worked for me. seemingly not great, but gets the job done. i thought there might be some easier solution by adjusting the encoding within the `requests` object. – ryantuck Feb 18 '15 at 22:30
  • I use "%XX" where XX are hex digits. Sending strings for params works until I try to send something larger than 2F, at which point I get an "Invalid control character" error – retsigam Aug 15 '18 at 22:19
  • `urllib.parse.urlencode` is not ignoring curly braces during parsing. `self.response = requests.get(SteamQuery.queries[self.query_type], params=urllib.parse.urlencode(self.query_params,safe=":{}[]"))` `input_json=%7Bappids_filter:[892970]%7D` – user1023102 Feb 26 '21 at 03:10
13

The solution, as designed, is to pass the URL directly.

Kenneth Reitz
  • 8,103
  • 4
  • 28
  • 34
  • 1
    The idea behind using the payload dictionary to keep the actual code somewhat cleaner - as suggested [here](http://docs.python-requests.org/en/latest/user/quickstart/#passing-parameters-in-urls). – Satyen Rai May 06 '14 at 15:12
  • 8
    I found this old comment by @Darkstar kind of funny as the answer he's responding to is by the author of `requests`. – Dustin Wyatt Jul 14 '16 at 16:53
  • @DustinWyatt Wow! I don't know how I missed that! – Satyen Rai Jul 14 '16 at 16:57
  • 1
    This is the most straightforward and verified working solution. Ditch the payload dictionary and slap all those parameters right into the url. – Rakaim Oct 22 '20 at 03:12
  • 2
    No this will not work, `requests` of latest version will encode the characters even if you pass the URL directly. – oeter Nov 25 '21 at 12:36
13

In case someone else comes across this in the future, you can subclass requests.Session, override the send method, and alter the raw url, to fix percent encodings and the like. Corrections to the below are welcome.

import requests, urllib

class NoQuotedCommasSession(requests.Session):
    def send(self, *a, **kw):
        # a[0] is prepared request
        a[0].url = a[0].url.replace(urllib.parse.quote(","), ",")
        return requests.Session.send(self, *a, **kw)

s = NoQuotedCommasSession()
s.get("http://somesite.com/an,url,with,commas,that,won't,be,encoded.")
Tim Ludwinski
  • 2,398
  • 25
  • 33
  • I know this wasn't in the OP's question but this doesn't work for the path portion of the URL (at the time of this comment). – Tim Ludwinski Apr 12 '21 at 20:20
  • 1
    In modern versions of requests, you actually also are gonna have to patch `urllib3`; it performs its own encoding. `requests.urllib3.util.url.PATH_CHARS.add(',')`. This starts to get into "more hacky than it's probably worth" territory, but if you _REALLY_ need it... here it is – ollien Nov 05 '21 at 15:31
11

The answers above didn't work for me.

I was trying to do a get request where the parameter contained a pipe, but python requests would also percent encode the pipe. So instead i used urlopen:

# python3
from urllib.request import urlopen

base_url = 'http://www.example.com/search?'
query = 'date_range=2017-01-01|2017-03-01'
url = base_url + query

response = urlopen(url)
data = response.read()
# response data valid

print(response.url)
# output: 'http://www.example.com/search?date_range=2017-01-01|2017-03-01'
kujosHeist
  • 760
  • 1
  • 9
  • 22
1

All above solutions don't seem to work anymore from requests version 2.26 on. The suggested solution from the GitHub repo seems to be using a work around with a PreparedRequest.

The following worked for me. Make sure the URL is resolvable, so don't use 'this-is-not-a-domain.com'.

import requests

base_url = 'https://www.example.com/search'
query = '?format=json&key=site:dummy+type:example+group:wheel'

s = requests.Session()
req = requests.Request('GET', base_url)
p = req.prepare()
p.url += query
resp = s.send(p)
print(resp.request.url)

Source: https://github.com/psf/requests/issues/5964#issuecomment-949013046

LGG
  • 61
  • 1
  • 4
0

Please have a look at the 1st option in this github link. You can ignore the urlibpart which means prep.url = url instead of prep.url = url + qry

Sandeep Kanabar
  • 1,104
  • 13
  • 30