I am trying to execute the following code, but seloger certainly blocks the execution of the script.
https://github.com/edouardmulliez/scraper-seloger/blob/master/seloger_scraper.py
I am looking to integrate headers to get around the issue, as described here
How to use Python requests to fake a browser visit?
however I did not succeed in getting it working. Here is my code below for the while loop.
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
import re
import os
import matplotlib.pyplot as plt
import seaborn as sns
locality = []
price = []
room_nb = []
bedroom_nb = []
surface = []
acc_type = []
while (page <= max_pages):
url = ("http://www.seloger.com/list.htm?org=advanced_search&idtt=2&idtypebien=2,1&cp=75&tri=initial&LISTING-LISTpg=" +
str(page) + "&naturebien=1,2,4")
page += 1
try:
r = s.get(url, headers=headers)
except:
# Stop visiting pages
break