0

I'm trying to write a script to scrape IPFS data by file number. I have a folder containing 2222 files numbered 1 to 2222. I'm trying to scrape HTML data from the corresponding numbered file on an IPFS hosted directory, and write the content into my local files. I'm a noob at python, so any help would be appreciated. Here is the script so far, requiring me to manually input numbers each time...

import urllib.request

def extractHTML(url):
    f = open('3', 'w')
    page = urllib.request.urlopen(url)
    pagetext = str(page.read())
    f.write(pagetext)
    f.close()

extractHTML('https://gateway.pinata.cloud/ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/3')

So, for example, I would like the script to replace file number 4 with the contents of /ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/4. And file 5 with contents of /ipfs/QmQFkLSQysj94s5GvTHPyzTxrawwtjgiiYS2TBLgrvw8CW/5 etc. Preferably in one long loop with incrementing numbers.

Robert
  • 6,055
  • 26
  • 41
  • 54

0 Answers0