Using Python 3.10.2, Firefox 96.0.3. I'm logged into Goodreads (they had public APIs until Dec 2020). I want to save my Group bookshelf preferably using a fragment of the GR file name. GR doesn't have a bulk download option for Group bookshelves.
So I used the second major response in the thread, How to scrape a website which requires login using python and beautifulsoup?, to scrape pages and it works great! https://curlconverter.com/ seems really useful!
Now I'm stuck on getting the same filename I would get if I did a "Save File As..." on the page. The file name format is shelf for GROUPNAME Showing 1-30 of 3,245 Goodreads.html. The key bit of information the generated file name shows is the number (and order) of books on the page out of the total number books on the shelf.
All the posts I've read are more about saving an attachment or using urlparse. I checked requests.get.url and requests.get.headers. Also I didn't recognize anything in the curl info that seems to help:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:96.0) Gecko/20100101 Firefox/96.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Cache-Control': 'max-age=0',
'If-None-Match': 'W/f6fb6e4c8c29c186b031d887e0f7db44',
}
So it seems like it's being generated, but how do I get it? requests or the Firefox web developer toolkit? Some other way?
Thanks!
UPDATE: Found a fragment of what ends up being the filename in a metadata tag. So for my program, I'm good to go. But I'd still like to know how to get the name of run-time generated file name. Modified my question.
FINAL UPDATE: Anon Coward gave about as much of an answer as possible. I eventually found a fragment(seed) of the filename in one of the many metadata tags. My own program is resolved. But I'd still like to know if there's a way to get the name of run-time generated file name.