1

Some of a website's pages have source code that includes img src="url_to_image" but the img is not requested by the page. Is there a way to crawl the website to programatically determine if images are in code but not requested?

I'm reasonably familiar with Screaming Frog and Python+Selenium but I don't know if I can make either do that job.

[edit]

The images are in lazy load code that in turn is inside a div set to display:none. So am I correct in thinking that in those circumstances the browser will not request the images?

Lag
  • 258
  • 1
  • 9

1 Answers1

3

Since you want to do this programmatically rather than as a one-off, one idea would be to load up the page in Selenium and get a list of all image files that were requested while the page was loading. Then parse the source html of the same page for img src tags, and compare the list.

The images you're looking for would be part of the second list, but not part of the first.

Maximillian Laumeister
  • 15,972
  • 3
  • 31
  • 62