-1

I'm trying to scrape a really long Spotify playlist (10,000 songs), when I tried using beautiful soup I found out I could find only 100 links. I found out that some websites load the objects dynamically so I had to use "selenium".

The problem is that I still don't understand how to get the "changing HTML", I found out that the source code is changing when I scroll to show different parts of the playlist. How do I get all of the 10,000 song links?

BIG_PLAYLIST_URL = 'https://open.spotify.com/playlist/2owshwciXFj8pnZElRlJqG?si=cf026ba736f64271'

driver = webdriver.Chrome()
driver.get(BIG_PLAYLIST_URL)
xpath = '//*[@id="main"]/div/div[2]/div[3]/div[1]/div[2]/div[2]/div/div/div[2]/main/div/section/div[2]/div[3]/div/div[2]/div[2]/div[12]'

mainDiv = driver.find_element(by=By.XPATH, value=xpath)
divs = mainDiv.find_elements(by=By.CSS_SELECTOR, value="div")
print(len(divs))

playlist link

HTML before change

HTML after change

Shy Cohen
  • 45
  • 2
  • Isn't it possible to get this list using the [Spotify API](https://developer.spotify.com/documentation/web-api/)? – alex May 23 '22 at 09:03
  • Does this answer your question? [How can I scroll a web page using selenium webdriver in python?](https://stackoverflow.com/questions/20986631/how-can-i-scroll-a-web-page-using-selenium-webdriver-in-python) – alex May 23 '22 at 09:05
  • I'm doing a school project so I'm trying to learn scraping so I cant use API – Shy Cohen May 23 '22 at 09:06
  • do the link given by alex helped you to solve the problem? – sound wave May 23 '22 at 20:01

0 Answers0