1

hey guys does beautifulSoup strips css and javascript content? after using

content3 = ''.join(BeautifulSoup(content).findAll(text=True))

i still have them lingering around.

goh
  • 24,793
  • 28
  • 82
  • 149

1 Answers1

0

What exactly do you want to strip, all script and style elements? It should be something like:

''.join(BeautifulSoup(content).findAll(text=lambda text: 
text.parent.name != "script" and 
text.parent.name != "style"))
Matthew Flaschen
  • 268,153
  • 48
  • 509
  • 534
  • thats right, probably a regex replace could do that, but i was wondering if beautifulsoup handles tthat. Or does the "simple version of webstemmer" could do that too? – goh Jun 09 '10 at 01:42