Puppeteer: get full HTML content of a webpage, like innerHTML, but including any shadow roots?

Asked Jan 21 '21 at 11:05

Active Jan 21 '21 at 11:05

Viewed 198 times

When browsing a page in Puppeteer, I can usually get the full HTML content as text like this:

var content = await page.evaluate( 
  () => document.querySelector('body').innerHTML );

However I'm currently dealing with a situation where there are multiple nested shadow roots. So I assume I'll have to traverse the entire DOM and check each node for any .shadowRoot available and traverse those DOMS separately.

Is there a shortcut or simpler way to do this? Like a innerHTML variant that includes any shadowroot DOMs?

asked Jan 21 '21 at 11:05

RocketNuts

8,559
9
36
75

just FYI: if the `attachShadow({mode: 'closed'})` was used, `.shadowRoot` won't work neither – Andrea Giammarchi Jan 21 '21 at 11:14
@AndreaGiammarchi Thanks, don't know yet if that's the case in my particular situation. Actually this whole shadowRoot business is fairly new to me. But in case `mode:'closed'` is used, is there another way to get the HTML content? When I create a screenshot from Puppeteer, all content is there. So one way or another it must have the corresponding DOM objects in there. – RocketNuts Jan 21 '21 at 11:23
devtools *has* that privilege, but I don't know if it's exported/available in Puppeteer – Andrea Giammarchi Jan 21 '21 at 11:30

Puppeteer: get full HTML content of a webpage, like innerHTML, but including any shadow roots?

0 Answers0

Linked