0

I'm currently trying to get some informations from a website (https://www.bauhaus.info/) and fail at the cookie popup form.

This is my code till now:

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://www.bauhaus.info');
    await sleep(5000);
    const html = await page.content();
    fs.writeFileSync("./page.html", html, "UTF-8");
    page.pdf({
        path: './bauhaus.pdf', 
        format: 'a4'
    });
});

function sleep(ms) {
    return new Promise((resolve) => {
        setTimeout(resolve, ms);
    });
}

Till this everything works fine. But I can't accept the cookie banner, because I don't see the html from this banner in puppeteer. But in the pdf I can see the form.

enter image description here My browser

enter image description here Puppeteer

Why can I not see this popup in the html code? Bonus quest: Is there any way to replace the sleep method with any page.await without knowing which js method triggers the cookie form to appear?

Raphael
  • 25
  • 5
  • Sleep: await page.waitForTimeout(4000) – Konrad Linkowski May 08 '22 at 21:16
  • Why no popup in HTML? This popup is loaded through js and you are saving initial HTML – Konrad Linkowski May 08 '22 at 21:16
  • How do you try to close the banner? – Konrad Linkowski May 08 '22 at 21:17
  • It's in the shadow DOM. See something like [Puppeteer not giving accurate HTML code for page with shadow roots](https://stackoverflow.com/questions/68525115/puppeteer-not-giving-accurate-html-code-for-page-with-shadow-roots/68540701#68540701) which has an explanation and a ton of resources. Also, try to avoid sleeping if you can possibly help it -- it's slow and unreliable. – ggorlen May 08 '22 at 21:39
  • Also, please only ask one question per post. That said, I don't know what you mean by the "bonus quest". – ggorlen May 08 '22 at 21:56

1 Answers1

0

This element is in a shadow root. Please visit my answer in Puppeteer not giving accurate HTML code for page with shadow roots for additional information about the shadow DOM.

This code dips into the shadow root, waits for the button to appear, then clicks it:

const puppeteer = require("puppeteer"); // ^13.5.1

let browser;
(async () => {
  browser = await puppeteer.launch({headless: false});
  const [page] = await browser.pages();
  const url = "https://www.bauhaus.info/";
  await page.goto(url, {waitUntil: "domcontentloaded"});
  const el = await page.waitForSelector("#usercentrics-root");
  await page.waitForFunction(el =>
    el.shadowRoot.querySelector(".sc-gsDKAQ.dejeIh"), {}, el
  );
  await el.evaluate(el =>
    el.shadowRoot.querySelector(".sc-gsDKAQ.dejeIh").click()
  );
  await page.waitForTimeout(100000); // pause to show that it worked
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close())
;
ggorlen
  • 33,459
  • 6
  • 59
  • 67