3

I'm writing a Chrome Extention to manipulate pdf file so I want to get selected text in the pdf. How can I do that.

Some thing like that:

enter image description here

dinosaur
  • 169
  • 8

2 Answers2

0

There is no one generic solution for all pdf extensions. Every extention has is own API. If you work with google-chrome extension i belive it's impossible.

How to get the selected text from an embedded pdf in a web page?

https://html.developreference.com/article/23259983/How+extension+get+the+text+selected+in+chrome+pdf+viewer%EF%BC%9F

0

You can use the internal undocumented commands of the built-in PDF viewer.

Here's an example of a content script:

function getPdfSelectedText() {
  return new Promise(resolve => {
    window.addEventListener('message', function onMessage(e) {
      if (e.origin === 'chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai' &&
          e.data && e.data.type === 'getSelectedTextReply') {
        window.removeEventListener('message', onMessage);
        resolve(e.data.selectedText);
      }
    });
    // runs code in page context to access postMessage of the embedded plugin
    const script = document.createElement('script');
    if (chrome.runtime.getManifest().manifest_version > 2) {
      script.src = chrome.runtime.getURL('query-pdf.js');
    } else {
      script.textContent = `(${() => {
        document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*');
      }})()`;
    }
    document.documentElement.appendChild(script);
    script.remove();
  });
}

chrome.runtime.onMessage.addListener((msg, sender, sendResponse) => {
  if (msg === 'getPdfSelection') {
    getPdfSelectedText().then(sendResponse);
    return true;
  }
});

This example assumes you send a message from the popup or background script:

chrome.tabs.query({active: true, currentWindow: true}, ([tab]) => {
  chrome.tabs.sendMessage(tab.id, 'getPdfSelection', sel => {
    // do something
  });
});

See also How to open the correct devtools console to see output from an extension script?

ManifestV3 extensions also need this:

  • manifest.json should expose query-pdf.js

      "web_accessible_resources": [{
        "resources": ["query-pdf.js"],
        "matches": ["<all_urls>"],
        "use_dynamic_url": true
      }]
    
  • query-pdf.js

    document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*')
    
wOxxOm
  • 53,493
  • 8
  • 111
  • 119
  • This did not work for me. The message listener did not intercept any events from the pdf viewer, unfortunately. – Alex Zhong Oct 17 '21 at 00:16
  • @AlexZhong, this is still working so if you can post a new question with an [MCVE](/help/mcve) that describes all the specifics of your case someone (or I) might be able to help. Note that this answer only works with the built-in viewer and only in the main page, so for an iframe you would need to make a couple of changes. – wOxxOm Oct 17 '21 at 05:28
  • Hey, I tried it with the built-in viewer. What I did was I copied your code in my CRX, tried in both the background and content script separately -> the message listener is registered -> I cannot observe any messages received from the listener when I select the text in the pdf viewer. – Alex Zhong Oct 17 '21 at 07:03
  • Also, I could not find any "getPdfSelection" message being sent in the source code that you linked – Alex Zhong Oct 17 '21 at 07:04
  • You are supposed to send that message yourself, of course. – wOxxOm Oct 17 '21 at 10:01