1

I am making a currency converting script that will scrape all the text from any webpage and will find any foreign currency, convert it using an API and then replace the foreign currency with the new one. My question was, is there any way I can get the text of an element and it's lowest form eg.

<body>
 <div>
  <div>
   <h1>
     "Hello, world"
   </h1>
  </div>
 <p>
  "How are you today?"
 </p>
 </div>
<body>

How could I get the h1 and the p element but not the div? So my array would be [h1, p] (keep in mind I'm trying to do this on a much larger scale with hundreds of elements)

I_love_vegetables
  • 1,644
  • 5
  • 11
  • 25
  • Loop through all the elements. For each element, check if its `childElementCount` is `0`. Then it's a terminal node rather than a container. – Barmar Jul 27 '21 at 17:04

3 Answers3

0

What you want to do is find the (non-empty) text nodes, then return their parents. The recursive implementation of this is:

function parentTagsOfText (elem, parentList) {
  if (elem.nodeType == 4) { // text
    parentList.push(elem.parentNode);
    return;
  }
  if (elem.childNodes) {
    for (let i = 0; i < elem.childNodes.length; i++) {
      parentTagsOfText(elem.childNodes[i], parentList);
    }
  }
}
ControlAltDel
  • 32,042
  • 9
  • 48
  • 75
0

Based on this answer, with the following function, you get all the text elements under a passed element.

function textNodesUnder(node){
  var all = [];
  for (node=node.firstChild;node;node=node.nextSibling){
    if (node.nodeType==3) all.push(node);
    else all = all.concat(textNodesUnder(node));
  }
  return all;
}

Then filter out the empty ones and get their parents.

textNodesParents = textNodesUnder(document.body).filter(x =>
    x.nodeValue.trim() != '').map(x => x.parentNode);
Kerap
  • 30
  • 1
  • 4
-1

Actually you can simply get the element via tag name , for example if you want only to get h1 and p tags , not div tags

but if you have lets say onclick event on the p tag (example) , you can pass this in your function and get tags you needed by this.parentElement.children[index]

hope it was helpful

Narek
  • 9
  • 2