0

I have a big xml structure. I am interested in certain xml structure like below. I need to extract img tags only and the value of the src attribute if they are inside coral-card. I was trying to use regex to get coral-card enclosing tags with a regex and then use regex with the coral-card tags to get to the img tag and the content.

var regex = /<coral\-card ((.|[\r\n])*?)<\/coral\-card>/g;

Is there a way to use anything after I have got the specified xml content containing coral-card tags like below. I don't want to use regex after this as I think it should be possible to get the img tag and src attribute value using jquery or javascript function.

<coral-card variant="condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/lightbox.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>

<coral-card variant="semi-condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/small.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>
Geek
  • 15
  • 6

2 Answers2

2

DOMParser and xpath are very easy to use for parsing xml. You can do something like:

const DOMParser = require('xmldom').DOMParser;
const xpath = require('xpath');

let parser = new DOMParser();
let doc = parser.parseFromString(<your xml>);
let document = doc.documentElement;
let coralCards = xpath.select('<path>/coral-card', document);

See xpath docs for all of the ways you can extract nodes out of an xml blob.

Jim B.
  • 4,060
  • 1
  • 21
  • 45
0

This is exactly why the core DOM specification was created:

// Find all the <coral-card> elements:
var elements = document.getElementsByTagName("coral-card");

// Loop through them:
for(var i = 0; i < elements.length; ++i){
  // Extract whatever you need:
  console.log(elements[i].getAttribute("variant"));
  console.log(elements[i].querySelector("img").src);
}
<coral-card variant="condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/lightbox.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>

<coral-card variant="semi-condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/small.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>
Scott Marcus
  • 60,351
  • 6
  • 44
  • 64
  • Thanks. I have that specified xml content within coral-card tag or to put it precisely I have a very big HTML content lets say htmlResponse which has this xml content . How would documen.GetEelementsByTagName would work in that case ? Should I convert html response string to DOM using parseHTML first ? – Geek Nov 18 '16 at 18:07
  • @Geek Yes. Once it's parsed from a string. You can use the DOM API to traverse it and extract whatever you want. – Scott Marcus Nov 18 '16 at 19:37