5

I am trying to extract the img and src from a long html string.

I know there are a lot of questions about how to do this, but I have tried and gotten the wrong result. My question is just about contradicting results though.

I am using:

var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>";
var regexp = /<img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*>/g;
var src = url.match(regexp);

But this results in src not being extracted properly. I keep getting src =<img height="100" src="data:image/png;base64,testurlhere" width="200"></img> instead of data:image/png;base64,testurlhere

However, when I try this on the regex tester at regex101, it extracts the src correctly. What am I doing wrong? Is match() the wrong function to use>

llams48
  • 347
  • 2
  • 7
  • 16

3 Answers3

16

If you need to get the whole img tags for some reason:

const imgTags = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm);

then you can extract the source link for every img tag in array like this:

const sources = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm)
                          .map(x => x.replace(/.*src="([^"]*)".*/, '$1'));
Vi0nik
  • 171
  • 1
  • 8
3

Not a big fan of using regex to parse html content, so here goes the longer way

var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>";
var tmp = document.createElement('div');
tmp.innerHTML = url;
var src = tmp.querySelector('img').getAttribute('src');
snippet.log(src)
<!-- Provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
Arun P Johny
  • 376,738
  • 64
  • 519
  • 520
  • OP, I gave you the literal answer to your question; but this here is what you would be advised to be doing instead. – Amadan Jul 22 '15 at 04:27
1

Try this:

var match = regexp.exec(url);
var src = match[1];
Amadan
  • 179,482
  • 20
  • 216
  • 275
  • Thanks, this works too. Just wondering, why does match[0] return the original string and match[1] return the substring that we are actually looking for? Is it always the case that the 2nd element in the resulting array will be the desired result? – llams48 Jul 23 '15 at 03:35
  • @llams48: `match[1]` is the 1st capture group, `match[2]` is the second... and `match[0]` is the full match. – Amadan Jul 23 '15 at 03:44