1

I have regex:

str.replaceAll("(?!<img\ssrc=".*?">)([a-z])", "");

...which should to kick all letters out except <img> tag body from this string:

 qwerty <img src="image.jpg"> zxc

But i get < ="."> instead of <img src="image.jpg">

How to fix this?

Cœur
  • 34,719
  • 24
  • 185
  • 251
WildDev
  • 2,120
  • 5
  • 33
  • 65

2 Answers2

6

Option 1: Only One Tag

If you have only one image tag, just match it: the match is your new string.

Pattern regex = Pattern.compile("<img[^>]+>");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
    String ReplacedString = regexMatcher.group();
}

Option 2: Multiple Tags

Use this regex:

<img[^>]+>|(.)

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."

The left side of the alternation | matches complete <img tags>. We will ignore these matches. The right side matches and captures single characters to Group 1, and we know they are the right ones because they were not matched by the expression on the left.

This program shows how to use the regex (see the results at the bottom of the online demo):

String subject = "qwerty <img src=\"image.jpg\"> zxc";
Pattern regex = Pattern.compile("<img[^>]+>|(.)");
Matcher m = regex.matcher(subject);
StringBuffer b= new StringBuffer();
while (m.find()) {
if(m.group(1) != null) m.appendReplacement(b, "");
else m.appendReplacement(b, m.group(0));
}
m.appendTail(b);
String replaced = b.toString();
System.out.println(replaced);

Reference

Community
  • 1
  • 1
zx81
  • 39,708
  • 9
  • 81
  • 104
3

Your problem is in the REGEXP. The first thing I see is that you have not escape properly your string:

Should be

(?!<img\\ssrc=\".*?\">)([\\s\\S])

Note that is a whitespace between both groups

Anyway I would put:

[^<]*([^>]*>)[\s\S]*
inigoD
  • 1,653
  • 13
  • 26