-3

I'm trying to get all ' characters that are inside of the <> tag from a string using RegEx. The string might contain the ' outside of <> tag which I don't want to get.

The string is e.g.:

"<img src = '/path/to/the/file' title = 'My Image1'/>

<img src = '/path/to/the/file2' title = 'My Image2'/>

Don't need to get quotes from this line.";

Expected: What's expected

The <> tags can include not only the info from example.

Yury Fedorov
  • 13,640
  • 6
  • 50
  • 65
  • While this is possible for some subset of cases, it can never work in the general case because as soon as you get into "How do I get all the `characters` inside `some delimiters`" you've left the world of regular languages and therefore the world where regular expressions have much power. – Adam Smith Jul 24 '19 at 17:29
  • Unfortunately I don't have access to the functionality that generates the string, and I have to get rid of the single quotes inside (change them to double quotes) of the tags while not touching the text outside of the tags. – Oleg Saienko Jul 24 '19 at 17:45
  • 1
    All I can advise you on is that Regex is not the tool for this job. – Adam Smith Jul 24 '19 at 17:53
  • [ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) ... (I'll take any excuse to link to that answer - basically, you want an HTML parser, not RegEx) – CD001 Jul 25 '19 at 14:57

1 Answers1

0

I am assuming you want to extract the image paths and titles of all the images embedded in the html file and not process other strings or text in the html!! If that is correct, then using Python Beautiful Soup (https://www.crummy.com/software/BeautifulSoup/bs4/doc/) you can get all the image src very easily. However if your intention is not this then, I think regex might have little say here (as already mentioned by Adam Smith). All the best.

Amit
  • 1,938
  • 1
  • 7
  • 10