2

I'm just trying to do a simple deletion of an element in C#. If my html element contains the text [Store Logo] then I want to remove it. Example:

<img src="http://src.sencha.io/300/80/http://images.company.com/[Store Logo]" />

Since it has [Store Logo] then I'd like to delete the whole image tag. I was trying to use RegEx to do it but it's hard to understand how to use all the symbols together and I read that I'm not supposed to use regex to parse html. What is the best way to remove it?

proseidon
  • 2,095
  • 5
  • 28
  • 54

2 Answers2

3

U can use Html Agility Pack

Here's an example straight from their examples page on how to find all the links in a page:

 HtmlWeb hw = new HtmlWeb();
 HtmlDocument doc = hw.Load(/* url */);
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]"))
 {
    if(link.Attributes["href"].Value.Contains("[Store Logo]"))
       link.ParentNode.RemoveChild(link, true);
 }
Roar
  • 2,087
  • 4
  • 23
  • 35
0

Use HtmlAgilityPack. It's a library for parsing HTML that allows to to access the DOM and modify it.

System Down
  • 6,102
  • 1
  • 26
  • 32