2

How can I replace contiguous substring of a string in C#? For example, the string

"<p>The&nbsp;&nbsp;&nbsp;quick&nbsp;&nbsp;&nbsp;fox</p>"

will be converted to

"<p>The&nbsp;quick&nbsp;fox</p>"

rajeemcariazo
  • 2,416
  • 5
  • 34
  • 61

3 Answers3

3

Use the below regex

@"(.+)\1+"

(.+) captures the group of characters and matches also the following \1+ one or more same set of characters.

And then replace the match with $1

DEMO

string result = Regex.Replace(str, @"(.+)\1+", "$1");
Avinash Raj
  • 166,785
  • 24
  • 204
  • 249
2

Maybe this simple one is enough:

(&nbsp;){2,}

and replace with $1 (&nbsp; that's captured in first parenthesized group)

See test at regex101


To check, if a substring is followed by itself, also can use a lookahead:

(?:(&nbsp;)(?=\1))+

and replace with empty. See test at regex101.com

Jonny 5
  • 11,591
  • 2
  • 23
  • 42
2

Let's call the original string s and the substring subString:

    var s = "<p>The&nbsp;&nbsp;&nbsp;quick&nbsp;&nbsp;&nbsp;fox</p>";
    var subString = "&nbsp;";

I'd prefer this instead of a regex, much more readable:

    var subStringTwice = subString + subString;

    while (s.Contains(subStringTwice))
    {
        s = s.Replace(subStringTwice, subString);
    }

Another possible solution with better performance:

    var elements = s.Split(new []{subString}, StringSplitOptions.RemoveEmptyEntries);
    s = string.Join(subString, elements);
    // This part is only needed when subString can appear at the start or the end of s
    if (result != "")
    {
        if (s.StartsWith(subString)) result = subString + result;
        if (s.EndsWith(subString)) result = result + subString;                
    }
schnaader
  • 48,121
  • 9
  • 101
  • 135
  • 2
    To my regex loving eyes, this is only more readable if you are **not familiar with regex**. Algorithmically speaking, this could also be much more expensive but that is barely worth mentioning. – Gusdor Feb 05 '15 at 13:26
  • Yup... repeated string replace is expensive, and the split method fails if the substring appears at the start or end of the input. – Rawling Feb 05 '15 at 15:40
  • @Rawling: Thanks for catching that, fixed. – schnaader Feb 06 '15 at 07:31