12

I want to replace XML tags, with a sequence of repeated characters that has the same number of characters of the tag.

For example:

<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>

I want to replace it with:

#############2013-01-21T21:15:00Z##############

How can we use RegEx for this?

Marko
  • 69,939
  • 26
  • 120
  • 154
hmghaly
  • 1,437
  • 3
  • 23
  • 45

1 Answers1

24

re.sub accepts a function as replacement:

re.sub(pattern, repl, string, count=0, flags=0)

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Here's an example:

In [1]: import re

In [2]: def repl(m):
   ...:     return '#' * len(m.group())
   ...: 

In [3]: re.sub(r'<[^<>]*?>', repl,
   ...:     '<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>')
Out[3]: '#############2013-01-21T21:15:00Z##############'

The pattern I used may need some polishing, I'm not sure what's the canonical solution to matching XML tags is. But you get the idea.

Lev Levitsky
  • 59,844
  • 20
  • 139
  • 166
  • the canonical solution to matching XML tags is in [this answer](https://stackoverflow.com/a/1732454/786559) :) – Ciprian Tomoiagă Nov 27 '17 at 14:00
  • quick question what does .group() do in the above? – Hao S Jul 17 '21 at 18:48
  • @HaoS the method basically returns the matched string. Had there been any "groups" in the regex, this method would return them. See the reference here: https://docs.python.org/3/library/re.html#re.Match.group – Lev Levitsky Jul 17 '21 at 19:36