23

OK regex question , how to extract a character NOT between two characters, in this case brackets.

I have a string such as: word1 | {word2 | word3 } | word 4

I only want to get the first and last 'pipe', not the second which is between brackets. I have tried a myriad of attempts with negative carats and negative groupings and can't seem to get it to work.

Basically I am using this regex in a JavaScript split function to split this into an array containing: "word1", "{word2 | word3}", "word4".

Any assistance would be greatly appreciated!

BenMorel
  • 31,815
  • 47
  • 169
  • 296
shaun stewart
  • 233
  • 1
  • 2
  • 5

2 Answers2

33

On refiddle.com set to JavaScript, try using this pattern

/\|(?![^{]*})/g

with this text

word1 | {word2 | word3 } | word 4 | word 4 | {word2 | word3 }

This should match all of the Pipe symbols that are not inside {}.

Diver
  • 1,488
  • 2
  • 17
  • 32
  • 1
    very helpful, specially for html strings that need some replacement in their nodes, but not in their attributes ;) – axel Nov 20 '18 at 13:34
21

Depends on the language/implementation you're using, but...

\|(?![^{]*})

This matches a pipe that is not followed by a } except in the case that a { comes first.


The (?! ... ) is known as a negative lookahead assertion. This is easier to understand if we start with a positive lookahead assertion:

\|(?=[^{]*})

The above only matches a pipe that is followed by a } without encountering a { first. When you negate that by replacing the = with a !, the match is now only successful if there's no way for the positive case to be true (also known as the complement).

slackwing
  • 27,451
  • 15
  • 82
  • 136