15

Since I'm a bit new with re2, I'm trying to figure out how to use positive-lookahead (?=regex) like JS, C++ or any PCRE style in Go.

Here's some examples of what I'm looking for.

JS:

'foo bar baz'.match(/^[\s\S]+?(?=baz|$)/);

Python:

re.match('^[\s\S]+?(?=baz|$)', 'foo bar baz')
  • Note: both examples match 'foo bar '

Thanks a lot.

a8m
  • 8,998
  • 4
  • 34
  • 40
  • 6
    Looking at https://github.com/google/re2/wiki/Syntax - there is a line saying "`(?=re)` before text matching `re` (NOT SUPPORTED)". This doesn't look good. Also, it says "alternative to backtracking regular expression engines" - suggesting they'd drop some features. – Kobi May 18 '15 at 14:19
  • I guess that's a sort of an answer, so I've added one. – Kobi May 18 '15 at 14:32
  • 1
    @Kobi there is now [dlclark/regexp2](https://github.com/dlclark/regexp2) available – Andy Jul 24 '17 at 23:21
  • 3
    @Andy - Thanks! So Go has `regexp` (which is re2), and `regexp2` (which isn't re2). That is a poor choice of library names - I think this is even more confusing than Python's `re` and `regex` libraries `:P`. Looks like it was ported from .Net with [balancing groups](https://github.com/dlclark/regexp2/blob/487489b64fb796de2e55f4e8a4ad1e145f80e957/regexp_mono_test.go#L998,L1002), which are [my favorite regex feature](https://kobikobi.wordpress.com/tag/regex/) - I'll have a look. – Kobi Jul 25 '17 at 02:16

2 Answers2

17

According to the Syntax Documentation, this feature isn't supported:

(?=re) before text matching re (NOT SUPPORTED)

Also, from WhyRE2:

As a matter of principle, RE2 does not support constructs for which only backtracking solutions are known to exist. Thus, backreferences and look-around assertions are not supported.

Kobi
  • 130,553
  • 41
  • 252
  • 283
9

You can achieve this with a simpler regexp:

re := regexp.MustCompile(`^(.+?)(?:baz)?$`)
sm := re.FindStringSubmatch("foo bar baz")
fmt.Printf("%q\n", sm)

sm[1] will be your match. Playground: http://play.golang.org/p/Vyah7cfBlH

Ainar-G
  • 31,424
  • 10
  • 89
  • 112
  • 1
    Yes, capturing group is the only means to achieve that... at least, until look-aheads are implemented in Go, – Wiktor Stribiżew May 18 '15 at 14:32
  • 5
    @stribizhev (wrt your "until look-aheads are implemented in Go" comment), I doubt such features will ever be added to Go or that Go will switch from using RE2. (Although you could probably use a third party PCRE package, I wouldn't recommend that). Most/all of these "features" are not supported due to the basic design which is a deliberate choice made between "advanced" (but slow/dangerous) features and speed and safety (in terms of run-time/memory). See https://swtch.com/~rsc/regexp/regexp1.html for details (or just look at the graphs). – Dave C May 18 '15 at 17:04
  • 4
    FWIW, recent research on handling lookaheads in linear time for PCRE-like engines: https://medium.com/@davisjam/using-selective-memoization-to-defeat-regular-expression-denial-of-service-f7acbbd34792 – James Davis Sep 21 '20 at 14:59