0

I have the following regex:

(\+|-|\^)?[a-z\d]+

I am trying to match any sequence of alphanumeric characters, that may or may not be preceded by a +, -, and may or may not be followed by a ^ and a series of digits. However, this does not produce the results that I want.

For example, the following code:

import re
r = re.findall(r'(\+|-|)?[a-z\d]+(\^\d+)?', '4x+5x-2445y^56')

Returns the result [('', ''), ('+', ''), ('-', '^56')], but I would expect it to return ['4x', '+5x', '-2445y^56'].

What am I doing wrong?

Perplexityy
  • 483
  • 1
  • 7
  • 21

1 Answers1

2

You are introducing two captured groups while trying to use optional ?, which will get returned by findall. You can make them non capture using ?: while still being able to group certain pattern together:

r = re.findall(r'[+-]?[a-z\d]+(?:\^\d+)?', '4x+5x-2445y^56')
r
['4x', '+5x', '-2445y^56']
Psidom
  • 195,464
  • 25
  • 298
  • 322