-2

I am still learning regex and I cannot find the solution to my problem. I want to match all single quotes in a string, but not if the single quote is within a double quote.

For example, I have a string:

'['hello', "that's great", "Good to see you O'shay"]'

I would like to be able to match the single quotes around 'hello', but not the single quotes nested inside the double quotes.

I hope that makes sense! Thanks in advance!

edit

Instead of matching the single quotes, would be possible to obtain the strings, i.e.

'['hello', "that's great", "Good to see you O'shay"]'

to

'[hello, that's great, Good to see you O'shay]'

Not sure if that makes the problem easier or harder? Would welcome being enlightened!

Cellan
  • 37
  • 7
  • 3
    This looks like a variant of the impossible-to-implement-in-regex problem of [matching nested patterns](https://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns). Some variants of regex support balanced groups but I don't think that is part of Python's core regex library. You'll want to implement some sort of simple parser. – sphennings Sep 06 '21 at 22:16
  • Ah, okay. I was hoping there was an "easy" way that I was overlooking, so thank you for clarifying! – Cellan Sep 06 '21 at 22:22

1 Answers1

2

Not a pure regular expression solution. Mainly the string is subdivided in parts and the undesired ones are removed afterwards:

test = '''['hello', "that's great", "Good to see you O'shay"]'''


import re

parts = re.split(r"(\"[^\"]*\"|'[^']*')", test)

parts = [p for p in parts if p.startswith("'")]

print(parts)
Michael Butscher
  • 8,425
  • 4
  • 22
  • 24
  • Thank you for your answer! I was hoping to just match just match the single quotes to eventually remove them from the string. However, I will probably go down the root of just finding all the words within the single quote and then double quotations. I will change my question to avoid confusion. Sorry! – Cellan Sep 06 '21 at 22:43
  • @Cellan In this case you just can examine each part, remove quotes if present and finally join the parts together to a new string. – Michael Butscher Sep 06 '21 at 23:03