3

How can I get a string constant from source code in a string?

For example, here is the source code I am trying to process:

var v = "this is string constant + some numbers and \" is also included "

I am unable to get everything inside quotation marks. by using this regular expression: "(.*?)".

I can't get var, v, = or anything else except string character.

Jeremy
  • 1
  • 83
  • 335
  • 359
Zeeshan Anjum
  • 450
  • 5
  • 20

3 Answers3

1

You need to match an opening quote, then anything that's either an escaped character or a normal character (except quotes and backslashes), and then a closing quote:

"(?:\\.|[^"\\])*"
Tim Pietzcker
  • 313,408
  • 56
  • 485
  • 544
  • Yes, was about to write that. There's a difference between _parsing_ a text completely and _extracting_ just some bits from it. – georg Apr 14 '13 at 17:02
1

Using lookbehind, to make sure the " is not preceded by a \

import re

data = 'var v = "this is string constant + some numbers and \" is also included "\r\nvar v = "and another \"line\" "'
matches = re.findall( r'= "(.*(?<!\\))"', data, re.I | re.M)
print(matches)

Output:

['this is string constant + some numbers and " is also included ', 'and another "line" ']
ChaseTheSun
  • 3,760
  • 2
  • 17
  • 16
0

For get everything inside quotation marks you can try this: "\".+?\"" with re.findall()

Navand
  • 416
  • 4
  • 11