0

I am using regex to extract the dimensions from 'dimensions' to get height, width, depth.

How can I get the regex to return an empty string if there is no depth? and can I use a single regex for all of them?

extract0 = re.findall(r'\d{2}',df.iat[0,0])
extract1 = re.findall(r'\d{2}\,?\d*',df.iat[1,0])
extract2 = re.findall(r'\d+\.\d',df.iat[2,0])
extract3 = re.findall(r'(?<=\()(\d+\.\d)(?:[\s\W])*(\d+\.\d)(?=\scm\)$)', dim_df.iat[3,0])[0]
extract4 = re.findall(r'\d', df.iat[4,0])

EDIT:

Here is the table

dimensions height width depth
0 19×52cm 19.0 52.0 NaN
1 50 x 66,4 cm 50.0 66.4 NaN
2 168.9 x 274.3 x 3.8 cm (66 1/2 x 108 x 1 1/2 in.) 168.9 274.3 3.8
3 Sheet: 16 1/4 × 12 1/4 in. (41.3 × 31.1 cm) Image: 14 × 9 7/8 in. (35.6 × 25.1 cm) 35.6 25.1 NaN
4 5 by 5in 12.7 12.7 NaN
MR MDOX
  • 23
  • 4
  • You don't need to rely on regexes to do everything. If you try to combine lots of them into one, you will have horrible problems if anything needs to be adjusted. Keep things simple and let the compiler take care of optimisations. It looks like you can simply check the "depth" value in the table to find out if the depth is given: [How do I parse a string to a float or int?](https://stackoverflow.com/questions/379906/how-do-i-parse-a-string-to-a-float-or-int) – Andrew Morton Jun 21 '21 at 19:59

0 Answers0