4

I import a csv data file into a pandas DataFrame df with pd.read_csv. The text file contains a column with strings like these:

y
0.001
0.0003
0.0001
3e-05
1e-05
1e-06

If I print the DataFrame, pandas outputs the decimal representation of these values with 6 digits after the comma, and everything looks good.

When I try to select rows by value, like here:

df[df['y'] == value],

by typing the corresponding decimal representation of value, pandas correctly matches certain values (example: rows 0, 2, 4) but does not match others (rows 1, 3, 5). This is of course due to the fact that those rows values do not have a perfect representation in base two.

I was able to workaround this problem is this way:

df[abs(df['y']/value-1) <= 0.0001]

but it seems somewhat awkward. What I'm wondering is: numpy already has a method, .isclose, that is specifically for this purpose.

Is there a way to use .isclose in a case like this? Or a more direct solution in pandas?

Community
  • 1
  • 1
germ
  • 1,135
  • 16
  • 18

1 Answers1

5

Yes, you can use numpy's isclose

df[np.isclose(df['y'], value)]
Mike Graham
  • 69,495
  • 14
  • 96
  • 129