Selecting rows by value in a floating point column in pandas

Question

I import a csv data file into a pandas DataFrame df with pd.read_csv. The text file contains a column with strings like these:

y
0.001
0.0003
0.0001
3e-05
1e-05
1e-06

If I print the DataFrame, pandas outputs the decimal representation of these values with 6 digits after the comma, and everything looks good.

When I try to select rows by value, like here:

df[df['y'] == value],

by typing the corresponding decimal representation of value, pandas correctly matches certain values (example: rows 0, 2, 4) but does not match others (rows 1, 3, 5). This is of course due to the fact that those rows values do not have a perfect representation in base two.

I was able to workaround this problem is this way:

df[abs(df['y']/value-1) <= 0.0001]

but it seems somewhat awkward. What I'm wondering is: numpy already has a method, .isclose, that is specifically for this purpose.

Is there a way to use .isclose in a case like this? Or a more direct solution in pandas?

score 5 · Answer 1 · answered Feb 13 '16 at 06:15

5

Yes, you can use numpy's isclose

df[np.isclose(df['y'], value)]

answered Feb 13 '16 at 06:15

Mike Graham

69,495
14
96
129

Selecting rows by value in a floating point column in pandas

1 Answers1