1

I'm trying to replace values in a column to NaN. I normally use

imputed_data_x = imputed_data_x.replace(0, np.nan)

But my problem is that my values are not exactly 0, some are 0.01111,etc. How can I replace all values in a data frame that is less than 1?

I tried imputed_data_x = imputed_data_x.replace(>1, np.nan)

But it didn't work. I'm curious to see if I can use replace to do this or do I need a different command for conditions?

Lostsoul
  • 23,230
  • 42
  • 133
  • 217
  • 1
    np.where() may help here – Chris Jul 08 '20 at 18:07
  • 1
    `imputed_data_x.mask(imputed_data_x.lt(1))` ? – anky Jul 08 '20 at 18:08
  • 1
    Does this answer your question? [How to select rows from a DataFrame based on column values?](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) – Dan Jul 09 '20 at 09:58

2 Answers2

3

Use standard boolean indexing:

imputed_data_x[imputed_data_x < 1] = np.nan
Dan
  • 44,224
  • 16
  • 81
  • 148
0

DataFrame.replace is just for replacing fixed values. In your case you want to replace if the value is "close" to 0 which you can express as a predicate function. The API command to replace a value where the predicate returns false (keep the value where it is true) is

imputed_data_x = imputed_data_x.where(lambda x: x >= 1, np.nan)
maow
  • 2,164
  • 1
  • 7
  • 19