Delete rows that contain any string, in a certain column

Question

I have a huge data frame, and I want to calculate the correlation between some columns. The problem is those columns contain strings here and there, which prevents me from such a calculation. How can I delete the rows that contain strings, just in a certain column? Note: I don't know what the strings look like, there are many, and I need just a line of code that deletes them all together.

This is what I tried, didn't work for some reason:

df= df[df['column_name'].apply(lambda x: str(x).isdigit())]

Does [this post](https://stackoverflow.com/a/28680078/7375347) answer your question? — tax evader, May 22 '22 at 18:05
Do you know how to check an object's type? See [What's the canonical way to check for type in Python?](/q/152580/4518341) — wjandrea, May 22 '22 at 18:07
It'd help if you provided a [reproducible pandas example](/q/20109391/4518341). See also [mre]. — wjandrea, May 22 '22 at 18:09
@taxevader No because this post talks about a particular string, I have several I and cant start looking for them because the data is too large — L0987, May 22 '22 at 18:10
@L0987 No, not the Pandas dtype, the Python type. That is, a column of type `object` might have objects of any Python type inside it, like `str`, `int`, `list`, etc. — wjandrea, May 22 '22 at 18:11
@L0987 OK, I've closed your question as a duplicate then. LMK if anything's unclear. — wjandrea, May 22 '22 at 18:16
@L0987 Actually, I should probably clarify up front: you already know how to use all the Pandas stuff properly, you just need to change what the lambda does. — wjandrea, May 22 '22 at 18:19
@L0987 Oh I see, I think you can convert the column into numeric type by using [pd.to_numeric](https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.to_numeric.html) `df['column_name'] = pd.to_numeric(df['column_name'], errors='coerce')` which will cause any column that can't be converted to numeric type `NaN` and then all you need to do is filter out row with NaN in that column with `df = df[df['column_name'].notna()]` — tax evader, May 22 '22 at 18:40
@taxevader Any idea how to do the opposite? turn numeric values to NaN. Just out of curiosity. — L0987, May 23 '22 at 08:11
@L0987 Hey, I just realized I might have misunderstood what you're trying to accomplish, but it looks like Tax Evader already got it :) Sorry, I'm still learning Pandas myself. There's another existing question about that, so I tacked it on :) — wjandrea, May 23 '22 at 18:03
@L0987 If you want to filter out row with numeric value in column and retain row with non-numeric, you can use the `pd.ro_numeric` function but instead assign it to a different column so it don't overwrite the existing column `df['column_numeric'] = pd.to_numeric(df['column_name'], errors='coerce')` and then filter out row without NaN in the `column_numeric` column by using `notna` but with inverse `~` operator `df[~df['column_numeric'].notna()]` — tax evader, May 24 '22 at 06:22

Delete rows that contain any string, in a certain column

0 Answers0