0

Just curious if I need to change anything for stability.

In this scenario, adding a column with the text "insert" for each row raises a SettingWithCopyWarning:

input

import pandas as pd

prev_df = pd.DataFrame(data = {'id':[0],'feature':['y']})
curr_df = pd.DataFrame(data = {'id':[0,1],'feature':['y','n']})

df = curr_df[~curr_df['id'].isin(prev_df['id'])]
df['change_type'] = 'insert'

output

<input>:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In contrast, this scenario does not raise a SettingWithCopyWarning:

import pandas as pd

df = pd.DataFrame(data = {'id':[0,1],'feature':['y','n']})
df['change_type'] = 'insert'

The first scenario is an example of a larger script used for work which flags different kinds of changes to the same datasets over time (here, an insertion is shown as a kind of change that might happen.)

  • Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – Laurent May 13 '22 at 17:05

0 Answers0