Just curious if I need to change anything for stability.
In this scenario, adding a column with the text "insert" for each row raises a SettingWithCopyWarning:
input
import pandas as pd
prev_df = pd.DataFrame(data = {'id':[0],'feature':['y']})
curr_df = pd.DataFrame(data = {'id':[0,1],'feature':['y','n']})
df = curr_df[~curr_df['id'].isin(prev_df['id'])]
df['change_type'] = 'insert'
output
<input>:7: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
In contrast, this scenario does not raise a SettingWithCopyWarning:
import pandas as pd
df = pd.DataFrame(data = {'id':[0,1],'feature':['y','n']})
df['change_type'] = 'insert'
The first scenario is an example of a larger script used for work which flags different kinds of changes to the same datasets over time (here, an insertion is shown as a kind of change that might happen.)