0

I am trying to write a def within a class that uses a dataframe as input, perform some manipulation, and output the result as new dataframe. However for some reason, the calculation changes the existing dataframe as well, and I can't understand why. Below is an example (not my actual function)

class format_data():
    def __init__(self):
        pass
    def custom_func(self,df):
        new_df=df
        new_df.iloc[1,1]='1000'
        return new_df

For example my original df is

data = {'Name': ['Tom', 'Joseph', 'Krish', 'John'], 'Age': [20, 21, 19, 18]}  
df = pd.DataFrame(data) 

if i run

format_data().custom_func(df)

Output is

 Name   Age
0   Tom 20
1   Joseph  1000
2   Krish   19
3   John    18

but if i check the original df, the df is now exactly the same as the new df.

eyllanesc
  • 221,139
  • 17
  • 121
  • 189
Felton Wang
  • 107
  • 7

0 Answers0