2

My dataset is in this form:

df = pd.DataFrame({'ID': [1,2,3,4],
                   'Type': ['A', 'B', 'B', 'B'],
                   'Value': [100, 200, 201, 120]})

I want to update the dataframe in the following way:

df = pd.DataFrame({'ID': [1,2,3,4],
                   'Type': ['A', 'B1', 'B2', 'B3'],
                   'Value': [100, 200, 201, 120]}) 

The code I was trying was:

df[df['Type'] == 'B', df['Value'] == 200] = 'B1'

But I'm getting error:

ValueError: Cannot reindex from a duplicate axis

Can someone please help me solve the problem?

Thanks!

Luke
  • 1,741
  • 10
  • 29
Beta
  • 1,480
  • 5
  • 31
  • 63

3 Answers3

1

Try this instead:

df.loc[df['Type'].eq('B') & df['Value'].eq(200), 'Type'] = 'B1'
U12-Forward
  • 65,118
  • 12
  • 70
  • 89
1

You can use:

df['Type'] = df['Type'].mask(df['Type'].eq('B'),
                             df['Type'] + df.groupby('Type').cumcount().add(1).astype(str)
                            )
mozway
  • 81,317
  • 8
  • 19
  • 49
  • @mozway: Thanks a lot for your answer. It worked perfectly. But U12-Forward answer is correct, he shared first . So, will accept this answer. – Beta Sep 07 '21 at 11:10
  • @Beta sure, most important is you got your answer ;) – mozway Sep 07 '21 at 11:12
1

If need convert all B values by iterator starting by 1 use np.arange by count Trues by sum and join by +:

m = df['Type'] == 'B'
df.loc[m, 'Type'] += np.arange(1, m.sum()+1).astype(str)
print (df)

   ID Type  Value
0   1    A    100
1   2   B1    200
2   3   B2    201
3   4   B3    120
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090