1

I could have used index match if it was excel. I have a data frame df = pd.DataFrame(np.random.randn(200,5),columns = ['apple','pear','orange','mango','banana'])

    apple      pear    orange     mango    banana
0   -1.162567  0.488261  1.716845 -1.375144 -0.510948
1   -0.344498 -1.096802 -0.544039 -0.106573 -0.316679
2    0.097983 -0.313277  0.572100 -0.176696 -0.574828
3   -1.300936 -2.749289 -0.065648  1.072607  2.099388
4    0.956781 -1.036766  0.794087  1.962683 -2.087505
5   -2.619787  1.024262  1.025925 -0.763013  0.942017
...

I also have a list made of 200 items :['apple','orange','mango',mango','pear'...] How to I iterate over rows in df and get values based on the column names in the list : Desired output:

     values     
0   -1.162567  
1   -0.544039
2    -0.176696
3   1.072607
4   -1.036766
...
Candice
  • 199
  • 3
  • 13

1 Answers1

6

Use lookup, but need list with same length as df and all values of list have to be in columns names:

L = ['apple','orange','mango','mango','pear', 'banana']

df['values'] = df.lookup(df.index, L)
print (df)
      apple      pear    orange     mango    banana    values
0 -1.162567  0.488261  1.716845 -1.375144 -0.510948 -1.162567
1 -0.344498 -1.096802 -0.544039 -0.106573 -0.316679 -0.544039
2  0.097983 -0.313277  0.572100 -0.176696 -0.574828 -0.176696
3 -1.300936 -2.749289 -0.065648  1.072607  2.099388  1.072607
4  0.956781 -1.036766  0.794087  1.962683 -2.087505 -1.036766
5 -2.619787  1.024262  1.025925 -0.763013  0.942017  0.942017
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
  • great thanks! just realised there is a "u" in front of each item in my list..but I should be able to get rid of it. after that it should work. – Candice Jul 12 '18 at 12:14