0

Hello i am trying to create new columns in a dataframe, existing columns one of the column values are matching with column name. and wanted check with other column values which is having any values or zero i tried below solution did not get my output

Python: pandas: match row value to column name/ key's value

my dataframe which is looking like

loc     Var A   T   C   G
3985    T   0   0   1   0
4491    G   10  0   0   25
4492    G   0   1   0   0
5265    A   35  10  0   1
7328    G   0   0   20  1
6927    A   1   0   0   0

output what i wanted

loc     Var A   T   C   G  REF ALT
3985    T   0   0   1   0    0   1
4491    G   10  0   0   25   25  10
4492    G   0   1   0   0    0   1
5265    A   35  10  0   0    35  10
7328    G   0   0   20  1    1   20
6927    A   1   0   0   0    1   0

column "REF" columns match the values of "Var" and the ALT column which will search with the otherthan Var value, rest of the columns any one have value, i just wanted to match between four column(A,T, G, C) to get the other two column in dataframe

any suggestions? help

Erfan
  • 36,997
  • 6
  • 53
  • 68

1 Answers1

2

You can use DataFrame.lookup first to get your REF column. Then we create ALT column by getting the max value out of all the other values in the same row:

l1 = df.lookup(df.index, df['Var'])
l2 = df.iloc[:, 2:].apply(list, axis=1)

df['REF'] = l1
df['ALT'] = [max(set([x]) ^ set(y)) for x, y in zip(l1, l2)]

    loc Var   A   T   C   G  REF  ALT
0  3985   T   0   0   1   0    0    1
1  4491   G  10   0   0  25   25   10
2  4492   G   0   1   0   0    0    1
3  5265   A  35  10   0   1   35   10
4  7328   G   0   0  20   1    1   20
5  6927   A   1   0   0   0    1    0
Erfan
  • 36,997
  • 6
  • 53
  • 68