This seems simple, but it's throwing me for a loop. Coding pun intended.
I have a dataframe with the following format:
df = pd.DataFrame({"chrom":[12,12],
"Pos":[112233,112234],
"ref_base":["A","G"],
"alt_base":["T","C"],
"A":[12,22],
"T":[3,34],
"G":[12,23],
"C":[22,21]},
index=[0,1])
chrom Pos ref_base alt_base A T G C
12 112233 A T 12 3 23 22
12 112234 G C 22 34 23 21
I need to find a way to create a new column that contains the value from the A,T,G, or C columns that matches the value in the ref_base column.
chrom Pos ref_base alt_base A T G C ref_val
12 112233 A T 12 3 23 22 12
12 112234 G C 22 34 23 21 23
What I'm ultimately trying to do is create a column containing a tuple of (ref_val, alt_base_val) so if there's a better way to do that than creating the individual columns first and joining them, I'm grateful to learn what that is.
chrom Pos ref_base alt_base A T G C AD
12 112233 A T 12 3 23 22 (12,3)
12 112234 G C 22 34 23 21 (23,21)