3

Guys,

For some reasons, I have to put a np.array into a single column of DataFrame. It looks as :

A           B        C
1       [1,2]        0
2         [4]        0
3   [1,2,5,6]        0
7     [2,5,6]        0
4         [8]        0

Is there any method setting the column C based on length of column B without iteracting them ? E.g. If length(col.B) == 2 or length(col.B) == 4, C = 1, else C = -1. Then I expected :

A           B        C
1       [1,2]        1
2         [4]       -1
3   [1,2,5,6]        1
7     [2,5,6]        1
4         [8]       -1

Thanks so much.

WangYang.Cao
  • 111
  • 6

2 Answers2

4

Use numpy.where by condition by len and isin:

df['C'] = np.where(df['B'].str.len().isin({2,4}), 1, -1)

print (df)
   A             B  C
0  1        [1, 2]  1
1  2           [4] -1
2  3  [1, 2, 5, 6]  1
3  7     [2, 5, 6] -1
4  4           [8] -1
Anton vBR
  • 16,833
  • 3
  • 36
  • 44
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
0

Use .apply:

df['C']=df.apply(lambda row: 1 if len(row['B'].tolist()) in [2,4] else -1,axis=1)
print(df)

Output:

   A          B  C
0  1      [1,2]  1
1  2        [4] -1
2  3  [1,2,5,6]  1
3  7    [2,5,6] -1
4  4        [8] -1

(Do ast.literal_eval if the dataframe element are strings)

U12-Forward
  • 65,118
  • 12
  • 70
  • 89