I have two dataframes in python
First dataframe : tf_words : of shape (1 row,2235 columns) : looks like-
0 1 2 3 4 5 6 ...... 2234
0 aa, aaa, aaaa, aaan, aaanu, aada, aadhyam,.....zindabad]
Second dataframe : tf1_bigram: of shape (4000, 34319) : contains bigram with their occurrences in dataset, dataframe looks like-
(a, en) (a, ha) (a, padam) (aa, aala) (aa, accountinte) (aa,adhamanaya)...
1 0 0 1 0 0 ...
0 1 0 0 1 0 ...
0 0 1 0 0 1 ...
I have to compare tf_words dataframe with tf1_bigram dataframe and the comparison should be as follows
E.g. As seen in tf_words dataframe, though the word 'aa' is matching with only one word in columns: (aa, aala) (aa, accountinte) & (aa,adhamanaya) in tf1_bigram datagram, those matching columns values will be multiply by 0.5.
then to check for 'aaa', and if found multiply found column by 0.5;
then to check for 'aaaa', if found multiply found column by 0.5;
then for 'aaan', if found multiply the found column by 0.5
and so on upto last word 'zindabad'(having coulmn no. 2234)
Thus the output tf1_bigram will look like as below:
(a, en) (a, ha) (a, padam) (aa, aala) (aa, accountinte) (aa,adhamanaya)...
1 0 0 0.5 0 0 ...
0 1 0 0 0.5 0 ...
0 0 1 0 0 0.5 ...
I have tried : tf1_bigram.apply(lambda x: np.multiply(x * 0.5) if x.name in tf_words else x) but output output is not what I have expected.
Plz help...!!!!!!!!