0

I have a dataset with two columns, "Gene ID" and "GO term". Each gene ID can be linked to multiple GO terms and each GO term can be found in multiple gene IDs of the "Gene ID" column. i.e., there are multiple gene IDs and multiple GO terms in both columns

     Gene ID                  GO term
1    ENSRNA049995838                  
2  VIT_00s0131g00010        GO:0005544
3  VIT_00s0131g00010        GO:0005509
4  VIT_00s0131g00010        GO:0005737
5  VIT_00s0131g00010        GO:0005509
6  VIT_00s0181g00030                  
7  VIT_00s0181g00180        GO:0016021
8  VIT_00s0181g00180        GO:0009941
9  VIT_00s0181g00180        GO:0009522
10 VIT_00s0181g00180        GO:0016168
11 VIT_00s0181g00180        GO:0009523
12 VIT_00s0181g00180        GO:0018298
13 VIT_00s0181g00180        GO:0009768
14 VIT_00s0181g00180        GO:0009416
15 VIT_00s0181g00180        GO:0009535
16 VIT_00s0181g00180        GO:0010287
17 VIT_00s0181g00180        GO:0016020
18 VIT_00s0181g00180        GO:0009507
19 VIT_00s0181g00180        GO:0009536
20 VIT_00s0181g00180        GO:0015979
21 VIT_00s0181g00180        GO:0009765
22 VIT_00s0181g00180        GO:0009535
23 VIT_00s0181g00180        GO:0009579
24 VIT_00s0207g00010        GO:0005634
25 VIT_00s0207g00010        GO:0016747
26 VIT_00s0207g00010        GO:0009567
27 VIT_00s0207g00010        GO:0005737
28 VIT_00s0207g00010        GO:0016747
29 VIT_00s0371g00100        GO:0008270
30 VIT_00s0371g00100        GO:0016491

I want to transform the data to wide format, with each GO term as one variable, and all gene ID linked to this GO term will be listed in the second column of the same cell, and the third column will be count of the number of gene ID linked to this GO term, like below:

GO term       Gene ID                                   Number of Gene ID
GO:0005544   VIT_00s0131g00010                               1
GO:0016747   VIT_00s0207g00010, VIT_00s0207g00010            2 
GO:0005509   VIT_00s0131g00010, VIT_00s0131g00010            2

I tried tidyr's function spread(), and pivot.wider() and tried to convert the table from long to wide. However they did not produce results as I wanted like above.

Can somebody suggest what I should do? Thanks!

Alex Song
  • 21
  • 2

0 Answers0