-1

I got the following data:

    S. lanosoniveum Simplicillium sp.  T. atroviride     E. weberi
0            GH79(36-358)      GH71(22-402)   GH18(36-328)   AA7(52-488)
1            GH76(38-215)       GH7(19-451)  PL7_4(30-249)  GH18(69-306)
2             GH7(19-451)      AA12(23-410)   GH55(28-741)  GH95(27-771)
3           GH5_5(68-345)     GH125(87-522)   GH79(23-433)  GH18(32-287)
4            AA12(22-411)      GH24(26-168)  GH18(179-528)  GH36(36-718)

I want to remove all the parenthesis and numbers to got an aoutput like this:

I got the following data:

    S. lanosoniveum Simplicillium sp.  T. atroviride     E. weberi
0            GH79      GH71   GH18   AA7
1            GH76       GH7  PL7_4  GH18
2             GH7      AA12   GH55  GH95
3           GH5_5     GH125   GH79  GH18
4            AA12      GH24  GH18  GH36

I tried the code:

df2 = df.replace("[^(]", "", regex=True)
print(df2)

and got that:

 S. lanosoniveum – UFV Simplicillium sp. T. atroviride E. weberi
0                       (                 (             (         (
1                       (                 (             (         (
2                       (                 (             (         (
3                       (                 (             (         (
4                       (                 (             (         (

What I`m doing wrong?

Could anyone hep me? It could be a solution in python or sed

anubhava
  • 713,503
  • 59
  • 514
  • 593
  • 2
    Your code says to replace every character except `(` by nothing. Try instead something like: `df2 = df.replace(r"\([^)]+\)", "", regex=True)` – Michael Butscher May 20 '22 at 18:12

0 Answers0