I have in the following data frame:
That looks like this table below :
| ID | VAR | CATEGORY |
|---|---|---|
| 1 | A | ANE |
| 1 | A | ANE |
| 1 | A | ANA |
| 1 | A | ANB |
| 1 | B | ANE |
| 2 | C | BOO |
| 2 | C | BOA |
| 2 | D | BOO |
| 3 | E | CAT |
| 3 | E | CAT |
| 4 | F | DOG |
| 4 | A | ANE |
| 4 | B | ANE |
| 4 | F | DOG |
| 4 | C | FUT |
| 4 | F | DOG |
ideal output given the above data frame in Python, It must be like that:
| ID | TEXTS | category |
|---|---|---|
| 1 | A | ANE |
| 2 | C | BOO |
| 3 | E | CAT |
| 4 | F | DOG |
More specifically: I want for ID say 1 to search the most common value in the column VAR which is A and then to search the most common value in the column CATEGORY related to the most common value A which is the ANE and so forth.
How can I do it in Pandas module in Python ? Imagine that it is sample example.My real data frame contains 850.000 rows and has 11000 unique ID.