-3

I have this table full of characters and numbers and would like only to have top 3 freq, plus their own variable. According to image, I would like to have results of a table includes only AZ 520, then AE 488, then AU 399.

   Var1 Freq
1    AE  488
2    AR   12
3    AU  399
4    AW   56
5    AZ  520
6    BA    2
7    BB   84
8    BG  246
9    BH   85
10   BI    6




as.data.frame(table(training.data.raw$destinationcountry))

1 Answers1

2

Recreating your data as follows, assuming column names, name, and value:

training.data.raw <- data_frame(name  = c("IN", "IS", "IT", "JO", "JP",     "KZ", "MA", "MZ", "NG", "NO", "NZ", "PE", "PH", "PR", "RO", "RU", "SA", "SE", "SY", "TM", "TN", "TR", "UK", "US", "WS"),
                                value = c(999, 1, 1885, 1098, 2, 584, 858, 11, 10, 522, 193, 29, 2, 1, 1603, 353, 6, 2, 4, 33, 228, 3201, 852, 1363, 1));

You can use the top_n function in the dplyr package to easily get your desired results (details in helpfile ?top_n):

library(dplyr);
top_3 <- top_n(x=training.data.raw, n=3);
top_3;

EDIT BASED ON COMMENT: If you have character factors instead of regular character vectors, you can mutate them first to characters:

training.data.characters <- mutate(training.data.raw, name = as.character(name));

# Now top_n() will take it
# Can also explicity state wt argument to tell it to sort by value
top_3 <- top_n(x=training.data.characters, n=3, wt=value);
top_3;
Mekki MacAulay
  • 1,697
  • 2
  • 10
  • 23