0

I am working my way through the website https://www.tidytextmining.com and experimenting with the Austen books dataset. I am trying to plot the most frequent words in each of six books and get the individual bars of each plot to go in descending order. I have adapted the code to plot tf-idf shown in section 3.3 but I am unable to get the plots to look the same (get bars for word freqency to go in descending order). Reproducible code and output are shown below.

needed <- c("plyr", "dplyr", "tidytext", "ggplot2", "janeaustenr")
install.packages(needed, dependencies=TRUE)
list = lapply(needed, require, character.only = TRUE)

data(stop_words)

book_words = austen_books() %>%
    unnest_tokens(word, text) %>%
    anti_join(stop_words) %>%
    count(book, word, sort = TRUE) %>%
    ungroup()

total_words = book_words %>% 
    group_by(book) %>% 
    summarize(total = sum(n))

book_words = left_join(book_words, total_words)

book_words <- book_words %>%
    bind_tf_idf(word, book, n)

book_words %>%
    arrange(desc(n)) %>%
    mutate(word = factor(word, levels = rev(unique(word)))) %>% 
    group_by(book) %>% 
    top_n(15, wt = n) %>% 
    ungroup %>%
    ggplot(aes(word, n, fill = book)) +
    geom_col(show.legend = FALSE) +
    labs(x = NULL, y = "n") +
    facet_wrap(~book, ncol = 2, scales = "free") +
    coord_flip()

Resulting Rplot

Michael Harper
  • 13,303
  • 2
  • 55
  • 79
balkat
  • 1
  • 3
  • In your last statement you have `arrange(desc(n))`, the example in the book has `arrange(desc(tf_idf))` – phiver Mar 06 '18 at 19:54
  • Nicely presented code for a first post :) You can also check out this question: https://stackoverflow.com/questions/43176546/ggplot2-reorder-bars-from-highest-to-lowest-in-each-facet – Michael Harper Mar 06 '18 at 20:01
  • 1
    @MikeyHarper, thanks very much. I was able to solve it using the answer in your second link. – balkat Mar 06 '18 at 20:22
  • What I use a lot for this now is a function that Dave wrapped up in his personal R package `drlib::reorder_within()`: https://github.com/dgrtwo/drlib/blob/master/R/reorder_within.R – Julia Silge Mar 07 '18 at 04:04

0 Answers0