2

I have a data.frame with gene expression data and I want to create a graph in ggplot2. here's an example for my data frame:

Gene.Name    cell.type    expression
ABC          heart        12
AZF          heart        13  
ABC          kidney       1
AZF          kidney       2

and forth. in reality there are 160 genes, 5 tissue types.
I drew a dotplot with the following code:

a <- ggplot(data, aes(x = expression, y = Gene.Name))
a + geom_point() + facet_grid(. ~ cell.type)

Here's a snapshot of the plot

http://i55.tinypic.com/2rgonjp.jpg

what I want to do but can't seem to manage is to order the genes alphabetically. I tried:

a <- ggplot(data, aes(x = expression, reorder(Gene.Name, Gene.Name)))

but this didn't work (the Gene.Name column is alphabetically sorted, so I thought this might change the order but it didn't)

Any suggestions as to how I might change the gene name order?

Thanks

AhmetZ
  • 137
  • 1
  • 3
  • 9

1 Answers1

1

Changed the name to "dat" because "data" is a bad dog. Use rev to reverse the order of the levels on the factor variable. Your code was missing a closing paren in the first line and misspelled geom_point() in the second:

dat$Gene.Name <- factor(dat$Gene.Name, levels= levels(rev(dat$Gene.Name))
a <- ggplot(dat, aes(x = expression, y = Gene.Name))
a + geom_point() + facet_grid(. ~ cell.type)
IRTFM
  • 251,731
  • 20
  • 347
  • 472
  • thanks a lot DWin. It worked! So, it's my understanding now that ggplot starts laying out the first data point (the gene that starts with A in my case) as close as to 0,0. is this a true generalization? – AhmetZ Sep 14 '11 at 18:17
  • Possibly true of your original code with caveats. It could depend on what you mean by [0,0]. In my code, I would have said that the positioning of the Gene.Names had the "A" nearest [ (min(expression), max(re-ordered-Gene-Names) ], since I think of [0,0] as being at the lower left hand corner. And ggplot probably won't include [0,0] generally unless you set axis limits. – IRTFM Sep 14 '11 at 18:23
  • yes, i meant [0,0] as the lower left corner of the plot area. If i flip the coordinates on my original data set, gene that starts with A is closest to [0,0]. I'm still trying to learn ggplot2 so it's good to keep this in mind. thanks a lot again! – AhmetZ Sep 14 '11 at 18:41
  • Most of the plotting paradigms (base, lattice and ggplot) will use the factor order by default and this is most easily changed in the data rather than in the plotting routines. – IRTFM Sep 14 '11 at 19:40
  • The answer only changes the order of the axis labels but not the data. Generate another factor or ggplot2 option. http://stackoverflow.com/questions/8713462/ggplot2-change-order-of-display-of-a-factor-variable-on-an-axis – microbe Jul 03 '13 at 14:59
  • Hmmm. This may have been written before I fully understood that issue. Let me put in a revision. – IRTFM Jul 03 '13 at 17:25