I need to draw a complex graphics for visual data analysis. I have 2 variables and a big number of cases (>1000). For example (number is 100 if to make dispersion less "normal"):
x <- rnorm(100,mean=95,sd=50)
y <- rnorm(100,mean=35,sd=20)
d <- data.frame(x=x,y=y)
1) I need to plot raw data with point size, corresponding the relative frequency of coincidences, so plot(x,y) is not an option - I need point sizes. What should be done to achieve this?
2) On the same plot I need to plot 95% confidence interval ellipse and line representing change of correlation (do not know how to name it correctly) - something like this:
library(corrgram)
corrgram(d, order=TRUE, lower.panel=panel.ellipse, upper.panel=panel.pts)

but with both graphs at one plot.
3) Finally, I need to draw a resulting linar regression model on top of this all:
r<-lm(y~x, data=d)
abline(r,col=2,lwd=2)
but with error range... something like on QQ-plot:

but for fitting errors, if it is possible.
So the question is:
How to achieve all of this at one graph?


df.new <- data.frame(x = seq(min(x), max(x), 0.1))is better.s size is also strange (too small). Also tryedlibrary(car) dataEllipse(df$x, df$y, levels=0.95:1, lty=2)` but it drops all.library(car) cr.plots(m0)but the range of data is incorrect.Use first 2 lines from my code instead of yours to reproduce.
– Yuriy Petrovskiy Mar 05 '11 at 14:36car::dataEllipsedoes provide the same facilities than in theellipsepackage, but it is probably less easy to customize. I guess the superimposed curve is just a loess, so it is not difficult to add. – chl Mar 05 '11 at 14:56xandtable(x)will differ, hence thecexparam was recycled; (2) I forgot to update the location and scale of the ellipse (not clearly apparent when using standard gaussian variates). I hope there are no other ones. – chl Mar 05 '11 at 16:06Error in ellipse(cor(df$x, df$y), scale = c(sd(df$x), sd(df$y)), centre = c(mean(df$x), : center must be a vector of length 2and no ellipse :( Everything else is excellent. Thank you very much. – Yuriy Petrovskiy Mar 05 '11 at 16:28corrgrampackage: it shows 95% pairwise confidence region assuming a bivariate normal distribution centered on the mean and scaled by SD(x) and SD(y). I'm not a big fan of this when used in a scatterplot, though. But see Murdoch & Chow, A graphical display of large correlation matrices, Am Stat (1996) 50:178, or Friendly, Corrgrams: Exploratory displays for correlation matrices, Am Stat (2002) 56:316. – chl Mar 06 '11 at 16:01