0

I am trying to interpret this pretty basic scatter plot but am new to this so I don't want to get it wrong. Is it fair to say that there appears to be a moderate positive relationship between the two variables, though of course the data is clustered around the lower values of the y variable? enter image description here

mlee
  • 1
  • 1
    The way things look like there may be overplotting at the lower values of the x-axis, i.e., there may be more than one observation in places where there is only one point due to apparent discreteness of observations. In principle this could mislead you into thinking there is a positive relationship if in fact there isn't (there may well be such a relationship, but we can't see everything that matters). You may want to improve the plot by jittering parallel to the y-axis. https://stats.stackexchange.com/questions/379006/when-adding-jitter-a-scatterplot-for-conveying-information-is-appropriate – Christian Hennig Oct 18 '23 at 09:30
  • Why not calculate a correlation coefficient - perhaps a non-parametric one like Spearman's rank correlation? – PBulls Oct 18 '23 at 09:31
  • Also in general this depends on what you are interested in. You can see many more things in principle than just whether there is a positive (or negative or none or non-monotonic) relationship, such as skewness, outliers... – Christian Hennig Oct 18 '23 at 09:31
  • I am wondering if it would be more helpful to plot mean visit for every y-axis value and plot it. – tatami Oct 18 '23 at 11:53
  • thank you all for your help, I think I am going to create a boxplot to see if this sheds some more light – mlee Oct 18 '23 at 12:32

0 Answers0