0

So, I need to do some exploratory data analysis and I picked MDS to figure up if there were trends in the data. The structure of my data looks like this:

 $ Generation: int  2 2 2 2 2 2 2 2 2 2 ...
 $ Panel     : chr  "A" "A" "A" "A" ...
 $ Line      : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Rep       : int  2 2 2 2 2 2 2 2 6 6 ...
 $ Sex       : chr  "F" "F" "F" "F" ...
 $ Size      : num  1662 1720 1721 1778 1565 ...
 $ ILD12     : num  1930 1954 1947 1932 1915 ...
 $ ILD15     : num  1524 1567 1575 1539 1528 ...
 $ ILD18     : num  427 414 420 389 418 ...
 $ ILD23     : num  732 706 702 733 749 ...
 $ ILD25     : num  1380 1386 1383 1393 1391 ...
 $ ILD29     : num  1544 1584 1554 1568 1531 ...
 $ ILD37     : num  1586 1546 1575 1568 1611 ...
 $ ILD39     : num  2070 2060 2046 2061 2060 ...
 $ ILD46     : num  1515 1481 1498 1493 1532 ...
 $ ILD49     : num  1970 1973 1953 1971 1962 ...
 $ ILD57     : num  673 695 705 691 697 ...
 $ ILD58     : num  1117 1166 1172 1164 1127 ...
 $ ILD67     : num  192 194 188 196 178 ...
 $ ILD69     : num  611 644 623 642 585 ...
 $ ILD78     : num  522 552 531 545 497 ...
 $ ILD89     : num  97.5 99.2 97.9 99.9 96.9 ...

How would I deal with categorical data in my dataset if I am using R to analyse the data? I am using ggPlot too - would I just fit a model first using cmdscale and then plot the x and y coordinates?

So something like this:

ggplot(df, aes(x=x, y=y, color = Panel)) +
  geom_point() +
  ggtitle("Metric MDS Results") +
  labs(x="Coordinate 1", y="Coordinate 2")
  theme_bw()

Am I correct to assume the color parameter in ggplot shows the similarity of categorical variable Panel?

0 Answers0