I am working with a model that models water flows in a certain area. These flows can be influenced by taking certain measures, resulting in multiple water management scenarios. I would like to compare the flows and residence time per scenario using box plots.
However, the box plots based on the daily output have very many outliers, making the box plots hard to interpret. Creating a box plot of the monthly averaged flows gives a nicer, smoothed box plot (for clarity, I don't mean a box per month. I mean 1 boxplot of the full dataset for each scenario, but averaged by month). However, I am not sure if that is representative, given that I am using averaged values. Is making a boxplot of averaged values useful in any way, or is it like averaging averages, and giving a skewed view of the situation?
Thank you!
Edit: The outliers are not relevant for what we are looking into. The focus is on the "normal" situation, not the extremes. It's okay to lose information on extremes.
Also, the extremes are mainly on the high side (high flows), while there are hardly any on the low side. Uploading an image doesn't work, but on a bean plot the bottom is almost flat and very wide. Then gets slightly wider going up, and then goes up exponentially. Ending in a thin long line on top.