I'm doing a multiclass classification and data is considered as not being a time-series. Working on a feature engineering and trying to solve the problem with classic KNN, RF, boosting etc. I'm creating new features based on rolling window and found out that people usually do mean aggregation.
- Does the mean characteristic of rolling distribution is the most informative?
- Are there any other characteristics (like std, quantilies, IQR, etc.) that may be used?
- Is it worth to perform some kind of data transformation (like scaling, quantile transfomation or Box-Cox)?