comparing multivariate variables under different conditions

Question

Let's consider $N$ different musical instruments. For each instruments, $k$ notes have been recorded at different frequencies. $M$ audio descriptors have been extracted from every instrument and every frequency. Therefore, for each frequency we have an $N\times M$ matrix:

$ A_1 = \begin{bmatrix} s_{11}^1 & s_{12}^1 & \dots \\ \vdots & \ddots\\ s_{N1}^1 & \dots & s_{NM}^1 \end{bmatrix} $

$ A_2 = \begin{bmatrix} s_{11}^2 & s_{12}^2 & \dots \\ \vdots & \ddots\\ s_{N1}^2 & \dots & s_{NM}^2 \end{bmatrix} $

$ \dots $

$ A_k = \begin{bmatrix} s_{11}^2 & s_{12}^k & \dots \\ \vdots & \ddots\\ s_{N1}^k & \dots & s_{NM}^k \end{bmatrix} $

and so on, where $s_{ij}^k$ means instrument $i$, descriptor $j$, frequency $k$.

1) I want to check if the differences between instruments remain constant at different frequencies. If they are not, I want to study how these differences change. What statistical tool should I use? Please note that I am not interested in how does an audio descriptor change at different frequencies, but whether different instruments change homogeneously.

2) Suppose that the $N$ instruments can be grouped into 3 different classes (e.g. instruments from 1 to 5 belong to class 1, instruments from 6 to 20 belong to class 2 and so on). How can I take this grouping into account? For example, how can I check if the differences within a group are more constant wrt differences between different groups?

EDIT: I have been suggested to use multi-level regression analysis, which seems suitable for this case, but I am a bit confused: what is the variable to be predicted in this case? How can I use this technique?

score 2 · Answer 1 · answered Jun 17 '17 at 05:47

2

I think what you're looking for is a Multilevel regression model. You would have your observations nested into samples and then samples nested inside conditions (or vice versa, not clear from your description). To account for simple differences between the samples and conditions, you would need random intercepts in your model, and to account for possible differences in relationships between the predictors and dependents within each condition you would need random slopes.

more info on multilevel models : http://www.bristol.ac.uk/cmm/learning/multilevel-models/what-why.html

answered Jun 17 '17 at 05:47

Adam B

299

Although I feel this could be the right way, I am not sure about how to proceed. I have no dependent variable to model with regression. What do you mean by random intercepts? – firion Jun 19 '17 at 08:52
I edited the question in order to make my problem clearer – firion Jun 19 '17 at 09:15

Karel Macek · Answer 2 · 2017-06-19T11:50:23.270

1

You can model your data as follows: $$ s_{i,j}^k = d^{k}_j+f_{i,j}+\epsilon_j $$ where $d_j^k$ is center of the descritor for given frequency (value that does not depend on the instrument) and $f_{i,j}$ is the instrument-specific offset of the descriptor and $\epsilon_j\sim\mathcal{N}(0,\sigma_j)$.

You can easily fit this model by the least squares. Having the model, you can examine outliers in the following way (for each $j$):

Calculate residuals $r_{i,j}^k = s_{i,j}^k - d_j^k+f_{i,j}$
For for each instrument $i$, calculate $\mu_j^k$ and $\sigma_{j}^k$ based on data $\left(r_{i',j}^k\right)_{i'\neq i}$
Validate whether $r_{i,j}^k$ is within 95% inteval of $\mathcal{N}(\mu_j^k,\sigma_{j}^k)$. If so, the value is as expected. If not, then the instrument $i$ has a value that was not expected by the model.

In case of groups, you can consider the model $$ s_{g,j}^k=d_j^k+f_{g,j}+\epsilon_{g} $$ where $g$ is index of the group.

edited Jun 19 '17 at 11:50

answered Jun 19 '17 at 11:28

Karel Macek

2,816

Thanks. I have two questions: 1) what do you mean by center of the descriptor? 2) How can I process the outliers in order to answer my question, i.e. how the differences between samples change with frequency? – firion Jun 19 '17 at 11:39
Did the update help? – Karel Macek Jun 19 '17 at 12:20
Yes, it helped a lot. I am just not sure if the validation you show is the one I want to do. What I want to do is: given all the pairwise differences between instruments at a given frequency $f1$, do these differences remain (more or less) the same at another frequency $f2$, or do instruments that were "similar" at $f1$ now became "different" at $f2$? – firion Jun 19 '17 at 14:51
also, do you know what matlab function is the best to perform this fitting? what about nlmefit? – firion Jun 20 '17 at 14:21
If you have optimization toolbox, you can simply define the least squares minimization problem to estimate the parameters. – Karel Macek Jun 20 '17 at 14:23
Thanks. Can you clarify one last thing? Let's assume that we have just one desriptor. The model to be fit would be $s_i^k = d^k + f_i + \epsilon$. Where does the frequency (which I believe to be the independent variable) go in this model? Is it multiplied to $d^k$? – firion Jun 26 '17 at 08:36

comparing multivariate variables under different conditions

2 Answers2