I'm doing a data analysis on data with more than 100 dimensions.
After that different ML-Algorithms like NN are applied to it.
When I do a PCA in the first place to reduce dimensionality to somewhat like 3-10, I persistently get better results (as in less miss-predictions) than without it.
My thought was that PCA should just speed up NN, etc, but not make them better?
Is this improvement realistic or did I make a mistake with my PCA?
This is how I´m doing it concretely:
Data; % training input
Test_Data; % test input
pca_size = 3; % pca size
%Scaling and Centering of Data
Scaled = (Data - mean(Data))./std(Data);
coeff = pca(Scaled);
Data_Reduced = Data * coeff(:, 1:pca_size);
Test_Data_Reduced = Test_Data * coeff(:, 1:pca_size);
Data_Reduced = Data * coeff(:, 1:pca_size);are not correct. – Matthew Gunn Jan 25 '17 at 09:48Scaled = (Data - mean(Data))./std(Data);a Matlab command or rather a pseudo-code? IfDatais a 2D matrix, then in MatlabData - mean(Data)will not subtract column means, and dividing bystd(Data)like that won't work at all. – amoeba Jan 25 '17 at 16:32Scaledwith PCAcoeff, not theData. And for the test data, you need to scale it first with the mean/std of the training data. – amoeba Jan 25 '17 at 19:12But does this have a big influence, especially because I am using unscaled versions of both Data and Test_Data, so there is no difference in preparation of both. And PCA is working well. So in sum there is impmrovement potential by multiplying scaled data, but in general my solution is working?
– SwingNoob Jan 25 '17 at 20:42bsxfun. I don't know if it has a big influence or not. What you are doing is not really PCA (because you compute eigenvectors of the scaled data but transform unscaled data), but as you are doing the same thing with train and test then it's okay. Regarding your main question of how it's possible that results are better, please read http://stats.stackexchange.com/questions/141864/ and follow the links in the top answer. (Also, please include@amoebain your comments, otherwise I am not notified of them.) – amoeba Jan 25 '17 at 23:54