I read somewhere that, in binary classification problems, very strong correlation does not imply redundancy of features, for example, if $X_i$ and $X_j$ have a correlation coefficient $\rho> 0.95$, dropping $X_j$ might lead to losing information and thus make the classification model less accurate.
Is that true and if so, is the correlation matrix of features of any use if you cannot drop highly correlated features?
Further info about my problem: Classifying tuples of $50$ values as either signal or background. Many variables have correlation higher than $0.8$ and some have even stronger ($> 0.9$). I doubt that dropping those variables is the right thing to do, but I cannot explain it in theory.