0

I would like to get the projection of a set of points in a plane (described by a 2D array) on the directions of the principal axis (defined as the directions of maximum variance in the data). For this reason I used PCA, imported from sklearn.decomposition:

pca = PCA(n_components=1)
pca.fit(myArray)
Transformed_points = pca.transform(myArray)
Projected_points = pca.inverse_transform(Transformed_points)

This works just fine, as you can see in the picture.

I was wondering if there is a way to get the projection of points along the second principal axis, which is defined by the second eigenvector that I get if I set:

pca = PCA(n_components=2)

I tried to define the PCA method on my own, in order to have more control over these details. In particular I referred to the answer to this question Principal Component Analysis (PCA) in Python. Here is the code I wrote:

        def PCA_new(data, evecs_value):
            data = data.astype(np.float64)
            m, n = data.shape
            # mean center the data
            data -= data.mean(axis=0)      
            # calculate the covariance matrix
            R = np.cov(data, rowvar=False)
            # calculate eigenvectors & eigenvalues of the covariance matrix
            # use 'eigh' rather than 'eig' since R is symmetric, 
            # the performance gain is substantial
            evals, evecs = LA.eigh(R)
            # sort eigenvalue in decreasing order
            idx = np.argsort(evals)[::-1]
            evecs = evecs[:,idx]
            # sort eigenvectors according to same index
            evals = evals[idx]
            # select the eigenvectors, along which points need to be projected
            evecs_1 = evecs[evecs_value]
            # carry out the transformation on the data using eigenvectors
            # and return the re-scaled data, eigenvalues, and eigenvectors        
            data_mod = np.dot(data, evecs_1.T)
            # define the projection of points with the inverse transformation
            inverse = []
            for i in range(len(data_mod)):
                inverse.append([-data_mod[i]*evecs_1[0] + Center_X, data_mod[i]*evecs_1[1] + Center_Y])
            inverse = np.array(inverse)
            
            return inverse

However, for some reasons I do not understand, the results are quite different to what I obtain by pca.inverse_transform. Here is the plot

Jeremy
  • 1
  • 1

0 Answers0