I would like to get the projection of a set of points in a plane (described by a 2D array) on the directions of the principal axis (defined as the directions of maximum variance in the data). For this reason I used PCA, imported from sklearn.decomposition:
pca = PCA(n_components=1)
pca.fit(myArray)
Transformed_points = pca.transform(myArray)
Projected_points = pca.inverse_transform(Transformed_points)
This works just fine, as you can see in the picture.
I was wondering if there is a way to get the projection of points along the second principal axis, which is defined by the second eigenvector that I get if I set:
pca = PCA(n_components=2)
I tried to define the PCA method on my own, in order to have more control over these details. In particular I referred to the answer to this question Principal Component Analysis (PCA) in Python. Here is the code I wrote:
def PCA_new(data, evecs_value):
data = data.astype(np.float64)
m, n = data.shape
# mean center the data
data -= data.mean(axis=0)
# calculate the covariance matrix
R = np.cov(data, rowvar=False)
# calculate eigenvectors & eigenvalues of the covariance matrix
# use 'eigh' rather than 'eig' since R is symmetric,
# the performance gain is substantial
evals, evecs = LA.eigh(R)
# sort eigenvalue in decreasing order
idx = np.argsort(evals)[::-1]
evecs = evecs[:,idx]
# sort eigenvectors according to same index
evals = evals[idx]
# select the eigenvectors, along which points need to be projected
evecs_1 = evecs[evecs_value]
# carry out the transformation on the data using eigenvectors
# and return the re-scaled data, eigenvalues, and eigenvectors
data_mod = np.dot(data, evecs_1.T)
# define the projection of points with the inverse transformation
inverse = []
for i in range(len(data_mod)):
inverse.append([-data_mod[i]*evecs_1[0] + Center_X, data_mod[i]*evecs_1[1] + Center_Y])
inverse = np.array(inverse)
return inverse
However, for some reasons I do not understand, the results are quite different to what I obtain by pca.inverse_transform. Here is the plot