Why extracting eigenvectors from PCA?

David_McClish_DerBr · June 28, 2024, 12:05pm

When doing PCA, why would we extract the eigenvectors from it as done here: image

Wouldn't we want to use PCA to transform the original data, and how would just using the eigenvectors help with clustering?

Also, shouldn't we scale the data before using PCA? Is this better done with MinMax or StandardScaler?

In the notebook, they scale the features but only AFTER pca was conducted. Shouldn't it also be done before? image 2

Course: Unsupervised Learning in Trading

Notebook: Feature Engineering for Pairs Trading

Rushda_Ansari_A8OBX · June 29, 2024, 2:38pm

Hey David,

We'll check this and get back to you

Rushda_Ansari_A8OBX · July 3, 2024, 12:32am

Hi David,

You're correct, thankyou for pointing this out. We will be making the necessary changes in the notebook accordingly.

Thanks

Rushda

Rushda_Ansari_A8OBX · July 13, 2024, 1:48pm

Hey David

We have updated the notebook by scaling the data before using PCA. In addition to that, the datafile created at the end of the notebook has also been updated.

However, there are no changes in cell 3. This is because the eigenvectors (principal component vectors) indicate the directions of maximum variance in the data. We do this to obtain the directions in which the data varies the most. This can help understand the underlying structure of the data. These eigenvectors can also be useful for understanding how each feature contributes to the principal components.