PCA

Principal component analysis (PCA) on features with numerical values

📘

PCA is an advanced Monitor allowing you to monitor multidimensional correlations. If you have limited experience with PCA, we suggest involving a Data Scientist who has PCA experience when setting up this monitor

Configuration parameters

Parameter name and description Parameter values
1. Name Arbitrary String
2. Target features

The features to perform the PCA on

Multi-select of all features with numeric values
3. Number of components

Number of components to reduce dimension to, needs to be less than number of selected target features

Integer
4. Computed metric
  • Primary component explained variance
  • Explained variance effective component number

Parameter details

Principal component analysis (PCA) can be used to monitor changes in the eigenvectors in the dataset’s covariance matrix, and how much variance each component (eigenvector) explains. In layman terms, PCA checks how ‘spread’ the data is along some axes.

Primary component explained variance

Validating how many % of the variance the primary (1st) component explains.

Higher % means more data is distributed along the direction of the first component while lower % means less data is distributed along the direction of the first component.

Explained variance effective component number

Validating how many principal components are needed to explain 90% of the variance, the number of components needed can also be fractional.

The less components needed to explain 90% of the variance relative to the number of target features, the more correlated the selected features are.