Principal component analysis (PCA) is a statistical technique that can be used for data exploration. It is not necessary to understand the details of PCA to be able to successfully use it to find patterns within your data, but they can help interpret how significant such a pattern is. The screenshot below shows SpectralAnalysis’ interface for exploring PCA results.
There are two methods for performing PCA included within SpectralAnalysis. These can be found in the Data Reduction
dropdown menu, as shown below.
The fastest method (PCA
) requires the data to be in memory (see data representation), whereas Memory Efficient PCA
only loads in the bare minimum amount of data at any one time and can therefore be performed on datasets much larger than the available RAM. See Race et al. (https://pubs.acs.org/doi/10.1021/ac302528v) for more details on this method.
The data is projected into a new space, such the the first dimension (principal component) captures the largest amount of variance within the data. Each subsequent principal component captures the largest amount of remaining variance, with the constraint that it must be orthogonal to all previous principal components.