pegasus.pca

pegasus.pca(data, n_components=50, features='highly_variable_features', standardize=True, max_value=10, robust=False, random_state=0)[source]

Perform Principle Component Analysis (PCA) to the data.

The calculation uses scikit-learn implementation.

Parameters
  • data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.

  • n_components (int, optional, default: 50.) – Number of Principal Components to get.

  • features (str, optional, default: "highly_variable_features".) – Keyword in data.var to specify features used for PCA.

  • standardize (bool, optional, default: True.) – Whether to scale the data to unit variance and zero mean.

  • max_value (float, optional, default: 10.) – The threshold to truncate data after scaling. If None, do not truncate.

  • robust (bool, optional, default: False.) – If true, use ‘arpack’ instead of ‘randomized’ for large sparse matrices (i.e. max(X.shape) > 500 and n_components < 0.8 * min(X.shape))

  • random_state (int, optional, default: 0.) – Random seed to be set for reproducing result.

Return type

None

Returns

  • None.

  • Update data.obsm

    • data.obsm["X_pca"]: PCA matrix of the data.

  • Update data.uns

    • data.uns["PCs"]: The principal components containing the loadings.

    • data.uns["pca_variance"]: Explained variance, i.e. the eigenvalues of the covariance matrix.

    • data.uns["pca_variance_ratio"]: Ratio of explained variance.

Examples

>>> pg.pca(data)