pegasus.umap

pegasus.umap(data, rep='pca', n_components=2, n_neighbors=15, min_dist=0.5, spread=1.0, densmap=False, dens_lambda=2.0, dens_frac=0.3, dens_var_shift=0.1, n_jobs=- 1, full_speed=False, random_state=0, out_basis='umap')[source]

Calculate UMAP embedding of cells.

This function uses umap-learn package. See [McInnes18] for details on UMAP.

Parameters
  • data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.

  • rep (str, optional, default: "pca") – Representation of data used for the calculation. By default, use PCA coordinates. If None, use the count matrix data.X.

  • n_components (int, optional, default: 2) – Dimension of calculated UMAP coordinates. By default, generate 2-dimensional data for 2D visualization.

  • n_neighbors (int, optional, default: 15) – Number of nearest neighbors considered during the computation.

  • min_dist (float, optional, default: 0.5) – The effective minimum distance between embedded data points.

  • spread (float, optional, default: 1.0) – The effective scale of embedded data points.

  • densmap (bool, optional, default: False) – Whether the density-augmented objective of densMAP should be used for optimization, which will generate an embedding where local densities are encouraged to be correlated with those in the original space.

  • dens_lambda (float, optional, default: 2.0) – Controls the regularization weight of the density correlation term in densMAP. Only works when densmap is True. Larger values prioritize density preservation over the UMAP objective, while values closer to 0 for the opposite direction. Notice that setting this parameter to 0 is equivalent to running the original UMAP algorithm.

  • dens_frac (float, optional, default: 0.3) – Controls the fraction of epochs (between 0 and 1) where the density-augmented objective is used in densMAP. Only works when densmap is True. The first (1 - dens_frac) fraction of epochs optimize the original UMAP objective before introducing the density correlation term.

  • dens_var_shift (float, optional, default, 0.1) – A small constant added to the variance of local radii in the embedding when calculating the density correlation objective to prevent numerical instability from dividing by a small number. Only works when densmap is True.

  • n_jobs (int, optional, default: -1) – Number of threads to use for computing kNN graphs. If -1, use all physical CPU cores.

  • full_speed (bool, optional, default: False) –

    • If True, use multiple threads in constructing hnsw index. However, the kNN results are not reproducible.

    • Otherwise, use only one thread to make sure results are reproducible.

  • random_state (int, optional, default: 0) – Random seed set for reproducing results.

  • out_basis (str, optional, default: "umap") – Key name for calculated UMAP coordinates to store.

Return type

None

Returns

  • None

  • Update data.obsm

    • data.obsm['X_' + out_basis]: UMAP coordinates of the data.

Examples

>>> pg.umap(data)