pegasus.run_scvi(data, features='highly_variable_features', matkey='raw.X', n_jobs=- 1, random_state=0, max_epochs=None, batch=None, categorical_covariate_keys=None, continuous_covariate_keys=None, use_gpu=None)[source]

Run scVI embedding.

This is a wrapper of scvitools package.

  • data (MultimodalData.) – Annotated data matrix with rows for cells and columns for genes.

  • features (str, optional, default: "highly_variable_features") – Keyword in data.var, which refers to a boolean array. If None, all features will be selected.

  • matkey (str, optional, default: "raw.X") – Matrix key for the raw count

  • n_jobs (int, optional, default: -1.) – Number of threads to use. -1 refers to using all physical CPU cores.

  • random_state (int, optional, default: 0.) – Seed for random number generator

  • max_epochs (int | None, optional, default: None.) – Maximum number of training epochs. Defaults to np.min([round((20000 / n_cells) * 400), 400])

  • batch (str, optional, default: None.) – If only one categorical covariate, the obs key representing batches that should be corrected for, default is None.

  • categorical_covariate_keys (List[str]) – If multiple categorical covariates, a list of obs keys listing categorical covariates that should be corrected for, default is None.

  • continuous_covariate_keys (List[str]) – A list of obs keys listing continuous covariates that should be corrected for, default is None.

  • use_gpu (str | int | bool | None) – Use default GPU if available (if None or True), or index of GPU to use (if int), or name of GPU (if str, e.g., cuda:0), or use CPU (if False).

Return type



  • out_rep (str) – The keyword in data.obsm referring to the embedding calculated by integrative NMF algorithm. out_rep is always equal to “scVI”

  • Update data.obsm

    • data.obsm['X_scVI']: The embedding calculated by scVI.


>>> pg.run_scvi(data, batch="Channel")
>>> pg.run_scvi(data, categorical_covariate_keys=["cell_source", "donor"], continuous_covariate_keys=["percent_mito", "percent_ribo"])