pegasus.run_harmony
- pegasus.run_harmony(data, batch='Channel', rep='pca', n_comps=None, n_jobs=-1, n_clusters=None, random_state=0, use_gpu=False, max_iter_harmony=10)[source]
Batch correction on PCs using Harmony.
This is a wrapper of harmony-pytorch package, which is a Pytorch implementation of Harmony algorithm [Korsunsky19].
- Parameters
data (
MultimodalData
.) – Annotated data matrix with rows for cells and columns for genes.batch (
str
orList[str]
, optional, default:"Channel"
.) – Which attribute in data.obs field represents batches, default is “Channel”. If using multiple attributes, specify their names in a list.rep (
str
, optional, default:"pca"
.) – Which representation to use as input of Harmony, default is PCA.n_comps (int, optional (default: None)) – Number of components to be used in the rep. If n_comps == None, use all components; otherwise, use the minimum of n_comps and rep’s dimensions.
n_jobs (
int
, optional, default:-1
.) – Number of threads to use in Harmony.-1
refers to using all physical CPU cores.n_clusters (
int
, optional, default:None
.) – Number of Harmony clusters. Default isNone
, which asks Harmony to estimate this number from the data.random_state (
int
, optional, default:0
.) – Seed for random number generatoruse_gpu (
bool
, optional, default:False
.) – IfTrue
, use GPU if available. Otherwise, use CPU only.max_iter_harmony (
int
, optional, default:10
.) – Maximum iterations on running Harmony if not converged.
- Return type
str
- Returns
out_rep (
str
) – The keyword indata.obsm
referring to the embedding calculated by Harmony algorithm.This keyword is
rep + '_harmony'
, whererep
is the input parameter above.Update
data.obsm
–data.obsm['X_' + out_rep]
: The embedding calculated by Harmony algorithm.
Examples
>>> pg.run_harmony(data, rep = "pca", n_jobs = 10, random_state = 25)