pegasus.run_harmony
- pegasus.run_harmony(data, batch='Channel', rep='pca', n_comps=None, n_jobs=-1, n_clusters=None, random_state=0, use_gpu=False, max_iter_harmony=10)[source]
Batch correction on PCs using Harmony.
This is a wrapper of harmony-pytorch package, which is a Pytorch implementation of Harmony algorithm [Korsunsky19].
- Parameters
data (
MultimodalData.) – Annotated data matrix with rows for cells and columns for genes.batch (
strorList[str], optional, default:"Channel".) – Which attribute in data.obs field represents batches, default is “Channel”. If using multiple attributes, specify their names in a list.rep (
str, optional, default:"pca".) – Which representation to use as input of Harmony, default is PCA.n_comps (int, optional (default: None)) – Number of components to be used in the rep. If n_comps == None, use all components; otherwise, use the minimum of n_comps and rep’s dimensions.
n_jobs (
int, optional, default:-1.) – Number of threads to use in Harmony.-1refers to using all physical CPU cores.n_clusters (
int, optional, default:None.) – Number of Harmony clusters. Default isNone, which asks Harmony to estimate this number from the data.random_state (
int, optional, default:0.) – Seed for random number generatoruse_gpu (
bool, optional, default:False.) – IfTrue, use GPU if available. Otherwise, use CPU only.max_iter_harmony (
int, optional, default:10.) – Maximum iterations on running Harmony if not converged.
- Return type
str- Returns
out_rep (
str) – The keyword indata.obsmreferring to the embedding calculated by Harmony algorithm.This keyword is
rep + '_harmony', whererepis the input parameter above.Update
data.obsm–data.obsm['X_' + out_rep]: The embedding calculated by Harmony algorithm.
Examples
>>> pg.run_harmony(data, rep = "pca", n_jobs = 10, random_state = 25)