pegasus.run_harmony

pegasus.run_harmony(data, rep='pca', n_jobs=- 1, n_clusters=None, random_state=0)[source]

Batch correction on PCs using Harmony.

This is a wrapper of harmony-pytorch package, which is a Pytorch implementation of Harmony algorithm [Korsunsky19].

Parameters
  • data (MultimodalData.) – Annotated data matrix with rows for cells and columns for genes.

  • rep (str, optional, default: "pca".) – Which representation to use as input of Harmony, default is PCA.

  • n_jobs (int, optional, default: -1.) – Number of threads to use for the KMeans clustering used in Harmony. -1 refers to using all available threads.

  • n_clusters (int, optional, default: None.) – Number of Harmony clusters. Default is None, which asks Harmony to estimate this number from the data.

  • random_state (int, optional, default: 0.) – Seed for random number generator

Return type

str

Returns

  • out_rep (str) – The keyword in data.obsm referring to the embedding calculated by Harmony algorithm.

    This keyword is rep + '_harmony', where rep is the input parameter above.

  • Update data.obsm

    • data.obsm['X_' + out_rep]: The embedding calculated by Harmony algorithm.

Examples

>>> pg.run_harmony(data, rep = "pca", n_jobs = 10, random_state = 25)