pegasus.de_analysis

pegasus.de_analysis(data, cluster, condition=None, subset=None, de_key='de_res', n_jobs=-1, t=False, fisher=False, temp_folder=None, verbose=True)[source]

Perform Differential Expression (DE) Analysis on data.

The analysis considers one cluster at one time, comparing gene expression levels on cells within the cluster with all the others using a number of statistical tools, and determining up-regulated genes and down-regulated genes of the cluster.

Mann-Whitney U test and AUROC are calculated by default. Welch’s T test and Fisher’s Exact test are optionally.

The scalability performance on calculating all the test statistics is improved by the inspiration from Presto.

Parameters

data (MultimodalData, UnimodalData, or anndata.AnnData) – Data matrix with rows for cells and columns for genes.
cluster (str) – Cluster labels used in DE analysis. Must exist in data.obs.
condition (str, optional, default: None) – Sample attribute used as condition in DE analysis. If None, no condition is considered; otherwise, must exist in data.obs. If condition is used, the DE analysis will be performed on cells of each level of data.obs[condition] respectively, and collect the results after finishing.
subset (List[str], optional, default: None) – Perform DE analysis on only a subset of cluster IDs. Cluster ID subset is specified as a list of strings, such as [clust_1,clust_3,clust_5], where all IDs must exist in data.obs[cluster].
de_key (str, optional, default: "de_res") – Key name of DE analysis results stored.
n_jobs (int, optional, default: -1) – Number of threads to use. If -1, use all available threads.
t (bool, optional, default: False) – If True, calculate Welch’s t test.
fisher (bool, optional, default: False) – If True, calculate Fisher’s exact test.
temp_folder (str, optional, default: None) – Joblib temporary folder for memmapping numpy arrays.
verbose (bool, optional, default: True) – If True, show detailed intermediate output.

Return type

None

Returns

None
Update data.varm – data.varm[de_key]: DE analysis result.

Examples

>>> pg.de_analysis(data, cluster='spectral_leiden_labels')
>>> pg.de_analysis(data, cluster='louvain_labels', condition='anno')