pegasus.deseq2

pegasus.deseq2(pseudobulk, design, contrasts, backend='pydeseq2', de_key='deseq2', alpha=0.05, compute_all=False, verbose=True, n_jobs=-1)[source]

Perform Differential Expression (DE) Analysis using DESeq2 on pseduobulk data.

Parameters
  • pseudobulk (MultimodalData or UnimodalData) – Pseudobulk data with rows for samples/pseudobulks and columns for genes. It may contain multiple count matrices of the same shape with different keys

  • design (str or List[str]) – For pydeseq2 backend, specify either a factor or a list of factors to be used as design variables.They must be all in pseudobulk.obs. For deseq2 backend, specify the design formula that will be passed to DESeq2. E.g. ~group+condition or ~genotype+treatment+genotype:treatment.

  • contrasts (Tuple[str, str, str] or List[Tuple[str, str, str]]) – A tuple of three elements passing to DESeq2: a factor in design formula, a level in the factor as the test level (numeritor of fold change), and a level as the reference level (denominator of fold change). It also accept multiple contrasts as a list. In this way, de_key must be a list of strings, and the DE result of each contrast will then be stored in data.varm with the corresponding key.

  • backend (str, optional, default: pydeseq2) – Specify which package to use as the backend for pseudobulk DE analysis. By default, use PyDESeq2 which is a pure Python implementation of DESeq2 method. Alternatively, if specifying deseq2, then use R package DESeq2, which requires rpy2 package and R installation.

  • de_key (str or List[str], optional, default: "deseq2") – Key name of DE analysis results stored. For count matrix with name condition.X, stored key will be condition.de_key. Provide a list of keys if contrasts is a list.

  • alpha (float, optional, default: 0.05) – The significance cutoff (between 0 and 1) used for optimizing the independent filtering to calculate the adjusted p-values (FDR).

  • compute_all (bool, optional, default: False) – If performing DE analysis on all count matrices. By default (compute_all=False), only apply DE analysis to the default count matrix counts.

  • verbose (bool, optional, default: True) – If showing DESeq2 status updates during fit. Only works when backend="pydeseq2".

  • n_jobs (int, optional, default: -1) – Number of threads to use. If -1, use all physical CPU cores. This only works when ``backend=”pydeseq2”`.

Return type

None

Returns

  • None

  • Update pseudobulk.varmpseudobulk.varm[de_key]: DE analysis result for pseudo-bulk count matrix. (Optional) pseudobulk.varm[condition.de_key]: If compute_all=True, DE results for each condition-specific pseudo-bulk count matrices.

Examples

>>> pg.deseq2(pseudobulk, 'gender', ('gender', 'female', 'male'))
>>> pg.deseq2(pseudobulk, '~gender', ('gender', 'female', 'male'), backend="deseq2")
>>> pg.deseq2(pseudobulk, 'treatment', [('treatment', 'A', 'B'), ('treatment', 'A', 'C')], de_key=['deseq2_A_B', 'deseq2_A_C'])