pegasus.pseudobulk
- pegasus.pseudobulk(data, groupby, attrs=None, mat_key=None, condition=None)[source]
Generate Pseudo-bulk count matrices.
- Parameters
data (
MultimodalDataorUnimodalDataobject) – Annotated data matrix with rows for cells and columns for genes.groupby (
str) – Specify the cell attribute used for aggregating pseudo-bulk data. Key must exist indata.obs.attrs (
strorList[str], optional, default:None) – Specify additional cell attributes to remain in the pseudo bulk data. If set, all attributes’ keys must exist indata.obs. Notice that for a categorical attribute, each pseudo-bulk’s value is the one of highest frequency among its cells, and for a numeric attribute, each pseudo-bulk’s value is the mean among its cells.mat_key (
str, optional, default:None) – Specify the single-cell count matrix used for aggregating pseudo-bulk counts: If specified, use the count matrix with keymat_keyfrom matrices ofdata; otherwise, first look for keycounts, then forraw.Xif not existing.condition (
str, optional, default:None) – If set, additionally generate pseudo-bulk matrices per condition specified indata.obs[condition].
- Returns
It has the following count matrices:
X: The pseudo-bulk count matrix over all cells.If
conditionis set, add additional pseudo-bulk count matrices of cells restricted to each condition, respectively
mdata.obs: It contains pseudo-bulk attributes aggregated from the corresponding single-cell attributes.mdata.var: Gene names and Ensembl IDs are maintained.
- Return type
A MultimodalData object
mdatacontaining pseudo-bulk information
Examples
>>> pg.pseudobulk(data, groupby="Channel")