pegasus.pseudobulk¶
- pegasus.pseudobulk(data, sample, attrs=None, mat_key=None, cluster=None)[source]¶
Generate Pseudo-bulk count matrices.
- Parameters
data (
MultimodalData
orUnimodalData
object) – Annotated data matrix with rows for cells and columns for genes.sample (
str
) – Specify the cell attribute used for aggregating pseudo-bulk data. Key must exist indata.obs
.attrs (
str
orList[str]
, optional, default:None
) – Specify additional cell attributes to remain in the pseudo bulk data. If set, all attributes’ keys must exist indata.obs
. Notice that for a categorical attribute, each pseudo-bulk’s value is the one of highest frequency among its cells, and for a numeric attribute, each pseudo-bulk’s value is the mean among its cells.mat_key (
str
, optional, default:None
) – Specify the single-cell count matrix used for aggregating pseudo-bulk counts: IfNone
, use the raw count matrix indata
: look forraw.X
key in its matrices first; if not exists, useX
key. Otherwise, if specified, use the count matrix with keymat_key
from matrices ofdata
.cluster (
str
, optional, default:None
) – If set, additionally generate pseudo-bulk matrices per cluster specified indata.obs[cluster]
.
- Return type
UnimodalData
- Returns
A UnimodalData object
udata
containing pseudo-bulk information –It has the following count matrices:
X
: The pseudo-bulk count matrix over all cells.If
cluster
is set, a number of pseudo-bulk count matrices of cells belonging to the clusters, respectively.
udata.obs
: It contains pseudo-bulk attributes aggregated from the corresponding single-cell attributes.udata.var
: Gene names and Ensembl IDs are maintained.
Update
data
–Add the returned UnimodalData object above to
data
with key<sample>-pseudobulk
, where<sample>
is replaced by the actual value ofsample
argument.
Examples
>>> pg.pseudobulk(data, sample="Channel")