pegasus.qc_metrics¶
- pegasus.qc_metrics(data, select_singlets=False, remap_string=None, subset_string=None, min_genes=None, max_genes=None, min_umis=None, max_umis=None, mito_prefix=None, percent_mito=None)[source]¶
Generate Quality Control (QC) metrics regarding cell barcodes on the dataset.
- Parameters
data (
pegasusio.MultimodalData
) – Use current selected modality in data, which should contain one RNA expression matrix.select_singlets (
bool
, optional, defaultFalse
) – If select only singlets.remap_string (
str
, optional, defaultNone
) – Remap singlet names using <remap_string>, where <remap_string> takes the format “new_name_i:old_name_1,old_name_2;new_name_ii:old_name_3;…”. For example, if we hashed 5 libraries from 3 samples sample1_lib1, sample1_lib2, sample2_lib1, sample2_lib2 and sample3, we can remap them to 3 samples using this string: “sample1:sample1_lib1,sample1_lib2;sample2:sample2_lib1,sample2_lib2”. In this way, the new singlet names will be in metadata field with key ‘assignment’, while the old names will be kept in metadata field with key ‘assignment.orig’.subset_string (
str
, optional, defaultNone
) – If select singlets, only select singlets in the <subset_string>, which takes the format “name1,name2,…”. Note that if –remap-singlets is specified, subsetting happens after remapping. For example, we can only select singlets from sampe 1 and 3 using “sample1,sample3”.min_genes (
int
, optional, default:None
) – Only keep cells with at leastmin_genes
genes.max_genes (
int
, optional, default:None
) – Only keep cells with less thanmax_genes
genes.min_umis (
int
, optional, default:None
) – Only keep cells with at leastmin_umis
UMIs.max_umis (
int
, optional, default:None
) – Only keep cells with less thanmax_umis
UMIs.mito_prefix (
str
, optional, default:None
) – Prefix for mitochondrial genes.percent_mito (
float
, optional, default:None
) – Only keep cells with percent mitochondrial genes less thanpercent_mito
% of total counts.
- Return type
None
- Returns
None
Update
data.obs
–n_genes
: Total number of genes for each cell.n_counts
: Total number of counts for each cell.percent_mito
: Percent of mitochondrial genes for each cell.passed_qc
: Boolean type indicating if a cell passes the QC process based on the QC metrics.demux_type
: this column might be deleted if select_singlets is on.
Examples
>>> pg.qc_metrics(data, min_genes=500, max_genes=6000, mito_prefix="MT-", percent_mito=10)