pegasus.qc_metrics
- pegasus.qc_metrics(data, select_singlets=False, remap_string=None, subset_string=None, min_genes=None, max_genes=None, min_umis=None, max_umis=None, mito_prefix=None, percent_mito=None)[source]
Generate Quality Control (QC) metrics regarding cell barcodes on the dataset.
- Parameters
data (
pegasusio.MultimodalData) – Use current selected modality in data, which should contain one RNA expression matrix.select_singlets (
bool, optional, defaultFalse) – If select only singlets.remap_string (
str, optional, defaultNone) – Remap singlet names using <remap_string>, where <remap_string> takes the format “new_name_i:old_name_1,old_name_2;new_name_ii:old_name_3;…”. For example, if we hashed 5 libraries from 3 samples sample1_lib1, sample1_lib2, sample2_lib1, sample2_lib2 and sample3, we can remap them to 3 samples using this string: “sample1:sample1_lib1,sample1_lib2;sample2:sample2_lib1,sample2_lib2”. In this way, the new singlet names will be in metadata field with key ‘assignment’, while the old names will be kept in metadata field with key ‘assignment.orig’.subset_string (
str, optional, defaultNone) – If select singlets, only select singlets in the <subset_string>, which takes the format “name1,name2,…”. Note that if –remap-singlets is specified, subsetting happens after remapping. For example, we can only select singlets from sampe 1 and 3 using “sample1,sample3”.min_genes (
int, optional, default:None) – Only keep cells with at leastmin_genesgenes.max_genes (
int, optional, default:None) – Only keep cells with less thanmax_genesgenes.min_umis (
int, optional, default:None) – Only keep cells with at leastmin_umisUMIs.max_umis (
int, optional, default:None) – Only keep cells with less thanmax_umisUMIs.mito_prefix (
str, optional, default:None) – Prefix for mitochondrial genes.percent_mito (
float, optional, default:None) – Only keep cells with percent mitochondrial genes less thanpercent_mito% of total counts.
- Return type
None- Returns
NoneUpdate
data.obs–n_genes: Total number of genes for each cell.n_counts: Total number of counts for each cell.percent_mito: Percent of mitochondrial genes for each cell.passed_qc: Boolean type indicating if a cell passes the QC process based on the QC metrics.demux_type: this column might be deleted if select_singlets is on.
Examples
>>> pg.qc_metrics(data, min_genes=500, max_genes=6000, mito_prefix="MT-", percent_mito=10)