pegasus.read_input

pegasus.read_input(input_file, file_type=None, mode='r', genome=None, modality=None, black_list=None, select_data=None, select_genome=None, select_modality=None)[source]

Load data into memory. This function is used to load input data into memory. Inputs can be in ‘zarr’, ‘h5ad’, ‘loom’, ‘10x’, ‘mtx’, ‘csv’, ‘tsv’, ‘fcs’ (for flow/mass cytometry data) or ‘nanostring’ (Nanostring GeoMx spatial data) formats.

Parameters
  • input_file (str) – Input file name.

  • file_type (str, optional (default: None)) – File type, choosing from ‘zarr’, ‘h5ad’, ‘loom’, ‘10x’, ‘mtx’, ‘csv’, ‘tsv’, ‘fcs’ (for flow/mass cytometry data), ‘nanostring’ or ‘visium’. If None, inferred from input_file.

  • mode (str, optional (default: ‘r’)) – File open mode, options are ‘r’ or ‘a’. If mode == ‘a’, file_type must be zarr and ngene/select_singlets cannot be set.

  • genome (str, optional (default: None)) – For formats like loom, mtx, dge, csv and tsv, genome is used to provide genome name. In this case if genome is None, except mtx format, “unknown” is used as the genome name instead.

  • modality (str, optional (default: None)) – Default modality, choosing from ‘rna’, ‘atac’, ‘tcr’, ‘bcr’, ‘crispr’, ‘hashing’, ‘citeseq’, ‘cyto’ (flow cytometry / mass cytometry) or ‘nanostring’. If None, use ‘rna’ as default.

  • black_list (Set[str], optional (default: None)) – Attributes in black list will be poped out.

  • select_data (Set[str], optional (default: None)) – Only select data with keys in select_data. Select_data, select_genome and select_modality are mutually exclusive.

  • select_genome (Set[str], optional (default: None)) – Only select data with genomes in select_genome. Select_data, select_genome and select_modality are mutually exclusive.

  • select_modality (Set[str], optional (default: None)) – Only select data with modalities in select_modality. Select_data, select_genome and select_modality are mutually exclusive.

Return type

A MultimodalData object.

Examples

>>> data = io.read_input('example_10x.h5')
>>> data = io.read_input('example.h5ad')
>>> data = io.read_input('example_ADT.csv', genome = 'hashing_HTO', modality = 'hashing')