API¶

Pegasus can also be used as a python package. Import pegasus by:

import pegasus as pg

Analysis Tools¶

`read_input`(input_file[, genome, …])	Load data into memory.
`write_output`(data, output_file[, whitelist])	Write data back to disk.
`aggregate_matrices`(csv_file[, …])	Aggregate channel-specific count matrices into one big count matrix.

`qc_metrics`(data[, mito_prefix, min_genes, …])	Generate Quality Control (QC) metrics on the dataset.
`get_filter_stats`(data)	Calculate filtration stats on cell barcodes and genes, respectively.
`filter_data`(data)	Filter data based on qc_metrics calculated in `pg.qc_metrics`.
`log_norm`(data[, norm_count])	Normalization, and then apply natural logarithm to the data.
`highly_variable_features`(data, consider_batch)	Highly variable features (HVF) selection.
`select_features`(data[, features])	Subset the features and store the resulting matrix in dense format in data.uns with ‘fmat_’ prefix.
`pca`(data[, n_components, features, …])	Perform Principle Component Analysis (PCA) to the data.

`set_group_attribute`(data, attribute_string)	Set group attributes used in batch correction.
`correct_batch`(data[, features])	Batch correction on data.
`run_harmony`(data[, rep, n_jobs, n_clusters, …])	Batch correction PCs using Harmony

`neighbors`(data[, K, rep, n_jobs, …])	Compute k nearest neighbors and affinity matrix, which will be used for diffmap and graph-based community detection algorithms.
`calc_kBET`(data, attr[, rep, K, alpha, …])	Calculate the kBET metric of the data w.r.t.
`calc_kSIM`(data, attr[, rep, K, min_rate, …])	Calculate the kSIM metric of the data w.r.t.

`diffmap`(data[, n_components, rep, solver, …])	Calculate Diffusion Map.
`reduce_diffmap_to_3d`(data[, random_state])	Reduce high-dimensional Diffusion Map matrix to 3-dimentional.
`calc_pseudotime`(data, roots)	Calculate Pseudotime based on Diffusion Map.
`infer_path`(data, cluster, clust_id, path_name)	Inference on path of a cluster.

`cluster`(data[, algo, rep, resolution, …])	Cluster the data using the chosen algorithm.
`louvain`(data[, rep, resolution, …])	Cluster the cells using Louvain algorithm.
`leiden`(data[, rep, resolution, n_iter, …])	Cluster the data using Leiden algorithm.
`spectral_louvain`(data[, rep, resolution, …])	Cluster the data using Spectral Louvain algorithm.
`spectral_leiden`(data[, rep, resolution, …])	Cluster the data using Spectral Leiden algorithm.

`tsne`(data[, rep, n_jobs, n_components, …])	Calculate tSNE embedding using MulticoreTSNE_ package.
`fitsne`(data[, rep, n_jobs, n_components, …])	Calculate FIt-SNE embedding using fitsne_ package.
`umap`(data[, rep, n_components, n_neighbors, …])	Calculate UMAP embedding using umap-learn_ package.
`fle`(data[, file_name, n_jobs, rep, K, …])	Construct the Force-directed (FLE) graph using ForceAtlas2_ implementation, with Python wrapper as forceatlas2-python_.
`net_tsne`(data[, rep, n_jobs, n_components, …])	Calculate approximated tSNE embedding using Deep Learning model to improve the speed.
`net_fitsne`(data[, rep, n_jobs, …])	Calculate approximated FI-tSNE embedding using Deep Learning model to improve the speed.
`net_umap`(data[, rep, n_jobs, n_components, …])	Calculate approximated UMAP embedding using Deep Learning model to improve the speed.
`net_fle`(data[, file_name, n_jobs, rep, K, …])	Construct the approximated Force-directed (FLE) graph using Deep Learning model to improve the speed.

de_analysis(data, cluster[, condition, …])

Perform Differential Expression (DE) Analysis on data.

markers(data[, head, de_key, sort_by, alpha])

Write results into Excel workbook.

find_markers(data, label_attr[, de_key, …])

Find markers using gradient boosting method.

`infer_cell_types`(data, markers, de_test[, …])	Infer putative cell types for each cluster using legacy markers.
`annotate`(data, name, based_on, anno_dict)	Add annotation to AnnData obj.

`embedding`(adata, basis[, keys, cmap, …])	Generate an embedding plot.
`composition_plot`(adata, by, condition[, …])	Generate a composition plot, which shows the percentage of observations from every condition within each cluster (by).
`variable_feature_plot`(adata, **kwds)	Generate a variable feature plot.
`heatmap`(adata, keys, by[, reduce_function, …])	Generate a heatmap.
`dotplot`(adata, keys, by[, reduce_function, …])	Generate a dot plot.

`violin`(adata, keys[, by, width, cmap, cols, …])	Generate a violin plot.
`scatter`(adata, x, y[, color, size, dot_min, …])	Generate a scatter plot.
`scatter_matrix`(adata, keys[, color, use_raw])	Generate a scatter plot matrix.

`estimate_background_probs`(adt[, random_state])	For cell-hashing data, estimate antibody background probability using EM algorithm.
`demultiplex`(data, adt[, min_signal, alpha, …])	Demultiplexing cell-hashing data, using the estimated antibody background probability calculated in `pg.estimate_background_probs`.

`calc_signature_score`(data, signatures[, n_bins])	Calculate signature / gene module score.
`search_genes`(data, gene_list[, rec_key, measure])	Extract and display gene expressions for each cluster from an anndata object.
`search_de_genes`(data, gene_list[, rec_key, …])	Extract and display differential expression analysis results of markers for each cluster.