pegasus.dendrogram¶
- pegasus.dendrogram(data, groupby, rep='pca', genes=None, correlation_method='pearson', n_clusters=None, affinity='euclidean', linkage='complete', compute_full_tree='auto', distance_threshold=0, panel_size=(6, 6), orientation='top', color_threshold=None, return_fig=False, dpi=300.0, **kwargs)[source]¶
Generate a dendrogram on hierarchical clustering result.
The metrics used here are consistent with SCANPY’s dendrogram implementation.
scikit-learn Agglomerative Clustering implementation is used for hierarchical clustering.
- Parameters
data (
MultimodalData,UnimodalData, orAnnDataobject) – Single cell expression data.genes (
List[str], optional, default:None) – List of genes to use. Gene names must exist indata.var. If set, use the counts indata.Xfor plotting; if set asNone, use the embedding specified inrepfor plotting.rep (
str, optional, default:pca) – Cell embedding to use. It only works whengenes``is ``None, and its key"X_"+repmust exist indata.obsm. By default, use PCA coordinates.groupby (
str) – Categorical cell attribute to plot, which must exist indata.obs.correlation_method (
str, optional, default:pearson) – Method of correlation between categories specified indata.obs. Available options are:pearson,kendall,spearman. See pandas corr documentation for details.n_clusters (
int, optional, default:None) – The number of clusters to find, used by hierarchical clustering. It must beNoneifdistance_thresholdis notNone.affinity (
str, optional, default:correlation) –- Metric used to compute the linkage, used by hierarchical clustering. Valid values for metric are:
From scikit-learn:
cityblock,cosine,euclidean,l1,l2,manhattan.From scipy.spatial.distance:
braycurtis,canberra,chebyshev,correlation,dice,hamming,jaccard,kulsinski,mahalanobis,minkowski,rogerstanimoto,russellrao,seuclidean,sokalmichener,sokalsneath,sqeuclidean,yule.
Default is the correlation distance. See scikit-learn distance documentation for details.
linkage (
str, optional, default:complete) –- Which linkage criterion to use, used by hierarchical clustering. Below are available options:
wardminimizes the variance of the clusters being merged.avarageuses the average of the distances of each observation of the two sets.completeuses the maximum distances between all observations of the two sets. (Default)singleuses the minimum of the distances between all observations of the two sets.
See scikit-learn documentation for details.
compute_full_tree (
strorbool, optional, default:auto) – Stop early the construction of the tree atn_clusters, used by hierarchical clustering. It must beTrueifdistance_thresholdis notNone. By default, this option isauto, which isTrueif and only ifdistance_thresholdis notNone, orn_clustersis less thanmin(100, 0.02 * n_groups), wheren_groupsis the number of categories indata.obs[groupby].distance_threshold (
float, optional, default:0) – The linkage distance threshold above which, clusters will not be merged. If notNone,n_clustersmust beNoneandcompute_full_treemust beTrue.panel_size (
Tuple[float, float], optional, default:(6, 6)) – The size (width, height) in inches of figure.orientation (
str, optional, default:top) – The direction to plot the dendrogram. Available options are:top,bottom,left,right. See scipy dendrogram documentation for explanation.color_threshold (
float, optional, default:None) – Threshold for coloring clusters. See scipy dendrogram documentation for explanation.return_fig (
bool, optional, default:False) – Return aFigureobject ifTrue; returnNoneotherwise.dpi (
float, optional, default:300.0) – The resolution in dots per inch.**kwargs – Are passed to
scipy.cluster.hierarchy.dendrogram.
- Returns
A
matplotlib.figure.Figureobject containing the dot plot ifreturn_fig == True- Return type
Figureobject
Examples
>>> pg.dendrogram(data, genes=data.var_names, groupby='louvain_labels') >>> pg.dendrogram(data, rep='pca', groupby='louvain_labels')