pegasus.dendrogram¶
- pegasus.dendrogram(data, groupby, rep='pca', genes=None, correlation_method='pearson', n_clusters=None, affinity='euclidean', linkage='complete', compute_full_tree='auto', distance_threshold=0, panel_size=(6, 6), orientation='top', color_threshold=None, return_fig=False, dpi=300.0, **kwargs)[source]¶
Generate a dendrogram on hierarchical clustering result.
The metrics used here are consistent with SCANPY’s dendrogram implementation.
scikit-learn Agglomerative Clustering implementation is used for hierarchical clustering.
- Parameters
data (
MultimodalData
,UnimodalData
, orAnnData
object) – Single cell expression data.genes (
List[str]
, optional, default:None
) – List of genes to use. Gene names must exist indata.var
. If set, use the counts indata.X
for plotting; if set asNone
, use the embedding specified inrep
for plotting.rep (
str
, optional, default:pca
) – Cell embedding to use. It only works whengenes``is ``None
, and its key"X_"+rep
must exist indata.obsm
. By default, use PCA coordinates.groupby (
str
) – Categorical cell attribute to plot, which must exist indata.obs
.correlation_method (
str
, optional, default:pearson
) – Method of correlation between categories specified indata.obs
. Available options are:pearson
,kendall
,spearman
. See pandas corr documentation for details.n_clusters (
int
, optional, default:None
) – The number of clusters to find, used by hierarchical clustering. It must beNone
ifdistance_threshold
is notNone
.affinity (
str
, optional, default:correlation
) –- Metric used to compute the linkage, used by hierarchical clustering. Valid values for metric are:
From scikit-learn:
cityblock
,cosine
,euclidean
,l1
,l2
,manhattan
.From scipy.spatial.distance:
braycurtis
,canberra
,chebyshev
,correlation
,dice
,hamming
,jaccard
,kulsinski
,mahalanobis
,minkowski
,rogerstanimoto
,russellrao
,seuclidean
,sokalmichener
,sokalsneath
,sqeuclidean
,yule
.
Default is the correlation distance. See scikit-learn distance documentation for details.
linkage (
str
, optional, default:complete
) –- Which linkage criterion to use, used by hierarchical clustering. Below are available options:
ward
minimizes the variance of the clusters being merged.avarage
uses the average of the distances of each observation of the two sets.complete
uses the maximum distances between all observations of the two sets. (Default)single
uses the minimum of the distances between all observations of the two sets.
See scikit-learn documentation for details.
compute_full_tree (
str
orbool
, optional, default:auto
) – Stop early the construction of the tree atn_clusters
, used by hierarchical clustering. It must beTrue
ifdistance_threshold
is notNone
. By default, this option isauto
, which isTrue
if and only ifdistance_threshold
is notNone
, orn_clusters
is less thanmin(100, 0.02 * n_groups)
, wheren_groups
is the number of categories indata.obs[groupby]
.distance_threshold (
float
, optional, default:0
) – The linkage distance threshold above which, clusters will not be merged. If notNone
,n_clusters
must beNone
andcompute_full_tree
must beTrue
.panel_size (
Tuple[float, float]
, optional, default:(6, 6)
) – The size (width, height) in inches of figure.orientation (
str
, optional, default:top
) – The direction to plot the dendrogram. Available options are:top
,bottom
,left
,right
. See scipy dendrogram documentation for explanation.color_threshold (
float
, optional, default:None
) – Threshold for coloring clusters. See scipy dendrogram documentation for explanation.return_fig (
bool
, optional, default:False
) – Return aFigure
object ifTrue
; returnNone
otherwise.dpi (
float
, optional, default:300.0
) – The resolution in dots per inch.**kwargs – Are passed to
scipy.cluster.hierarchy.dendrogram
.
- Returns
A
matplotlib.figure.Figure
object containing the dot plot ifreturn_fig == True
- Return type
Figure
object
Examples
>>> pg.dendrogram(data, genes=data.var_names, groupby='louvain_labels') >>> pg.dendrogram(data, rep='pca', groupby='louvain_labels')