pegasus.split_one_cluster

pegasus.split_one_cluster(data, clust_label, clust_id, n_clust, res_label, rep='pca', n_comps=None, random_state=0)[source]

Use Leiden algorithm to split ‘clust_id’ in ‘clust_label’ into ‘n_components’ sub-clusters and write the new clusting results to ‘res_label’. The sub-cluster names are the concatenation of original cluster name and the subcluster id (e.g. ‘T’ -> ‘T-1’, ‘T-2’).

Parameters
  • data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.

  • clust_label (str) – Use existing clustering stored in data.obs[‘clust_label’].

  • clust_id (str) – Cluster ID in data.obs[‘clust_label’].

  • n_clust (int) – Split ‘clust_id’ into `n_clust’ subclusters.

  • res_label (str,) – Write new clustering in data.obs[‘res_label’]. The sub-cluster names are the concatenation of original cluster name and the subcluster id (e.g. ‘T’ -> ‘T-1’, ‘T-2’).

  • rep (str, optional, default: "pca") – The embedding representation used for Kmeans clustering. Keyword 'X_' + rep must exist in data.obsm. By default, use PCA coordinates.

  • n_comps (int, optional (default: None)) – Number of components to be used in the rep. If n_comps == None, use all components; otherwise, use the minimum of n_comps and rep’s dimensions.

  • n_jobs (int, optional (default: -1)) – Number of threads to use for the KMeans step in ‘spectral_louvain’ and ‘spectral_leiden’. -1 refers to using all physical CPU cores.

  • random_state (int, optional, default: 0) – Random seed for reproducing results.

Return type

None

Returns

  • None

  • Update data.obs

    • data.obs[res_label]: New cluster labels of cells as categorical data.

Examples

>>> pg.split_one_cluster(data, 'leiden_labels', '15', 2, 'leiden_labels_split')