pegasus.calc_kSIM

pegasus.calc_kSIM(data, attr, rep='pca', K=25, min_rate=0.9, n_jobs=-1, random_state=0, use_cache=True)[source]

Calculate the kSIM metric of the data regarding a specific sample attribute and embedding.

The kSIM metric is defined in [Li20], which measures if a sample attribute is not diffused too much in each cell’s local neighborhood.

Parameters
  • data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.

  • attr (str) – The sample attribute to consider. Must exist in data.obs.

  • rep (str, optional, default: "pca") – The embedding representation to consider. The key 'X_' + rep must exist in data.obsm.

  • K (int, optional, default: 25) – The number of nearest neighbors to be considered.

  • min_rate (float, optional, default: 0.9) – Acceptance rate threshold. A cell is accepted if its kSIM rate is larger than or equal to min_rate.

  • n_jobs (int, optional, default: -1) – Number of threads used. If -1, use all physical CPU cores.

  • random_state (int, optional, default: 0) – Random seed set for reproducing results.

  • use_cache (bool, optional, default: True) – If use cache results for kNN.

Return type

Tuple[float, float]

Returns

  • kSIM_mean (float) – Mean kSIM rate over all the cells.

  • kSIM_accept_rate (float) – kSIM Acceptance rate of the sample.

Examples

>>> pg.calc_kSIM(data, attr = 'cell_type')
>>> pg.calc_kSIM(data, attr = 'cell_type', rep = 'umap')