pegasus.calc_kBET

pegasus.calc_kBET(data, attr, rep='pca', K=25, alpha=0.05, n_jobs=-1, random_state=0, temp_folder=None, use_cache=True, full_speed=False)[source]

Calculate the kBET metric of the data regarding a specific sample attribute and embedding.

The kBET metric is defined in [Büttner18], which measures if cells from different samples mix well in their local neighborhood.

Parameters

data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.
attr (str) – The sample attribute to consider. Must exist in data.obs.
rep (str, optional, default: "pca") – The embedding representation to be used. The key 'X_' + rep must exist in data.obsm. By default, use PCA coordinates.
K (int, optional, default: 25) – Number of nearest neighbors, using L2 metric.
alpha (float, optional, default: 0.05) – Acceptance rate threshold. A cell is accepted if its kBET p-value is greater than or equal to alpha.
n_jobs (int, optional, default: -1) – Number of threads used. If -1, use all physical CPU cores.
random_state (int, optional, default: 0) – Random seed set for reproducing results.
temp_folder (str, optional, default: None) – Temporary folder for joblib execution.
use_cache (bool, optional, default: True) – If use cache results for kNN.
full_speed (bool, optional (default: False)) – If full_speed, use multiple threads in constructing hnsw index. However, the kNN results are not reproducible. If not full_speed, use only one thread to make sure results are reproducible.

Return type

Tuple[float, float, float]

Returns

stat_mean (float) – Mean kBET chi-square statistic over all cells.
pvalue_mean (float) – Mean kBET p-value over all cells.
accept_rate (float) – kBET Acceptance rate of the sample.

Examples

>>> pg.calc_kBET(data, attr = 'Channel')

>>> pg.calc_kBET(data, attr = 'Channel', rep = 'umap')