pegasus.neighbors¶
- pegasus.neighbors(data, K=100, rep='pca', n_jobs=- 1, random_state=0, full_speed=False, use_cache=True, dist='l2')[source]¶
Compute k nearest neighbors and affinity matrix, which will be used for diffmap and graph-based community detection algorithms.
The kNN calculation uses hnswlib introduced by [Malkov16].
- Parameters
data (
pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.K (
int, optional, default:100) – Number of neighbors, including the data point itself.rep (
str, optional, default:"pca") – Embedding representation used to calculate kNN. IfNone, usedata.X; otherwise, keyword'X_' + repmust exist indata.obsm.n_jobs (
int, optional, default:-1) – Number of threads to use. If-1, use all physical CPU cores.random_state (
int, optional, default:0) – Random seed set for reproducing results.full_speed (
bool, optional, default:False) –If
True, use multiple threads in constructinghnswindex. However, the kNN results are not reproducible.Otherwise, use only one thread to make sure results are reproducible.
use_cache (
bool, optional, default:True) –If
Trueand found cached knn results, Pegasus will use cached results and do not recompute.Otherwise, compute kNN irrespective of caching status.
dist (
str, optional (default:"l2")) – Distance metric to use. By default, use squared L2 distance. Available options, inner product"ip"or cosine similarity"cosine".
- Return type
None- Returns
NoneUpdate
data.obsm–data.obsm[rep + "_knn_indices"]: kNN index matrix. Row i is the index list of kNN of cell i (excluding itself), sorted from nearest to farthest.data.obsm[rep + "_knn_distances"]: kNN distance matrix. Row i is the distance list of kNN of cell i (excluding itselt), sorted from smallest to largest.
Update
data.obsp–data.obsp["W_" + rep]: kNN graph of the data in terms of affinity matrix.
Examples
>>> pg.neighbors(data)