pegasus.net_tsne

pegasus.net_tsne(data, rep='pca', n_jobs=- 1, n_components=2, perplexity=30, early_exaggeration=12, learning_rate=1000, random_state=0, select_frac=0.1, select_K=25, select_alpha=1.0, net_alpha=0.1, polish_learning_frac=0.33, polish_n_iter=150, out_basis='net_tsne')[source]

Calculate Net-tSNE embedding of cells.

Net-tSNE is an approximated tSNE embedding using Deep Learning model to improve the calculation speed.

In specific, the deep model used is MLPRegressor, the scikit-learn implementation of Multi-layer Perceptron regressor.

See [Li20] for details.

Parameters
  • data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells (n_obs) and columns for genes (n_feature).

  • rep (str, optional, default: "pca") – Representation of data used for the calculation. By default, use PCA coordinates. If None, use the count matrix data.X.

  • n_jobs (int, optional, default: -1) – Number of threads to use. If -1, use all available threads.

  • n_components (int, optional, default: 2) – Dimension of calculated tSNE coordinates. By default, generate 2-dimensional data for 2D visualization.

  • perplexity (float, optional, default: 30) – The perplexity is related to the number of nearest neighbors used in other manifold learning algorithms. Larger datasets usually require a larger perplexity.

  • early_exaggeration (int, optional, default: 12) – Controls how tight natural clusters in the original space are in the embedded space, and how much space will be between them.

  • learning_rate (float, optional, default: 1000) – The learning rate can be a critical parameter, which should be between 100 and 1000.

  • random_state (int, optional, default: 0) – Random seed set for reproducing results.

  • select_frac (float, optional, default: 0.1) – Down sampling fraction on the cells.

  • select_K (int, optional, default: 25) – Number of neighbors to be used to estimate local density for each data point for down sampling.

  • select_alpha (float, optional, default: 1.0) – Weight the down sample to be proportional to radius ** select_alpha.

  • net_alpha (float, optional, default: 0.1) – L2 penalty (regularization term) parameter of the deep regressor.

  • polish_learning_frac (float, optional, default: 0.33) – After running the deep regressor to predict new coordinates, use polish_learning_frac * n_obs as the learning rate to polish the coordinates.

  • polish_n_iter (int, optional, default: 150) – Number of iterations for polishing tSNE run.

  • out_basis (str, optional, default: "net_tsne") – Key name for the approximated tSNE coordinates calculated.

Return type

None

Returns

  • None

  • Update data.obsm

    • data.obsm['X_' + out_basis]: Net tSNE coordinates of the data.

  • Update data.obs

    • data.obs['ds_selected']: Boolean array to indicate which cells are selected during the down sampling phase.

Examples

>>> pg.net_tsne(data)