pegasus.nmf

pegasus.nmf(data, n_components=20, features='highly_variable_features', space='log', init='nndsvdar', algo='halsvar', mode='batch', tol=0.0001, use_gpu=False, alpha_W=0.0, l1_ratio_W=0.0, alpha_H=0.0, l1_ratio_H=0.0, fp_precision='float', online_chunk_size=5000, n_jobs=-1, random_state=0)[source]

Perform Nonnegative Matrix Factorization (NMF) to the data using Frobenius norm. Steps include select features and L2 normalization and NMF and L2 normalization of resulting coordinates.

The calculation uses nmf-torch package.

Parameters

data (pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.
n_components (int, optional, default: 50.) – Number of Principal Components to get.
features (str, optional, default: "highly_variable_features".) – Keyword in data.var to specify features used for nmf.
max_value (float, optional, default: None.) – The threshold to truncate data symmetrically after scaling. If None, do not truncate.
space (str, optional, default: log.) – Choose from log and expression. log works on log-transformed expression space; expression works on the original expression space (normalized by total UMIs).
init (str, optional, default: nndsvdar.) – Method to initialize NMF. Options are ‘random’, ‘nndsvd’, ‘nndsvda’ and ‘nndsvdar’.
algo (str, optional, default: halsvar) – Choose from mu (Multiplicative Update), hals (Hierarchical Alternative Least Square), halsvar (HALS variant, use HALS to mimic bpp and can get better convergence for sometimes) and bpp (alternative non-negative least squares with Block Principal Pivoting method).
mode (str, optional, default: batch) – Learning mode. Choose from batch and online. Notice that online only works when beta=2.0. For other beta loss, it switches back to batch method.
tol (float, optional, default: 1e-4) – The toleration used for convergence check.
use_gpu (bool, optional, default: False) – If True, use GPU if available. Otherwise, use CPU only.
alpha_W (float, optional, default: 0.0) – A numeric scale factor which multiplies the regularization terms related to W. If zero or negative, no regularization regarding W is considered.
l1_ratio_W (float, optional, default: 0.0) – The ratio of L1 penalty on W, must be between 0 and 1. And thus the ratio of L2 penalty on W is (1 - l1_ratio_W).
alpha_H (float, optional, default: 0.0) – A numeric scale factor which multiplies the regularization terms related to H. If zero or negative, no regularization regarding H is considered.
l1_ratio_H (float, optional, default: 0.0) – The ratio of L1 penalty on W, must be between 0 and 1. And thus the ratio of L2 penalty on H is (1 - l1_ratio_H).
fp_precision (str, optional, default: float) – The numeric precision on the results. Choose from float and double.
online_chunk_size (int, optional, default: int) – The chunk / mini-batch size for online learning. Only works when mode='online'.
n_jobs (int, optional (default: -1)) – Number of threads to use. -1 refers to using all physical CPU cores.
random_state (int, optional, default: 0.) – Random seed to be set for reproducing result.

Return type

None

Returns

None.
Update data.obsm –
- data.obsm["X_nmf"]: Scaled NMF coordinates of shape (n_cells, n_components). Each column has a unit variance.
- data.obsm["H"]: The coordinate factor matrix of shape (n_cells, n_components).
Update data.uns –
- data.uns["W"]: The feature factor matrix of shape (n_HVFs, n_components).
- data.uns["nmf_err"]: The NMF loss.
- data.uns["nmf_features"]: Record the features used to perform NMF analysis.

Examples

>>> pg.nmf(data)