pegasus.highly_variable_features
- pegasus.highly_variable_features(data, batch=None, flavor='pegasus', n_top=2000, span=0.02, min_disp=0.5, max_disp=inf, min_mean=0.0125, max_mean=7, n_jobs=-1)[source]
Highly variable features (HVF) selection. The input data should be logarithmized.
- Parameters
data (
pegasusio.MultimodalData) – Annotated data matrix with rows for cells and columns for genes.batch (
str, optional, default:None) – A key in data.obs specifying batch information. If batch is not set, do not consider batch effects in selecting highly variable features. Otherwise, if data.obs[batch] is not categorical, data.obs[batch] will be automatically converted into categorical before highly variable feature selection.flavor (
str, optional, default:"pegasus") – The HVF selection method to use. Available choices are"pegasus"or"Seurat".n_top (
int, optional, default:2000) – Number of genes to be selected as HVF. ifNone, no gene will be selected.span (
float, optional, default:0.02) – Only applicable whenflavoris"pegasus". The smoothing factor used by scikit-learn loess model in pegasus HVF selection method.min_disp (
float, optional, default:0.5) – Minimum normalized dispersion.max_disp (
float, optional, default:np.inf) – Maximum normalized dispersion. Set it tonp.inffor infinity bound.min_mean (
float, optional, default:0.0125) – Minimum mean.max_mean (
float, optional, default:7) – Maximum mean.n_jobs (
int, optional, default:-1) – Number of threads to be used during calculation. If-1, all physical CPU cores will be used.
- Return type
None- Returns
NoneUpdate
data.var–highly_variable_features: replace with Boolean type array indicating the selected highly variable features.
Examples
>>> pg.highly_variable_features(data) >>> pg.highly_variable_features(data, batch="Channel")