pegasus.highly_variable_features¶
- pegasus.highly_variable_features(data, batch=None, flavor='pegasus', n_top=2000, span=0.02, min_disp=0.5, max_disp=inf, min_mean=0.0125, max_mean=7, n_jobs=- 1)[source]¶
Highly variable features (HVF) selection. The input data should be logarithmized.
- Parameters
data (
pegasusio.MultimodalData
) – Annotated data matrix with rows for cells and columns for genes.batch (
str
, optional, default:None
) – A key in data.obs specifying batch information. If batch is not set, do not consider batch effects in selecting highly variable features. Otherwise, if data.obs[batch] is not categorical, data.obs[batch] will be automatically converted into categorical before highly variable feature selection.flavor (
str
, optional, default:"pegasus"
) – The HVF selection method to use. Available choices are"pegasus"
or"Seurat"
.n_top (
int
, optional, default:2000
) – Number of genes to be selected as HVF. ifNone
, no gene will be selected.span (
float
, optional, default:0.02
) – Only applicable whenflavor
is"pegasus"
. The smoothing factor used by scikit-learn loess model in pegasus HVF selection method.min_disp (
float
, optional, default:0.5
) – Minimum normalized dispersion.max_disp (
float
, optional, default:np.inf
) – Maximum normalized dispersion. Set it tonp.inf
for infinity bound.min_mean (
float
, optional, default:0.0125
) – Minimum mean.max_mean (
float
, optional, default:7
) – Maximum mean.n_jobs (
int
, optional, default:-1
) – Number of threads to be used during calculation. If-1
, all physical CPU cores will be used.
- Return type
None
- Returns
None
Update
data.var
–highly_variable_features
: replace with Boolean type array indicating the selected highly variable features.
Examples
>>> pg.highly_variable_features(data) >>> pg.highly_variable_features(data, batch="Channel")