partipy.compute_bootstrap_variance#
- partipy.compute_bootstrap_variance(adata, n_bootstrap, n_archetypes_list=None, init=None, optim=None, weight=None, max_iter=None, early_stopping=True, rel_tol=None, coreset_algorithm=None, coreset_fraction=0.1, coreset_size=None, delta=0.0, seed=42, save_to_anndata=True, return_result=False, n_jobs=-1, verbose=False, force_recompute=False, **optim_kwargs)#
Perform bootstrap sampling to compute archetypes and assess their stability.
This function generates bootstrap samples from the data, computes archetypes for each sample, aligns them with the reference archetypes, and stores the results in
adata.uns["AA_bootstrap"]. It allows assessing the stability of the archetypes across multiple bootstrap iterations.- Parameters:
adata (anndata.AnnData) – The AnnData object containing the data to fit the archetypes. The data should be available in
adata.obsm[obsm_key].n_bootstrap (int) – The number of bootstrap samples to generate.
n_archetypes_list (Union[int, List[int]], default
list(range(2, 8))) – A list specifying the numbers of archetypes to evaluate. Can also be a single int.optim ({
"regularized_nnls","projected_gradients","frank_wolfe"}, default"projected_gradients") –Optimization algorithm to use (aliases
"PCHA"→"projected_gradients"and"FW"→"frank_wolfe"are also accepted). Options are:"projected_gradients": Projected gradient descent (also known as PCHA) [MH12]."frank_wolfe": Frank-Wolfe algorithm (often abbreviated FW) [BKHT15]."regularized_nnls": Regularized non-negative least squares [CB94].
See
partipy.schema.OPTIM_ALGSfor all available options.init ({
"uniform","furthest_sum","plus_plus"}, default"plus_plus") –Initialization method for the archetypes. Options are:
"plus_plus": Archetypal++ initialization [MS24]."furthest_sum": Utilizes the furthest sum algorithm [MH12]."uniform": Random initialization.
See
partipy.schema.INIT_ALGSfor all available options.seed (int, default
42) – Random seed to use for reproducible results.save_to_anndata (bool, default
True) – Whether to save the results toadata.uns["AA_bootstrap"]. IfFalse, the result is returned.n_jobs (int, default
-1) – The number of jobs to run in parallel.-1uses all available cores.verbose (bool, default
False) – Whether to print the progressforce_recompute (bool, default
False) – Recompute bootstrap samples even if cached results for the given configuration already exist.**optim_kwargs – TODO: Additional keyword arguments passed to
AAclass.weight (None | str)
max_iter (None | int)
early_stopping (bool)
rel_tol (None | float)
coreset_algorithm (None | str)
coreset_fraction (float)
coreset_size (None | int)
delta (float)
return_result (bool)
- Return type:
- Returns:
None The results are stored in
adata.uns["AA_bootstrap"]as a DataFrame with the following columns: -x_i: The coordinates of the archetypes in the i-th principal component. -archetype: The archetype index. -iter: The bootstrap iteration index (0 for the reference archetypes). -reference: A boolean indicating whether the archetype is from the reference model. -mean_variance: The mean variance of all archetype coordinates across bootstrap samples. -variance_per_archetype: The mean variance of each archetype coordinates across bootstrap samples.