partipy.compute_archetypes#
- partipy.compute_archetypes(adata, n_archetypes, n_restarts=5, init=None, optim=None, weight=None, max_iter=None, early_stopping=True, rel_tol=None, coreset_algorithm=None, coreset_fraction=0.1, coreset_size=None, delta=0.0, verbose=None, seed=42, n_jobs=-1, save_to_anndata=True, return_result=False, archetypes_only=False, force_recompute=False, **optim_kwargs)#
Perform Archetypal Analysis (AA) on the input data.
This function is a wrapper around the AA class, offering a simplified interface for fitting the model and returning the results, or saving them to the AnnData object. It allows users to customize the archetype computation with various parameters for initialization, optimization, convergence, and output.
- Parameters:
adata (anndata.AnnData) – The AnnData object containing the data to fit the archetypes. The data should be available in
adata.obsm[obsm_key].n_archetypes (int) – The number of archetypes to compute.
n_restarts (int) – The optimization with be run with n_restarts. The run with the lowest RSS will be kept.
init ({
"uniform","furthest_sum","plus_plus"}, default"plus_plus") –Initialization method for the archetypes. Options are:
"plus_plus": Archetypal++ initialization [MS24]."furthest_sum": Utilizes the furthest sum algorithm [MH12]."uniform": Random initialization.
See
partipy.schema.INIT_ALGSfor all available options.optim ({
"regularized_nnls","projected_gradients","frank_wolfe"}, default"projected_gradients") –Optimization algorithm to use (aliases
"PCHA"→"projected_gradients"and"FW"→"frank_wolfe"are also accepted). Options are:"projected_gradients": Projected gradient descent (also known as PCHA) [MH12]."frank_wolfe": Frank-Wolfe algorithm (often abbreviated FW) [BKHT15]."regularized_nnls": Regularized non-negative least squares [CB94].
See
partipy.schema.OPTIM_ALGSfor all available options.weight ({
None,"bisquare","huber"}, defaultNone) –Weighting scheme for robust archetypal analysis, based on [EL11]. Options are:
None: No weighting (standard archetypal analysis)."bisquare": Bisquare weighting for robust estimation."huber": Huber weighting for robust estimation.
See
partipy.schema.WEIGHT_ALGSfor all available options.max_iter (int, default
500) – Maximum number of iterations for the optimization algorithm.early_stopping (bool, default
True) – Whether to stop the optimization early if the relative change in RSS is below a certain threshold.rel_tol (float, default
0.0001) – Tolerance for convergence of the optimization algorithm.coreset_algorithm ({
"None","standard","lightweight_kmeans","uniform"}, defaultNone) –Coreset algorithm to use for data reduction, based on [MB19]. Options are:
None: No coreset is used."standard": Coreset construction for archetypal analysis [MB19]. Recommended option if data reduction is needed."lightweight_kmeans": Lightweight coreset for k-means clustering [LBK16]."uniform": Coreset based on uniform sampling.
See
partipy.schema.CORESET_ALGSfor all available options.coreset_fraction (float, default
0.1) – Fraction of the data to use for the coreset. Only used ifcoreset_algorithmis notNoneand coreset_size isNone.coreset_size (int, default:
None) – If None, it is set ton_samples * coreset_fraction. Otherwise overwrites the coreset_fraction argument.delta (float, default:
0.0) – Parameter that relaxes the constraint that B must be convex combination of the data points. Must be in the interval [0, 1).verbose (bool, default
False) – Whether to display progress messages and additional execution details.seed (int, default
42) – Random seed to use for reproducible results.n_jobs (int, default
-1) – Number of jobs for parallel computation.-1uses all available cores.save_to_anndata (bool, default
True) – Whether to save the results to the AnnData object. If False, the results are returned as a tuple. Ifadatais not an AnnData object, this is ignored.archetypes_only (bool, default
True) – Whether to save/return only the archetypes matrixZ(if det to True) or also the full outputs, including the matricesA,B,RSS, and variance explainedvarexpl.optim_kwargs (dict | None, default
None) – Additional arguments that are passed topartipy.arch.AAreturn_result (bool)
force_recompute (bool)
- Return type:
ndarray|tuple[ndarray,ndarray,ndarray,list[float] |ndarray,float] |None- Returns:
np.ndarray or tuple or None The output depends on the values of
save_to_anndataandarchetypes_only:- If
archetypes_onlyis True: Only the archetype matrix
Zis returned or saved.
- If
- If
archetypes_onlyis False: A tuple is returned or saved, containing:
- Andarray of shape (n_samples, n_archetypes)
The matrix of weights for the data points.
- Bndarray of shape (n_archetypes, n_samples)
The matrix of weights for the archetypes.
- Zndarray of shape (n_archetypes, n_features)
The archetypes matrix.
- RSSfloat
The residual sum of squares from optimization.
- varexplfloat
The variance explained by the model.
- If
- If
save_to_anndatais True: Returns
None. Results are saved toadata.uns["AA_results"].
- If
- If
save_to_anndatais False: The results described above are returned.
- If