Anomaly component analysis#
- DepthEucl.ACA(dim: int = 2, sample_size: None = None, sample: None = None, notion: str = 'projection', solver: str = 'neldermead', NRandom: int = 100, n_refinements: int = 10, sphcap_shrink: float = 0.5, alpha_Dirichlet: float = 1.25, cooling_factor: float = 0.95, cap_size: int = 1, start: str = 'mean', space: str = 'sphere', line_solver: str = 'goldensection', bound_gc: bool = True)[source]
Computes the abnormal component analysis
- Arguments
- dim: int, default=2
Number of dimensions to keep in the reduction
- sample_size: int, default=None
Size of the dataset (uniform sampling) to be used in the ACA calculation
- sample: list[int], default=None
Indices for the dataset to be used in the computation
- notion: str {
'projection'
,'halfspace'
}, default=”projection” Chosen notion for depth computation
- solverstr {
'refinedrandom'
,'neldermead'
}, default=”neldermead” The type of solver used to approximate the depth.
- NRandomint, default=1000
The total number of iterations to compute the depth. Some solvers are converging faster so they are run several time to achieve
NRandom
iterations.- n_refinementsint, default = 10
Set the maximum of iteration for computing the depth of one point. For
solver='refinedrandom'
or'refinedgrid'
.- sphcap_shrinkfloat, default = 0.5
It’s the shrinking of the spherical cap. For
solver='refinedrandom'
or'refinedgrid'
.- alpha_Dirichletfloat, default = 1.25
It’s the parameter of the Dirichlet distribution. For
solver='randomsimplices'
.- cooling_factorfloat, default = 0.95
It’s the cooling factor. For
solver='simulatedannealing'
.- cap_sizeint | float, default = 1
It’s the size of the spherical cap. For
solver='simulatedannealing'
or'neldermead'
.- startstr {‘mean’, ‘random’}, default = mean
For
solver='simulatedannealing'
or'neldermead'
, it’s the method used to compute the first depth.- spacestr {‘sphere’, ‘euclidean’}, default = ‘sphere’
For
solver='coordinatedescent'
or'neldermead'
, it’s the type of spacecin which the solver is running.- line_solverstr {‘uniform’, ‘goldensection’}, default = goldensection
For
solver='coordinatedescent'
, it’s the line searh strategy used by this solver.- bound_gcbool, default = True
For
solver='neldermead'
, it’sTrue
if the search is limited to the closed hemisphere.
- Results
ACA directions for dimensional reduction
- References
Valla, R., Mozharovskyi, P., & d’Alché-Buc, F. (2023). Anomaly component analysis. arXiv preprint arXiv:2312.16139.
- Examples
>>> import numpy as np >>> from depth.model import DepthEucl >>> mat1=[[1, 0, 0, 0, 0],[0, 2, 0, 0, 0],[0, 0, 3, 0, 0],[0, 0, 0, 2, 0],[0, 0, 0, 0, 1]] >>> mat2=[[1, 0, 0, 0, 0],[0, 1, 0, 0, 0],[0, 0, 1, 0, 0],[0, 0, 0, 1, 0],[0, 0, 0, 0, 1]] >>> np.random.seed(0) >>> data1 = np.random.multivariate_normal([0,0,0,0,0], mat1, 990) >>> data2 = np.random.multivariate_normal([0,1,1,0,0], mat2, 10) >>> data=np.concat((data1,data2),axis=0) >>> model = DepthEucl().load_dataset(data) >>> model.ACA(dim=2, sample_size=900) array([[-0.13558675, -0.13558675], [ 2.65800844, 2.65800844], [-1.38230018, -1.38230018], [-0.11503065, -0.11503065], [ 0.55349281, 0.55349281]])