kb/data/en.wikipedia.org/wiki/ELKI-1.md

---
title: "ELKI"
chunk: 2/2
source: "https://en.wikipedia.org/wiki/ELKI"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T10:11:18.253498+00:00"
instance: "kb-cron"
---

Cluster analysis:
K-means clustering (including fast algorithms such as Elkan, Hamerly, Annulus, and Exponion k-Means, and robust variants such as k-means--)
K-medians clustering
K-medoids clustering (PAM) (including FastPAM and approximations such as CLARA, CLARANS)
Expectation-maximization algorithm for Gaussian mixture modeling
Hierarchical clustering (including the fast SLINK, CLINK, NNChain and Anderberg algorithms)
Single-linkage clustering
Leader clustering
DBSCAN (Density-Based Spatial Clustering of Applications with Noise, with full index acceleration for arbitrary distance functions)
OPTICS (Ordering Points To Identify the Clustering Structure), including the extensions OPTICS-OF, DeLi-Clu, HiSC, HiCO and DiSH
HDBSCAN
Mean-shift clustering
BIRCH clustering
SUBCLU (Density-Connected Subspace Clustering for High-Dimensional Data)
CLIQUE clustering
ORCLUS and PROCLUS clustering
COPAC, ERiC and 4C clustering
CASH clustering
DOC and FastDOC subspace clustering
P3C clustering
Canopy clustering algorithm
Anomaly detection:
k-Nearest-Neighbor outlier detection
LOF (Local outlier factor)
LoOP (Local Outlier Probabilities)
OPTICS-OF
DB-Outlier (Distance-Based Outliers)
LOCI (Local Correlation Integral)
LDOF (Local Distance-Based Outlier Factor)
EM-Outlier
SOD (Subspace Outlier Degree)
COP (Correlation Outlier Probabilities)
Frequent Itemset Mining and association rule learning
Apriori algorithm
Eclat
FP-growth
Dimensionality reduction
Principal component analysis
Multidimensional scaling
T-distributed stochastic neighbor embedding (t-SNE)
Spatial index structures and other search indexes:
R-tree
R*-tree
M-tree
k-d tree
X-tree
Cover tree
iDistance
NN descent
Locality sensitive hashing (LSH)
Evaluation:
Precision and recall, F1 score, Average Precision
Receiver operating characteristic (ROC curve)
Discounted cumulative gain (including NDCG)
Silhouette index
Davies–Bouldin index
Dunn index
Density-based cluster validation (DBCV)
Visualization
Scatter plots
Histograms
Parallel coordinates (also in 3D, using OpenGL)
Other:
Statistical distributions and many parameter estimators, including robust MAD based and L-moment based estimators
Dynamic time warping
Change point detection in time series
Intrinsic dimensionality estimators

== Version history ==
Version 0.1 (July 2008) contained several Algorithms from cluster analysis and anomaly detection, as well as some index structures such as the R*-tree. The focus of the first release was on subspace clustering and correlation clustering algorithms.
Version 0.2 (July 2009) added functionality for time series analysis, in particular distance functions for time series.
Version 0.3 (March 2010) extended the choice of anomaly detection algorithms and visualization modules.
Version 0.4 (September 2011) added algorithms for geo data mining and support for multi-relational database and index structures.
Version 0.5 (April 2012) focuses on the evaluation of cluster analysis results, adding new visualizations and some new algorithms.
Version 0.6 (June 2013) introduces a new 3D adaption of parallel coordinates for data visualization, apart from the usual additions of algorithms and index structures.
Version 0.7 (August 2015) adds support for uncertain data types, and algorithms for the analysis of uncertain data.
Version 0.7.5 (February 2019) adds additional clustering algorithms, anomaly detection algorithms, evaluation measures, and indexing structures.
Version 0.8 (October 2022) adds automatic index creation, garbage collection, and incremental priority search, as well as many more algorithms such as BIRCH.

== Similar applications ==
scikit-learn: machine learning library in Python
Weka: A similar project by the University of Waikato, with a focus on classification algorithms
RapidMiner: An application available commercially (a restricted version is available as open source)
KNIME: An open source platform which integrates various components for machine learning and data mining

== See also ==
Comparison of statistical packages

== References ==

== External links ==
Official website of ELKI with download and documentation.