70 lines
6.4 KiB
Markdown
70 lines
6.4 KiB
Markdown
---
|
||
title: "Concept drift"
|
||
chunk: 4/4
|
||
source: "https://en.wikipedia.org/wiki/Concept_drift"
|
||
category: "reference"
|
||
tags: "science, encyclopedia"
|
||
date_saved: "2026-05-05T09:53:40.159460+00:00"
|
||
instance: "kb-cron"
|
||
---
|
||
|
||
==== Real ====
|
||
USP Data Stream Repository, 27 real-world stream datasets with concept drift compiled by Souza et al. (2020). Access
|
||
Airline, approximately 116 million flight arrival and departure records (cleaned and sorted) compiled by E. Ikonomovska. Reference: Data Expo 2009 Competition [1]. Access
|
||
Chess.com (online games) and Luxembourg (social survey) datasets compiled by I. Zliobaite. Access
|
||
ECUE spam 2 datasets each consisting of more than 10,000 emails collected over a period of approximately 2 years by an individual. Access from S.J.Delany webpage
|
||
Elec2, electricity demand, 2 classes, 45,312 instances. Reference: M. Harries, Splice-2 comparative evaluation: Electricity pricing, Technical report, The University of South Wales, 1999. Access from J.Gama webpage. Comment on applicability.
|
||
PAKDD'09 competition data represents the credit evaluation task. It is collected over a five-year period. Unfortunately, the true labels are released only for the first part of the data. Access
|
||
Sensor stream and Power supply stream datasets are available from X. Zhu's Stream Data Mining Repository. Access
|
||
SMEAR is a benchmark data stream with a lot of missing values. Environment observation data over 7 years. Predict cloudiness. Access
|
||
Text mining, a collection of text mining datasets with concept drift, maintained by I. Katakis. Access
|
||
Gas Sensor Array Drift Dataset, a collection of 13,910 measurements from 16 chemical sensors utilized for drift compensation in a discrimination task of 6 gases at various levels of concentrations. Access
|
||
|
||
==== Other ====
|
||
KDD'99 competition data contains simulated intrusions in a military network environment. It is often used as a benchmark to evaluate handling concept drift. Access
|
||
|
||
==== Synthetic ====
|
||
Extreme verification latency benchmark Souza, V.M.A.; Silva, D.F.; Gama, J.; Batista, G.E.A.P.A. (2015). "Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency". Proceedings of the 2015 SIAM International Conference on Data Mining (SDM). SIAM. pp. 873–881. doi:10.1137/1.9781611974010.98. ISBN 978-1-61197-401-0. S2CID 19198944. Access from Nonstationary Environments – Archive.
|
||
Sine, Line, Plane, Circle and Boolean Data Sets Minku, L.L.; White, A.P.; Yao, X. (2010). "The Impact of Diversity on On-line Ensemble Learning in the Presence of Concept Drift" (PDF). IEEE Transactions on Knowledge and Data Engineering. 22 (5): 730–742. Bibcode:2010ITKDE..22..730M. doi:10.1109/TKDE.2009.156. S2CID 16592739. Access from L.Minku webpage.
|
||
SEA concepts Street, N.W.; Kim, Y. (2001). "A streaming ensemble algorithm (SEA) for large-scale classification" (PDF). KDD'01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 377–382. doi:10.1145/502512.502568. ISBN 978-1-58113-391-2. S2CID 11868540. Access from J.Gama webpage.
|
||
STAGGER Schlimmer, J.C.; Granger, R.H. (1986). "Incremental Learning from Noisy Data". Mach. Learn. 1 (3): 317–354. doi:10.1007/BF00116895. S2CID 33776987.
|
||
Mixed Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P. (2004). "Learning with drift detection". Brazilian symposium on artificial intelligence. Springer. pp. 286–295. doi:10.1007/978-3-540-28645-5_29. ISBN 978-3-540-28645-5. S2CID 2606652.
|
||
|
||
==== Data generation frameworks ====
|
||
Minku, White & Yao 2010 Download from L.Minku webpage.
|
||
Lindstrom, P.; Delany, S.J.; MacNamee, B. (2008). "Autopilot: Simulating Changing Concepts in Real Data" (PDF). Proceedings of the 19th Irish Conference on Artificial Intelligence & Cognitive Science. pp. 272–263.
|
||
Narasimhamurthy, A.; Kuncheva, L.I. (2007). "A framework for generating data to simulate changing environments". AIAP'07: Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications. pp. 384–389. Code
|
||
|
||
=== Projects ===
|
||
INFER: Computational Intelligence Platform for Evolving and Robust Predictive Systems (2010–2014), Bournemouth University (UK), Evonik Industries (Germany), Research and Engineering Centre (Poland)
|
||
HaCDAIS: Handling Concept Drift in Adaptive Information Systems (2008–2012), Eindhoven University of Technology (the Netherlands)
|
||
KDUS: Knowledge Discovery from Ubiquitous Streams, INESC Porto and Laboratory of Artificial Intelligence and Decision Support (Portugal)
|
||
ADEPT: Adaptive Dynamic Ensemble Prediction Techniques, University of Manchester (UK), University of Bristol (UK)
|
||
ALADDIN: autonomous learning agents for decentralised data and information networks (2005–2010)
|
||
GAENARI: C++ incremental decision tree algorithm. it minimize concept drifting damage. (2022)
|
||
|
||
=== Benchmarks ===
|
||
NAB: The Numenta Anomaly Benchmark, benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. (2014–2018)
|
||
|
||
=== Meetings ===
|
||
2014
|
||
[] Special Session on "Concept Drift, Domain Adaptation & Learning in Dynamic Environments" @IEEE IJCNN 2014
|
||
2013
|
||
RealStream Real-World Challenges for Data Stream Mining Workshop-Discussion at the ECML PKDD 2013, Prague, Czech Republic.
|
||
LEAPS 2013 The 1st International Workshop on Learning stratEgies and dAta Processing in nonStationary environments
|
||
2011
|
||
LEE 2011 Special Session on Learning in evolving environments and its application on real-world problems at ICMLA'11
|
||
HaCDAIS 2011 The 2nd International Workshop on Handling Concept Drift in Adaptive Information Systems
|
||
ICAIS 2011 Track on Incremental Learning
|
||
IJCNN 2011 Special Session on Concept Drift and Learning Dynamic Environments
|
||
CIDUE 2011 Symposium on Computational Intelligence in Dynamic and Uncertain Environments
|
||
2010
|
||
HaCDAIS 2010 International Workshop on Handling Concept Drift in Adaptive Information Systems: Importance, Challenges and Solutions
|
||
ICMLA10 Special Session on Dynamic learning in non-stationary environments
|
||
SAC 2010 Data Streams Track at ACM Symposium on Applied Computing
|
||
SensorKDD 2010 International Workshop on Knowledge Discovery from Sensor Data
|
||
StreamKDD 2010 Novel Data Stream Pattern Mining Techniques
|
||
Concept Drift and Learning in Nonstationary Environments at IEEE World Congress on Computational Intelligence
|
||
MLMDS'2010 Special Session on Machine Learning Methods for Data Streams at the 10th International Conference on Intelligent Design and Applications, ISDA'10
|
||
|
||
== References == |