kb/data/en.wikipedia.org/wiki/Concept_drift-3.md

70 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Concept drift"
chunk: 4/4
source: "https://en.wikipedia.org/wiki/Concept_drift"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T09:53:40.159460+00:00"
instance: "kb-cron"
---
==== Real ====
USP Data Stream Repository, 27 real-world stream datasets with concept drift compiled by Souza et al. (2020). Access
Airline, approximately 116 million flight arrival and departure records (cleaned and sorted) compiled by E. Ikonomovska. Reference: Data Expo 2009 Competition [1]. Access
Chess.com (online games) and Luxembourg (social survey) datasets compiled by I. Zliobaite. Access
ECUE spam 2 datasets each consisting of more than 10,000 emails collected over a period of approximately 2 years by an individual. Access from S.J.Delany webpage
Elec2, electricity demand, 2 classes, 45,312 instances. Reference: M. Harries, Splice-2 comparative evaluation: Electricity pricing, Technical report, The University of South Wales, 1999. Access from J.Gama webpage. Comment on applicability.
PAKDD'09 competition data represents the credit evaluation task. It is collected over a five-year period. Unfortunately, the true labels are released only for the first part of the data. Access
Sensor stream and Power supply stream datasets are available from X. Zhu's Stream Data Mining Repository. Access
SMEAR is a benchmark data stream with a lot of missing values. Environment observation data over 7 years. Predict cloudiness. Access
Text mining, a collection of text mining datasets with concept drift, maintained by I. Katakis. Access
Gas Sensor Array Drift Dataset, a collection of 13,910 measurements from 16 chemical sensors utilized for drift compensation in a discrimination task of 6 gases at various levels of concentrations. Access
==== Other ====
KDD'99 competition data contains simulated intrusions in a military network environment. It is often used as a benchmark to evaluate handling concept drift. Access
==== Synthetic ====
Extreme verification latency benchmark Souza, V.M.A.; Silva, D.F.; Gama, J.; Batista, G.E.A.P.A. (2015). "Data Stream Classification Guided by Clustering on Nonstationary Environments and Extreme Verification Latency". Proceedings of the 2015 SIAM International Conference on Data Mining (SDM). SIAM. pp. 873881. doi:10.1137/1.9781611974010.98. ISBN 978-1-61197-401-0. S2CID 19198944. Access from Nonstationary Environments Archive.
Sine, Line, Plane, Circle and Boolean Data Sets Minku, L.L.; White, A.P.; Yao, X. (2010). "The Impact of Diversity on On-line Ensemble Learning in the Presence of Concept Drift" (PDF). IEEE Transactions on Knowledge and Data Engineering. 22 (5): 730742. Bibcode:2010ITKDE..22..730M. doi:10.1109/TKDE.2009.156. S2CID 16592739. Access from L.Minku webpage.
SEA concepts Street, N.W.; Kim, Y. (2001). "A streaming ensemble algorithm (SEA) for large-scale classification" (PDF). KDD'01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 377382. doi:10.1145/502512.502568. ISBN 978-1-58113-391-2. S2CID 11868540. Access from J.Gama webpage.
STAGGER Schlimmer, J.C.; Granger, R.H. (1986). "Incremental Learning from Noisy Data". Mach. Learn. 1 (3): 317354. doi:10.1007/BF00116895. S2CID 33776987.
Mixed Gama, J.; Medas, P.; Castillo, G.; Rodrigues, P. (2004). "Learning with drift detection". Brazilian symposium on artificial intelligence. Springer. pp. 286295. doi:10.1007/978-3-540-28645-5_29. ISBN 978-3-540-28645-5. S2CID 2606652.
==== Data generation frameworks ====
Minku, White & Yao 2010 Download from L.Minku webpage.
Lindstrom, P.; Delany, S.J.; MacNamee, B. (2008). "Autopilot: Simulating Changing Concepts in Real Data" (PDF). Proceedings of the 19th Irish Conference on Artificial Intelligence & Cognitive Science. pp. 272263.
Narasimhamurthy, A.; Kuncheva, L.I. (2007). "A framework for generating data to simulate changing environments". AIAP'07: Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications. pp. 384389. Code
=== Projects ===
INFER: Computational Intelligence Platform for Evolving and Robust Predictive Systems (20102014), Bournemouth University (UK), Evonik Industries (Germany), Research and Engineering Centre (Poland)
HaCDAIS: Handling Concept Drift in Adaptive Information Systems (20082012), Eindhoven University of Technology (the Netherlands)
KDUS: Knowledge Discovery from Ubiquitous Streams, INESC Porto and Laboratory of Artificial Intelligence and Decision Support (Portugal)
ADEPT: Adaptive Dynamic Ensemble Prediction Techniques, University of Manchester (UK), University of Bristol (UK)
ALADDIN: autonomous learning agents for decentralised data and information networks (20052010)
GAENARI: C++ incremental decision tree algorithm. it minimize concept drifting damage. (2022)
=== Benchmarks ===
NAB: The Numenta Anomaly Benchmark, benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. (20142018)
=== Meetings ===
2014
[] Special Session on "Concept Drift, Domain Adaptation & Learning in Dynamic Environments" @IEEE IJCNN 2014
2013
RealStream Real-World Challenges for Data Stream Mining Workshop-Discussion at the ECML PKDD 2013, Prague, Czech Republic.
LEAPS 2013 The 1st International Workshop on Learning stratEgies and dAta Processing in nonStationary environments
2011
LEE 2011 Special Session on Learning in evolving environments and its application on real-world problems at ICMLA'11
HaCDAIS 2011 The 2nd International Workshop on Handling Concept Drift in Adaptive Information Systems
ICAIS 2011 Track on Incremental Learning
IJCNN 2011 Special Session on Concept Drift and Learning Dynamic Environments
CIDUE 2011 Symposium on Computational Intelligence in Dynamic and Uncertain Environments
2010
HaCDAIS 2010 International Workshop on Handling Concept Drift in Adaptive Information Systems: Importance, Challenges and Solutions
ICMLA10 Special Session on Dynamic learning in non-stationary environments
SAC 2010 Data Streams Track at ACM Symposium on Applied Computing
SensorKDD 2010 International Workshop on Knowledge Discovery from Sensor Data
StreamKDD 2010 Novel Data Stream Pattern Mining Techniques
Concept Drift and Learning in Nonstationary Environments at IEEE World Congress on Computational Intelligence
MLMDS'2010 Special Session on Machine Learning Methods for Data Streams at the 10th International Conference on Intelligent Design and Applications, ISDA'10
== References ==