diff --git a/_index.db b/_index.db
index 98a121505..591da7c85 100644
Binary files a/_index.db and b/_index.db differ
diff --git a/data/en.wikipedia.org/wiki/Data_mining-0.md b/data/en.wikipedia.org/wiki/Data_mining-0.md
new file mode 100644
index 000000000..2152d9db0
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Data_mining-0.md
@@ -0,0 +1,42 @@
+---
+title: "Data mining"
+chunk: 1/4
+source: "https://en.wikipedia.org/wiki/Data_mining"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:28.867669+00:00"
+instance: "kb-cron"
+---
+
+Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
+The term "data mining" is a misnomer because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself. It also is a buzzword and is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support systems, including artificial intelligence (e.g., machine learning) and business intelligence. Often the more general terms (large scale) data analysis and analytics—or, when referring to actual methods, artificial intelligence and machine learning—are more appropriate.
+The actual data mining task is the semi-automatic or automatic analysis of massive quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting is part of the data mining step, although they do belong to the overall KDD process as additional steps.
+The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data. In contrast, data mining uses machine learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.
+The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These methods can, however, be used in creating new hypotheses to test against the larger data populations.
+
+== Etymology ==
+In the 1960s, statisticians and economists used terms like data fishing or data dredging to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis. The term "data mining" was used in a similarly critical way by economist Michael Lovell in an article published in the Review of Economic Studies in 1983. Lovell indicates that the practice "masquerades under a variety of aliases, ranging from "experimentation" (positive) to "fishing" or "snooping" (negative).
+The term data mining appeared around 1990 in the database community, with generally positive connotations. For a short time in 1980s, the phrase "database mining"™, was used, but since it was trademarked by HNC, a San Diego–based company, to pitch their Database Mining Workstation; researchers consequently turned to data mining. Other terms used include data archaeology, information harvesting, information discovery, knowledge extraction, etc. Gregory Piatetsky-Shapiro coined the term "knowledge discovery in databases" for the first workshop on the same topic (KDD-1989) and this term became more popular in the AI and machine learning communities. However, the term data mining became more popular in the business and press communities. Currently, the terms data mining and knowledge discovery are used interchangeably.
+
+== Background ==
+The manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in data include Bayes' theorem (1700s) and regression analysis (1800s). The proliferation, ubiquity and increasing power of computer technology have dramatically increased data collection, storage, and manipulation ability. As data sets have grown in size and complexity, direct "hands-on" data analysis has increasingly been augmented with indirect, automated data processing, aided by other discoveries in computer science, specially in the field of machine learning, such as neural networks, cluster analysis, genetic algorithms (1950s), decision trees and decision rules (1960s), and support vector machines (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns. in large data sets. It bridges the gap from applied statistics and artificial intelligence (which usually provide the mathematical background) to database management by exploiting the way data is stored and indexed in databases to execute the actual learning and discovery algorithms more efficiently, allowing such methods to be applied to ever-larger data sets.
+
+== Process ==
+The knowledge discovery in databases (KDD) process is commonly defined with the stages:
+
+Selection
+Pre-processing
+Transformation
+Data mining
+Interpretation/evaluation.
+It exists, however, in many variations on this theme, such as the Cross-Industry Standard Process for Data Mining (CRISP-DM) which defines six phases:
+
+Business understanding
+Data understanding
+Data preparation
+Modeling
+Evaluation
+Deployment
+or a simplified process such as (1) Pre-processing, (2) Data Mining, and (3) Results Validation.
+Polls conducted in 2002, 2004, 2007 and 2014 show that the CRISP-DM methodology is the leading methodology used by data miners.
+The only other data mining standard named in these polls was SEMMA. However, 3–4 times as many people reported using CRISP-DM. Several teams of researchers have published reviews of data mining process models, and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in 2008.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Data_mining-1.md b/data/en.wikipedia.org/wiki/Data_mining-1.md
new file mode 100644
index 000000000..57903c884
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Data_mining-1.md
@@ -0,0 +1,44 @@
+---
+title: "Data mining"
+chunk: 2/4
+source: "https://en.wikipedia.org/wiki/Data_mining"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:28.867669+00:00"
+instance: "kb-cron"
+---
+
+=== Pre-processing ===
+Before data mining algorithms can be used, a target data set must be assembled. As data mining can only uncover patterns actually present in the data, the target data set must be large enough to contain these patterns while remaining concise enough to be mined within an acceptable time limit. A common source for data is a data mart or data warehouse. Pre-processing is essential to analyze the multivariate data sets before data mining. The target set is then cleaned. Data cleaning removes the observations containing noise and those with missing data.
+
+=== Data mining ===
+Data mining involves six common classes of tasks:
+
+Anomaly detection (outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors that require further investigation due to being out of standard range.
+Association rule learning (dependency modeling) – Searches for relationships between variables. For example, a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis.
+Clustering – is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.
+Classification – is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".
+Regression – attempts to find a function that models the data with the least error that is, for estimating the relationships among data or datasets.
+Summarization – providing a more compact representation of the data set, including visualization and report generation.
+
+=== Results validation ===
+Data mining can unintentionally be misused, producing results that appear to be significant but which do not actually predict future behavior and cannot be reproduced on a new sample of data, therefore bearing little use. This is sometimes caused by investigating too many hypotheses and not performing proper statistical hypothesis testing. A simple version of this problem in machine learning is known as overfitting, but the same problem can arise at different phases of the process and thus a train/test split—when applicable at all—may not be sufficient to prevent this from happening.
+The final step of knowledge discovery from data is to verify that the patterns produced by the data mining algorithms occur in the wider data set. Not all patterns found by the algorithms are necessarily valid. It is common for data mining algorithms to find patterns in the training set which are not present in the general data set. This is called overfitting. To overcome this, the evaluation uses a test set of data on which the data mining algorithm was not trained. The learned patterns are applied to this test set, and the resulting output is compared to the desired output. For example, a data mining algorithm trying to distinguish "spam" from "legitimate" e-mails would be trained on a training set of sample e-mails. Once trained, the learned patterns would be applied to the test set of e-mails on which it had not been trained. The accuracy of the patterns can then be measured from how many e-mails they correctly classify. Several statistical methods may be used to evaluate the algorithm, such as ROC curves.
+If the learned patterns do not meet the desired standards, it is necessary to re-evaluate and change the pre-processing and data mining steps. If the learned patterns do meet the desired standards, then the final step is to interpret the learned patterns and turn them into knowledge.
+
+== Research ==
+The premier professional body in the field is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining (SIGKDD). Since 1989, this ACM SIG has hosted an annual international conference and published its proceedings, and since 1999 it has published a biannual academic journal titled "SIGKDD Explorations".
+Computer science conferences on data mining include:
+
+CIKM Conference – ACM Conference on Information and Knowledge Management
+European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
+KDD Conference – ACM SIGKDD Conference on Knowledge Discovery and Data Mining
+Data mining topics are also present in many data management/database conferences such as the ICDE Conference, SIGMOD Conference and International Conference on Very Large Data Bases.
+
+== Standards ==
+There have been some efforts to define standards for the data mining process, for example, the 1999 European Cross-Industry Standard Process for Data Mining (CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM 1.0). Development on successors to these processes (CRISP-DM 2.0 and JDM 2.0) was active in 2006 but has stalled since. JDM 2.0 was withdrawn without reaching a final draft.
+For exchanging the extracted models—in particular for use in predictive analytics—the key standard is the Predictive Model Markup Language (PMML), which is an XML-based language developed by the Data Mining Group (DMG) and supported as exchange format by many data mining applications. As the name suggests, it only covers prediction models, a particular data mining task of high importance to business applications. However, extensions to cover (for example) subspace clustering have been proposed independently of the DMG.
+
+== Notable uses ==
+
+Data mining is used wherever there is digital data available. Notable examples of data mining can be found throughout business, medicine, science, finance, construction, and surveillance.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Data_mining-2.md b/data/en.wikipedia.org/wiki/Data_mining-2.md
new file mode 100644
index 000000000..047bf2938
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Data_mining-2.md
@@ -0,0 +1,41 @@
+---
+title: "Data mining"
+chunk: 3/4
+source: "https://en.wikipedia.org/wiki/Data_mining"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:28.867669+00:00"
+instance: "kb-cron"
+---
+
+== Privacy concerns and ethics ==
+While the term "data mining" itself may have no ethical implications, it is often associated with the mining of information in relation to user behavior (ethical and otherwise).
+The ways in which data mining can be used can in some cases and contexts raise questions regarding privacy, legality, and ethics. In particular, data mining government or commercial data sets for national security or law enforcement purposes, such as in the Total Information Awareness Program or in ADVISE, has raised privacy concerns.
+Data mining requires data preparation which uncovers information or patterns which compromise confidentiality and privacy obligations. A common way for this to occur is through data aggregation. Data aggregation involves combining data together (possibly from various sources) in a way that facilitates analysis (but that also might make identification of private, individual-level data deducible or otherwise apparent). The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly compiled data set, to be able to identify specific individuals, especially when the data were originally anonymous.
+Data may also be modified so as to become anonymous, so that individuals may not readily be identified. However, even "anonymized" data sets can potentially contain enough information to allow identification of individuals, as occurred when journalists were able to find several individuals based on a set of search histories that were inadvertently released by AOL.
+The inadvertent revelation of personally identifiable information leading to the provider violates Fair Information Practices.   This indiscretion can cause financial,
+emotional, or bodily harm to the indicated individual.  In one instance of privacy violation, the patrons of Walgreens filed a lawsuit against the company in 2011 for selling
+prescription information to data mining companies who in turn provided the data
+to pharmaceutical companies.
+
+=== Situation in Europe ===
+Europe has rather strong privacy laws, and efforts are underway to further strengthen the rights of the consumers. However, the U.S.–E.U. Safe Harbor Principles, developed between 1998 and 2000, currently effectively expose European users to privacy exploitation by U.S. companies. As a consequence of Edward Snowden's global surveillance disclosure, there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the National Security Agency, and attempts to reach an agreement with the United States have failed.
+In the United Kingdom in particular there have been cases of corporations using data mining as a way to target certain groups of customers forcing them to pay unfairly high prices. These groups tend to be people of lower socio-economic status who are not savvy to the ways they can be exploited in digital market places.
+
+=== Situation in the United States ===
+In the United States, privacy concerns have been addressed by the US Congress via the passage of regulatory controls such as the Health Insurance Portability and Accountability Act (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and future uses. According to an article in Biotech Business Week, "'[i]n practice, HIPAA may not offer any greater protection than the longstanding regulations in the research arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approaching a level of incomprehensibility to average individuals." This underscores the necessity for data anonymity in data aggregation and mining practices.
+U.S. information privacy legislation such as HIPAA and the Family Educational Rights and Privacy Act (FERPA) applies only to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.S. is not controlled by any legislation.
+
+== Copyright law ==
+
+=== Situation in Europe ===
+
+==== European Union ====
+Even if there is no copyright in a dataset, the European Union recognises a Database right, so data mining becomes subject to intellectual property owners' rights that are protected by the Database Directive. Under European copyright database laws, the mining of in-copyright works (such as by web mining) without the permission of the copyright owner is permitted under Articles 3 and 4 of the 2019 Directive on Copyright in the Digital Single Market. A specific TDM exception for scientific research is described in article 3, whereas a more general exception described in article 4 only applies if the copyright holder has not opted out.
+The European Commission facilitated stakeholder discussion on text and data mining in 2013, under the title of Licences for Europe. The focus on the solution to this legal issue, such as licensing rather than limitations and exceptions, led to representatives of universities, researchers, libraries, civil society groups and open access publishers to leave the stakeholder dialogue in May 2013.
+
+==== United Kingdom ====
+On the recommendation of the Hargreaves review, this led to the UK government to amend its copyright law in 2014 to allow content mining as a limitation and exception. The UK was the second country in the world to do so after Japan, which introduced an exception in 2009 for data mining. However, due to the restriction of the Information Society Directive (2001), the UK exception only allows content mining for non-commercial purposes. UK copyright law also does not allow this provision to be overridden by contractual terms and conditions.
+
+==== Switzerland ====
+Since 2020, also Switzerland has been regulating data mining by allowing it in the research field under certain conditions laid down by art. 24d of the Swiss Copyright Act. This new article entered into force on 1 April 2020.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Data_mining-3.md b/data/en.wikipedia.org/wiki/Data_mining-3.md
new file mode 100644
index 000000000..250f1d94b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Data_mining-3.md
@@ -0,0 +1,74 @@
+---
+title: "Data mining"
+chunk: 4/4
+source: "https://en.wikipedia.org/wiki/Data_mining"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:28.867669+00:00"
+instance: "kb-cron"
+---
+
+=== Situation in the United States ===
+US copyright law, and in particular its provision for fair use, upholds the legality of content mining in America, and other fair use countries such as Israel, Taiwan and South Korea. As content mining is transformative, that is it does not supplant the original work, it is viewed as being lawful under fair use. For example, as part of the Google Book settlement the presiding judge on the case ruled that Google's digitization project of in-copyright books was lawful, in part because of the transformative uses that the digitization project displayed—one being text and data mining.
+
+== Software ==
+
+=== Free open-source data mining software and applications ===
+The following applications are available under free/open-source licenses. Public access to application source code is also available.
+
+Carrot2: Text and search results clustering framework.
+Chemicalize.org: A chemical structure miner and web search engine.
+ELKI: A university research project with advanced cluster analysis and outlier detection methods written in the Java language.
+GATE: a natural language processing and language engineering tool.
+KNIME: The Konstanz Information Miner, a user-friendly and comprehensive data analytics framework.
+Massive Online Analysis (MOA): a real-time big data stream mining with concept drift tool in the Java programming language.
+MEPX: cross-platform tool for regression and classification problems based on a Genetic Programming variant.
+mlpack: a collection of ready-to-use machine learning algorithms written in the C++ language.
+NLTK (Natural Language Toolkit): A suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python language.
+OpenNN: Open neural networks library.
+Orange: A component-based data mining and machine learning software suite written in the Python language.
+PSPP: Data mining and statistics software under the GNU Project similar to SPSS
+R: A programming language and software environment for statistical computing, data mining, and graphics. It is part of the GNU Project.
+scikit-learn: An open-source machine learning library for the Python programming language;
+Torch: An open-source deep learning library for the Lua programming language and scientific computing framework with wide support for machine learning algorithms (development of it moved mostly to the much more used Python-based PyTorch)
+UIMA: The UIMA (Unstructured Information Management Architecture) is a component framework for analyzing unstructured content such as text, audio and video – originally developed by IBM.
+Weka: A suite of machine learning software applications written in the Java programming language.
+
+=== Proprietary data-mining software and applications ===
+The following applications are available under proprietary licenses.
+
+Angoss KnowledgeSTUDIO: data mining tool
+LIONsolver: an integrated software application for data mining, business intelligence, and modeling that implements the Learning and Intelligent OptimizatioN (LION) approach.
+PolyAnalyst: data and text mining software by Megaputer Intelligence.
+Microsoft Analysis Services: data mining software provided by Microsoft.
+NetOwl: suite of multilingual text and entity analytics products that enable data mining.
+Oracle Data Mining: data mining software by Oracle Corporation.
+PSeven: platform for automation of engineering simulation and analysis, multidisciplinary optimization and data mining provided by DATADVANCE.
+Qlucore Omics Explorer: data mining software.
+RapidMiner: An environment for machine learning and data mining experiments.
+SAS Enterprise Miner: data mining software provided by the SAS Institute.
+SPSS Modeler: data mining software provided by IBM.
+STATISTICA Data Miner: data mining software provided by StatSoft.
+Tanagra: Visualization-oriented data mining software, also for teaching.
+Vertica: data mining software provided by Hewlett-Packard.
+Google Cloud Platform: automated custom ML models managed by Google.
+Amazon SageMaker: managed service provided by Amazon for creating & productionising custom ML models.
+
+== See also ==
+Methods
+
+Application domains
+
+Application examples
+
+Related topics
+For more information about extracting information out of data (as opposed to analyzing data), see:
+
+Other resources
+International Journal of Data Warehousing and Mining
+
+== References ==
+
+== Further reading ==
+
+== External links ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Decision_theory-0.md b/data/en.wikipedia.org/wiki/Decision_theory-0.md
new file mode 100644
index 000000000..cee78f8de
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Decision_theory-0.md
@@ -0,0 +1,34 @@
+---
+title: "Decision theory"
+chunk: 1/2
+source: "https://en.wikipedia.org/wiki/Decision_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:30.080120+00:00"
+instance: "kb-cron"
+---
+
+Decision theory or the theory of rational choice is a branch of probability, economics, and analytic philosophy that uses expected utility and probability to model how individuals would behave rationally under uncertainty. It differs from the cognitive and behavioral sciences in that it is mainly prescriptive and concerned with identifying optimal decisions for a rational agent, rather than describing how people actually make decisions. Despite this, the field is important to the study of real human behavior by social scientists, as it lays the foundations to mathematically model and analyze individuals in fields such as sociology, economics, criminology, cognitive science, moral philosophy and political science.
+
+== History ==
+
+The roots of decision theory lie in probability theory, developed by Blaise Pascal and Pierre de Fermat in the 17th century, which was later refined by others like Christiaan Huygens. These developments provided a framework for understanding risk and uncertainty, which are central to decision-making.
+In the 18th century, Daniel Bernoulli introduced the concept of "expected utility" in the context of gambling, which was later formalized by John von Neumann and Oskar Morgenstern in the 1940s. Their work on Game Theory and Expected Utility Theory helped establish a rational basis for decision-making under uncertainty.
+After World War II, decision theory expanded into economics, particularly with the work of economists like Milton Friedman and others, who applied it to market behavior and consumer choice theory. This era also saw the development of Bayesian decision theory, which incorporates Bayesian probability into decision-making models.
+By the late 20th century, scholars like Daniel Kahneman and Amos Tversky challenged the assumptions of rational decision-making. Their work in behavioral economics highlighted cognitive biases and heuristics that influence real-world decisions, leading to the development of prospect theory, which modified expected utility theory by accounting for psychological factors.
+
+== Branches ==
+Normative decision theory is concerned with identification of optimal decisions where optimality is often determined by considering an ideal decision maker who is able to calculate with perfect accuracy and is in some sense fully rational. The practical application of this prescriptive approach (how people ought to make decisions) is called decision analysis and is aimed at finding tools, methodologies, and software (decision support systems) to help people make better decisions.
+In contrast, descriptive decision theory is concerned with describing observed behaviors often under the assumption that those making decisions are behaving under some consistent rules. These rules may, for instance, have a procedural framework (e.g. Amos Tversky's elimination by aspects model) or an axiomatic framework (e.g. stochastic transitivity axioms), reconciling the Von Neumann-Morgenstern axioms with behavioral violations of the expected utility hypothesis, or they may explicitly give a functional form for time-inconsistent utility functions (e.g. Laibson's quasi-hyperbolic discounting).
+Prescriptive decision theory is concerned with predictions about behavior that positive decision theory produces to allow for further tests of the kind of decision-making that occurs in practice. In recent decades, there has also been increasing interest in "behavioral decision theory", contributing to a re-evaluation of what useful decision-making requires.
+
+== Types of decisions ==
+
+=== Choice under uncertainty ===
+
+The area of choice under uncertainty represents the heart of decision theory. Known from the 17th century (Blaise Pascal invoked it in his famous wager, which is contained in his Pensées, published in 1670), the idea of expected value is that, when faced with a number of actions, each of which could give rise to more than one possible outcome with different probabilities, the rational procedure is to identify all possible outcomes, determine their values (positive or negative) and the probabilities that will result from each course of action, and multiply the two to give an "expected value", or the average expectation for an outcome; the action to be chosen should be the one that gives rise to the highest total expected value. In 1738, Daniel Bernoulli published an influential paper entitled Exposition of a New Theory on the Measurement of Risk, in which he uses the St. Petersburg paradox to show that expected value theory must be normatively wrong. He gives an example in which a Dutch merchant is trying to decide whether to insure a cargo being sent from Amsterdam to St. Petersburg in winter. In his solution, he defines a utility function and computes expected utility rather than expected financial value.
+In the 20th century, interest was reignited by Abraham Wald's 1939 paper pointing out that the two central procedures of sampling-distribution-based statistical-theory, namely hypothesis testing and parameter estimation, are special cases of the general decision problem. Wald's paper renewed and synthesized many concepts of statistical theory, including loss functions, risk functions, admissible decision rules, antecedent distributions, Bayesian procedures, and minimax procedures. The phrase "decision theory" itself was used in 1950 by E. L. Lehmann.
+The revival of subjective probability theory, from the work of Frank Ramsey, Bruno de Finetti, Leonard Savage and others, extended the scope of expected utility theory to situations where subjective probabilities can be used. At the time, von Neumann and Morgenstern's theory of expected utility proved that expected utility maximization followed from basic postulates about rational behavior.
+The work of Maurice Allais and Daniel Ellsberg showed that human behavior has systematic and sometimes important departures from expected-utility maximization (Allais paradox and Ellsberg paradox). The prospect theory of Daniel Kahneman and Amos Tversky renewed the empirical study of economic behavior with less emphasis on rationality presuppositions. It describes a way by which people make decisions when all of the outcomes carry a risk. Kahneman and Tversky found three regularities – in actual human decision-making, "losses loom larger than gains"; people focus more on changes in their utility-states than they focus on absolute utilities; and the estimation of subjective probabilities is severely biased by anchoring.
+
+=== Intertemporal choice ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Decision_theory-1.md b/data/en.wikipedia.org/wiki/Decision_theory-1.md
new file mode 100644
index 000000000..11adb44f4
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Decision_theory-1.md
@@ -0,0 +1,47 @@
+---
+title: "Decision theory"
+chunk: 2/2
+source: "https://en.wikipedia.org/wiki/Decision_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:30.080120+00:00"
+instance: "kb-cron"
+---
+
+Intertemporal choice is concerned with the kind of choice where different actions lead to outcomes that are realized at different stages over time. It is also described as cost-benefit decision making since it involves the choices between rewards that vary according to magnitude and time of arrival. If someone received a windfall of several thousand dollars, they could spend it on an expensive holiday, giving them immediate pleasure, or they could invest it in a pension scheme, giving them an income at some time in the future. What is the optimal thing to do? The answer depends partly on factors such as the expected rates of interest and inflation, the person's life expectancy, and their confidence in the pensions industry. However even with all those factors taken into account, human behavior again deviates greatly from the predictions of prescriptive decision theory, leading to alternative models in which, for example, objective interest rates are replaced by subjective discount rates.
+
+=== Interaction of decision makers ===
+
+Some decisions are difficult because of the need to take into account how other people in the situation will respond to the decision that is taken. The analysis of such social decisions is often treated under decision theory, though it involves mathematical methods. In the emerging field of socio-cognitive engineering, the research is especially focused on the different types of distributed decision-making in human organizations, in normal and abnormal/emergency/crisis situations.
+
+=== Complex decisions ===
+Other areas of decision theory are concerned with decisions that are difficult simply because of their complexity, or the complexity of the organization that has to make them. Individuals making decisions are limited in resources (i.e. time and intelligence) and are therefore boundedly rational; the issue is thus, more than the deviation between real and optimal behavior, the difficulty of determining the optimal behavior in the first place. Decisions are also affected by whether options are framed together or separately; this is known as the distinction bias.
+
+== Heuristics ==
+
+Heuristics are procedures for making a decision without working out the consequences of every option. Heuristics decrease the amount of evaluative thinking required for decisions, focusing on some aspects of the decision while ignoring others. While quicker than step-by-step processing, heuristic thinking is also more likely to involve fallacies or inaccuracies.
+One example of a common and erroneous thought process that arises through heuristic thinking is the gambler's fallacy — believing that an isolated random event is affected by previous isolated random events. For example, if flips of a fair coin give repeated tails, the coin still has the same probability (i.e., 0.5) of tails in future turns, though intuitively it might seems that heads becomes more likely. In the long run, heads and tails should occur equally often; people commit the gambler's fallacy when they use this heuristic to predict that a result of heads is "due" after a run of tails. Another example is that decision-makers may be biased towards preferring moderate alternatives to extreme ones. The compromise effect operates under a mindset that the most moderate option carries the most benefit. In an incomplete information scenario, as in most daily decisions, the moderate option will look more appealing than either extreme, independent of the context, based only on the fact that it has characteristics that can be found at either extreme.
+
+== Alternatives ==
+
+A highly controversial issue is whether one can replace the use of probability in decision theory with something else.
+
+=== Probability theory ===
+Advocates for the use of probability theory point to:
+
+the work of Richard Threlkeld Cox for justification of the probability axioms,
+the Dutch book paradoxes of Bruno de Finetti as illustrative of the theoretical difficulties that can arise from departures from the probability axioms, and
+the complete class theorems, which show that all admissible decision rules are equivalent to the Bayesian decision rule for some utility function and some prior distribution (or for the limit of a sequence of prior distributions). Thus, for every decision rule, either the rule may be reformulated as a Bayesian procedure (or a limit of a sequence of such), or there is a rule that is sometimes better and never worse.
+
+=== Alternatives to probability theory ===
+The proponents of fuzzy logic, possibility theory, Dempster–Shafer theory, and info-gap decision theory maintain that probability is only one of many alternatives and point to many examples where non-standard alternatives have been implemented with apparent success. Notably, probabilistic decision theory can sometimes be sensitive to assumptions about the probabilities of various events, whereas non-probabilistic rules, such as minimax, are robust in that they do not make such assumptions.
+
+=== Ludic fallacy ===
+
+A general criticism of decision theory based on a fixed universe of possibilities is that it considers the "known unknowns", not the "unknown unknowns": it focuses on expected variations, not on unforeseen events, which some argue have outsized impact and must be considered – significant events may be "outside model". This line of argument, called the ludic fallacy, is that there are inevitable imperfections in modeling the real world by particular models, and that unquestioning reliance on models blinds one to their limits.
+
+== See also ==
+
+== References ==
+
+== Further reading ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Exact_sciences-0.md b/data/en.wikipedia.org/wiki/Exact_sciences-0.md
new file mode 100644
index 000000000..75491c23b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Exact_sciences-0.md
@@ -0,0 +1,22 @@
+---
+title: "Exact sciences"
+chunk: 1/1
+source: "https://en.wikipedia.org/wiki/Exact_sciences"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:31.305081+00:00"
+instance: "kb-cron"
+---
+
+The exact sciences or quantitative sciences, sometimes called the exact mathematical sciences, are those sciences "which admit of absolute precision in their results"; especially the mathematical sciences.   Examples of the exact sciences are mathematics, optics, astronomy, and physics, which many philosophers from René Descartes, Gottfried Leibniz, and Immanuel Kant to the logical positivists took as paradigms of rational and objective knowledge.  These sciences have been practiced in many cultures from antiquity to modern times.   Given their ties to mathematics, the exact sciences are characterized by accurate quantitative expression, precise predictions and/or rigorous methods of testing hypotheses involving quantifiable predictions and measurements.
+The distinction between the quantitative exact sciences and those sciences that deal with the causes of things is due to Aristotle, who distinguished mathematics from natural philosophy and considered the exact sciences to be the "more natural of the branches of mathematics."  Thomas Aquinas employed this distinction when he said that astronomy explains the spherical shape of the Earth by mathematical reasoning while physics explains it by material causes.  This distinction was widely, but not universally, accepted until the Scientific Revolution of the 17th century.  Edward Grant has proposed that a fundamental change leading to the new sciences was the unification of the exact sciences and physics by Johannes Kepler, Isaac Newton, and others, which resulted in a quantitative investigation of the physical causes of natural phenomena.
+
+
+== See also ==
+
+Hard and soft science
+Fundamental science
+Demarcation problem
+
+
+== References ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-0.md b/data/en.wikipedia.org/wiki/Game_theory-0.md
new file mode 100644
index 000000000..64bfcdeff
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-0.md
@@ -0,0 +1,27 @@
+---
+title: "Game theory"
+chunk: 1/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+Game theory is the study of mathematical models of strategic interactions. It has applications in many fields of social science, and is used extensively in economics, logic, systems science and computer science. Initially, game theory addressed two-person zero-sum games, in which a participant's gains or losses are exactly balanced by the losses and gains of the other participant. In the 1950s, it was extended to the study of non zero-sum games, and was eventually applied to a wide range of behavioral relations. It is now an umbrella term for the science of rational decision making in humans, animals, and computers.
+Modern game theory began with the idea of mixed-strategy equilibria in two-person zero-sum games and its proof by John von Neumann. Von Neumann's original proof used the Brouwer fixed-point theorem on continuous mappings into compact convex sets, which became a standard method in game theory and mathematical economics. His paper was followed by Theory of Games and Economic Behavior (1944), co-written with Oskar Morgenstern, which considered cooperative games of several players. The second edition provided an axiomatic theory of expected utility, which allowed mathematical statisticians and economists to treat decision-making under uncertainty.
+Game theory was developed extensively in the 1950s, and was explicitly applied to evolution in the 1970s, although similar developments go back at least as far as the 1930s. Game theory has been widely recognized as an important tool in many fields. John Maynard Smith was awarded the Crafoord Prize for his application of evolutionary game theory in 1999, and fifteen game theorists have won the Nobel Prize in economics as of 2020, including most recently Paul Milgrom and Robert B. Wilson.
+
+== History ==
+Discussions on the mathematics of games began long before the rise of modern, mathematical game theory. Cardano wrote on games of chance in Liber de ludo aleae (Book on Games of Chance), written around 1564 but published posthumously in 1663. Influenced by the work of Fermat and Pascal on the problem of points, Huygens developed the concept of expectation on reasoning about the structure of games of chance, publishing his gambling calculus in De ratiociniis in ludo aleæ (On Reasoning in Games of Chance) in 1657.
+In 1713, a letter attributed to Charles Waldegrave, an active Jacobite and uncle to British diplomat James Waldegrave, analyzed a game called "le her". Waldegrave provided a minimax mixed strategy solution to a two-person version of the card game, and the problem is now known as the Waldegrave problem.
+In 1838, Antoine Augustin Cournot provided a model of competition in oligopolies. Though he did not refer to it as such, he presented a solution that is the Nash equilibrium of the game in his Recherches sur les principes mathématiques de la théorie des richesses (Researches into the Mathematical Principles of the Theory of Wealth). In 1883, Joseph Bertrand critiqued Cournot's model as unrealistic, providing an alternative model of price competition which would later be formalized by Francis Ysidro Edgeworth.
+In 1913, Ernst Zermelo published Über eine Anwendung der Mengenlehre auf die Theorie des Schachspiels (On an Application of Set Theory to the Theory of the Game of Chess), which proved that the optimal chess strategy is strictly determined.
+
+=== Foundation ===
+
+The work of John von Neumann established game theory as its own independent field in the early-to-mid 20th century, with von Neumann publishing his paper On the Theory of Games of Strategy in 1928. Von Neumann's original proof used Brouwer's fixed-point theorem on continuous mappings into compact convex sets, which became a standard method in game theory and mathematical economics. Von Neumann's work in game theory culminated in his 1944 book Theory of Games and Economic Behavior, co-authored with Oskar Morgenstern. The second edition of this book provided an axiomatic theory of utility, which reincarnated Daniel Bernoulli's old theory of utility (of money) as an independent discipline. This foundational work contains the method for finding mutually consistent solutions for two-person zero-sum games. Subsequent work focused primarily on cooperative game theory, which analyzes optimal strategies for groups of individuals, presuming that they can enforce agreements between them about proper strategies.
+In his 1938 book Applications aux Jeux de Hasard and earlier notes, Émile Borel proved a minimax theorem for two-person zero-sum matrix games only when the pay-off matrix is symmetric and provided a solution to a non-trivial infinite game (known in English as Blotto game). Borel conjectured the non-existence of mixed-strategy equilibria in finite two-person zero-sum games, a conjecture that was proved false by von Neumann.
+
+In 1950, John Nash developed a criterion for mutual consistency of players' strategies known as the Nash equilibrium, applicable to a wider variety of games than the criterion proposed by von Neumann and Morgenstern. Nash proved that every finite n-player, non-zero-sum (not just two-player zero-sum) non-cooperative game has what is now known as a Nash equilibrium in mixed strategies.
+Game theory experienced a flurry of activity in the 1950s, during which the concepts of the core, the extensive form game, fictitious play, repeated games, and the Shapley value were developed. The 1950s also saw the first applications of game theory to philosophy and political science. The first mathematical discussion of the prisoner's dilemma appeared, and an experiment was undertaken by mathematicians Merrill M. Flood and Melvin Dresher, as part of the RAND Corporation's investigations into game theory. RAND pursued the studies because of possible applications to global nuclear strategy.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-1.md b/data/en.wikipedia.org/wiki/Game_theory-1.md
new file mode 100644
index 000000000..288588fcf
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-1.md
@@ -0,0 +1,38 @@
+---
+title: "Game theory"
+chunk: 2/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+==== Prize-winning achievements ====
+In 1965, Reinhard Selten introduced his solution concept of subgame perfect equilibria, which further refined the Nash equilibrium. Later he would introduce trembling hand perfection as well. In 1994 Nash, Selten and Harsanyi became Economics Nobel Laureates for their contributions to economic game theory.
+In the 1970s, game theory was extensively applied in biology, largely as a result of the work of John Maynard Smith and his evolutionarily stable strategy. In addition, the concepts of correlated equilibrium, trembling hand perfection and common knowledge were introduced and analyzed.
+In 1994, John Nash was awarded the Nobel Memorial Prize in the Economic Sciences for his contribution to game theory. Nash's most famous contribution to game theory is the concept of the Nash equilibrium, which is a solution concept for non-cooperative games, published in 1951. A Nash equilibrium is a set of strategies, one for each player, such that no player can improve their payoff by unilaterally changing their strategy.
+In 2005, game theorists Thomas Schelling and Robert Aumann followed Nash, Selten, and Harsanyi as Nobel Laureates. Schelling worked on dynamic models, early examples of evolutionary game theory. Aumann contributed more to the equilibrium school, introducing equilibrium coarsening and correlated equilibria, and developing an extensive formal analysis of the assumption of common knowledge and of its consequences.
+In 2007, Leonid Hurwicz, Eric Maskin, and Roger Myerson were awarded the Nobel Prize in Economics "for having laid the foundations of mechanism design theory". Myerson's contributions include the notion of proper equilibrium, and an important graduate text: Game Theory, Analysis of Conflict. Hurwicz introduced and formalized the concept of incentive compatibility.
+In 2012, Alvin E. Roth and Lloyd S. Shapley were awarded the Nobel Prize in Economics "for the theory of stable allocations and the practice of market design". In 2014, the Nobel went to game theorist Jean Tirole.
+
+== Different types of games ==
+
+=== Cooperative / non-cooperative ===
+
+A game is cooperative if the players are able to form binding commitments externally enforced (e.g. through contract law). A game is non-cooperative if players cannot form alliances or if all agreements need to be self-enforcing (e.g. through credible threats).
+Cooperative games are often analyzed through the framework of cooperative game theory, which focuses on predicting which coalitions will form, the joint actions that groups take, and the resulting collective payoffs. It is different from non-cooperative game theory which focuses on predicting individual players' actions and payoffs by analyzing Nash equilibria.
+Cooperative game theory provides a high-level approach as it describes only the structure and payoffs of coalitions, whereas non-cooperative game theory also looks at how strategic interaction will affect the distribution of payoffs. As non-cooperative game theory is more general, cooperative games can be analyzed through the approach of non-cooperative game theory (the converse does not hold) provided that sufficient assumptions are made to encompass all the possible strategies available to players due to the possibility of external enforcement of cooperation.
+
+=== Symmetric / asymmetric ===
+
+A symmetric game is a game where each player earns the same payoff when making the same choice. In other words, the identity of the player does not change the resulting game facing the other player. Many of the commonly studied 2×2 games are symmetric. The standard representations of chicken, the prisoner's dilemma, and the stag hunt are all symmetric games.
+The most commonly studied asymmetric games are games where there are not identical strategy sets for both players. For instance, the ultimatum game and similarly the dictator game have different strategies for each player. It is possible, however, for a game to have identical strategies for both players, yet be asymmetric. For example, the game pictured in this section's graphic is asymmetric despite having identical strategy sets for both players.
+
+=== Zero-sum / non-zero-sum ===
+
+Zero-sum games (more generally, constant-sum games) are games in which choices by players can neither increase nor decrease the available resources. In zero-sum games, the total benefit goes to all players in a game, for every combination of strategies, and always adds to zero (more informally, a player benefits only at the equal expense of others). Poker exemplifies a zero-sum game (ignoring the possibility of the house's cut), because one wins exactly the amount one's opponents lose. Other zero-sum games include matching pennies and most classical board games including Go and chess.
+Many games studied by game theorists (including the famed prisoner's dilemma) are non-zero-sum games, because the outcome has net results greater or less than zero. Informally, in non-zero-sum games, a gain by one player does not necessarily correspond with a loss by another.
+Furthermore, constant-sum games correspond to activities like theft and gambling, but not to the fundamental economic situation in which there are potential gains from trade. It is possible to transform any constant-sum game into a (possibly asymmetric) zero-sum game by adding a dummy player (often called "the board") whose losses compensate the players' net winnings.
+
+=== Simultaneous / sequential ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-10.md b/data/en.wikipedia.org/wiki/Game_theory-10.md
new file mode 100644
index 000000000..01406d378
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-10.md
@@ -0,0 +1,33 @@
+---
+title: "Game theory"
+chunk: 11/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+=== Trust game ===
+The Trust Game is an experiment designed to measure trust in economic decisions. It is also called "the investment game" and is designed to investigate trust and demonstrate its importance rather than "rationality" of self-interest. The game was designed by Berg Joyce, John Dickhaut and Kevin McCabe in 1995.
+In the game, one player (the investor) is given a sum of money and must decide how much of it to give to another player (the trustee). The amount given is then tripled by the experimenter. The trustee then decides how much of the tripled amount to return to the investor. If the trustee is completely self-interested, then they would return nothing. However, experiments have shown that this isn't the expected behavior of the trustee. The outcome instead suggests that people are willing to place trust, by risking some amount of money, in the belief that there will be reciprocity.
+
+=== Cournot Competition ===
+
+The Cournot competition model involves players choosing quantity of a homogenous product to produce independently and simultaneously, where marginal cost can be different for each firm and the firm's payoff is profit. The production costs are public information and the firm aims to find their profit-maximizing quantity based on what they believe the other firm will produce and behave like monopolies. In this game firms want to produce at the monopoly quantity but there is a high incentive to deviate and produce more, which decreases the market-clearing price. For example, firms may be tempted to deviate from the monopoly quantity if there is a low monopoly quantity and high price, with the aim of increasing production to maximize  profit. However this option does not provide the highest payoff, as a firm's ability to maximize profits depends on its market share and the elasticity of the market demand. The Cournot equilibrium is reached when each firm operates on their reaction function with no incentive to deviate, as they have the best response based on the other firms output. Within the game, firms reach the Nash equilibrium when the Cournot equilibrium is achieved.   
+
+=== Bertrand Competition ===
+
+The Bertrand competition assumes homogenous products and a constant marginal cost and players choose the prices. The equilibrium of price competition is where the price is equal to marginal costs, assuming complete information about the competitors' costs. Therefore, the firms have an incentive to deviate from the equilibrium because a homogenous product with a lower price will gain all of the market share, known as a cost advantage.
+
+== In popular culture ==
+Based on the 1998 book by Sylvia Nasar, the life story of game theorist and mathematician John Nash was turned into the 2001 biopic A Beautiful Mind, starring Russell Crowe as Nash.
+The 1959 military science fiction novel Starship Troopers by Robert A. Heinlein mentioned "games theory" and "theory of games". In the 1997 film of the same name, the character Carl Jenkins referred to his military intelligence assignment as being assigned to "games and theory".
+The 1964 film Dr. Strangelove satirizes game theoretic ideas about deterrence theory. For example, nuclear deterrence depends on the threat to retaliate catastrophically if a nuclear attack is detected. A game theorist might argue that such threats can fail to be credible, in the sense that they can lead to subgame imperfect equilibria. The movie takes this idea one step further, with the Soviet Union irrevocably committing to a catastrophic nuclear response without making the threat public.
+The 1980s power pop band Game Theory was founded by singer/songwriter Scott Miller, who described the band's name as alluding to "the study of calculating the most appropriate action given an adversary ... to give yourself the minimum amount of failure".
+Liar Game, a 2005 Japanese manga and 2007 television series, presents the main characters in each episode with a game or problem that is typically drawn from game theory, as demonstrated by the strategies applied by the characters.
+The 1974 novel Spy Story by Len Deighton explores elements of game theory in regard to cold war army exercises.
+The 2008 novel The Dark Forest by Liu Cixin explores the relationship between extraterrestrial life, humanity, and game theory.
+Joker, the prime antagonist in the 2008 film The Dark Knight presents game theory concepts—notably the prisoner's dilemma in a scene where he asks passengers in two different ferries to bomb the other one to save their own.
+In the 2018 film Crazy Rich Asians, the female lead Rachel Chu is a professor of economics and game theory at New York University. At the beginning of the film she is seen in her NYU classroom playing a game of poker with her teaching assistant and wins the game by bluffing; then in the climax of the film, she plays a game of mahjong with her boyfriend's disapproving mother Eleanor, losing the game to Eleanor on purpose but winning her approval as a result.
+In the 2017 film Molly's Game, Brad, an inexperienced poker player, makes an irrational betting decision without realizing and causes his opponent Harlan to deviate from his Nash Equilibrium strategy, resulting in a significant loss when Harlan loses the hand.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-11.md b/data/en.wikipedia.org/wiki/Game_theory-11.md
new file mode 100644
index 000000000..73aa9c67e
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-11.md
@@ -0,0 +1,79 @@
+---
+title: "Game theory"
+chunk: 12/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+== See also ==
+Applied ethics – Practical application of moral considerations
+Bandwidth-sharing game – Type of resource allocation game
+Chainstore paradox – Game theory paradox
+Collective intentionality – Social concept in philosophy of mind
+Core (game theory) – Set in game theory
+Glossary of game theory
+Intra-household bargaining
+Kingmaker scenario – Endgame situation in game theory
+Law and economics – Analysis of law using economic theory
+Mutual assured destruction – Doctrine of military strategyPages displaying short descriptions of redirect targets
+Outline of artificial intelligence
+Parrondo's paradox – Paradox of combining strategies
+Precautionary principle – Risk management strategy
+Quantum refereed game
+Risk management – Identification, evaluation and control of risks
+Self-confirming equilibrium – Aspect of game theory
+Tragedy of the commons – Overuse of a shared resource
+Traveler's dilemma – Non-zero-sum game thought experiment
+Wilson doctrine (economics) – Argument in economic theory
+Compositional game theory
+Lists
+
+List of cognitive biases
+List of emerging technologies
+List of games in game theory
+
+== Notes ==
+
+== References ==
+
+== Further reading ==
+
+Ben-David, S.; Borodin, A.; Karp, R.; Tardos, G.; Wigderson, A. (January 1994). "On the power of randomization in on-line algorithms". Algorithmica. 11 (1): 2–14. doi:10.1007/BF01294260. S2CID 26771869.
+Downs, Anthony (1957), An Economic theory of Democracy, New York: Harper
+Fisher, Sir Ronald Aylmer (1930). The Genetical Theory of Natural Selection. Clarendon Press.
+Gauthier, David (1986), Morals by agreement, Oxford University Press, ISBN 978-0-19-824992-4
+Grim, Patrick; Kokalis, Trina; Alai-Tafti, Ali; Kilb, Nicholas; St Denis, Paul (2004), "Making meaning happen", Journal of Experimental & Theoretical Artificial Intelligence, 16 (4): 209–243, Bibcode:2004JETAI..16..209G, doi:10.1080/09528130412331294715, S2CID 5737352
+Harper, David; Maynard Smith, John (2003), Animal signals, Oxford University Press, ISBN 978-0-19-852685-8
+Howard, Nigel (1971), Paradoxes of Rationality: Games, Metagames, and Political Behavior, Cambridge, MA: The MIT Press, ISBN 978-0-262-58237-7
+Kavka, Gregory S. (1986). Hobbesian Moral and Political Theory. Princeton University Press. ISBN 978-0-691-02765-4.
+Lewis, David (1969), Convention: A Philosophical Study, ISBN 978-0-631-23257-5 (2002 edition)
+Maynard Smith, John; Price, George R. (1973), "The logic of animal conflict", Nature, 246 (5427): 15–18, Bibcode:1973Natur.246...15S, doi:10.1038/246015a0, S2CID 4224989
+Osborne, Martin J.; Rubinstein, Ariel (1994), A course in game theory, MIT Press, ISBN 978-0-262-65040-3. A modern introduction at the graduate level.
+Poundstone, William (1993). Prisoner's Dilemma (1st Anchor Books ed.). New York: Anchor. ISBN 0-385-41580-X.
+Quine, W.v.O (1967), "Truth by Convention", Philosophica Essays for A.N. Whitehead, Russel and Russel Publishers, ISBN 978-0-8462-0970-6
+Quine, W.v.O (1960), "Carnap and Logical Truth", Synthese, 12 (4): 350–374, doi:10.1007/BF00485423, S2CID 46979744
+Skyrms, Brian (1996), Evolution of the social contract, Cambridge University Press, ISBN 978-0-521-55583-8
+Skyrms, Brian (2004), The stag hunt and the evolution of social structure, Cambridge University Press, ISBN 978-0-521-53392-8
+Sober, Elliott; Wilson, David Sloan (1998), Unto others: the evolution and psychology of unselfish behavior, Harvard University Press, ISBN 978-0-674-93047-6
+Webb, James N. (2007), Game theory: decisions, interaction and evolution, Undergraduate mathematics, Springer, ISBN 978-1-84628-423-6 Consistent treatment of game types usually claimed by different applied fields, e.g. Markov decision processes.
+
+=== Textbooks and general literature ===
+Aumann, Robert J (1987), "game theory", The New Palgrave: A Dictionary of Economics, vol. 2, pp. 460–82.
+Camerer, Colin (2003), "Introduction", Behavioral Game Theory: Experiments in Strategic Interaction, Russell Sage Foundation, pp. 1–25, ISBN 978-0-691-09039-9, archived from the original on 14 May 2011, retrieved 9 February 2011, Description.
+Dutta, Prajit K. (1999), Strategies and games: theory and practice, MIT Press, ISBN 978-0-262-04169-0. Suitable for undergraduate and business students.
+Fernandez, L F.; Bierman, H S. (1998), Game theory with economic applications, Addison-Wesley, ISBN 978-0-201-84758-1. Suitable for upper-level undergraduates.
+Gaffal, Margit; Padilla Gálvez, Jesús (2014). Dynamics of Rational Negotiation: Game Theory, Language Games and Forms of Life. Springer.
+Gibbons, Robert D. (1992), Game theory for applied economists, Princeton University Press, ISBN 978-0-691-00395-5. Suitable for advanced undergraduates.
+Published in Europe as Gibbons, Robert (2001), A Primer in Game Theory, London: Harvester Wheatsheaf, ISBN 978-0-7450-1159-2.
+Gintis, Herbert (2000), Game theory evolving: a problem-centered introduction to modeling strategic behavior, Princeton University Press, ISBN 978-0-691-00943-8
+Green, Jerry R.; Mas-Colell, Andreu; Whinston, Michael D. (1995), Microeconomic theory, Oxford University Press, ISBN 978-0-19-507340-9. Presents game theory in formal way suitable for graduate level.
+Joseph E. Harrington (2008) Games, strategies, and decision making, Worth, ISBN 0-7167-6630-2. Textbook suitable for undergraduates in applied fields; numerous examples, fewer formalisms in concept presentation.
+Isaacs, Rufus (1999), Differential Games: A Mathematical Theory With Applications to Warfare and Pursuit, Control and Optimization, New York: Dover Publications, ISBN 978-0-486-40682-4
+Michael Maschler; Eilon Solan; Shmuel Zamir (2013), Game Theory, Cambridge University Press, ISBN 978-1-108-49345-1. Undergraduate textbook.
+Miller, James H. (2003), Game theory at work: how to use game theory to outthink and outmaneuver your competition, New York: McGraw-Hill, ISBN 978-0-07-140020-6. Suitable for a general audience.
+Shoham, Yoav; Leyton-Brown, Kevin (2009), Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, New York: Cambridge University Press, ISBN 978-0-521-89943-7, retrieved 8 March 2016
+Watson, Joel (2013), Strategy: An Introduction to Game Theory (3rd edition), New York: W.W. Norton and Co., ISBN 978-0-393-91838-0. A leading textbook at the advanced undergraduate level.
+McCain, Roger A. (2010). Game Theory: A Nontechnical Introduction to the Analysis of Strategy. World Scientific. ISBN 978-981-4289-65-8.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-12.md b/data/en.wikipedia.org/wiki/Game_theory-12.md
new file mode 100644
index 000000000..a4a22c81d
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-12.md
@@ -0,0 +1,66 @@
+---
+title: "Game theory"
+chunk: 13/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+=== Historically important texts ===
+Aumann, R. J.; Shapley, L. S. (1974), Values of Non-Atomic Games, Princeton University Press
+Cournot, A. Augustin (1838), "Recherches sur les principles mathematiques de la théorie des richesses", Libraire des Sciences Politiques et Sociales
+Edgeworth, Francis Y. (1881), Mathematical Psychics, London: Kegan Paul
+Farquharson, Robin (1969), Theory of Voting, Blackwell (Yale U.P. in the U.S.), ISBN 978-0-631-12460-3
+Luce, R. Duncan; Raiffa, Howard (1957), Games and decisions: introduction and critical survey, New York: Wiley
+reprinted edition: R. Duncan Luce; Howard Raiffa (1989), Games and decisions: introduction and critical survey, New York: Dover Publications, ISBN 978-0-486-65943-5
+Maynard Smith, John (1982), Evolution and the theory of games, Cambridge University Press, ISBN 978-0-521-28884-2
+Nash, John (1950), "Equilibrium points in n-person games", Proceedings of the National Academy of Sciences of the United States of America, 36 (1): 48–49, Bibcode:1950PNAS...36...48N, doi:10.1073/pnas.36.1.48, PMC 1063129, PMID 16588946
+Shapley, L.S. (1953), A Value for n-person Games, In: Contributions to the Theory of Games volume II, H. W. Kuhn and A. W. Tucker (eds.)
+Shapley, L. S. (October 1953). "Stochastic Games". Proceedings of the National Academy of Sciences. 39 (10): 1095–1100. Bibcode:1953PNAS...39.1095S. doi:10.1073/pnas.39.10.1095. PMC 1063912. PMID 16589380.
+von Neumann, John (1928), "Zur Theorie der Gesellschaftsspiele", Mathematische Annalen, 100 (1): 295–320, Bibcode:1928MatAn.100..295V, doi:10.1007/bf01448847, S2CID 122961988 English translation: "On the Theory of Games of Strategy," in A. W. Tucker and R. D. Luce, ed. (1959), Contributions to the Theory of Games, v. 4, p. 42. Princeton University Press.
+von Neumann, John; Morgenstern, Oskar (1944), "Theory of games and economic behavior", Nature, 157 (3981), Princeton University Press: 172, Bibcode:1946Natur.157..172R, doi:10.1038/157172a0, S2CID 29754824
+Zermelo, Ernst (1913), "Über eine Anwendung der Mengenlehre auf die Theorie des Schachspiels", Proceedings of the Fifth International Congress of Mathematicians, 2: 501–4
+
+=== Other material ===
+Allan Gibbard, "Manipulation of voting schemes: a general result", Econometrica, Vol. 41, No. 4 (1973), pp. 587–601.
+McDonald, John (1950–1996), Strategy in Poker, Business & War, W. W. Norton, ISBN 978-0-393-31457-1 {{citation}}: ISBN / Date incompatibility (help). A layman's introduction.
+Papayoanou, Paul (2010), Game Theory for Business: A Primer in Strategic Gaming, Probabilistic, ISBN 978-0-9647938-7-3.
+Satterthwaite, Mark Allen (April 1975). "Strategy-proofness and Arrow's conditions: Existence and correspondence theorems for voting procedures and social welfare functions" (PDF). Journal of Economic Theory. 10 (2): 187–217. doi:10.1016/0022-0531(75)90050-2.
+Siegfried, Tom (2006), A Beautiful Math, Joseph Henry Press, ISBN 978-0-309-10192-9
+Skyrms, Brian (1990), The Dynamics of Rational Deliberation, Harvard University Press, ISBN 978-0-674-21885-7
+Thrall, Robert M.; Lucas, William F. (1963), "
+  
+    
+      
+        n
+      
+    
+    {\displaystyle n}
+  
+-person games in partition function form", Naval Research Logistics Quarterly, 10 (4): 281–298, doi:10.1002/nav.3800100126
+Dolev, Shlomi; Panagopoulou, Panagiota N.; Rabie, Mikaël; Schiller, Elad M.; Spirakis, Paul G. (2011). "Rationality authority for provable rational behavior". Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing. pp. 289–290. doi:10.1145/1993806.1993858. ISBN 978-1-4503-0719-2.
+Chastain, Erick; Livnat, Adi; Papadimitriou, Christos; Vazirani, Umesh (June 2014), "Algorithms, games, and evolution", Proceedings of the National Academy of Sciences of the United States of America, 111 (29): 10620–10623, Bibcode:2014PNAS..11110620C, doi:10.1073/pnas.1406556111, PMC 4115542, PMID 24979793
+
+== External links ==
+
+James Miller (2015): Introductory Game Theory Videos.
+"Games, theory of", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
+Paul Walker: History of Game Theory Page.
+David Levine: Game Theory. Papers, Lecture Notes and much more stuff.
+Alvin Roth:"Game Theory and Experimental Economics page". Archived from the original on 15 August 2000. Retrieved 13 September 2003.  — Comprehensive list of links to game theory information on the Web
+Adam Kalai: Game Theory and Computer Science — Lecture notes on Game Theory and Computer Science
+Mike Shor: GameTheory.net — Lecture notes, interactive illustrations and other information.
+Jim Ratliff's Graduate Course in Game Theory Archived 29 March 2010 at the Wayback Machine (lecture notes).
+Don Ross: Review Of Game Theory in the Stanford Encyclopedia of Philosophy.
+Bruno Verbeek and Christopher Morris: Game Theory and Ethics
+Elmer G. Wiens: Game Theory — Introduction, worked examples, play online two-person zero-sum games.
+Marek M. Kaminski: Game Theory and Politics Archived 20 October 2006 at the Wayback Machine — Syllabuses and lecture notes for game theory and political science.
+Websites on game theory and social interactions
+Kesten Green's Conflict Forecasting at the Wayback Machine (archived 11 April 2011) — See Papers for evidence on the accuracy of forecasts from game theory and other methods Archived 15 September 2019 at the Wayback Machine.
+McKelvey, Richard D., McLennan, Andrew M., and Turocy, Theodore L. (2007) Gambit: Software Tools for Game Theory.
+Benjamin Polak: Open Course on Game Theory at Yale Archived 3 August 2010 at the Wayback Machine videos of the course
+Benjamin Moritz, Bernhard Könsgen, Danny Bures, Ronni Wiersch, (2007) Spieltheorie-Software.de: An application for Game Theory implemented in JAVA.
+Antonin Kucera: Stochastic Two-Player Games.
+Yu-Chi Ho: What is Mathematical Game Theory; What is Mathematical Game Theory (#2); What is Mathematical Game Theory (#3); What is Mathematical Game Theory (#4)-Many person game theory; What is Mathematical Game Theory ?( #5) – Finale, summing up, and my own view
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-2.md b/data/en.wikipedia.org/wiki/Game_theory-2.md
new file mode 100644
index 000000000..7ad8c7dbb
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-2.md
@@ -0,0 +1,35 @@
+---
+title: "Game theory"
+chunk: 3/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+Simultaneous games are games where both players move simultaneously, or instead the later players are unaware of the earlier players' actions (making them effectively simultaneous). Sequential games (a type of dynamic games) are games where players do not make decisions simultaneously, and player's earlier actions affect the outcome and decisions of other players. This need not be perfect information about every action of earlier players; it might be very little knowledge. For instance, a player may know that an earlier player did not perform one particular action, while they do not know which of the other available actions the first player actually performed.
+The difference between simultaneous and sequential games is captured in the different representations discussed above. Often, normal form is used to represent simultaneous games, while extensive form is used to represent sequential ones. The transformation of extensive to normal form is one way, meaning that multiple extensive form games correspond to the same normal form. Consequently, notions of equilibrium for simultaneous games are insufficient for reasoning about sequential games; see subgame perfection.
+In short, the differences between sequential and simultaneous games are as follows:
+
+=== Perfect information and imperfect information ===
+
+An important subset of sequential games consists of games of perfect information. A game with perfect information means that all players, at every move in the game, know the previous history of the game and the moves previously made by all other players. An imperfect information game is played when the players do not know all moves already made by the opponent such as a simultaneous move game. Examples of perfect-information games include tic-tac-toe, checkers, chess, and Go.
+Many card games are games of imperfect information, such as poker and bridge. Perfect information is often confused with complete information, which is a similar concept pertaining to the common knowledge of each player's sequence, strategies, and payoffs throughout gameplay. Complete information requires that every player know the strategies and payoffs available to the other players but not necessarily the actions taken, whereas perfect information is knowledge of all aspects of the game and players. Games of incomplete information can be reduced, however, to games of imperfect information by introducing "moves by nature".
+
+=== Bayesian game ===
+
+One of the assumptions of the Nash equilibrium is that every player has correct beliefs about the actions of the other players. However, there are many situations in game theory where participants do not fully understand the characteristics of their opponents. Negotiators may be unaware of their opponent's valuation of the object of negotiation, companies may be unaware of their opponent's cost functions, combatants may be unaware of their opponent's strengths, and jurors may be unaware of their colleague's interpretation of the evidence at trial. In some cases, participants may know the character of their opponent well, but may not know how well their opponent knows his or her own character.
+Bayesian game means a strategic game with incomplete information. For a strategic game, decision makers are players, and every player has a group of actions. A core part of the imperfect information specification is the set of states. Every state completely describes a collection of characteristics relevant to the player such as their preferences and details about them. There must be a state for every set of features that some player believes may exist.
+
+For example, where Player 1 is unsure whether Player 2 would rather date her or get away from her, while Player 2 understands Player 1's preferences as before. To be specific, supposing that Player 1 believes that Player 2 wants to date her under a probability of 1/2 and get away from her under a probability of 1/2 (this evaluation comes from Player 1's experience probably: she faces players who want to date her half of the time in such a case and players who want to avoid her half of the time). Due to the probability involved, the analysis of this situation requires to understand the player's preference for the draw, even though people are only interested in pure strategic equilibrium.
+
+=== Combinatorial games ===
+Games in which the difficulty of finding an optimal strategy stems from the multiplicity of possible moves are called combinatorial games. Examples include chess, shogi, and Go. Games that involve imperfect information may also have a strong combinatorial character, for instance backgammon. There is no unified theory addressing combinatorial elements in games. There are, however, mathematical tools that can solve some particular problems and answer some general questions.
+Games of perfect information have been studied in combinatorial game theory, which has developed novel representations, e.g. surreal numbers, as well as combinatorial and algebraic (and sometimes non-constructive) proof methods to solve games of certain types, including "loopy" games that may result in infinitely long sequences of moves. These methods address games with higher combinatorial complexity than those usually considered in traditional (or "economic") game theory. A typical game that has been solved this way is Hex. A related field of study, drawing from computational complexity theory, is game complexity, which is concerned with estimating the computational difficulty of finding optimal strategies.
+Research in artificial intelligence has addressed both perfect and imperfect information games that have very complex combinatorial structures (like chess, go, or backgammon) for which no provable optimal strategies have been found. The practical solutions involve computational heuristics, like alpha–beta pruning or use of artificial neural networks trained by reinforcement learning, which make games more tractable in computing practice.
+
+=== Discrete and continuous games ===
+Much of game theory is concerned with finite, discrete games that have a finite number of players, moves, events, outcomes, etc. Many concepts can be extended, however. Continuous games allow players to choose a strategy from a continuous strategy set. For instance, Cournot competition is typically modeled with players' strategies being any non-negative quantities, including fractional quantities.
+
+=== Differential games ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-3.md b/data/en.wikipedia.org/wiki/Game_theory-3.md
new file mode 100644
index 000000000..79192d4e2
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-3.md
@@ -0,0 +1,37 @@
+---
+title: "Game theory"
+chunk: 4/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+Differential games such as the continuous pursuit and evasion game are continuous games where the evolution of the players' state variables is governed by differential equations. The problem of finding an optimal strategy in a differential game is closely related to the optimal control theory. In particular, there are two types of strategies: the open-loop strategies are found using the Pontryagin maximum principle while the closed-loop strategies are found using Bellman's Dynamic Programming method.
+A particular case of differential games are the games with a random time horizon. In such games, the terminal time is a random variable with a given probability distribution function. Therefore, the players maximize the mathematical expectation of the cost function. It was shown that the modified optimization problem can be reformulated as a discounted differential game over an infinite time interval.
+
+=== Evolutionary game theory ===
+
+Evolutionary game theory studies players who adjust their strategies over time according to rules that are not necessarily rational or farsighted.  In general, the evolution of strategies over time according to such rules is modeled as a Markov chain with a state variable such as the current strategy profile or how the game has been played in the recent past. Such rules may feature imitation, optimization, or survival of the fittest.
+In biology, such models can represent evolution, in which offspring adopt their parents' strategies and parents who play more successful strategies (i.e. corresponding to higher payoffs) have a greater number of offspring. In the social sciences, such models typically represent strategic adjustment by players who play a game many times within their lifetime and, consciously or unconsciously, occasionally adjust their strategies.
+
+=== Stochastic outcomes (and relation to other fields) ===
+Individual decision problems with stochastic outcomes are sometimes considered "one-player games". They may be modeled using similar tools within the related disciplines of decision theory, operations research, and areas of artificial intelligence, particularly AI planning (with uncertainty) and multi-agent system. Although these fields may have different motivators, the mathematics involved are substantially the same, e.g. using Markov decision processes (MDP).
+Stochastic outcomes can also be modeled in terms of game theory by adding a randomly acting player who makes "chance moves" ("moves by nature"). This player is not typically considered a third player in what is otherwise a two-player game, but merely serves to provide a roll of the dice where required by the game.
+For some problems, different approaches to modeling stochastic outcomes may lead to different solutions. For example, the difference in approach between MDPs and the minimax solution is that the latter considers the worst-case over a set of adversarial moves, rather than reasoning in expectation about these moves given a fixed probability distribution. The minimax approach may be advantageous where stochastic models of uncertainty are not available, but may also be overestimating extremely unlikely (but costly) events, dramatically swaying the strategy in such scenarios if it is assumed that an adversary can force such an event to happen. (See Black swan theory for more discussion on this kind of modeling issue, particularly as it relates to predicting and limiting losses in investment banking.)
+General models that include all elements of stochastic outcomes, adversaries, and partial or noisy observability (of moves by other players) have also been studied. The "gold standard" is considered to be partially observable stochastic game (POSG), but few realistic problems are computationally feasible in POSG representation.
+
+=== Metagames ===
+These are games the play of which is the development of the rules for another game, the target or subject game. Metagames seek to maximize the utility value of the rule set developed. The theory of metagames is related to mechanism design theory.
+The term metagame analysis is also used to refer to a practical approach developed by Nigel Howard, whereby a situation is framed as a strategic game in which stakeholders try to realize their objectives by means of the options available to them. Subsequent developments have led to the formulation of confrontation analysis.
+
+=== Mean field game theory ===
+
+Mean field game theory is the study of strategic decision making in very large populations of small interacting agents. This class of problems was considered in the economics literature by Boyan Jovanovic and Robert W. Rosenthal, in the engineering literature by Peter E. Caines, and by mathematicians Pierre-Louis Lions and Jean-Michel Lasry.
+
+== Representation of games ==
+The games studied in game theory are well-defined mathematical objects. To be fully defined, a game must specify the following elements: the players of the game, the information and actions available to each player at each decision point, and the payoffs for each outcome. (Eric Rasmusen refers to these four "essential elements" by the acronym "PAPI".) A game theorist typically uses these elements, along with a solution concept of their choosing, to deduce a set of equilibrium strategies for each player such that, when these strategies are employed, no player can profit by unilaterally deviating from their strategy. These equilibrium strategies determine an equilibrium to the game—a stable state in which either one outcome occurs or a set of outcomes occur with known probability.
+Most cooperative games are presented in the characteristic function form, while the extensive and the normal forms are used to define noncooperative games.
+
+=== Extensive form ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-4.md b/data/en.wikipedia.org/wiki/Game_theory-4.md
new file mode 100644
index 000000000..e4a03e6b5
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-4.md
@@ -0,0 +1,62 @@
+---
+title: "Game theory"
+chunk: 5/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+The extensive form can be used to formalize games with a time sequencing of moves. Extensive form games can be visualized using game trees (as pictured here). Here each vertex (or node) represents a point of choice for a player. The player is specified by a number listed by the vertex. The lines out of the vertex represent a possible action for that player. The payoffs are specified at the bottom of the tree. The extensive form can be viewed as a multi-player generalization of a decision tree. To solve any extensive form game, backward induction must be used. It involves working backward up the game tree to determine what a rational player would do at the last vertex of the tree, what the player with the previous move would do given that the player with the last move is rational, and so on until the first vertex of the tree is reached.
+The game pictured consists of two players.  The way this particular game is structured (i.e., with sequential decision making and perfect information), Player 1 "moves" first by choosing either F or U (fair or unfair). Next in the sequence, Player 2, who has now observed Player 1's move, can choose to play either A or R  (accept or reject). Once Player 2 has made their choice, the game is considered finished and each player gets their respective payoff, represented in the image as two numbers, where the first number represents Player 1's payoff, and the second number represents Player 2's payoff.  Suppose that Player 1 chooses U and then Player 2 chooses A: Player 1 then gets a payoff of "eight" (which in real-world terms can be interpreted in many ways, the simplest of which is in terms of money but could mean things such as eight days of vacation or eight countries conquered or even eight more opportunities to play the same game against other players) and Player 2 gets a payoff of "two".
+The extensive form can also capture simultaneous-move games and games with imperfect information. To represent it, either a dotted line connects different vertices to represent them as being part of the same information set (i.e. the players do not know at which point they are), or a closed line is drawn around them. (See example in the imperfect information section.)
+
+=== Normal form ===
+
+The normal (or strategic form) game is usually represented by a matrix which shows the players, strategies, and payoffs (see the example to the right). More generally it can be represented by any function that associates a payoff for each player with every possible combination of actions. In the accompanying example there are two players; one chooses the row and the other chooses the column. Each player has two strategies, which are specified by the number of rows and the number of columns. The payoffs are provided in the interior. The first number is the payoff received by the row player (Player 1 in our example); the second is the payoff for the column player (Player 2 in our example). Suppose that Player 1 plays Up and that Player 2 plays Left. Then Player 1 gets a payoff of 4, and Player 2 gets 3.
+When a game is presented in normal form, it is presumed that each player acts simultaneously or, at least, without knowing the actions of the other. If players have some information about the choices of other players, the game is usually presented in extensive form.
+Every extensive-form game has an equivalent normal-form game, however, the transformation to normal form may result in an exponential blowup in the size of the representation, making it computationally impractical.
+
+=== Characteristic function form ===
+
+In cooperative game theory the characteristic function lists the payoff of each coalition. The origin of this formulation is in John von Neumann and Oskar Morgenstern's book.
+Formally, a characteristic function is a function 
+  
+    
+      
+        v
+        :
+        
+          2
+          
+            N
+          
+        
+        →
+        
+          R
+        
+      
+    
+    {\displaystyle v:2^{N}\to \mathbb {R} }
+  
+ from the set of all possible coalitions of players to a set of payments, and also satisfies 
+  
+    
+      
+        v
+        (
+        ∅
+        )
+        =
+        0
+      
+    
+    {\displaystyle v(\emptyset )=0}
+  
+. The function describes how much collective payoff a set of players can gain by forming a coalition.
+
+=== Alternative game representations ===
+
+Alternative game representation forms are used for some subclasses of games or adjusted to the needs of interdisciplinary research. In addition to classical game representations, some of the alternative representations also encode time related aspects.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-5.md b/data/en.wikipedia.org/wiki/Game_theory-5.md
new file mode 100644
index 000000000..15dc44d79
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-5.md
@@ -0,0 +1,29 @@
+---
+title: "Game theory"
+chunk: 6/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+== General and applied uses ==
+As a method of applied mathematics, game theory has been used to study a wide variety of human and animal behaviors. It was initially developed in economics to understand a large collection of economic behaviors, including behaviors of firms, markets, and consumers. The first use of game-theoretic analysis was by Antoine Augustin Cournot in 1838 with his solution of the Cournot duopoly. The use of game theory in the social sciences has expanded, and game theory has been applied to political, sociological, and psychological behaviors as well.
+Although pre-twentieth-century naturalists such as Charles Darwin made game-theoretic kinds of statements, the use of game-theoretic analysis in biology began with Ronald Fisher's studies of animal behavior during the 1930s. This work predates the name "game theory", but it shares many important features with this field. The developments in economics were later applied to biology largely by John Maynard Smith in his 1982 book Evolution and the Theory of Games.
+In addition to being used to describe, predict, and explain behavior, game theory has also been used to develop theories of ethical or normative behavior and to prescribe such behavior. In economics and philosophy, scholars have applied game theory to help in the understanding of good or proper behavior. Game-theoretic approaches have also been suggested in the philosophy of language and philosophy of science. Game-theoretic arguments of this type can be found as far back as Plato. An alternative version of game theory, called chemical game theory, represents the player's choices as metaphorical chemical reactant molecules called "knowlecules".  Chemical game theory then calculates the outcomes as equilibrium solutions to a system of chemical reactions.
+
+=== Description and modeling ===
+
+The primary use of game theory is to describe and model how human populations behave. Some scholars believe that by finding the equilibria of games they can predict how actual human populations will behave when confronted with situations analogous to the game being studied. This particular view of game theory has been criticized . It is argued that the assumptions made by game theorists are often violated when applied to real-world situations. Game theorists usually assume players act rationally, but in practice, human rationality and/or behavior often deviates from the model of rationality as used in game theory. Game theorists respond by comparing their assumptions to those used in physics. Thus while their assumptions do not always hold, they can treat game theory as a reasonable scientific ideal akin to the models used by physicists. However, empirical work has shown that in some classic games, such as the centipede game, guess 2/3 of the average game, and the dictator game, people regularly do not play Nash equilibria. There is an ongoing debate regarding the importance of these experiments and whether the analysis of the experiments fully captures all aspects of the relevant situation.
+Some game theorists, following the work of John Maynard Smith and George R. Price, have turned to evolutionary game theory in order to resolve these issues. These models presume either no rationality or bounded rationality on the part of players. Despite the name, evolutionary game theory does not necessarily presume natural selection in the biological sense. Evolutionary game theory includes both biological as well as cultural evolution and also models of individual learning (for example, fictitious play dynamics).
+
+=== Prescriptive or normative analysis ===
+
+Some scholars see game theory not as a predictive tool for the behavior of human beings, but as a suggestion for how people ought to behave. Since a strategy, corresponding to a Nash equilibrium of a game constitutes one's best response to the actions of the other players – provided they are in (the same) Nash equilibrium – playing a strategy that is part of a Nash equilibrium seems appropriate. This normative use of game theory has also come under criticism.
+
+=== Economics ===
+Game theory is a major method used in mathematical economics and business for modeling competing behaviors of interacting agents. Applications include a wide array of economic phenomena and approaches, such as auctions, bargaining, mergers and acquisitions pricing, fair division, duopolies, oligopolies, social network formation, agent-based computational economics, general equilibrium, mechanism design, and voting systems; and across such broad areas as experimental economics, behavioral economics, information economics, industrial organization, and political economy.
+This research usually focuses on particular sets of strategies known as "solution concepts" or "equilibria". A common assumption is that players act rationally. In non-cooperative games, the most famous of these is the Nash equilibrium. A set of strategies is a Nash equilibrium if each represents a best response to the other strategies. If all the players are playing the strategies in a Nash equilibrium, they have no unilateral incentive to deviate, since their strategy is the best they can do given what others are doing.
+The payoffs of the game are generally taken to represent the utility of individual players.
+A prototypical paper on game theory in economics begins by presenting a game that is an abstraction of a particular economic situation. One or more solution concepts are chosen, and the author demonstrates which strategy sets in the presented game are equilibria of the appropriate type. Economists and business professors suggest two primary uses (noted above): descriptive and prescriptive.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-6.md b/data/en.wikipedia.org/wiki/Game_theory-6.md
new file mode 100644
index 000000000..69ea90270
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-6.md
@@ -0,0 +1,36 @@
+---
+title: "Game theory"
+chunk: 7/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+==== Managerial economics ====
+Game theory also has an extensive use in a specific branch or stream of economics – Managerial Economics. One important usage of it in the field of managerial economics is in analyzing strategic interactions between firms. For example, firms may be competing in a market with limited resources, and game theory can help managers understand how their decisions impact their competitors and the overall market outcomes. Game theory can also be used to analyze cooperation between firms, such as in forming strategic alliances or joint ventures. Another use of game theory in managerial economics is in analyzing pricing strategies. For example, firms may use game theory to determine the optimal pricing strategy based on how they expect their competitors to respond to their pricing decisions. Overall, game theory serves as a useful tool for analyzing strategic interactions and decision making in the context of managerial economics.
+
+=== Business ===
+The Chartered Institute of Procurement & Supply (CIPS) promotes knowledge and use of game theory within the context of business procurement. CIPS and TWS Partners have conducted a series of surveys designed to explore the understanding, awareness and application of game theory among procurement professionals. Some of the main findings in their third annual survey (2019) include:
+
+application of game theory to procurement activity has increased – at the time it was at 19% across all survey respondents
+65% of participants predict that use of game theory applications will grow
+70% of respondents say that they have "only a basic or a below basic understanding" of game theory
+20% of participants had undertaken on-the-job training in game theory
+50% of respondents said that new or improved software solutions were desirable
+90% of respondents said that they do not have the software they need for their work.
+
+=== Project management ===
+Sensible decision-making is critical for the success of projects.  In project management, game theory is used to model the decision-making process of players, such as investors, project managers, contractors, sub-contractors, governments and customers.  Quite often, these players have competing interests, and sometimes their interests are directly detrimental to other players, making project management scenarios well-suited to be modeled by game theory.
+Piraveenan (2019) in his review provides several examples where game theory is used to model project management scenarios. For instance, an investor typically has several investment options, and each option will likely result in a different project, and thus one of the investment options has to be chosen before the project charter can be produced. Similarly, any large project involving subcontractors, for instance, a construction project, has a complex interplay between the main contractor (the project manager) and subcontractors, or among the subcontractors themselves, which typically has several decision points. For example, if there is an ambiguity in the contract between the contractor and subcontractor, each must decide how hard to push their case without jeopardizing the whole project, and thus their own stake in it. Similarly, when projects from competing organizations are launched, the marketing personnel have to decide what is the best timing and strategy to market the project, or its resultant product or service, so that it can gain maximum traction in the face of competition. In each of these scenarios, the required decisions depend on the decisions of other players who, in some way, have competing interests to the interests of the decision-maker, and thus can ideally be modeled using game theory.
+Piraveenan summarizes that two-player games are predominantly used to model project management scenarios, and based on the identity of these players, five distinct types of games are used in project management.
+
+Government-sector–private-sector games (games that model public–private partnerships)
+Contractor–contractor games
+Contractor–subcontractor games
+Subcontractor–subcontractor games
+Games involving other players
+In terms of types of games, both cooperative as well as non-cooperative, normal-form as well as extensive-form, and zero-sum as well as non-zero-sum are used to model various project management scenarios.
+
+=== Political science ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-7.md b/data/en.wikipedia.org/wiki/Game_theory-7.md
new file mode 100644
index 000000000..f18ce797a
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-7.md
@@ -0,0 +1,23 @@
+---
+title: "Game theory"
+chunk: 8/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+The application of game theory to political science is focused in the overlapping areas of fair division, political economy, public choice, war bargaining, positive political theory, and social choice theory. In each of these areas, researchers have developed game-theoretic models in which the players are often voters, states, special interest groups, and politicians.
+Early examples of game theory applied to political science are provided by Anthony Downs. In his 1957 book An Economic Theory of Democracy, he applies the Hotelling firm location model to the political process. In the Downsian model, political candidates commit to ideologies on a one-dimensional policy space. Downs first shows how the political candidates will converge to the ideology preferred by the median voter if voters are fully informed, but then argues that voters choose to remain rationally ignorant which allows for candidate divergence. Game theory was applied in 1962 to the Cuban Missile Crisis during the presidency of John F. Kennedy.
+It has also been proposed that game theory explains the stability of any form of political government.  Taking the simplest case of a monarchy, for example, the king, being only one person, does not and cannot maintain his authority by personally exercising physical control over all or even any significant number of his subjects.  Sovereign control is instead explained by the recognition by each citizen that all other citizens expect each other to view the king (or other established government) as the person whose orders will be followed.  Coordinating communication among citizens to replace the sovereign is effectively barred, since conspiracy to replace the sovereign is generally punishable as a crime.  Thus, in a process that can be modeled by variants of the prisoner's dilemma, during periods of stability no citizen will find it rational to move to replace the sovereign, even if all the citizens know they would be better off if they were all to act collectively.
+A game-theoretic explanation for democratic peace is that public and open debate in democracies sends clear and reliable information regarding their intentions to other states. In contrast, it is difficult to know the intentions of nondemocratic leaders, what effect concessions will have, and if promises will be kept. Thus there will be mistrust and unwillingness to make concessions if at least one of the parties in a dispute is a non-democracy.
+However, game theory predicts that two countries may still go to war even if their leaders are cognizant of the costs of fighting. War may result from asymmetric information; two countries may have incentives to mis-represent the amount of military resources they have on hand, rendering them unable to settle disputes agreeably without resorting to fighting. Moreover, war may arise because of commitment problems: if two countries wish to settle a dispute via peaceful means, but each wishes to go back on the terms of that settlement, they may have no choice but to resort to warfare. Finally, war may result from issue indivisibilities.
+Game theory could also help predict a nation's responses when there is a new rule or law to be applied to that nation. One example is Peter John Wood's (2013) research looking into what nations could do to help reduce climate change. Wood thought this could be accomplished by making treaties with other nations to reduce greenhouse gas emissions. However, he concluded that this idea could not work because it would create a prisoner's dilemma for the nations.
+
+=== Defence science and technology ===
+Game theory has been used extensively to model decision-making scenarios relevant to defence applications.  Most studies that has applied game theory in defence settings are concerned with Command and Control Warfare, and can be further classified into studies dealing with (i) Resource Allocation Warfare (ii) Information Warfare (iii) Weapons Control Warfare, and (iv) Adversary Monitoring Warfare.  Many of the problems studied are concerned with sensing and tracking, for example a surface ship trying to track a hostile submarine and the submarine trying to evade being tracked, and the interdependent decision making that takes place with regards to bearing, speed, and the sensor technology activated by both vessels.
+The tool, for example, automates the transformation of public vulnerability data into models, allowing defenders to synthesize optimal defence strategies through Stackelberg equilibrium analysis. This approach enhances cyber resilience by enabling defenders to anticipate and counteract attackers’ best responses, making game theory increasingly relevant in adversarial cybersecurity environments.
+Ho et al. provide a broad summary of game theory applications in defence, highlighting its advantages and limitations across both physical and cyber domains.
+
+=== Biology ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-8.md b/data/en.wikipedia.org/wiki/Game_theory-8.md
new file mode 100644
index 000000000..d4232a5b6
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-8.md
@@ -0,0 +1,25 @@
+---
+title: "Game theory"
+chunk: 9/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+Unlike those in economics, the payoffs for games in biology are often interpreted as corresponding to fitness. In addition, the focus has been less on equilibria that correspond to a notion of rationality and more on ones that would be maintained by evolutionary forces. The best-known equilibrium in biology is known as the evolutionarily stable strategy (ESS), first introduced in (Maynard Smith & Price 1973). Although its initial motivation did not involve any of the mental requirements of the Nash equilibrium, every ESS is a Nash equilibrium.
+In biology, game theory has been used as a model to understand many different phenomena. It was first used to explain the evolution (and stability) of the approximate 1:1 sex ratios. (Fisher 1930) suggested that the 1:1 sex ratios are a result of evolutionary forces acting on individuals who could be seen as trying to maximize their number of grandchildren.
+Additionally, biologists have used evolutionary game theory and the ESS to explain the emergence of animal communication. The analysis of signaling games and other communication games has provided insight into the evolution of communication among animals. For example, the mobbing behavior of many species, in which a large number of prey animals attack a larger predator, seems to be an example of spontaneous emergent organization. Ants have also been shown to exhibit feed-forward behavior akin to fashion (see Paul Ormerod's Butterfly Economics).
+Biologists have used the game of chicken to analyze fighting behavior and territoriality.
+According to Maynard Smith, in the preface to Evolution and the Theory of Games, "paradoxically, it has turned out that game theory is more readily applied to biology than to the field of economic behaviour for which it was originally designed". Evolutionary game theory has been used to explain many seemingly incongruous phenomena in nature.
+One such phenomenon is known as biological altruism. This is a situation in which an organism appears to act in a way that benefits other organisms and is detrimental to itself. This is distinct from traditional notions of altruism because such actions are not conscious, but appear to be evolutionary adaptations to increase overall fitness. Examples can be found in species ranging from vampire bats that regurgitate blood they have obtained from a night's hunting and give it to group members who have failed to feed, to worker bees that care for the queen bee for their entire lives and never mate, to vervet monkeys that warn group members of a predator's approach, even when it endangers that individual's chance of survival. All of these actions increase the overall fitness of a group, but occur at a cost to the individual.
+Evolutionary game theory explains this altruism with the idea of kin selection. Altruists discriminate between the individuals they help and favor relatives. Hamilton's rule explains the evolutionary rationale behind this selection with the equation c < b × r, where the cost c to the altruist must be less than the benefit b to the recipient multiplied by the coefficient of relatedness r. The more closely related two organisms are causes the incidences of altruism to increase because they share many of the same alleles. This means that the altruistic individual, by ensuring that the alleles of its close relative are passed on through survival of its offspring, can forgo the option of having offspring itself because the same number of alleles are passed on. For example, helping a sibling (in diploid animals) has a coefficient of 1⁄2, because (on average) an individual shares half of the alleles in its sibling's offspring. Ensuring that enough of a sibling's offspring survive to adulthood precludes the necessity of the altruistic individual producing offspring. The coefficient values depend heavily on the scope of the playing field; for example if the choice of whom to favor includes all genetic living things, not just all relatives, we assume the discrepancy between all humans only accounts for approximately 1% of the diversity in the playing field, a coefficient that was 1⁄2 in the smaller field becomes 0.995. Similarly if it is considered that information other than that of a genetic nature (e.g. epigenetics, religion, science, etc.) persisted through time the playing field becomes larger still, and the discrepancies smaller.
+
+=== Computer science and logic ===
+Game theory has come to play an increasingly important role in logic and in computer science. Several logical theories have a basis in game semantics. In addition, computer scientists have used games to model interactive computations. Also, game theory provides a theoretical basis to the field of multi-agent systems.
+Separately, game theory has played a role in online algorithms; in particular, the k-server problem, which has in the past been referred to as games with moving costs and request-answer games. Yao's principle is a game-theoretic technique for proving lower bounds on the computational complexity of randomized algorithms, especially online algorithms.
+The emergence of the Internet has motivated the development of algorithms for finding equilibria in games, markets, computational auctions, peer-to-peer systems, and security and information markets. Algorithmic game theory and within it algorithmic mechanism design combine computational algorithm design and analysis of complex systems with economic theory.
+Game theory has multiple applications in the field of artificial intelligence and machine learning. It is often used in developing autonomous systems that can make complex decisions in uncertain environment. Some other areas of application of game theory in AI/ML context are as follows - multi-agent system formation, reinforcement learning, mechanism design etc. By using game theory to model the behavior of other agents and anticipate their actions, AI/ML systems can make better decisions and operate more effectively.
+
+=== Philosophy ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Game_theory-9.md b/data/en.wikipedia.org/wiki/Game_theory-9.md
new file mode 100644
index 000000000..5466135f3
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Game_theory-9.md
@@ -0,0 +1,38 @@
+---
+title: "Game theory"
+chunk: 10/13
+source: "https://en.wikipedia.org/wiki/Game_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:32.715747+00:00"
+instance: "kb-cron"
+---
+
+Game theory has been put to several uses in philosophy. Responding to two papers by W.V.O. Quine (1960, 1967), Lewis (1969) used game theory to develop a philosophical account of convention. In so doing, he provided the first analysis of common knowledge and employed it in analyzing play in coordination games. In addition, he first suggested that one can understand meaning in terms of signaling games. This later suggestion has been pursued by several philosophers since Lewis. Following Lewis (1969) game-theoretic account of conventions, Edna Ullmann-Margalit (1977) and Bicchieri (2006) have developed theories of social norms that define them as Nash equilibria that result from transforming a mixed-motive game into a coordination game.
+Game theory has also challenged philosophers to think in terms of interactive epistemology: what it means for a collective to have common beliefs or knowledge, and what are the consequences of this knowledge for the social outcomes resulting from the interactions of agents. Philosophers who have worked in this area include Bicchieri (1989, 1993), Skyrms (1990), and Stalnaker (1999).
+The synthesis of game theory with ethics was championed by R. B. Braithwaite. The hope was that rigorous mathematical analysis of game theory might help formalize the more imprecise philosophical discussions. However, this expectation was only materialized to a limited extent.
+In ethics, some (most notably David Gauthier, Gregory Kavka, and Jean Hampton)  authors have attempted to pursue Thomas Hobbes' project of deriving morality from self-interest. Since games like the prisoner's dilemma present an apparent conflict between morality and self-interest, explaining why cooperation is required by self-interest is an important component of this project. This general strategy is a component of the general social contract view in political philosophy (for examples, see Gauthier (1986) and Kavka (1986)).
+Other authors have attempted to use evolutionary game theory in order to explain the emergence of human attitudes about morality and corresponding animal behaviors. These authors look at several games including the prisoner's dilemma, stag hunt, and the Nash bargaining game as providing an explanation for the emergence of attitudes about morality (see, e.g., Skyrms (1996, 2004) and Sober and Wilson (1998)).
+
+=== Epidemiology ===
+Since the decision to take a vaccine for a particular disease is often made by individuals, who may consider a range of factors and parameters in making this decision (such as the incidence and prevalence of the disease, perceived and real risks associated with contracting the disease, mortality rate, perceived and real risks associated with vaccination, and financial cost of vaccination), game theory has been used to model and predict vaccination uptake in a society.
+
+== Well known examples of games ==
+
+=== Prisoner's dilemma ===
+
+William Poundstone described the game in his 1993 book Prisoner's Dilemma:
+Two members of a criminal gang, A and B, are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communication with their partner. The principal charge would lead to a sentence of ten years in prison; however, the police do not have the evidence for a conviction. They plan to sentence both to two years in prison on a lesser charge but offer each prisoner a Faustian bargain: If one of them confesses to the crime of the principal charge, betraying the other, they will be pardoned and free to leave while the other must serve the entirety of the sentence instead of just two years for the lesser charge.
+The dominant strategy (and therefore the best response to any possible opponent strategy), is to betray the other, which aligns with the sure-thing principle. However, both prisoners staying silent would yield a greater reward for both of them than mutual betrayal.
+
+=== Battle of the sexes ===
+
+The "battle of the sexes" is a term used to describe the perceived conflict between men and women in various areas of life, such as relationships, careers, and social roles. This conflict is often portrayed in popular culture, such as movies and television shows, as a humorous or dramatic competition between the genders. This conflict can be depicted in a game theory framework. This is an example of non-cooperative games.
+An example of the "battle of the sexes" can be seen in the portrayal of relationships in popular media, where men and women are often depicted as being fundamentally different and in conflict with each other. For instance, in some romantic comedies, the male and female protagonists are shown as having opposing views on love and relationships, and they have to overcome these differences in order to be together.
+In this game, there are two pure strategy Nash equilibria: one where both the players choose some option and one where the players choose the other option. If the game is played in mixed strategies, where each player chooses their strategy randomly, then there is an infinite number of Nash equilibria. However, in the context of the "battle of the sexes" game, the assumption is usually made that the game is played in pure strategies.
+
+=== Ultimatum game ===
+
+The ultimatum game is a game that has become a popular instrument of economic experiments. An early description is by Nobel laureate John Harsanyi in 1961.
+One player, the proposer, is endowed with a sum of money. The proposer is tasked with splitting it with another player, the responder (who knows what the total sum is). Once the proposer communicates his decision, the responder may accept it or reject it. If the responder accepts, the money is split per the proposal; if the responder rejects, both players receive nothing.  Both players know in advance the consequences of the responder accepting or rejecting the offer. The game demonstrates how social acceptance, fairness, and generosity influence the players decisions.
+Ultimatum game has a variant, that is the dictator game. They are mostly identical, except in dictator game the responder has no power to reject the proposer's offer.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Grammar_systems_theory-0.md b/data/en.wikipedia.org/wiki/Grammar_systems_theory-0.md
new file mode 100644
index 000000000..57102c102
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Grammar_systems_theory-0.md
@@ -0,0 +1,170 @@
+---
+title: "Grammar systems theory"
+chunk: 1/1
+source: "https://en.wikipedia.org/wiki/Grammar_systems_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:33.967957+00:00"
+instance: "kb-cron"
+---
+
+Grammar systems theory is a field of theoretical computer science that studies systems of finite collections of formal grammars generating a formal language. Each grammar works on a string, a so-called sequential form that represents an environment. Grammar systems can thus be used as a formalization of decentralized or distributed systems of agents in artificial intelligence.
+Let 
+  
+    
+      
+        
+          A
+        
+      
+    
+    {\displaystyle \mathbb {A} }
+  
+ be a simple reactive agent moving on the table and trying not to fall down from the table with two reactions, t for turning and ƒ for moving forward. The set of possible behaviors of 
+  
+    
+      
+        
+          A
+        
+      
+    
+    {\displaystyle \mathbb {A} }
+  
+ can then be described as formal language
+
+  
+    
+      
+        
+          
+            L
+            
+              A
+            
+          
+        
+        =
+        {
+        (
+        
+          f
+          
+            m
+          
+        
+        
+          t
+          
+            n
+          
+        
+        
+          f
+          
+            r
+          
+        
+        
+          )
+          
+            +
+          
+        
+        :
+        1
+        ≤
+        m
+        ≤
+        k
+        ;
+        1
+        ≤
+        n
+        ≤
+        ℓ
+        ;
+        1
+        ≤
+        r
+        ≤
+        k
+        }
+        ,
+      
+    
+    {\displaystyle \mathbb {L_{A}} =\{(f^{m}t^{n}f^{r})^{+}:1\leq m\leq k;1\leq n\leq \ell ;1\leq r\leq k\},}
+  
+
+where ƒ can be done maximally k times and t can be done maximally ℓ times considering the dimensions of the table.
+
+ 
+Let 
+  
+    
+      
+        
+          
+            G
+            
+              A
+            
+          
+        
+      
+    
+    {\displaystyle \mathbb {G_{A}} }
+  
+ be a formal grammar which generates language 
+  
+    
+      
+        
+          
+            L
+            
+              A
+            
+          
+        
+      
+    
+    {\displaystyle \mathbb {L_{A}} }
+  
+. The behavior of 
+  
+    
+      
+        
+          A
+        
+      
+    
+    {\displaystyle \mathbb {A} }
+  
+ is then described by this grammar. Suppose the 
+  
+    
+      
+        
+          A
+        
+      
+    
+    {\displaystyle \mathbb {A} }
+  
+ has a subsumption architecture; each component of this architecture can be then represented as a formal grammar, too, and the final behavior of the agent is then described by this system of grammars.
+The schema on the right describes such a system of grammars which shares a common string representing an environment. The shared sequential form is sequentially rewritten by each grammar, which can represent either a component or generally an agent.
+If grammars communicate together and work on a shared sequential form, it is called a Cooperating Distributed (DC) grammar system. Shared sequential form is a similar concept to the blackboard approach in AI, which is inspired by an idea of experts solving some problem together while they share their proposals and ideas on a shared blackboard.
+Each grammar in a grammar system can also work on its own string and communicate with other grammars in a system by sending their sequential forms on request. Such a grammar system is then called a Parallel Communicating (PC) grammar system.
+PC and DC are inspired by distributed AI. If there is no communication between grammars, the system is close to the decentralized approaches in AI. These kinds of grammar systems are sometimes called colonies or Eco-Grammar systems, depending (besides others) on whether the environment is changing on its own (Eco-Grammar system) or not (colonies).
+
+
+== See also ==
+Artificial life
+Agent-based model
+Distributed artificial intelligence
+Multi-agent system
+
+
+== References ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Homeokinetics-0.md b/data/en.wikipedia.org/wiki/Homeokinetics-0.md
new file mode 100644
index 000000000..b748aec9b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Homeokinetics-0.md
@@ -0,0 +1,36 @@
+---
+title: "Homeokinetics"
+chunk: 1/1
+source: "https://en.wikipedia.org/wiki/Homeokinetics"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:35.253721+00:00"
+instance: "kb-cron"
+---
+
+Homeokinetics is the study of self-organizing, complex systems. Standard physics studies systems at separate levels, such as atomic physics, nuclear physics, biophysics, social physics, and galactic physics. Homeokinetic physics studies the up-down processes that bind these levels. Tools such as mechanics, quantum field theory, and the laws of thermodynamics provide the key relationships. The subject, described as the physics and thermodynamics associated with the up down movement between levels of systems, originated in the late 1970s work of American physicists Harry Soodak and Arthur Iberall. Complex systems are universes, galaxies, social systems, people, or even those that seem as simple as gases. The basic premise is that the entire universe consists of atomistic-like units bound in interactive ensembles to form systems, level by level, in a nested hierarchy. Homeokinetics treats all complex systems on an equal footing, animate and inanimate, providing them with a common viewpoint. The complexity in studying how they work is reduced by the emergence of common languages in all complex systems.
+
+
+== History ==
+Arthur Iberall, Warren McCulloch and Harry Soodak developed the concept of homeokinetics as a new branch of physics.  It began through Iberall's biophysical research for the NASA exobiology program into the dynamics of mammalian physiological processes They were observing an area that physics has neglected, that of complex systems with their very long internal factory day delays. They were observing systems associated with nested hierarchy and with an extensive range of time scale processes. It was such connections, referred to as both up-down or in-out connections (as nested hierarchy) and side-side or flatland physics among atomistic-like components (as heterarchy) that became the hallmark of homeokinetic problems. By 1975, they began to put a formal catch-phrase name on those complex problems, associating them with nature, life, human, mind, and society. The major method of exposition that they began using was a combination of engineering physics and a more academic pure physics. In 1981, Iberall was invited to the Crump Institute for Medical Engineering of UCLA, where he further refined the key concepts of homeokinetics, developing a physical scientific foundation for complex systems.
+
+
+== Self-organizing complex Systems ==
+A system is a collective of interacting ‘atomistic’-like entities. The word ‘atomism’ is used to stand both for the entity and the doctrine. As is known from ‘kinetic’ theory, in mobile or simple systems, the atomisms share their ‘energy’ in interactive collisions. That so-called ‘equipartitioning’ process takes place within a few collisions. Physically, if there is little or no interaction, the process is considered to be very weak. Physics deals basically with the forces of interaction—few in number—that influence the interactions. They all tend to emerge with considerable force at high ‘density’ of atomistic interaction. In complex systems, there is also a result of internal processes in the atomisms. They exhibit, in addition to the pair-by-pair interactions, internal actions such as vibrations, rotations, and association. If the energy and time involved internally creates a very large—in time—cycle of performance of their actions compared to their pair interactions, the collective system is complex. If you eat a cookie and you do not see the resulting action for hours, that is complex; if boy meets girl and they become ‘engaged’ for a protracted period, that is complex. What emerges from that physics is a broad host of changes in state and stability transitions in state. Viewing Aristotle as having defined a general basis for systems in their static-logical states and trying to identify a logic-metalogic for physics, e.g., metaphysics, then homeokinetics is viewed to be an attempt to define the dynamics of all those systems in the universe.
+
+
+== Flatland physics vs. homeokinetic physics ==
+Ordinary physics is a flatland physics, a physics at some particular level. Examples include nuclear and atomic physics, biophysics, social physics, and stellar physics. Homeokinetic physics combines flatland physics with the study of the up down processes that binds the levels. Tools, such as mechanics, quantum field theory, and the laws of thermodynamics, provide key relationships for the binding of the levels, how they connect, and how the energy flows up and down. And whether the atomisms are atoms, molecules, cells, people, stars, galaxies, or universes, the same tools can be used to understand them. Homeokinetics treats all complex systems on an equal footing, animate and inanimate, providing them with a common viewpoint. The complexity in studying how they work is reduced by the emergence of common languages in all complex systems.
+
+
+== Applications ==
+A homeokinetic approach to complex systems has been applied to understanding life, ecological psychology, mind, anthropology, geology, law, motor control, bioenergetics, healing modalities, and political science.
+It has also been applied to social physics where a homeokinetics analysis shows that one must account for flow variables such as the flow of energy, of materials, of action, reproduction rate, and value-in-exchange. Iberall's conjectures on life and mind have been used as a springboard to develop theories of mental activity and action.
+
+
+== References ==
+
+
+== External links ==
+www.homeokinetics.org
+commons.trincoll.edu/homeokinetics
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Image_analysis-0.md b/data/en.wikipedia.org/wiki/Image_analysis-0.md
new file mode 100644
index 000000000..f42a92136
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Image_analysis-0.md
@@ -0,0 +1,91 @@
+---
+title: "Image analysis"
+chunk: 1/1
+source: "https://en.wikipedia.org/wiki/Image_analysis"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:36.441246+00:00"
+instance: "kb-cron"
+---
+
+Image analysis or imagery analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques. Image analysis tasks can be as simple as reading bar coded tags or as sophisticated as identifying a person from their face.
+Computers are indispensable for the analysis of large amounts of data, for tasks that require complex computation, or for the extraction of quantitative information.  On the other hand, the human visual cortex is an excellent image analysis apparatus, especially for extracting higher-level information, and for many applications — including medicine, security, and remote sensing — human analysts still cannot be replaced by computers.  For this reason, many important image analysis tools such as edge detectors and neural networks are inspired by human visual perception models.
+
+
+== Digital ==
+
+Digital Image Analysis or Computer Image Analysis is when a computer or electrical device automatically studies an image to obtain useful information from it. Note that the device is often a computer but may also be an electrical circuit, a digital camera or a mobile phone. It involves the fields of computer or machine vision, and medical imaging, and makes heavy use of pattern recognition, digital geometry, and signal processing.  This field of computer science developed in the 1950s at academic institutions such as the MIT A.I. Lab, originally as a branch of artificial intelligence and robotics.
+It is the quantitative or qualitative characterization of two-dimensional (2D) or three-dimensional (3D) digital images. 2D images are, for example, to be analyzed in computer vision, and 3D images in medical imaging. The field was established in the 1950s—1970s, for example with pioneering contributions by Azriel Rosenfeld, Herbert Freeman, Jack E. Bresenham, or King-Sun Fu.
+
+
+== Techniques ==
+
+There are many different techniques used in automatically analysing images. Each technique may be useful for a small range of tasks, however there still aren't any known methods of image analysis that are generic enough for wide ranges of tasks, compared to the abilities of a human's image analysing capabilities. Examples of image analysis techniques in different fields include:
+
+2D and 3D object recognition,
+image segmentation,
+motion detection e.g. Single particle tracking,
+video tracking,
+optical flow,
+medical scan analysis,
+3D Pose Estimation.
+
+
+== Deep learning ==
+Since the early 2010s, deep learning methods have substantially advanced the field of image analysis. In 2012, a deep convolutional neural network (CNN) known as AlexNet achieved a significant reduction in error rates on the ImageNet large-scale image classification benchmark, demonstrating the effectiveness of deep learning for visual recognition tasks. Subsequent architectures such as ResNet introduced residual connections that enabled training of much deeper networks, further improving accuracy across image analysis tasks.
+Real-time object detection became practical with frameworks such as YOLO (You Only Look Once), which unified detection and classification into a single network pass. In 2020, the Vision Transformer (ViT) demonstrated that transformer architectures, originally developed for natural language processing, could achieve competitive results on image classification when applied directly to sequences of image patches.
+More recently, foundation models trained on large-scale datasets have enabled zero-shot generalisation across image analysis tasks. The Segment Anything Model (SAM), trained on over one billion masks, can segment arbitrary objects in images without task-specific fine-tuning. These advances have made image analysis techniques increasingly accessible through browser-based tools and open-source implementations.
+
+
+== Applications ==
+The applications of digital image analysis are continuously expanding through all areas of science and industry, including:
+
+anatomy, allows for precise measurements, visualization, and statistical analysis of anatomical structures.
+assay micro plate reading, such as detecting where a chemical was manufactured.
+astronomy, such as calculating the size of a planet.
+automated species identification (e.g. plant and animal species)
+defense
+error level analysis
+filtering
+machine vision, such as to automatically count items in a factory conveyor belt.
+materials science, such as determining if a metal weld has cracks.
+medicine, such as detecting cancer in a mammography scan.
+metallography, such as determining the mineral content of a rock sample.
+microscopy, such as counting the germs in a swab.
+automatic number plate recognition;
+optical character recognition, such as automatic license plate detection.
+remote sensing, such as detecting intruders in a house, and producing land cover/land use maps.
+robotics, such as to avoid steering into an obstacle.
+security, such as detecting a person's eye color or hair color.
+
+
+== Object-based ==
+
+Object-based image analysis (OBIA) involves two typical processes, segmentation and classification. Segmentation helps to group pixels into homogeneous objects. The objects typically correspond to individual features of interest, although over-segmentation or under-segmentation is very likely. Classification then can be performed at object levels, using various statistics of the objects as features in the classifier.  Statistics can include geometry, context and texture of image objects. Over-segmentation is often preferred over under-segmentation when classifying high-resolution images.
+Object-based image analysis has been applied in many fields, such as cell biology,  medicine, earth sciences, and remote sensing. For example, it can detect changes of cellular shapes in the process of cell differentiation.; it has also been widely used in the mapping community to generate  land cover.
+When applied to earth images, OBIA is known as geographic object-based image analysis (GEOBIA), defined as "a sub-discipline of geoinformation science devoted to (...) partitioning remote sensing (RS) imagery into meaningful image-objects, and assessing their characteristics through spatial, spectral and temporal scale". The international GEOBIA conference has been held biannually since 2006.
+OBIA techniques are implemented in software such as eCognition or the Orfeo toolbox.
+
+
+== See also ==
+Archeological imagery
+Imaging technologies
+Image processing
+imc FAMOS (1987), graphical data analysis
+Land cover mapping
+Military intelligence
+Remote sensing
+
+
+== References ==
+
+
+== Further reading ==
+The Image Processing Handbook by John C. Russ, ISBN 0-8493-7254-2 (2006)
+Image Processing and Analysis - Variational, PDE, Wavelet, and Stochastic Methods by Tony F. Chan and Jianhong (Jackie) Shen, ISBN 0-89871-589-X (2005)
+Front-End Vision and Multi-Scale Image Analysis by Bart M. ter Haar Romeny, Paperback, ISBN 1-4020-1507-0 (2003)
+Practical Guide to Image Analysis by J.J. Friel, et al., ASM International, ISBN 0-87170-688-1 (2000).
+Fundamentals of Image Processing by Ian T. Young, Jan J. Gerbrands, Lucas J. Van Vliet, Paperback, ISBN 90-75691-01-7 (1995)
+Image Analysis and Metallography edited by P.J. Kenny, et al., International Metallographic Society and ASM International (1989).
+Quantitative Image Analysis of Microstructures by H.E. Exner & H.P. Hougardy,  DGM Informationsgesellschaft mbH, ISBN 3-88355-132-5 (1988).
+"Metallographic and Materialographic Specimen Preparation, Light Microscopy, Image Analysis and Hardness Testing", Kay Geels in collaboration with Struers A/S, ASTM International 2006.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-0.md b/data/en.wikipedia.org/wiki/Information_theory-0.md
new file mode 100644
index 000000000..116d82e6b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-0.md
@@ -0,0 +1,45 @@
+---
+title: "Information theory"
+chunk: 1/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+Information theory is the mathematical study of the quantification, storage, and communication of a particular type of mathematically defined information. The field was established and formalized by Claude Shannon in the 1940s, though early contributions were made in the 1920s through the works of Harry Nyquist and Ralph Hartley. 
+Information theory was initially formed in the context of telecommunication but soon found a wide range of other applications. It is now at the intersection of mathematics, statistics and computer science, and has applications in diverse fields ranging from electrical engineering and physics to neurobiology.
+As a simple example of the concept, if one flips a fair coin and does not yet know the outcome (heads or tails), then they lack a certain amount of information. After looking at the coin, they gain information about the outcome. For a fair coin, the probability of either heads or tails is 1/2 and the amount of information is expressed as 
+  
+    
+      
+        −
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        (
+        1
+        
+          /
+        
+        2
+        )
+      
+    
+    {\displaystyle -\log _{2}(1/2)}
+  
+ = 1 bit of information.
+A key concept in information theory is information entropy. In Shannon's formulation entropy is equal to the lack of information about an event. In the above coin flip example, the entropy in the case where you don't know the outcome is 1 bit. When you know the outcome after the coin has landed, the entropy is zero because you have gained one bit 
+Information theory has been used in a wide range of applications, such as source coding/data compression (e.g. for ZIP files), and channel coding/error detection and correction (e.g. for DSL). Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones and the development of the Internet and artificial intelligence. The theory has also found applications in other areas, including statistical inference, cryptography, neurobiology, perception, signal processing, linguistics, the evolution and function of molecular codes (bioinformatics), thermal physics, molecular dynamics, black holes, quantum computing, information retrieval, intelligence gathering, plagiarism detection, pattern recognition, anomaly detection, the analysis of music, art creation, imaging system design, study of outer space, the dimensionality of space, and epistemology.
+
+== Overview ==
+Information theory, as conceived by Claude Shannon, studies the processing and utilization of information within a probabilistic context. Abstractly, in this approach information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was formalized in 1948 by Claude Shannon in a paper entitled A Mathematical Theory of Communication, in which information is thought of as a set of possible messages, and the goal is to send these messages over a noisy channel, and to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the noisy-channel coding theorem, showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.
+Coding theory is concerned with finding explicit methods, called codes, for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.
+A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis, such as the unit ban.
+
+== Historical background ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-1.md b/data/en.wikipedia.org/wiki/Information_theory-1.md
new file mode 100644
index 000000000..d1ebb48ef
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-1.md
@@ -0,0 +1,234 @@
+---
+title: "Information theory"
+chunk: 2/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+The landmark event establishing the discipline of information theory and bringing it to immediate worldwide attention was the publication of Claude Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October 1948. Historian James Gleick rated the paper as the most important development of 1948, noting that the paper was "even more profound and more fundamental" than the transistor. He came to be known as the "father of information theory". Shannon outlined some of his initial ideas of information theory as early as 1939 in a letter to Vannevar Bush.
+Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation W = K log m (recalling the Boltzmann constant), where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish one sequence of symbols from any other, thus quantifying information as H = log Sn = n log S, where S was the number of possible symbols, and n the number of symbols in a transmission. The unit of information was therefore the decimal digit, which since has sometimes been called the hartley in his honor as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers.
+Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory.
+In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion: 
+
+"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."
+With it came the ideas of: 
+
+The information entropy and redundancy of a source, and its relevance through the source coding theorem;
+The mutual information, and the channel capacity of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem;
+The practical result of the Shannon–Hartley law for the channel capacity of a Gaussian channel; as well as
+The bit—a new way of seeing the most fundamental unit of information.
+
+== Quantities of information ==
+
+Information theory is based on probability theory and statistics, where quantified information is usually described in terms of bits. Information theory often concerns itself with measures of information of the distributions associated with random variables. One of the most important measures is called entropy, which forms the building block of many other measures. Entropy allows quantification of measure of information in a single random variable. 
+Another useful concept is mutual information defined on two random variables, which quantifies the dependence between those variables, which is done by comparing the conditional and unconditional distributions. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables and is the maximum rate of reliable communication across a noisy channel in the limit of long block lengths, when the channel statistics are determined by the joint distribution.
+The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. A common unit of information is the bit or shannon, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the decimal digit, which is based on the common logarithm.
+In what follows, an expression of the form p log p is considered by convention to be equal to zero whenever p = 0. This is justified because 
+  
+    
+      
+        
+          lim
+          
+            p
+            →
+            
+              0
+              
+                +
+              
+            
+          
+        
+        p
+        log
+        ⁡
+        p
+        =
+        0
+      
+    
+    {\displaystyle \lim _{p\rightarrow 0^{+}}p\log p=0}
+  
+ for any logarithmic base.
+
+=== Entropy of an information source ===
+Based on the probability mass function of a source, the Shannon entropy H, in units of bits per symbol, is defined as the expected value of the information content of the symbols.
+The amount of information conveyed by an individual source symbol 
+  
+    
+      
+        
+          x
+          
+            i
+          
+        
+      
+    
+    {\displaystyle x_{i}}
+  
+ with probability 
+  
+    
+      
+        
+          p
+          
+            i
+          
+        
+      
+    
+    {\displaystyle p_{i}}
+  
+ is known as its self-information or surprisal, 
+  
+    
+      
+        I
+        (
+        
+          p
+          
+            i
+          
+        
+        )
+      
+    
+    {\displaystyle I(p_{i})}
+  
+. This quantity is defined as:
+
+  
+    
+      
+        I
+        (
+        
+          p
+          
+            i
+          
+        
+        )
+        =
+        −
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        (
+        
+          p
+          
+            i
+          
+        
+        )
+      
+    
+    {\displaystyle I(p_{i})=-\log _{2}(p_{i})}
+  
+
+A less probable symbol has a larger surprisal, meaning its occurrence provides more information. The entropy 
+  
+    
+      
+        H
+      
+    
+    {\displaystyle H}
+  
+ is the weighted average of the surprisal of all possible symbols from the source's probability distribution:
+
+  
+    
+      
+        H
+        (
+        X
+        )
+         
+        =
+         
+        
+          
+            E
+          
+          
+            X
+          
+        
+        [
+        I
+        (
+        x
+        )
+        ]
+         
+        =
+         
+        
+          ∑
+          
+            i
+          
+        
+        
+          p
+          
+            i
+          
+        
+        I
+        (
+        
+          p
+          
+            i
+          
+        
+        )
+         
+        =
+         
+        −
+        
+          ∑
+          
+            i
+          
+        
+        
+          p
+          
+            i
+          
+        
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        (
+        
+          p
+          
+            i
+          
+        
+        )
+      
+    
+    {\displaystyle H(X)\ =\ \mathbb {E} _{X}[I(x)]\ =\ \sum _{i}p_{i}I(p_{i})\ =\ -\sum _{i}p_{i}\log _{2}(p_{i})}
+  
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-2.md b/data/en.wikipedia.org/wiki/Information_theory-2.md
new file mode 100644
index 000000000..839c84dc4
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-2.md
@@ -0,0 +1,660 @@
+---
+title: "Information theory"
+chunk: 3/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+Intuitively, the entropy 
+  
+    
+      
+        H
+        (
+        X
+        )
+      
+    
+    {\displaystyle H(X)}
+  
+ of a discrete random variable X is a measure of the amount of uncertainty associated with the value of 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ when only its distribution is known. A high entropy indicates the outcomes are more evenly distributed, making the result harder to predict.
+For example, if one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, no information is transmitted. If, however, each bit is independently and equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted.
+
+==== Properties ====
+A key property of entropy is that it is maximized when all the messages in the message space are equiprobable. For a source with n possible symbols, where 
+  
+    
+      
+        
+          p
+          
+            i
+          
+        
+        =
+        
+          
+            1
+            n
+          
+        
+      
+    
+    {\textstyle p_{i}={\frac {1}{n}}}
+  
+ for all 
+  
+    
+      
+        i
+      
+    
+    {\displaystyle i}
+  
+, the entropy is given by:
+
+  
+    
+      
+        H
+        (
+        X
+        )
+        =
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        (
+        n
+        )
+      
+    
+    {\displaystyle H(X)=\log _{2}(n)}
+  
+
+This maximum value represents the most unpredictable state.
+For a source that emits a sequence of 
+  
+    
+      
+        N
+      
+    
+    {\displaystyle N}
+  
+ symbols that are independent and identically distributed (i.i.d.), the total entropy of the message is 
+  
+    
+      
+        N
+        ⋅
+        H
+      
+    
+    {\displaystyle N\cdot H}
+  
+ bits. If the source data symbols are identically distributed but not independent, the entropy of a message of length 
+  
+    
+      
+        N
+      
+    
+    {\displaystyle N}
+  
+ will be less than 
+  
+    
+      
+        N
+        ⋅
+        H
+      
+    
+    {\displaystyle N\cdot H}
+  
+.
+
+==== Units ====
+The choice of the logarithmic base in the entropy formula determines the unit of entropy used:
+
+A base-2 logarithm (as shown in the main formula) measures entropy in bits per symbol. This unit is also sometimes called the shannon in honor of Claude Shannon.
+A Natural logarithm (base e) measures entropy in nats per symbol. This is often used in theoretical analysis as it avoids the need for scaling constants (like ln 2) in derivations.
+Other bases are also possible. A base-10 logarithm measures entropy in decimal digits, or hartleys, per symbol. A base-256 logarithm measures entropy in bytes per symbol, since 28 = 256.
+
+==== Binary Entropy Function ====
+The special case of information entropy for a random variable with two outcomes (a Bernoulli trial) is the binary entropy function. This is typically calculated using a base-2 logarithm, and its unit is the shannon. If one outcome has probability p, the other has probability 1 − p. The entropy is given by:
+
+  
+    
+      
+        
+          H
+          
+            
+              b
+            
+          
+        
+        (
+        p
+        )
+        =
+        −
+        p
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        p
+        −
+        (
+        1
+        −
+        p
+        )
+        
+          log
+          
+            2
+          
+        
+        ⁡
+        (
+        1
+        −
+        p
+        )
+      
+    
+    {\displaystyle H_{\mathrm {b} }(p)=-p\log _{2}p-(1-p)\log _{2}(1-p)}
+  
+
+This function is depicted in the plot shown above, reaching its maximum of 1 bit when p = 0.5, corresponding to the highest uncertainty.
+
+=== Joint entropy ===
+The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X, Y). This implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies.
+For example, if (X, Y) represents the position of a chess piece—X the row and Y the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.
+
+  
+    
+      
+        H
+        (
+        X
+        ,
+        Y
+        )
+        =
+        
+          
+            E
+          
+          
+            X
+            ,
+            Y
+          
+        
+        [
+        −
+        log
+        ⁡
+        p
+        (
+        x
+        ,
+        y
+        )
+        ]
+        =
+        −
+        
+          ∑
+          
+            x
+            ,
+            y
+          
+        
+        p
+        (
+        x
+        ,
+        y
+        )
+        log
+        ⁡
+        p
+        (
+        x
+        ,
+        y
+        )
+        
+      
+    
+    {\displaystyle H(X,Y)=\mathbb {E} _{X,Y}[-\log p(x,y)]=-\sum _{x,y}p(x,y)\log p(x,y)\,}
+  
+
+Despite similar notation, joint entropy should not be confused with cross-entropy.
+The joint entropy of 
+  
+    
+      
+        n
+      
+    
+    {\displaystyle n}
+  
+ discrete random variables 
+  
+    
+      
+        
+          X
+          
+            n
+          
+        
+        ≜
+        (
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        …
+        ,
+        
+          X
+          
+            n
+          
+        
+        )
+      
+    
+    {\displaystyle X^{n}\triangleq (X_{1},X_{2},\ldots ,X_{n})}
+  
+ is 
+
+  
+    
+      
+        H
+        (
+        
+          X
+          
+            n
+          
+        
+        )
+        =
+        H
+        (
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        …
+        ,
+        
+          X
+          
+            n
+          
+        
+        )
+        =
+        
+          E
+        
+        
+          [
+          
+            −
+            log
+            ⁡
+            
+              P
+              
+                
+                  X
+                  
+                    1
+                  
+                
+                ,
+                …
+                ,
+                
+                  X
+                  
+                    n
+                  
+                
+              
+            
+            (
+            
+              X
+              
+                1
+              
+            
+            ,
+            …
+            ,
+            
+              X
+              
+                n
+              
+            
+            )
+          
+          ]
+        
+      
+    
+    {\displaystyle H(X^{n})=H(X_{1},X_{2},\ldots ,X_{n})=\mathbb {E} \left[-\log P_{X_{1},\ldots ,X_{n}}(X_{1},\ldots ,X_{n})\right]}
+  
+
+This can also be represented as a summation of their joint probability mass function:
+
+  
+    
+      
+        H
+        (
+        
+          X
+          
+            n
+          
+        
+        )
+        =
+        −
+        
+          ∑
+          
+            
+              x
+              
+                1
+              
+            
+          
+        
+        ⋯
+        
+          ∑
+          
+            
+              x
+              
+                n
+              
+            
+          
+        
+        
+          P
+          
+            
+              X
+              
+                1
+              
+            
+            ,
+            …
+            ,
+            
+              X
+              
+                n
+              
+            
+          
+        
+        (
+        
+          x
+          
+            1
+          
+        
+        ,
+        …
+        ,
+        
+          x
+          
+            n
+          
+        
+        )
+        log
+        ⁡
+        
+          P
+          
+            
+              X
+              
+                1
+              
+            
+            ,
+            …
+            ,
+            
+              X
+              
+                n
+              
+            
+          
+        
+        (
+        
+          x
+          
+            1
+          
+        
+        ,
+        …
+        ,
+        
+          x
+          
+            n
+          
+        
+        )
+      
+    
+    {\displaystyle H(X^{n})=-\sum _{x_{1}}\cdots \sum _{x_{n}}P_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})\log P_{X_{1},\ldots ,X_{n}}(x_{1},\ldots ,x_{n})}
+  
+.
+Thus, joint entropy is just a subcase of entropy where the random variable is a vector giving values in the product space.
+
+=== Conditional entropy (equivocation) ===
+The conditional entropy or conditional uncertainty of X given random variable Y (also called the equivocation of X about Y) is the average conditional entropy over Y:
+
+  
+    
+      
+        H
+        (
+        X
+        
+          |
+        
+        Y
+        )
+        =
+        
+          
+            E
+          
+          
+            Y
+          
+        
+        [
+        H
+        (
+        X
+        
+          |
+        
+        y
+        )
+        ]
+        =
+        −
+        
+          ∑
+          
+            y
+            ∈
+            Y
+          
+        
+        p
+        (
+        y
+        )
+        
+          ∑
+          
+            x
+            ∈
+            X
+          
+        
+        p
+        (
+        x
+        
+          |
+        
+        y
+        )
+        log
+        ⁡
+        p
+        (
+        x
+        
+          |
+        
+        y
+        )
+        =
+        −
+        
+          ∑
+          
+            x
+            ,
+            y
+          
+        
+        p
+        (
+        x
+        ,
+        y
+        )
+        log
+        ⁡
+        p
+        (
+        x
+        
+          |
+        
+        y
+        )
+        .
+      
+    
+    {\displaystyle H(X|Y)=\mathbb {E} _{Y}[H(X|y)]=-\sum _{y\in Y}p(y)\sum _{x\in X}p(x|y)\log p(x|y)=-\sum _{x,y}p(x,y)\log p(x|y).}
+  
+
+Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:
+
+  
+    
+      
+        H
+        (
+        X
+        
+          |
+        
+        Y
+        )
+        =
+        H
+        (
+        X
+        ,
+        Y
+        )
+        −
+        H
+        (
+        Y
+        )
+        .
+        
+      
+    
+    {\displaystyle H(X|Y)=H(X,Y)-H(Y).\,}
+  
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-3.md b/data/en.wikipedia.org/wiki/Information_theory-3.md
new file mode 100644
index 000000000..f33f8026d
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-3.md
@@ -0,0 +1,656 @@
+---
+title: "Information theory"
+chunk: 4/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+=== Mutual information (transinformation) ===
+Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of X relative to Y is given by:
+
+  
+    
+      
+        I
+        (
+        X
+        ;
+        Y
+        )
+        =
+        
+          
+            E
+          
+          
+            X
+            ,
+            Y
+          
+        
+        [
+        S
+        I
+        (
+        x
+        ,
+        y
+        )
+        ]
+        =
+        
+          ∑
+          
+            x
+            ,
+            y
+          
+        
+        p
+        (
+        x
+        ,
+        y
+        )
+        log
+        ⁡
+        
+          
+            
+              p
+              (
+              x
+              ,
+              y
+              )
+            
+            
+              p
+              (
+              x
+              )
+              
+              p
+              (
+              y
+              )
+            
+          
+        
+      
+    
+    {\displaystyle I(X;Y)=\mathbb {E} _{X,Y}[SI(x,y)]=\sum _{x,y}p(x,y)\log {\frac {p(x,y)}{p(x)\,p(y)}}}
+  
+
+where SI (Specific mutual Information) is the pointwise mutual information.
+A basic property of the mutual information is that:
+
+  
+    
+      
+        I
+        (
+        X
+        ;
+        Y
+        )
+        =
+        H
+        (
+        X
+        )
+        −
+        H
+        (
+        X
+        
+          |
+        
+        Y
+        )
+        .
+        
+      
+    
+    {\displaystyle I(X;Y)=H(X)-H(X|Y).\,}
+  
+
+That is, knowing 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+, we can save an average of I(X; Y) bits in encoding 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ compared to not knowing 
+  
+    
+      
+        Y
+      
+    
+    {\displaystyle Y}
+  
+.
+Mutual information is symmetric:
+
+  
+    
+      
+        I
+        (
+        X
+        ;
+        Y
+        )
+        =
+        I
+        (
+        Y
+        ;
+        X
+        )
+        =
+        H
+        (
+        X
+        )
+        +
+        H
+        (
+        Y
+        )
+        −
+        H
+        (
+        X
+        ,
+        Y
+        )
+        .
+        
+      
+    
+    {\displaystyle I(X;Y)=I(Y;X)=H(X)+H(Y)-H(X,Y).\,}
+  
+
+Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ given the value of 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+ and the prior distribution on 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+:
+
+  
+    
+      
+        I
+        (
+        X
+        ;
+        Y
+        )
+        =
+        
+          
+            E
+          
+          
+            p
+            (
+            y
+            )
+          
+        
+        [
+        
+          D
+          
+            
+              K
+              L
+            
+          
+        
+        (
+        p
+        (
+        X
+        
+          |
+        
+        Y
+        =
+        y
+        )
+        ‖
+        p
+        (
+        X
+        )
+        )
+        ]
+        .
+      
+    
+    {\displaystyle I(X;Y)=\mathbb {E} _{p(y)}[D_{\mathrm {KL} }(p(X|Y=y)\|p(X))].}
+  
+
+In other words, this is a measure of how much, on the average, the probability distribution on 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ will change if we are given the value of 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+. This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:
+
+  
+    
+      
+        I
+        (
+        X
+        ;
+        Y
+        )
+        =
+        
+          D
+          
+            
+              K
+              L
+            
+          
+        
+        (
+        p
+        (
+        X
+        ,
+        Y
+        )
+        ‖
+        p
+        (
+        X
+        )
+        p
+        (
+        Y
+        )
+        )
+        .
+      
+    
+    {\displaystyle I(X;Y)=D_{\mathrm {KL} }(p(X,Y)\|p(X)p(Y)).}
+  
+
+Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ2 test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.
+
+=== Kullback–Leibler divergence (information gain) ===
+The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution ⁠
+  
+    
+      
+        p
+        (
+        X
+        )
+      
+    
+    {\displaystyle p(X)}
+  
+⁠, and an arbitrary probability distribution ⁠
+  
+    
+      
+        q
+        (
+        X
+        )
+      
+    
+    {\displaystyle q(X)}
+  
+⁠. If we compress data in a manner that assumes ⁠
+  
+    
+      
+        q
+        (
+        X
+        )
+      
+    
+    {\displaystyle q(X)}
+  
+⁠ is the distribution underlying some data, when, in reality, ⁠
+  
+    
+      
+        p
+        (
+        X
+        )
+      
+    
+    {\displaystyle p(X)}
+  
+⁠ is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined
+
+  
+    
+      
+        
+          D
+          
+            
+              K
+              L
+            
+          
+        
+        (
+        p
+        (
+        X
+        )
+        ‖
+        q
+        (
+        X
+        )
+        )
+        =
+        
+          ∑
+          
+            x
+            ∈
+            X
+          
+        
+        −
+        p
+        (
+        x
+        )
+        log
+        ⁡
+        
+          q
+          (
+          x
+          )
+        
+        
+        −
+        
+        
+          ∑
+          
+            x
+            ∈
+            X
+          
+        
+        −
+        p
+        (
+        x
+        )
+        log
+        ⁡
+        
+          p
+          (
+          x
+          )
+        
+        =
+        
+          ∑
+          
+            x
+            ∈
+            X
+          
+        
+        p
+        (
+        x
+        )
+        log
+        ⁡
+        
+          
+            
+              p
+              (
+              x
+              )
+            
+            
+              q
+              (
+              x
+              )
+            
+          
+        
+        .
+      
+    
+    {\displaystyle D_{\mathrm {KL} }(p(X)\|q(X))=\sum _{x\in X}-p(x)\log {q(x)}\,-\,\sum _{x\in X}-p(x)\log {p(x)}=\sum _{x\in X}p(x)\log {\frac {p(x)}{q(x)}}.}
+  
+
+Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasimetric).
+Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ is about to be drawn randomly from a discrete set with probability distribution ⁠
+  
+    
+      
+        p
+        (
+        x
+        )
+      
+    
+    {\displaystyle p(x)}
+  
+⁠. If Alice knows the true distribution ⁠
+  
+    
+      
+        p
+        (
+        x
+        )
+      
+    
+    {\displaystyle p(x)}
+  
+⁠, while Bob believes (has a prior) that the distribution is ⁠
+  
+    
+      
+        q
+        (
+        x
+        )
+      
+    
+    {\displaystyle q(x)}
+  
+⁠, then Bob will be more surprised than Alice, on average, upon seeing the value of 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+. The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.
+
+=== Directed Information ===
+Directed information, 
+  
+    
+      
+        I
+        (
+        
+          X
+          
+            n
+          
+        
+        →
+        
+          Y
+          
+            n
+          
+        
+        )
+      
+    
+    {\displaystyle I(X^{n}\to Y^{n})}
+  
+, is an information theory measure that quantifies the information flow from the random process 
+  
+    
+      
+        
+          X
+          
+            n
+          
+        
+        =
+        {
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        …
+        ,
+        
+          X
+          
+            n
+          
+        
+        }
+      
+    
+    {\displaystyle X^{n}=\{X_{1},X_{2},\dots ,X_{n}\}}
+  
+ to the random process 
+  
+    
+      
+        
+          Y
+          
+            n
+          
+        
+        =
+        {
+        
+          Y
+          
+            1
+          
+        
+        ,
+        
+          Y
+          
+            2
+          
+        
+        ,
+        …
+        ,
+        
+          Y
+          
+            n
+          
+        
+        }
+      
+    
+    {\displaystyle Y^{n}=\{Y_{1},Y_{2},\dots ,Y_{n}\}}
+  
+. The term directed information was coined by James Massey and is defined as:
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-4.md b/data/en.wikipedia.org/wiki/Information_theory-4.md
new file mode 100644
index 000000000..286fe11bc
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-4.md
@@ -0,0 +1,448 @@
+---
+title: "Information theory"
+chunk: 5/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+  
+    
+      
+        I
+        (
+        
+          X
+          
+            n
+          
+        
+        →
+        
+          Y
+          
+            n
+          
+        
+        )
+         
+        ≜
+         
+        
+          ∑
+          
+            i
+            =
+            1
+          
+          
+            n
+          
+        
+        I
+        (
+        
+          X
+          
+            i
+          
+        
+        ;
+        
+          Y
+          
+            i
+          
+        
+        
+          |
+        
+        
+          Y
+          
+            i
+            −
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle I(X^{n}\to Y^{n})\ \triangleq \ \sum _{i=1}^{n}I(X^{i};Y_{i}|Y^{i-1})}
+  
+,
+where 
+  
+    
+      
+        I
+        (
+        
+          X
+          
+            i
+          
+        
+        ;
+        
+          Y
+          
+            i
+          
+        
+        
+          |
+        
+        
+          Y
+          
+            i
+            −
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle I(X^{i};Y_{i}|Y^{i-1})}
+  
+ is the conditional mutual information 
+  
+    
+      
+        I
+        (
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        .
+        .
+        .
+        ,
+        
+          X
+          
+            i
+          
+        
+        ;
+        
+          Y
+          
+            i
+          
+        
+        
+          |
+        
+        
+          Y
+          
+            1
+          
+        
+        ,
+        
+          Y
+          
+            2
+          
+        
+        ,
+        .
+        .
+        .
+        ,
+        
+          Y
+          
+            i
+            −
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle I(X_{1},X_{2},...,X_{i};Y_{i}|Y_{1},Y_{2},...,Y_{i-1})}
+  
+.
+In contrast to mutual information, directed information is not symmetric. The 
+  
+    
+      
+        I
+        (
+        
+          X
+          
+            n
+          
+        
+        →
+        
+          Y
+          
+            n
+          
+        
+        )
+      
+    
+    {\displaystyle I(X^{n}\to Y^{n})}
+  
+ measures the information bits that are transmitted causally from 
+  
+    
+      
+        
+          X
+          
+            n
+          
+        
+      
+    
+    {\displaystyle X^{n}}
+  
+ to 
+  
+    
+      
+        
+          Y
+          
+            n
+          
+        
+      
+    
+    {\displaystyle Y^{n}}
+  
+. The Directed information has many applications in problems where causality plays an important role such as capacity of channel with feedback, capacity of discrete memoryless networks with feedback, gambling with causal side information, compression with causal side information,
+real-time control communication settings, and in statistical physics.
+
+=== Other quantities ===
+Other important information theoretic quantities include the Rényi entropy and the Tsallis entropy (generalizations of the concept of entropy), differential entropy (a generalization of quantities of information to continuous distributions), and the conditional mutual information. Also, pragmatic information has been proposed as a measure of how much information has been used in making a decision.
+
+== Coding theory ==
+
+Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.
+
+Data compression (source coding): There are two formulations for the compression problem:
+Lossless data compression: the data must be reconstructed exactly;
+Lossy data compression: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function. This subset of information theory is called rate–distortion theory.
+Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error-correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.
+This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal. For general sources and channels that are not necessarily stationary or ergodic, information-spectrum methods characterize coding limits using asymptotic distributions of information density rather than only single-letter entropies or mutual information. A related problem, channel resolvability, asks what rate is required for channel inputs to approximate a target output distribution; Han and Sergio Verdú connected this approximation problem to coding theorems for general channels.
+
+Hayashi later derived general nonasymptotic and asymptotic formulas connecting channel resolvability and identification capacity, and applied these formulas to secrecy analysis for the wiretap channel.
+
+=== Source theory ===
+Any process that generates successive messages can be considered a source of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory.
+
+==== Rate ====
+Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is:
+
+  
+    
+      
+        r
+        =
+        
+          lim
+          
+            n
+            →
+            ∞
+          
+        
+        H
+        (
+        
+          X
+          
+            n
+          
+        
+        
+          |
+        
+        
+          X
+          
+            n
+            −
+            1
+          
+        
+        ,
+        
+          X
+          
+            n
+            −
+            2
+          
+        
+        ,
+        
+          X
+          
+            n
+            −
+            3
+          
+        
+        ,
+        …
+        )
+        ;
+      
+    
+    {\displaystyle r=\lim _{n\to \infty }H(X_{n}|X_{n-1},X_{n-2},X_{n-3},\ldots );}
+  
+
+that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is:
+
+  
+    
+      
+        r
+        =
+        
+          lim
+          
+            n
+            →
+            ∞
+          
+        
+        
+          
+            1
+            n
+          
+        
+        H
+        (
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        …
+        
+          X
+          
+            n
+          
+        
+        )
+        ;
+      
+    
+    {\displaystyle r=\lim _{n\to \infty }{\frac {1}{n}}H(X_{1},X_{2},\dots X_{n});}
+  
+
+that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.
+The information rate is defined as:  
+
+  
+    
+      
+        r
+        =
+        
+          lim
+          
+            n
+            →
+            ∞
+          
+        
+        
+          
+            1
+            n
+          
+        
+        I
+        (
+        
+          X
+          
+            1
+          
+        
+        ,
+        
+          X
+          
+            2
+          
+        
+        ,
+        …
+        
+          X
+          
+            n
+          
+        
+        ;
+        
+          Y
+          
+            1
+          
+        
+        ,
+        
+          Y
+          
+            2
+          
+        
+        ,
+        …
+        
+          Y
+          
+            n
+          
+        
+        )
+        ;
+      
+    
+    {\displaystyle r=\lim _{n\to \infty }{\frac {1}{n}}I(X_{1},X_{2},\dots X_{n};Y_{1},Y_{2},\dots Y_{n});}
+  
+
+It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of source coding.
+
+=== Channel capacity ===
+
+Communications over a channel is the primary motivation of information theory. However, channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.
+Consider the communications process over a discrete channel. A simple model of the process is shown below:
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-5.md b/data/en.wikipedia.org/wiki/Information_theory-5.md
new file mode 100644
index 000000000..a554b4e0f
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-5.md
@@ -0,0 +1,501 @@
+---
+title: "Information theory"
+chunk: 6/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+  
+    
+      
+        
+          
+            →
+            
+              
+                Message
+              
+            
+            
+              W
+            
+          
+        
+        
+          
+            
+              
+                
+                  Encoder
+                
+              
+            
+            
+              
+                
+                  f
+                  
+                    n
+                  
+                
+              
+            
+          
+        
+        
+          
+            →
+            
+              
+                
+                  
+                    
+                      E
+                      n
+                      c
+                      o
+                      d
+                      e
+                      d
+                    
+                    
+                      s
+                      e
+                      q
+                      u
+                      e
+                      n
+                      c
+                      e
+                    
+                  
+                
+              
+            
+            
+              
+                X
+                
+                  n
+                
+              
+            
+          
+        
+        
+          
+            
+              
+                
+                  Channel
+                
+              
+            
+            
+              
+                p
+                (
+                y
+                
+                  |
+                
+                x
+                )
+              
+            
+          
+        
+        
+          
+            →
+            
+              
+                
+                  
+                    
+                      R
+                      e
+                      c
+                      e
+                      i
+                      v
+                      e
+                      d
+                    
+                    
+                      s
+                      e
+                      q
+                      u
+                      e
+                      n
+                      c
+                      e
+                    
+                  
+                
+              
+            
+            
+              
+                Y
+                
+                  n
+                
+              
+            
+          
+        
+        
+          
+            
+              
+                
+                  Decoder
+                
+              
+            
+            
+              
+                
+                  g
+                  
+                    n
+                  
+                
+              
+            
+          
+        
+        
+          
+            →
+            
+              
+                
+                  
+                    
+                      E
+                      s
+                      t
+                      i
+                      m
+                      a
+                      t
+                      e
+                      d
+                    
+                    
+                      m
+                      e
+                      s
+                      s
+                      a
+                      g
+                      e
+                    
+                  
+                
+              
+            
+            
+              
+                
+                  W
+                  ^
+                
+              
+            
+          
+        
+      
+    
+    {\displaystyle {\xrightarrow[{\text{Message}}]{W}}{\begin{array}{|c| }\hline {\text{Encoder}}\\f_{n}\\\hline \end{array}}{\xrightarrow[{\mathrm {Encoded \atop sequence} }]{X^{n}}}{\begin{array}{|c| }\hline {\text{Channel}}\\p(y|x)\\\hline \end{array}}{\xrightarrow[{\mathrm {Received \atop sequence} }]{Y^{n}}}{\begin{array}{|c| }\hline {\text{Decoder}}\\g_{n}\\\hline \end{array}}{\xrightarrow[{\mathrm {Estimated \atop message} }]{\hat {W}}}}
+  
+
+Here 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ represents the space of messages transmitted, and 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+ the space of messages received during a unit time over our channel. Let p(y|x) be the conditional probability distribution function of 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+ given 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+. We will consider p(y|x) to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of 
+  
+    
+      
+        X
+      
+    
+    {\displaystyle X}
+  
+ and 
+  
+    
+      
+        Y
+      
+    
+    {\textstyle Y}
+  
+ is completely determined by our channel and by our choice of f(x), the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the channel capacity and is given by:
+
+  
+    
+      
+        C
+        =
+        
+          max
+          
+            f
+          
+        
+        I
+        (
+        X
+        ;
+        Y
+        )
+        .
+        
+      
+    
+    {\displaystyle C=\max _{f}I(X;Y).\!}
+  
+
+This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R > C, it is impossible to transmit with arbitrarily small block error.
+Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.
+
+==== Capacity of particular channel models ====
+A continuous-time analog communications channel subject to Gaussian noise—see Shannon–Hartley theorem.
+A binary symmetric channel (BSC) with crossover probability p is a binary input, binary output channel that flips the input bit with probability p. The BSC has a capacity of 1 − Hb(p) bits per channel use, where Hb is the binary entropy function to the base-2 logarithm:
+
+A binary erasure channel (BEC) with erasure probability p is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is 1 − p bits per channel use.
+
+==== Channels with memory and directed information ====
+In practice many channels have memory. Namely, at time 
+  
+    
+      
+        i
+      
+    
+    {\displaystyle i}
+  
+ the channel is given by the conditional probability
+  
+    
+      
+        P
+        (
+        
+          y
+          
+            i
+          
+        
+        
+          |
+        
+        
+          x
+          
+            i
+          
+        
+        ,
+        
+          x
+          
+            i
+            −
+            1
+          
+        
+        ,
+        
+          x
+          
+            i
+            −
+            2
+          
+        
+        ,
+        .
+        .
+        .
+        ,
+        
+          x
+          
+            1
+          
+        
+        ,
+        
+          y
+          
+            i
+            −
+            1
+          
+        
+        ,
+        
+          y
+          
+            i
+            −
+            2
+          
+        
+        ,
+        .
+        .
+        .
+        ,
+        
+          y
+          
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle P(y_{i}|x_{i},x_{i-1},x_{i-2},...,x_{1},y_{i-1},y_{i-2},...,y_{1})}
+  
+.
+It is often more comfortable to use the notation 
+  
+    
+      
+        
+          x
+          
+            i
+          
+        
+        =
+        (
+        
+          x
+          
+            i
+          
+        
+        ,
+        
+          x
+          
+            i
+            −
+            1
+          
+        
+        ,
+        
+          x
+          
+            i
+            −
+            2
+          
+        
+        ,
+        .
+        .
+        .
+        ,
+        
+          x
+          
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle x^{i}=(x_{i},x_{i-1},x_{i-2},...,x_{1})}
+  
+ and the channel become 
+  
+    
+      
+        P
+        (
+        
+          y
+          
+            i
+          
+        
+        
+          |
+        
+        
+          x
+          
+            i
+          
+        
+        ,
+        
+          y
+          
+            i
+            −
+            1
+          
+        
+        )
+      
+    
+    {\displaystyle P(y_{i}|x^{i},y^{i-1})}
+  
+.
+In such a case the capacity is given by the mutual information rate when there is no feedback available and the Directed information rate in the case that either there is feedback or not (if there is no feedback the directed information equals the mutual information).
+
+=== Fungible information ===
+Fungible information is the information for which the means of encoding is not important. Classical information theorists and computer scientists are mainly concerned with information of this sort. It is sometimes referred as speakable information.
+
+== Applications to other fields ==
+
+=== Network physiology ===
+Information theory concepts, methods and approaches have broad applications in network physiology, a field which provides a quantitative framework, based on adaptive networks of dynamical systems, to investigate how physiological systems exchange, process, and integrate information as a network to (i) coordinate their functions across levels and scales (from sub-cellular to organs and organism level) and (ii) generate distinct physiological states in health and disease. Through measures such as mutual information, transfer entropy, and co-information, information theory enables the detection of coupling strength, directionality, synergy/redundancy and higher-order interactions among physiological systems and sub-systems, revealing how network cross-communication and regulation occur within the organism. Applications of information-theoretic approaches span from analyzing information transfer between brain and body networks during various states; cardio-respiratory interactions; cardio-muscular interactions; cortico-muscular interactions; brain wave interactions and brain functional networks; network physiology in extreme environments.
+
+=== Intelligence uses and secrecy applications ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Information_theory-6.md b/data/en.wikipedia.org/wiki/Information_theory-6.md
new file mode 100644
index 000000000..c5b7239de
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Information_theory-6.md
@@ -0,0 +1,64 @@
+---
+title: "Information theory"
+chunk: 7/7
+source: "https://en.wikipedia.org/wiki/Information_theory"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:37.735412+00:00"
+instance: "kb-cron"
+---
+
+Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of World War II in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.
+Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on asymmetric key algorithms or on most commonly used methods of symmetric key algorithms (sometimes called secret key algorithms), such as block ciphers. The security of all such methods comes from the assumption that no known attack can break them in a practical amount of time.
+Information theoretic security refers to methods such as the one-time pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.
+
+=== Pseudorandom number generation ===
+
+Pseudorandom number generators are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed cryptographically secure pseudorandom number generators, but even they require random seeds external to the software to work as intended. These can be obtained via extractors, if done carefully. The measure of sufficient randomness in extractors is min-entropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.
+
+=== Seismic exploration ===
+One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing offer a major improvement of resolution and image clarity over previous analog methods.
+
+=== Semiotics ===
+Semioticians Doede Nauta and Winfried Nöth both considered Charles Sanders Peirce as having created a theory of information in his works on semiotics. Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."
+Concepts from information theory such as redundancy and code control have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.
+
+=== Integrated process organization of neural information ===
+Quantitative information theoretic methods have been applied in cognitive science to analyze the integrated process organization of neural information in the context of the binding problem in cognitive neuroscience. In this context, either an information-theoretical measure, such as functional clusters (Gerald Edelman and Giulio Tononi's functional clustering model and dynamic core hypothesis (DCH)) or effective information (Tononi's integrated information theory (IIT) of consciousness), is defined (on the basis of a reentrant process organization, i.e. the synchronization of neurophysiological activity between groups of neuronal populations), or the measure of the minimization of free energy on the basis of statistical methods (Karl J. Friston's free energy principle (FEP), an information-theoretical measure which states that every adaptive change in a self-organized system leads to a minimization of free energy, and the Bayesian brain hypothesis).
+
+=== Miscellaneous applications ===
+Information theory also has applications in the search for extraterrestrial intelligence, black holes, bioinformatics, and gambling.
+
+== See also ==
+
+=== Applications ===
+
+=== History ===
+Hartley, R.V.L.
+History of information theory
+Shannon, C.E.
+Timeline of information theory
+Yockey, H.P.
+Andrey Kolmogorov
+
+=== Theory ===
+
+=== Concepts ===
+
+== References ==
+
+== Further reading ==
+
+=== The classic work ===
+
+=== Other journal articles ===
+
+=== Textbooks on information theory ===
+
+=== Other books ===
+
+== External links ==
+
+"Information", Encyclopedia of Mathematics, EMS Press, 2001 [1994]
+Lambert F. L. (1999), "Shuffled Cards, Messy Desks, and Disorderly Dorm Rooms - Examples of Entropy Increase? Nonsense!", Journal of Chemical Education
+IEEE Information Theory Society Archived 2019-08-01 at the Wayback Machine and ITSOC Monographs, Surveys, and Reviews Archived 2018-06-12 at the Wayback Machine
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-0.md b/data/en.wikipedia.org/wiki/Logic-0.md
new file mode 100644
index 000000000..4cfe501f8
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-0.md
@@ -0,0 +1,40 @@
+---
+title: "Logic"
+chunk: 1/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the study of deductively valid inferences or logical truths. It examines how conclusions follow from premises based on the structure of arguments alone, independent of their topic and content. Informal logic is associated with informal fallacies, critical thinking, and argumentation theory. Informal logic examines arguments expressed in natural language whereas formal logic uses formal language. When used as a countable noun, the term "a logic" refers to a specific logical formal system that articulates a proof system. Logic plays a central role in many fields, such as philosophy, mathematics, computer science, and linguistics.
+Logic studies arguments, which consist of a set of premises that leads to a conclusion. An example is the argument from the premises "it's Sunday" and "if it's Sunday then I don't have to work" leading to the conclusion "I don't have to work." Premises and conclusions express propositions or claims that can be true or false. An important feature of propositions is their internal structure. For example, complex propositions are made up of simpler propositions linked by logical vocabulary like 
+  
+    
+      
+        ∧
+      
+    
+    {\displaystyle \land }
+  
+ (and) or 
+  
+    
+      
+        →
+      
+    
+    {\displaystyle \to }
+  
+ (if...then). Simple propositions also have parts, like "Sunday" or "work" in the example. The truth of a proposition usually depends on the meanings of all of its parts. However, this is not the case for logically true propositions. They are true only because of their logical structure independent of the specific meanings of the individual parts.
+Arguments can be either correct or incorrect. An argument is correct if its premises support its conclusion. Deductive arguments have the strongest form of support: if their premises are true then their conclusion must also be true. This is not the case for ampliative arguments, which arrive at genuinely new information not found in the premises. Many arguments in everyday discourse and the sciences are ampliative arguments. They are divided into inductive and abductive arguments. Inductive arguments are statistical generalizations, such as inferring that all ravens are black based on many individual observations of black ravens. Abductive arguments are inferences to the best explanation, for example, when a doctor concludes that a patient has a certain disease which explains the symptoms they suffer. Arguments that fall short of the standards of correct reasoning often embody fallacies. Systems of logic are theoretical frameworks for assessing the correctness of arguments.
+Logic has been studied since antiquity. Early approaches include Aristotelian logic, Stoic logic, Nyaya, and Mohism. Aristotelian logic focuses on reasoning in the form of syllogisms. It was considered the main system of logic in the Western world until it was replaced by modern formal logic, which has its roots in the work of late 19th-century mathematicians such as Gottlob Frege. Today, the most commonly used system is classical logic. It consists of propositional logic and first-order logic. Propositional logic only considers logical relations between full propositions. First-order logic also takes the internal parts of propositions into account, like predicates and quantifiers. Extended logics accept the basic intuitions behind classical logic and apply it to other fields, such as metaphysics, ethics, and epistemology, as frameworks for reasoning about what is possible or necessary, what is or should be, and what is believed or known. Deviant logics, on the other hand, reject certain classical intuitions and provide alternative explanations of the basic laws of logic.
+
+== Definition ==
+The word "logic" originates from the Greek word logos, which has a variety of translations, such as reason, discourse, or language. Logic is traditionally defined as the study of the laws of thought or correct reasoning, and is usually understood in terms of inferences or arguments. Reasoning is the activity of drawing inferences. Arguments are the outward expression of inferences. An argument is a set of premises together with a conclusion. Logic is interested in whether arguments are correct, i.e. whether their premises support the conclusion. These general characterizations apply to logic in the widest sense, i.e., to both formal and informal logic since they are both concerned with assessing the correctness of arguments. Formal logic is the traditionally dominant field, and some logicians restrict logic to formal logic.
+
+=== Formal logic ===
+
+Formal logic (also known as symbolic logic) is widely used in mathematical logic. It uses a formal approach to study reasoning: it replaces concrete expressions with abstract symbols to examine the logical form of arguments independent of their concrete content. In this sense, it is topic-neutral since it is only concerned with the abstract structure of arguments and not with their concrete content.
+Formal logic is interested in deductively valid arguments, for which the truth of their premises ensures the truth of their conclusion. This means that it is impossible for the premises to be true and the conclusion to be false. For valid arguments, the logical structure that leads from the premises to the conclusion follows a pattern called a rule of inference. For example, modus ponens is a rule of inference according to which all arguments of the form "(1) p, (2) if p then q, (3) therefore q" are valid, independent of what the terms p and q stand for. In this sense, formal logic can be defined as the science of valid inferences. An alternative definition sees logic as the study of logical truths. A proposition is logically true if its truth depends only on the logical vocabulary used in it. This means that it is true in all possible worlds and under all interpretations of its non-logical terms, like the claim "either it is raining, or it is not". These two definitions of formal logic are not identical, but they are closely related. For example, if the inference from p to q is deductively valid then the claim "if p then q" is a logical truth.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-1.md b/data/en.wikipedia.org/wiki/Logic-1.md
new file mode 100644
index 000000000..11d7d7e5d
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-1.md
@@ -0,0 +1,106 @@
+---
+title: "Logic"
+chunk: 2/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Formal logic uses formal languages to express, analyze, and clarify arguments. They normally have a very limited vocabulary and exact syntactic rules. These rules specify how their symbols can be combined to construct sentences, so-called well-formed formulas. This simplicity and exactness of formal logic make it capable of formulating precise rules of inference. They determine whether a given argument is valid. Because of the reliance on formal language, natural language arguments cannot be studied directly. Instead, they need to be translated into formal language before their validity can be assessed.
+The term "logic" can also be used in a slightly different sense as a countable noun. In this sense, a logic is a logical formal system. Distinct logics differ from each other concerning the rules of inference they accept as valid and the formal languages used to express them. Starting in the late 19th century, many new formal systems have been proposed. There are disagreements about what makes a formal system a logic. For example, it has been suggested that only logically complete systems, like first-order logic, qualify as logics. For such reasons, some theorists deny that higher-order logics are logics in the strict sense.
+
+=== Informal logic ===
+
+When understood in a wide sense, logic encompasses both formal and informal logic. Informal logic uses non-formal criteria and standards to analyze and assess the correctness of arguments. Its main focus is on everyday discourse. Its development was prompted by difficulties in applying the insights of formal logic to natural language arguments. In this regard, it considers problems that formal logic on its own is unable to address. Both provide criteria for assessing the correctness of arguments and distinguishing them from fallacies.
+Many characterizations of informal logic have been suggested but there is no general agreement on its precise definition. The most literal approach sees the terms "formal" and "informal" as applying to the language used to express arguments. On this view, informal logic studies arguments that are in informal or natural language. Formal logic can only examine them indirectly by translating them first into a formal language while informal logic investigates them in their original form. On this view, the argument "Birds fly. Tweety is a bird. Therefore, Tweety flies." belongs to natural language and is examined by informal logic. But the formal translation "(1) 
+  
+    
+      
+        ∀
+        x
+        (
+        
+          B
+          i
+          r
+          d
+        
+        (
+        x
+        )
+        →
+        
+          F
+          l
+          i
+          e
+          s
+        
+        (
+        x
+        )
+        )
+      
+    
+    {\displaystyle \forall x(\mathrm {Bird} (x)\to \mathrm {Flies} (x))}
+  
+; (2) 
+  
+    
+      
+        
+          B
+          i
+          r
+          d
+        
+        (
+        
+          T
+          w
+          e
+          e
+          t
+          y
+        
+        )
+      
+    
+    {\displaystyle \mathrm {Bird} (\mathrm {Tweety} )}
+  
+; (3) 
+  
+    
+      
+        
+          F
+          l
+          i
+          e
+          s
+        
+        (
+        
+          T
+          w
+          e
+          e
+          t
+          y
+        
+        )
+      
+    
+    {\displaystyle \mathrm {Flies} (\mathrm {Tweety} )}
+  
+" is studied by formal logic. The study of natural language arguments comes with various difficulties. For example, natural language expressions are often ambiguous, vague, and context-dependent. Another approach defines informal logic in a wide sense as the normative study of the standards, criteria, and procedures of argumentation. In this sense, it includes questions about the role of rationality, critical thinking, and the psychology of argumentation.
+Another characterization identifies informal logic with the study of non-deductive arguments. In this way, it contrasts with deductive reasoning examined by formal logic. Non-deductive arguments make their conclusion probable but do not ensure that it is true. An example is the inductive argument from the empirical observation that "all ravens I have seen so far are black" to the conclusion "all ravens are black".
+A further approach is to define informal logic as the study of informal fallacies. Informal fallacies are incorrect arguments in which errors are present in the content and the context of the argument. A false dilemma, for example, involves an error of content by excluding viable options. This is the case in the fallacy "you are either with us or against us; you are not with us; therefore, you are against us". Some theorists state that formal logic studies the general form of arguments while informal logic studies particular instances of arguments. Another approach is to hold that formal logic only considers the role of logical constants for correct inferences while informal logic also takes the meaning of substantive concepts into account. Further approaches focus on the discussion of logical topics with or without formal devices and on the role of epistemology for the assessment of arguments.
+
+== Basic concepts ==
+
+=== Premises, conclusions, and truth ===
+
+==== Premises and conclusions ====
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-10.md b/data/en.wikipedia.org/wiki/Logic-10.md
new file mode 100644
index 000000000..36c3adc49
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-10.md
@@ -0,0 +1,28 @@
+---
+title: "Logic"
+chunk: 11/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Logic was developed independently in several cultures during antiquity. One major early contributor was Aristotle, who developed term logic in his Organon and Prior Analytics. He was responsible for the introduction of the hypothetical syllogism and temporal modal logic. Further innovations include inductive logic as well as the discussion of new logical concepts such as terms, predicables, syllogisms, and propositions. Aristotelian logic was highly regarded in classical and medieval times, both in Europe and the Middle East. It remained in wide use in the West until the early 19th century. It has now been superseded by later work, though many of its key insights are still present in modern systems of logic.
+Ibn Sina (Avicenna) was the founder of Avicennian logic, which replaced Aristotelian logic as the dominant system of logic in the Islamic world. It influenced Western medieval writers such as Albertus Magnus and William of Ockham. Ibn Sina wrote on the hypothetical syllogism and on the propositional calculus. He developed an original "temporally modalized" syllogistic theory, involving temporal logic and modal logic. He also made use of inductive logic, such as his methods of agreement, difference, and concomitant variation, which are critical to the scientific method. Fakhr al-Din al-Razi was another influential Muslim logician. He criticized Aristotelian syllogistics and formulated an early system of inductive logic, foreshadowing the system of inductive logic developed by John Stuart Mill.
+During the Middle Ages, many translations and interpretations of Aristotelian logic were made. The works of Boethius were particularly influential. Besides translating Aristotle's work into Latin, he also produced textbooks on logic. Later, the works of Islamic philosophers such as Ibn Sina and Ibn Rushd (Averroes) were drawn on. This expanded the range of ancient works available to medieval Christian scholars since more Greek work was available to Muslim scholars that had been preserved in Latin commentaries. In 1323, William of Ockham's influential Summa Logicae was released. It is a comprehensive treatise on logic that discusses many basic concepts of logic and provides a systematic exposition of types of propositions and their truth conditions.
+In Chinese philosophy, the School of Names and Mohism were particularly influential. The School of Names focused on the use of language and on paradoxes. For example, Gongsun Long proposed the white horse paradox, which defends the thesis that a white horse is not a horse. The school of Mohism also acknowledged the importance of language for logic and tried to relate the ideas in these fields to the realm of ethics.
+In India, the study of logic was primarily pursued by the schools of Nyaya, Buddhism, and Jainism. It was not treated as a separate academic discipline and discussions of its topics usually happened in the context of epistemology and theories of dialogue or argumentation. In Nyaya, inference is understood as a source of knowledge (pramāṇa). It follows the perception of an object and tries to arrive at conclusions, for example, about the cause of this object. A similar emphasis on the relation to epistemology is also found in Buddhist and Jainist schools of logic, where inference is used to expand the knowledge gained through other sources. Some of the later theories of Nyaya, belonging to the Navya-Nyāya school, resemble modern forms of logic, such as Gottlob Frege's distinction between sense and reference and his definition of number.
+The syllogistic logic developed by Aristotle predominated in the West until the mid-19th century, when interest in the foundations of mathematics stimulated the development of modern symbolic logic. Many see Gottlob Frege's Begriffsschrift as the birthplace of modern logic. Gottfried Wilhelm Leibniz's idea of a universal formal language is often considered a forerunner. Other pioneers were George Boole, who invented Boolean algebra as a mathematical system of logic, and Charles Peirce, who developed the logic of relatives. Alfred North Whitehead and Bertrand Russell, in turn, condensed many of these insights in their work Principia Mathematica. Modern logic introduced novel concepts, such as functions, quantifiers, and relational predicates. A hallmark of modern symbolic logic is its use of formal language to precisely codify its insights. In this regard, it departs from earlier logicians, who relied mainly on natural language. Of particular influence was the development of first-order logic, which is usually treated as the standard system of modern logic. Its analytical generality allowed the formalization of mathematics and drove the investigation of set theory. It also made Alfred Tarski's approach to model theory possible and provided the foundation of modern mathematical logic.
+
+== See also ==
+
+== References ==
+
+=== Notes ===
+
+=== Citations ===
+
+=== Bibliography ===
+
+== External links ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-2.md b/data/en.wikipedia.org/wiki/Logic-2.md
new file mode 100644
index 000000000..870d1b05b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-2.md
@@ -0,0 +1,147 @@
+---
+title: "Logic"
+chunk: 3/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Premises and conclusions are the basic parts of inferences or arguments and therefore play a central role in logic. In the case of a valid inference or a correct argument, the conclusion follows from the premises, or in other words, the premises support the conclusion. For instance, the premises "Mars is red" and "Mars is a planet" support the conclusion "Mars is a red planet". For most types of logic, it is accepted that premises and conclusions have to be truth-bearers. This means that they have a truth value: they are either true or false. Contemporary philosophy generally sees them either as propositions or as sentences. Propositions are the denotations of sentences and are usually seen as abstract objects. For example, the English sentence "the tree is green" is different from the German sentence "der Baum ist grün" but both express the same proposition.
+Propositional theories of premises and conclusions are often criticized because they rely on abstract objects. For instance, philosophical naturalists usually reject the existence of abstract objects. Other arguments concern the challenges involved in specifying the identity criteria of propositions. These objections are avoided by seeing premises and conclusions not as propositions but as sentences, i.e. as concrete linguistic objects like the symbols displayed on a page of a book. But this approach comes with new problems of its own: sentences are often context-dependent and ambiguous, meaning an argument's validity would not only depend on its parts but also on its context and on how it is interpreted. Another approach is to understand premises and conclusions in psychological terms as thoughts or judgments. This position is known as psychologism. It was discussed at length around the turn of the 20th century but it is not widely accepted today.
+
+==== Internal structure ====
+Premises and conclusions have an internal structure. As propositions or sentences, they can be either simple or complex. A complex proposition has other propositions as its constituents, which are linked to each other through propositional connectives like "and" or "if...then". Simple propositions, on the other hand, do not have propositional parts. But they can also be conceived as having an internal structure: they are made up of subpropositional parts, like singular terms and predicates. For example, the simple proposition "Mars is red" can be formed by applying the predicate "red" to the singular term "Mars". In contrast, the complex proposition "Mars is red and Venus is white" is made up of two simple propositions connected by the propositional connective "and".
+Whether a proposition is true depends, at least in part, on its constituents. For complex propositions formed using truth-functional propositional connectives, their truth only depends on the truth values of their parts. But this relation is more complicated in the case of simple propositions and their subpropositional parts. These subpropositional parts have meanings of their own, like referring to objects or classes of objects. Whether the simple proposition they form is true depends on their relation to reality, i.e. what the objects they refer to are like. This topic is studied by theories of reference.
+
+==== Logical truth ====
+
+Some complex propositions are true independently of the substantive meanings of their parts. In classical logic, for example, the complex proposition "either Mars is red or Mars is not red" is true independent of whether its parts, like the simple proposition "Mars is red", are true or false. In such cases, the truth is called a logical truth: a proposition is logically true if its truth depends only on the logical vocabulary used in it. This means that it is true under all interpretations of its non-logical terms. In some modal logics, this means that the proposition is true in all possible worlds. Some theorists define logic as the study of logical truths.
+
+==== Truth tables ====
+Truth tables can be used to show how logical connectives work or how the truth values of complex propositions depends on their parts. They have a column for each input variable. Each row corresponds to one possible combination of the truth values these variables can take; for truth tables presented in the English literature, the symbols "T" and "F" or "1" and "0" are commonly used as abbreviations for the truth values "true" and "false". The first columns present all the possible truth-value combinations for the input variables. Entries in the other columns present the truth values of the corresponding expressions as determined by the input values. For example, the expression "
+  
+    
+      
+        p
+        ∧
+        q
+      
+    
+    {\displaystyle p\land q}
+  
+" uses the logical connective 
+  
+    
+      
+        ∧
+      
+    
+    {\displaystyle \land }
+  
+ (and). It could be used to express a sentence like "yesterday was Sunday and the weather was good". It is only true if both of its input variables, 
+  
+    
+      
+        p
+      
+    
+    {\displaystyle p}
+  
+ ("yesterday was Sunday") and 
+  
+    
+      
+        q
+      
+    
+    {\displaystyle q}
+  
+ ("the weather was good"), are true. In all other cases, the expression as a whole is false. Other important logical connectives are 
+  
+    
+      
+        ¬
+      
+    
+    {\displaystyle \lnot }
+  
+ (not), 
+  
+    
+      
+        ∨
+      
+    
+    {\displaystyle \lor }
+  
+ (or), 
+  
+    
+      
+        →
+      
+    
+    {\displaystyle \to }
+  
+ (if...then), and 
+  
+    
+      
+        ↑
+      
+    
+    {\displaystyle \uparrow }
+  
+ (Sheffer stroke). Given the conditional proposition 
+  
+    
+      
+        p
+        →
+        q
+      
+    
+    {\displaystyle p\to q}
+  
+, one can form truth tables of its converse 
+  
+    
+      
+        q
+        →
+        p
+      
+    
+    {\displaystyle q\to p}
+  
+, its inverse (
+  
+    
+      
+        ¬
+        p
+        →
+        ¬
+        q
+      
+    
+    {\displaystyle \lnot p\to \lnot q}
+  
+), and its contrapositive (
+  
+    
+      
+        ¬
+        q
+        →
+        ¬
+        p
+      
+    
+    {\displaystyle \lnot q\to \lnot p}
+  
+). Truth tables can also be defined for more complex expressions that use several propositional connectives.
+
+=== Arguments and inferences ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-3.md b/data/en.wikipedia.org/wiki/Logic-3.md
new file mode 100644
index 000000000..cab01b05b
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-3.md
@@ -0,0 +1,48 @@
+---
+title: "Logic"
+chunk: 4/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Logic is commonly defined in terms of arguments or inferences as the study of their correctness. An argument is a set of premises together with a conclusion. An inference is the process of reasoning from these premises to the conclusion. But these terms are often used interchangeably in logic. Arguments are correct or incorrect depending on whether their premises support their conclusion. Premises and conclusions, on the other hand, are true or false depending on whether they are in accord with reality. In formal logic, a sound argument is an argument that is both correct and has only true premises. Sometimes a distinction is made between simple and complex arguments. A complex argument is made up of a chain of simple arguments. This means that the conclusion of one argument acts as a premise of later arguments. For a complex argument to be successful, each link of the chain has to be successful.
+
+Arguments and inferences are either correct or incorrect. If they are correct then their premises support their conclusion. In the incorrect case, this support is missing. It can take different forms corresponding to the different types of reasoning. The strongest form of support corresponds to deductive reasoning. But even arguments that are not deductively valid may still be good arguments because their premises offer non-deductive support to their conclusions. For such cases, the term ampliative or inductive reasoning is used. Deductive arguments are associated with formal logic in contrast to the relation between ampliative arguments and informal logic.
+
+==== Deductive ====
+A deductively valid argument is one whose premises guarantee the truth of its conclusion. For instance, the argument "(1) all frogs are amphibians; (2) no cats are amphibians; (3) therefore no cats are frogs" is deductively valid. For deductive validity, it does not matter whether the premises or the conclusion are actually true. So the argument "(1) all frogs are mammals; (2) no cats are mammals; (3) therefore no cats are frogs" is also valid because the conclusion follows necessarily from the premises.
+According to an influential view by Alfred Tarski, deductive arguments have three essential features: (1) they are formal, i.e. they depend only on the form of the premises and the conclusion; (2) they are a priori, i.e. no sense experience is needed to determine whether they obtain; (3) they are modal, i.e. that they hold by logical necessity for the given propositions, independent of any other circumstances.
+Because of the first feature, the focus on formality, deductive inference is usually identified with rules of inference. Rules of inference specify the form of the premises and the conclusion: how they have to be structured for the inference to be valid. Arguments that do not follow any rule of inference are deductively invalid. The modus ponens is a prominent rule of inference. It has the form "p; if p, then q; therefore q". Knowing that it has just rained (
+  
+    
+      
+        p
+      
+    
+    {\displaystyle p}
+  
+) and that after rain the streets are wet (
+  
+    
+      
+        p
+        →
+        q
+      
+    
+    {\displaystyle p\to q}
+  
+), one can use modus ponens to deduce that the streets are wet (
+  
+    
+      
+        q
+      
+    
+    {\displaystyle q}
+  
+).
+The third feature can be expressed by stating that deductively valid inferences are truth-preserving: it is impossible for the premises to be true and the conclusion to be false. Because of this feature, it is often asserted that deductive inferences are uninformative since the conclusion cannot arrive at new information not already present in the premises. But this point is not always accepted since it would mean, for example, that most of mathematics is uninformative. A different characterization distinguishes between surface and depth information. The surface information of a sentence is the information it presents explicitly. Depth information is the totality of the information contained in the sentence, both explicitly and implicitly. According to this view, deductive inferences are uninformative on the depth level. But they can be highly informative on the surface level by making implicit information explicit. This happens, for example, in mathematical proofs.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-4.md b/data/en.wikipedia.org/wiki/Logic-4.md
new file mode 100644
index 000000000..a59697ab7
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-4.md
@@ -0,0 +1,19 @@
+---
+title: "Logic"
+chunk: 5/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+==== Ampliative ====
+Ampliative arguments are arguments whose conclusions contain additional information not found in their premises. In this regard, they are more interesting since they contain information on the depth level and the thinker may learn something genuinely new. But this feature comes with a certain cost: the premises support the conclusion in the sense that they make its truth more likely but they do not ensure its truth. This means that the conclusion of an ampliative argument may be false even though all its premises are true. This characteristic is closely related to non-monotonicity and defeasibility: it may be necessary to retract an earlier conclusion upon receiving new information or in light of new inferences drawn. Ampliative reasoning plays a central role in many arguments found in everyday discourse and the sciences. Ampliative arguments are not automatically incorrect. Instead, they just follow different standards of correctness. The support they provide for their conclusion usually comes in degrees. This means that strong ampliative arguments make their conclusion very likely while weak ones are less certain. As a consequence, the line between correct and incorrect arguments is blurry in some cases, such as when the premises offer weak but non-negligible support. This contrasts with deductive arguments, which are either valid or invalid with nothing in-between.
+The terminology used to categorize ampliative arguments is inconsistent. Some authors, like James Hawthorne, use the term "induction" to cover all forms of non-deductive arguments. But in a more narrow sense, induction is only one type of ampliative argument alongside abductive arguments. Some philosophers, like Leo Groarke, also allow conductive arguments as another type. In this narrow sense, induction is often defined as a form of statistical generalization. In this case, the premises of an inductive argument are many individual observations that all show a certain pattern. The conclusion then is a general law that this pattern always obtains. In this sense, one may infer that "all elephants are gray" based on one's past observations of the color of elephants. A closely related form of inductive inference has as its conclusion not a general law but one more specific instance, as when it is inferred that an elephant one has not seen yet is also gray. Some theorists, like Igor Douven, stipulate that inductive inferences rest only on statistical considerations. This way, they can be distinguished from abductive inference.
+Abductive inference may or may not take statistical observations into consideration. In either case, the premises offer support for the conclusion because the conclusion is the best explanation of why the premises are true. In this sense, abduction is also called the inference to the best explanation. For example, given the premise that there is a plate with breadcrumbs in the kitchen in the early morning, one may infer the conclusion that one's house-mate had a midnight snack and was too tired to clean the table. This conclusion is justified because it is the best explanation of the current state of the kitchen. For abduction, it is not sufficient that the conclusion explains the premises. For example, the conclusion that a burglar broke into the house last night, got hungry on the job, and had a midnight snack, would also explain the state of the kitchen. But this conclusion is not justified because it is not the best or most likely explanation.
+
+=== Fallacies ===
+Not all arguments live up to the standards of correct reasoning. When they do not, they are usually referred to as fallacies. Their central aspect is not that their conclusion is false but that there is some flaw with the reasoning leading to this conclusion. So the argument "it is sunny today; therefore spiders have eight legs" is fallacious even though the conclusion is true. Some theorists, like John Stuart Mill, give a more restrictive definition of fallacies by additionally requiring that they appear to be correct. This way, genuine fallacies can be distinguished from mere mistakes of reasoning due to carelessness. This explains why people tend to commit fallacies: because they have an alluring element that seduces people into committing and accepting them. However, this reference to appearances is controversial because it belongs to the field of psychology, not logic, and because appearances may be different for different people.
+
+Fallacies are usually divided into formal and informal fallacies. For formal fallacies, the source of the error is found in the form of the argument. For example, denying the antecedent is one type of formal fallacy, as in "if Othello is a bachelor, then he is male; Othello is not a bachelor; therefore Othello is not male". But most fallacies fall into the category of informal fallacies, of which a great variety is discussed in the academic literature. The source of their error is usually found in the content or the context of the argument. Informal fallacies are sometimes categorized as fallacies of ambiguity, fallacies of presumption, or fallacies of relevance. For fallacies of ambiguity, the ambiguity and vagueness of natural language are responsible for their flaw, as in "feathers are light; what is light cannot be dark; therefore feathers cannot be dark". Fallacies of presumption have a wrong or unjustified premise but may be valid otherwise. In the case of fallacies of relevance, the premises do not support the conclusion because they are not relevant to it.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-5.md b/data/en.wikipedia.org/wiki/Logic-5.md
new file mode 100644
index 000000000..ec4c355a7
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-5.md
@@ -0,0 +1,115 @@
+---
+title: "Logic"
+chunk: 6/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+=== Definitory and strategic rules ===
+The main focus of most logicians is to study the criteria according to which an argument is correct or incorrect. A fallacy is committed if these criteria are violated. In the case of formal logic, they are known as rules of inference. They are definitory rules, which determine whether an inference is correct or which inferences are allowed. Definitory rules contrast with strategic rules. Strategic rules specify which inferential moves are necessary to reach a given conclusion based on a set of premises. This distinction does not just apply to logic but also to games. In chess, for example, the definitory rules dictate that bishops may only move diagonally. The strategic rules, on the other hand, describe how the allowed moves may be used to win a game, for instance, by controlling the center and by defending one's king. It has been argued that logicians should give more emphasis to strategic rules since they are highly relevant for effective reasoning.
+
+=== Formal systems ===
+
+A formal system of logic consists of a formal language together with a set of axioms and a proof system used to draw inferences from these axioms. In logic, axioms are statements that are accepted without proof. They are used to justify other statements. Some theorists also include a semantics that specifies how the expressions of the formal language relate to real objects. Starting in the late 19th century, many new formal systems have been proposed.
+A formal language consists of an alphabet and syntactic rules. The alphabet is the set of basic symbols used in expressions. The syntactic rules determine how these symbols may be arranged to result in well-formed formulas. For instance, the syntactic rules of propositional logic determine that "
+  
+    
+      
+        P
+        ∧
+        Q
+      
+    
+    {\displaystyle P\land Q}
+  
+" is a well-formed formula but "
+  
+    
+      
+        ∧
+        Q
+      
+    
+    {\displaystyle \land Q}
+  
+" is not since the logical conjunction 
+  
+    
+      
+        ∧
+      
+    
+    {\displaystyle \land }
+  
+ requires terms on both sides.
+A proof system is a collection of rules to construct formal proofs. It is a tool to arrive at conclusions from a set of axioms. Rules in a proof system are defined in terms of the syntactic form of formulas independent of their specific content. For instance, the classical rule of conjunction introduction states that 
+  
+    
+      
+        P
+        ∧
+        Q
+      
+    
+    {\displaystyle P\land Q}
+  
+ follows from the premises 
+  
+    
+      
+        P
+      
+    
+    {\displaystyle P}
+  
+ and 
+  
+    
+      
+        Q
+      
+    
+    {\displaystyle Q}
+  
+. Such rules can be applied sequentially, giving a mechanical procedure for generating conclusions from premises. There are different types of proof systems including natural deduction and sequent calculi.
+A semantics is a system for mapping expressions of a formal language to their denotations. In many systems of logic, denotations are truth values. For instance, the semantics for classical propositional logic assigns the formula 
+  
+    
+      
+        P
+        ∧
+        Q
+      
+    
+    {\displaystyle P\land Q}
+  
+ the denotation "true" whenever 
+  
+    
+      
+        P
+      
+    
+    {\displaystyle P}
+  
+ and 
+  
+    
+      
+        Q
+      
+    
+    {\displaystyle Q}
+  
+ are true. From the semantic point of view, a premise entails a conclusion if the conclusion is true whenever the premise is true.
+A system of logic is sound when its proof system cannot derive a conclusion from a set of premises unless it is semantically entailed by them. In other words, its proof system cannot lead to false conclusions, as defined by the semantics. A system is complete when its proof system can derive every conclusion that is semantically entailed by its premises. In other words, its proof system can lead to any true conclusion, as defined by the semantics. Thus, soundness and completeness together describe a system whose notions of validity and entailment line up perfectly.
+
+== Systems of logic ==
+Systems of logic are theoretical frameworks for assessing the correctness of reasoning and arguments. For over two thousand years, Aristotelian logic was treated as the canon of logic in the Western world, but modern developments in this field have led to a vast proliferation of logical systems. One prominent categorization divides modern formal logical systems into classical logic, extended logics, and deviant logics.
+
+=== Aristotelian ===
+
+Aristotelian logic encompasses a great variety of topics. They include metaphysical theses about ontological categories and problems of scientific explanation. But in a more narrow sense, it is identical to term logic or syllogistics. A syllogism is a form of argument involving three propositions: two premises and a conclusion. Each proposition has three essential parts: a subject, a predicate, and a copula connecting the subject to the predicate. For example, the proposition "Socrates is wise" is made up of the subject "Socrates", the predicate "wise", and the copula "is". The subject and the predicate are the terms of the proposition. Aristotelian logic does not contain complex propositions made up of simple propositions. It differs in this aspect from propositional logic, in which any two propositions can be linked using a logical connective like "and" to form a new complex proposition.
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-6.md b/data/en.wikipedia.org/wiki/Logic-6.md
new file mode 100644
index 000000000..99ddb9d92
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-6.md
@@ -0,0 +1,146 @@
+---
+title: "Logic"
+chunk: 7/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+In Aristotelian logic, the subject can be universal, particular, indefinite, or singular. For example, the term "all humans" is a universal subject in the proposition "all humans are mortal". A similar proposition could be formed by replacing it with the particular term "some humans", the indefinite term "a human", or the singular term "Socrates".
+Aristotelian logic only includes predicates for simple properties of entities. But it lacks predicates corresponding to relations between entities. The predicate can be linked to the subject in two ways: either by affirming it or by denying it. For example, the proposition "Socrates is not a cat" involves the denial of the predicate "cat" to the subject "Socrates". Using combinations of subjects and predicates, a great variety of propositions and syllogisms can be formed. Syllogisms are characterized by the fact that the premises are linked to each other and to the conclusion by sharing one term in each case. Thus, these three propositions contain three terms, referred to as major term, minor term, and middle term. The central aspect of Aristotelian logic involves classifying all possible syllogisms into valid and invalid arguments according to how the propositions are formed. For example, the syllogism "all men are mortal; Socrates is a man; therefore Socrates is mortal" is valid. The syllogism "all cats are mortal; Socrates is mortal; therefore Socrates is a cat", on the other hand, is invalid.
+
+=== Classical ===
+
+Classical logic is distinct from traditional or Aristotelian logic. It encompasses propositional logic and first-order logic. It is "classical" in the sense that it is based on basic logical intuitions shared by most logicians. These intuitions include the law of excluded middle, the double negation elimination, the principle of explosion, and the bivalence of truth. It was originally developed to analyze mathematical arguments and was only later applied to other fields as well. Because of this focus on mathematics, it does not include logical vocabulary relevant to many other topics of philosophical importance. Examples of concepts it overlooks are the contrast between necessity and possibility and the problem of ethical obligation and permission. Similarly, it does not address the relations between past, present, and future. Such issues are addressed by extended logics. They build on the basic intuitions of classical logic and expand it by introducing new logical vocabulary. This way, the exact logical approach is applied to fields like ethics or epistemology that lie beyond the scope of mathematics.
+
+==== Propositional logic ====
+
+Propositional logic comprises formal systems in which formulae are built from atomic propositions using logical connectives. For instance, propositional logic represents the conjunction of two atomic propositions 
+  
+    
+      
+        P
+      
+    
+    {\displaystyle P}
+  
+ and 
+  
+    
+      
+        Q
+      
+    
+    {\displaystyle Q}
+  
+ as the complex formula 
+  
+    
+      
+        P
+        ∧
+        Q
+      
+    
+    {\displaystyle P\land Q}
+  
+. Unlike predicate logic where terms and predicates are the smallest units, propositional logic takes full propositions with truth values as its most basic component. Thus, propositional logics can only represent logical relationships that arise from the way complex propositions are built from simpler ones. But it cannot represent inferences that result from the inner structure of a proposition.
+
+==== First-order logic ====
+
+First-order logic includes the same propositional connectives as propositional logic but differs from it because it articulates the internal structure of propositions. This happens through devices such as singular terms, which refer to particular objects, predicates, which refer to properties and relations, and quantifiers, which treat notions like "some" and "all". For example, to express the proposition "this raven is black", one may use the predicate 
+  
+    
+      
+        B
+      
+    
+    {\displaystyle B}
+  
+ for the property "black" and the singular term 
+  
+    
+      
+        r
+      
+    
+    {\displaystyle r}
+  
+ referring to the raven to form the expression 
+  
+    
+      
+        B
+        (
+        r
+        )
+      
+    
+    {\displaystyle B(r)}
+  
+. To express that some objects are black, the existential quantifier 
+  
+    
+      
+        ∃
+      
+    
+    {\displaystyle \exists }
+  
+ is combined with the variable 
+  
+    
+      
+        x
+      
+    
+    {\displaystyle x}
+  
+ to form the proposition 
+  
+    
+      
+        ∃
+        x
+        B
+        (
+        x
+        )
+      
+    
+    {\displaystyle \exists xB(x)}
+  
+. First-order logic contains various rules of inference that determine how expressions articulated this way can form valid arguments, for example, that one may infer 
+  
+    
+      
+        ∃
+        x
+        B
+        (
+        x
+        )
+      
+    
+    {\displaystyle \exists xB(x)}
+  
+ from 
+  
+    
+      
+        B
+        (
+        r
+        )
+      
+    
+    {\displaystyle B(r)}
+  
+.
+
+=== Extended ===
+Extended logics are logical systems that accept the basic principles of classical logic. They introduce additional symbols and principles to apply it to fields like metaphysics, ethics, and epistemology.
+
+==== Modal logic ====
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-7.md b/data/en.wikipedia.org/wiki/Logic-7.md
new file mode 100644
index 000000000..3cb8695ff
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-7.md
@@ -0,0 +1,225 @@
+---
+title: "Logic"
+chunk: 8/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Modal logic is an extension of classical logic. In its original form, sometimes called "alethic modal logic", it introduces two new symbols: 
+  
+    
+      
+        ◊
+      
+    
+    {\displaystyle \Diamond }
+  
+ expresses that something is possible while 
+  
+    
+      
+        ◻
+      
+    
+    {\displaystyle \Box }
+  
+ expresses that something is necessary. For example, if the formula 
+  
+    
+      
+        B
+        (
+        s
+        )
+      
+    
+    {\displaystyle B(s)}
+  
+ stands for the sentence "Socrates is a banker" then the formula 
+  
+    
+      
+        ◊
+        B
+        (
+        s
+        )
+      
+    
+    {\displaystyle \Diamond B(s)}
+  
+ articulates the sentence "It is possible that Socrates is a banker". To include these symbols in the logical formalism, modal logic introduces new rules of inference that govern what role they play in inferences. One rule of inference states that, if something is necessary, then it is also possible. This means that 
+  
+    
+      
+        ◊
+        A
+      
+    
+    {\displaystyle \Diamond A}
+  
+ follows from 
+  
+    
+      
+        ◻
+        A
+      
+    
+    {\displaystyle \Box A}
+  
+. Another principle states that if a proposition is necessary then its negation is impossible and vice versa. This means that 
+  
+    
+      
+        ◻
+        A
+      
+    
+    {\displaystyle \Box A}
+  
+ is equivalent to 
+  
+    
+      
+        ¬
+        ◊
+        ¬
+        A
+      
+    
+    {\displaystyle \lnot \Diamond \lnot A}
+  
+.
+Other forms of modal logic introduce similar symbols but associate different meanings with them to apply modal logic to other fields. For example, deontic logic concerns the field of ethics and introduces symbols to express the ideas of obligation and permission, i.e. to describe whether an agent has to perform a certain action or is allowed to perform it. The modal operators in temporal modal logic articulate temporal relations. They can be used to express, for example, that something happened at one time or that something is happening all the time. In epistemology, epistemic modal logic is used to represent the ideas of knowing something in contrast to merely believing it to be the case.
+
+==== Higher order logic ====
+
+Higher-order logics extend classical logic not by using modal operators but by introducing new forms of quantification. Quantifiers correspond to terms like "all" or "some". In classical first-order logic, quantifiers are only applied to individuals. The formula "
+  
+    
+      
+        ∃
+        x
+        (
+        A
+        p
+        p
+        l
+        e
+        (
+        x
+        )
+        ∧
+        S
+        w
+        e
+        e
+        t
+        (
+        x
+        )
+        )
+      
+    
+    {\displaystyle \exists x(Apple(x)\land Sweet(x))}
+  
+" (some apples are sweet) is an example of the existential quantifier "
+  
+    
+      
+        ∃
+      
+    
+    {\displaystyle \exists }
+  
+" applied to the individual variable "
+  
+    
+      
+        x
+      
+    
+    {\displaystyle x}
+  
+". In higher-order logics, quantification is also allowed over predicates. This increases its expressive power. For example, to express the idea that Mary and John share some qualities, one could use the formula "
+  
+    
+      
+        ∃
+        Q
+        (
+        Q
+        (
+        M
+        a
+        r
+        y
+        )
+        ∧
+        Q
+        (
+        J
+        o
+        h
+        n
+        )
+        )
+      
+    
+    {\displaystyle \exists Q(Q(Mary)\land Q(John))}
+  
+". In this case, the existential quantifier is applied to the predicate variable "
+  
+    
+      
+        Q
+      
+    
+    {\displaystyle Q}
+  
+". The added expressive power is especially useful for mathematics since it allows for more succinct formulations of mathematical theories. But it has drawbacks in regard to its meta-logical properties and ontological implications, which is why first-order logic is still more commonly used.
+
+=== Deviant ===
+
+Deviant logics are logical systems that reject some of the basic intuitions of classical logic. Because of this, they are usually seen not as its supplements but as its rivals. Deviant logical systems differ from each other either because they reject different classical intuitions or because they propose different alternatives to the same issue.
+Intuitionistic logic is a restricted version of classical logic. It uses the same symbols but excludes some rules of inference. For example, according to the law of double negation elimination, if a sentence is not not true, then it is true. This means that 
+  
+    
+      
+        A
+      
+    
+    {\displaystyle A}
+  
+ follows from 
+  
+    
+      
+        ¬
+        ¬
+        A
+      
+    
+    {\displaystyle \lnot \lnot A}
+  
+. This is a valid rule of inference in classical logic but it is invalid in intuitionistic logic. Another classical principle not part of intuitionistic logic is the law of excluded middle. It states that for every sentence, either it or its negation is true. This means that every proposition of the form 
+  
+    
+      
+        A
+        ∨
+        ¬
+        A
+      
+    
+    {\displaystyle A\lor \lnot A}
+  
+ is true. These deviations from classical logic are based on the idea that truth is established by verification using a proof. Intuitionistic logic is especially prominent in the field of constructive mathematics, which emphasizes the need to find or construct a specific example to prove its existence.
+Multi-valued logics depart from classicality by rejecting the principle of bivalence, which requires all propositions to be either true or false. For instance, Jan Łukasiewicz and Stephen Cole Kleene both proposed ternary logics which have a third truth value representing that a statement's truth value is indeterminate. These logics have been applied in the field of linguistics. Fuzzy logics are multivalued logics that have an infinite number of "degrees of truth", represented by a real number between 0 and 1.
+Paraconsistent logics are logical systems that can deal with contradictions. They are formulated to avoid the principle of explosion: for them, it is not the case that anything follows from a contradiction. They are often motivated by dialetheism, the view that contradictions are real or that reality itself is contradictory. Graham Priest is an influential contemporary proponent of this position and similar views have been ascribed to Georg Wilhelm Friedrich Hegel.
+
+=== Informal ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-8.md b/data/en.wikipedia.org/wiki/Logic-8.md
new file mode 100644
index 000000000..591cdd869
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-8.md
@@ -0,0 +1,26 @@
+---
+title: "Logic"
+chunk: 9/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+Informal logic is usually carried out in a less systematic way. It often focuses on more specific issues, like investigating a particular type of fallacy or studying a certain aspect of argumentation. Nonetheless, some frameworks of informal logic have also been presented that try to provide a systematic characterization of the correctness of arguments.
+The pragmatic or dialogical approach to informal logic sees arguments as speech acts and not merely as a set of premises together with a conclusion. As speech acts, they occur in a certain context, like a dialogue, which affects the standards of right and wrong arguments. A prominent version by Douglas N. Walton understands a dialogue as a game between two players. The initial position of each player is characterized by the propositions to which they are committed and the conclusion they intend to prove. Dialogues are games of persuasion: each player has the goal of convincing the opponent of their own conclusion. This is achieved by making arguments: arguments are the moves of the game. They affect to which propositions the players are committed. A winning move is a successful argument that takes the opponent's commitments as premises and shows how one's own conclusion follows from them. This is usually not possible straight away. For this reason, it is normally necessary to formulate a sequence of arguments as intermediary steps, each of which brings the opponent a little closer to one's intended conclusion. Besides these positive arguments leading one closer to victory, there are also negative arguments preventing the opponent's victory by denying their conclusion. Whether an argument is correct depends on whether it promotes the progress of the dialogue. Fallacies, on the other hand, are violations of the standards of proper argumentative rules. These standards also depend on the type of dialogue. For example, the standards governing the scientific discourse differ from the standards in business negotiations.
+The epistemic approach to informal logic, on the other hand, focuses on the epistemic role of arguments. It is based on the idea that arguments aim to increase our knowledge. They achieve this by linking justified beliefs to beliefs that are not yet justified. Correct arguments succeed at expanding knowledge while fallacies are epistemic failures: they do not justify the belief in their conclusion. For example, the fallacy of begging the question is a fallacy because it fails to provide independent justification for its conclusion, even though it is deductively valid. In this sense, logical normativity consists in epistemic success or rationality. The Bayesian approach is one example of an epistemic approach. Central to Bayesianism is not just whether the agent believes something but the degree to which they believe it, the so-called credence. Degrees of belief are seen as subjective probabilities in the believed proposition, i.e. how certain the agent is that the proposition is true. On this view, reasoning can be interpreted as a process of changing one's credences, often in reaction to new incoming information. Correct reasoning and the arguments it is based on follow the laws of probability, for example, the principle of conditionalization. Bad or irrational reasoning, on the other hand, violates these laws.
+
+== Areas of research ==
+Logic is studied in various fields. In many cases, this is done by applying its formal method to specific topics outside its scope, like to ethics or computer science. In other cases, logic itself is made the subject of research in another discipline. This can happen in diverse ways. For instance, it can involve investigating the philosophical assumptions linked to the basic concepts used by logicians. Other ways include interpreting and analyzing logic through mathematical structures as well as studying and comparing abstract properties of formal logical systems.
+
+=== Philosophy of logic and philosophical logic ===
+
+Philosophy of logic is the philosophical discipline studying the scope and nature of logic. It examines many presuppositions implicit in logic, like how to define its basic concepts or the metaphysical assumptions associated with them. It is also concerned with how to classify logical systems and considers the ontological commitments they incur. Philosophical logic is one of the areas within the philosophy of logic. It studies the application of logical methods to philosophical problems in fields like metaphysics, ethics, and epistemology. This application usually happens in the form of extended or deviant logical systems.
+
+=== Metalogic ===
+
+Metalogic is the field of inquiry studying the properties of formal logical systems. For example, when a new formal system is developed, metalogicians may study it to determine which formulas can be proven in it. They may also study whether an algorithm could be developed to find a proof for each formula and whether every provable formula in it is a tautology. Finally, they may compare it to other logical systems to understand its distinctive features. A key issue in metalogic concerns the relation between syntax and semantics. The syntactic rules of a formal system determine how to deduce conclusions from premises, i.e. how to formulate proofs. The semantics of a formal system governs which sentences are true and which ones are false. This determines the validity of arguments since, for valid arguments, it is impossible for the premises to be true and the conclusion to be false. The relation between syntax and semantics concerns issues like whether every valid argument is provable and whether every provable argument is valid. Metalogicians also study whether logical systems are complete, sound, and consistent. They are interested in whether the systems are decidable and what expressive power they have. Metalogicians usually rely heavily on abstract mathematical reasoning when examining and formulating metalogical proofs. This way, they aim to arrive at precise and general conclusions on these topics.
+
+=== Mathematical logic ===
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Logic-9.md b/data/en.wikipedia.org/wiki/Logic-9.md
new file mode 100644
index 000000000..c2bb958a7
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Logic-9.md
@@ -0,0 +1,61 @@
+---
+title: "Logic"
+chunk: 10/11
+source: "https://en.wikipedia.org/wiki/Logic"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:39.118518+00:00"
+instance: "kb-cron"
+---
+
+The term "mathematical logic" is sometimes used as a synonym of "formal logic". But in a more restricted sense, it refers to the study of logic within mathematics. Major subareas include model theory, proof theory, set theory, and computability theory. Research in mathematical logic commonly addresses the mathematical properties of formal systems of logic. However, it can also include attempts to use logic to analyze mathematical reasoning or to establish logic-based foundations of mathematics. The latter was a major concern in early 20th-century mathematical logic, which pursued the program of logicism pioneered by philosopher-logicians such as Gottlob Frege, Alfred North Whitehead, and Bertrand Russell. Mathematical theories were supposed to be logical tautologies, and their program was to show this by means of a reduction of mathematics to logic. Many attempts to realize this program failed, from the crippling of Frege's project in his Grundgesetze by Russell's paradox, to the defeat of Hilbert's program by Gödel's incompleteness theorems.
+Set theory originated in the study of the infinite by Georg Cantor, and it has been the source of many of the most challenging and important issues in mathematical logic. They include Cantor's theorem, the status of the Axiom of Choice, the question of the independence of the continuum hypothesis, and the modern debate on large cardinal axioms.
+Computability theory is the branch of mathematical logic that studies effective procedures to solve calculation problems. One of its main goals is to understand whether it is possible to solve a given problem using an algorithm. For instance, given a certain claim about the positive integers, it examines whether an algorithm can be found to determine if this claim is true. Computability theory uses various theoretical tools and models, such as Turing machines, to explore this type of issue.
+
+=== Computational logic ===
+
+Computational logic is the branch of logic and computer science that studies how to implement mathematical reasoning and logical formalisms using computers. This includes, for example, automatic theorem provers, which employ rules of inference to construct a proof step by step from a set of premises to the intended conclusion without human intervention. Logic programming languages are designed specifically to express facts using logical formulas and to draw inferences from these facts. For example, Prolog is a logic programming language based on predicate logic. Computer scientists also apply concepts from logic to problems in computing. The works of Claude Shannon were influential in this regard. He showed how Boolean logic can be used to understand and implement computer circuits. This can be achieved using electronic logic gates, i.e. electronic circuits with one or more inputs and usually one output. The truth values of propositions are represented by voltage levels. In this way, logic functions can be simulated by applying the corresponding voltages to the inputs of the circuit and determining the value of the function by measuring the voltage of the output.
+
+=== Formal semantics of natural language ===
+
+Formal semantics is a subfield of logic, linguistics, and the philosophy of language. The discipline of semantics studies the meaning of language. Formal semantics uses formal tools from the fields of symbolic logic and mathematics to give precise theories of the meaning of natural language expressions. It understands meaning usually in relation to truth conditions, i.e. it examines in which situations a sentence would be true or false. One of its central methodological assumptions is the principle of compositionality. It states that the meaning of a complex expression is determined by the meanings of its parts and how they are combined. For example, the meaning of the verb phrase "walk and sing" depends on the meanings of the individual expressions "walk" and "sing". Many theories in formal semantics rely on model theory. This means that they employ set theory to construct a model and then interpret the meanings of expression in relation to the elements in this model. For example, the term "walk" may be interpreted as the set of all individuals in the model that share the property of walking. Early influential theorists in this field were Richard Montague and Barbara Partee, who focused their analysis on the English language.
+
+=== Epistemology of logic ===
+The epistemology of logic studies how one knows that an argument is valid or that a proposition is logically true. This includes questions like how to justify that modus ponens is a valid rule of inference or that contradictions are false. The traditionally dominant view is that this form of logical understanding belongs to knowledge a priori. In this regard, it is often argued that the mind has a special faculty to examine relations between pure ideas and that this faculty is also responsible for apprehending logical truths. A similar approach understands the rules of logic in terms of linguistic conventions. On this view, the laws of logic are trivial since they are true by definition: they just express the meanings of the logical vocabulary.
+Some theorists, like Hilary Putnam and Penelope Maddy, object to the view that logic is knowable a priori. They hold instead that logical truths depend on the empirical world. This is usually combined with the claim that the laws of logic express universal regularities found in the structural features of the world. According to this view, they may be explored by studying general patterns of the fundamental sciences. For example, it has been argued that certain insights of quantum mechanics refute the principle of distributivity in classical logic, which states that the formula 
+  
+    
+      
+        A
+        ∧
+        (
+        B
+        ∨
+        C
+        )
+      
+    
+    {\displaystyle A\land (B\lor C)}
+  
+ is equivalent to 
+  
+    
+      
+        (
+        A
+        ∧
+        B
+        )
+        ∨
+        (
+        A
+        ∧
+        C
+        )
+      
+    
+    {\displaystyle (A\land B)\lor (A\land C)}
+  
+. This claim can be used as an empirical argument for the thesis that quantum logic is the correct logical system and should replace classical logic.
+
+== History ==
\ No newline at end of file
diff --git a/data/en.wikipedia.org/wiki/Mathematical_linguistics-0.md b/data/en.wikipedia.org/wiki/Mathematical_linguistics-0.md
new file mode 100644
index 000000000..a854411c7
--- /dev/null
+++ b/data/en.wikipedia.org/wiki/Mathematical_linguistics-0.md
@@ -0,0 +1,514 @@
+---
+title: "Mathematical linguistics"
+chunk: 1/1
+source: "https://en.wikipedia.org/wiki/Mathematical_linguistics"
+category: "reference"
+tags: "science, encyclopedia"
+date_saved: "2026-05-05T03:56:40.346427+00:00"
+instance: "kb-cron"
+---
+
+Mathematical linguistics is the application of mathematics to model phenomena and solve problems in general linguistics and theoretical linguistics. Mathematical linguistics has a significant amount of overlap with computational linguistics.
+
+
+== Discrete mathematics ==
+Discrete mathematics is used in language modeling, including formal grammars, language representation, and historical linguistic trends.
+
+
+=== Set theory ===
+Semantic classes, word classes, natural classes, and the allophonic variations of each phoneme in a language are all examples of applied set theory. Set theory and concatenation theory are used extensively in phonetics and phonology.
+
+
+=== Combinatorics ===
+In phonotactics, combinatorics is useful for determining which sequences of phonemes are permissible in a given language, and for calculating the total number of possible syllables or words, based on a given set of phonological constraints. Combinatorics on words can reveal patterns within words, morphemes, and sentences.
+
+
+=== Finite-state transducers ===
+Context-sensitive rewriting rules of the form a → b / c _ d, used in linguistics to model phonological rules and sound change, are computationally equivalent to finite-state transducers, provided that application is nonrecursive, i.e. the rule is not allowed to rewrite the same substring twice.
+Weighted FSTs found applications in natural language processing, including machine translation, and in machine learning. An implementation for part-of-speech tagging can be found as one component of the OpenGrm library.
+
+
+=== Algorithms ===
+Optimality theory (OT) and maximum entropy (Maxent) phonotactics use algorithmic approaches when evaluating candidate forms (phoneme strings) for determining the phonotactic constraints of a language.
+
+
+=== Graph theory ===
+Trees have several applications in linguistics, including:
+
+Parsing trees
+Sentence diagrams
+Language family trees
+Etymology trees
+Other graphs that are used in linguistics include:
+
+Weighted graphs, which are used to model the lexical similarity between different languages (after computing lexicostatistics).
+Semantic networks
+Lattice graphs, which can model optimality theory.
+
+
+=== Topology ===
+The concept of topology has recently been introduced to linguistics. Semantic topology is a framework for discourse analysis that applies circuit topology to measure the semantic structure of sentence arrangements within a text. By representing recurring themes through series, parallel, or cross configurations, one can uncover statistical differences in communication styles.
+
+
+== Formal linguistics ==
+
+Formal linguistics is the branch of linguistics which uses formal languages, formal grammars and first-order logical expressions for the analysis of natural languages. Since the 1980s, the term is often used to refer to Chomskyan linguistics. Generative models of formal linguistics, such as head-driven phrase structure grammar, have also been used in natural language processing.
+
+
+=== Logic ===
+
+Logic is used to model syntax, formal semantics, and pragmatics. Modal logic can model syntax that employs different grammatical moods. Most linguistic universals (e.g. Greenberg's linguistic universals) employ propositional logic. Lexical relations between words can be determined based on whether a pair of words satisfies conditional propositions.
+
+
+=== Semiotics ===
+
+Methods of formal linguistics were introduced by semioticians such as Charles Sanders Peirce and Louis Hjelmslev. Building on the work of David Hilbert and Rudolf Carnap, Hjelmslev proposed the use of formal grammars to analyse, generate and explain language in his 1943 book Prolegomena to a Theory of Language. In this view, language is regarded as arising from a mathematical relationship between meaning and form.
+The formal description of language was further developed by linguists including J. R. Firth and Simon Dik, giving rise to modern grammatical frameworks such as systemic functional linguistics and functional discourse grammar. Computational methods have been developed by the framework functional generative description among others.
+Dependency grammar, created by French structuralist Lucien Tesnière, has been used widely in natural language processing.
+
+
+== Differential equations and multivariate calculus ==
+The fast Fourier transform, Kalman filters, and autoencoding are all used in signal processing (advanced phonetics, speech recognition).
+
+
+== Statistics ==
+In linguistics, statistical methods are necessary to describe and validate research results, as well as to understand observations and trends within an area of study.
+
+
+=== Corpus statistics ===
+
+Student's t-test can be used to determine whether the occurrence of a collocation in a corpus is statistically significant. For a bigram 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{1}w_{2}}
+  
+, let 
+  
+    
+      
+        P
+        (
+        
+          w
+          
+            1
+          
+        
+        )
+        =
+        
+          
+            
+              #
+              
+                w
+                
+                  1
+                
+              
+            
+            N
+          
+        
+      
+    
+    {\displaystyle P(w_{1})={\frac {\#w_{1}}{N}}}
+  
+ be the unconditional probability of occurrence of 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+      
+    
+    {\displaystyle w_{1}}
+  
+ in a corpus with size 
+  
+    
+      
+        N
+      
+    
+    {\displaystyle N}
+  
+, and let 
+  
+    
+      
+        P
+        (
+        
+          w
+          
+            2
+          
+        
+        )
+        =
+        
+          
+            
+              #
+              
+                w
+                
+                  2
+                
+              
+            
+            N
+          
+        
+      
+    
+    {\displaystyle P(w_{2})={\frac {\#w_{2}}{N}}}
+  
+ be the unconditional probability of occurrence of 
+  
+    
+      
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{2}}
+  
+ in the corpus. The t-score for the bigram 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{1}w_{2}}
+  
+ is calculated as:
+
+  
+    
+      
+        t
+        =
+        
+          
+            
+              
+                
+                  
+                    x
+                    ¯
+                  
+                
+              
+              −
+              μ
+            
+            
+              
+                
+                  s
+                  
+                    2
+                  
+                
+                N
+              
+            
+          
+        
+        ,
+      
+    
+    {\displaystyle t={\frac {{\bar {x}}-\mu }{\sqrt {\frac {s^{2}}{N}}}},}
+  
+
+where 
+  
+    
+      
+        
+          
+            
+              x
+              ¯
+            
+          
+        
+        =
+        
+          
+            
+              #
+              
+                w
+                
+                  i
+                
+              
+              
+                w
+                
+                  j
+                
+              
+            
+            N
+          
+        
+      
+    
+    {\displaystyle {\bar {x}}={\frac {\#w_{i}w_{j}}{N}}}
+  
+ is the sample mean of the occurrence of 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{1}w_{2}}
+  
+, 
+  
+    
+      
+        #
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle \#w_{1}w_{2}}
+  
+ is the number of occurrences of 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{1}w_{2}}
+  
+, 
+  
+    
+      
+        μ
+        =
+        P
+        (
+        
+          w
+          
+            i
+          
+        
+        )
+        P
+        (
+        
+          w
+          
+            j
+          
+        
+        )
+      
+    
+    {\displaystyle \mu =P(w_{i})P(w_{j})}
+  
+ is the probability of 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{1}w_{2}}
+  
+ under the null-hypothesis that 
+  
+    
+      
+        
+          w
+          
+            1
+          
+        
+      
+    
+    {\displaystyle w_{1}}
+  
+ and 
+  
+    
+      
+        
+          w
+          
+            2
+          
+        
+      
+    
+    {\displaystyle w_{2}}
+  
+ appear independently in the text, and 
+  
+    
+      
+        
+          s
+          
+            2
+          
+        
+        =
+        
+          
+            
+              x
+              ¯
+            
+          
+        
+        (
+        1
+        −
+        
+          
+            
+              x
+              ¯
+            
+          
+        
+        )
+        ≈
+        
+          
+            
+              x
+              ¯
+            
+          
+        
+      
+    
+    {\displaystyle s^{2}={\bar {x}}(1-{\bar {x}})\approx {\bar {x}}}
+  
+ is the sample variance. With a large 
+  
+    
+      
+        N
+      
+    
+    {\displaystyle N}
+  
+, the t-test is equivalent to a Z-test.
+
+
+=== Lexicostatistics ===
+
+Lexicostatistics can model the lexical similarities between languages that share a language family, sprachbund, language contact, or other historical connections.
+
+
+=== Quantitative linguistics ===
+
+Quantitative linguistics (QL) deals with language learning, language change, and application as well as structure of natural languages. QL investigates languages using statistical methods; its most demanding objective is the formulation of language laws and, ultimately, of a general theory of language in the sense of a set of interrelated languages laws. Synergetic linguistics was from its very beginning specifically designed for this purpose.
+QL is empirically based on the results of language statistics, a field which can be interpreted as statistics of languages or as statistics of any linguistic object. This field is not necessarily connected to substantial theoretical ambitions. Corpus linguistics and computational linguistics are other fields which contribute important empirical evidence.
+
+
+=== Quantitative comparative linguistics ===
+
+Quantitative comparative linguistics is a subfield of quantitative linguistics which applies quantitative analysis to comparative linguistics. It makes use of lexicostatistics and glottochronology, and the borrowing of phylogenetics from biology.
+
+
+== See also ==
+International Linguistics Olympiad
+
+
+== References ==
+
+
+== Bibliography ==
\ No newline at end of file