6.1 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Statistics | 6/8 | https://en.wikipedia.org/wiki/Statistics | reference | science, encyclopedia | 2026-05-05T03:56:59.364659+00:00 | kb-cron |
The mathematical foundations of statistics developed from discussions concerning games of chance among mathematicians such as Gerolamo Cardano, Blaise Pascal, Pierre de Fermat, and Christiaan Huygens. Although the idea of probability was already examined in ancient and medieval law and philosophy (such as the work of Juan Caramuel), probability theory as a mathematical discipline only took shape at the very end of the 17th century, particularly in Jacob Bernoulli's posthumous work Ars Conjectandi. This was the first book where the realm of games of chance and the realm of the probable (which concerned opinion, evidence, and argument) were combined and submitted to mathematical analysis. The method of least squares was first described by Adrien-Marie Legendre in 1805, though Carl Friedrich Gauss presumably made use of it a decade earlier in 1795.
In the 1830s-1850s, "statistical offices" and national "statistical societies" were founded in Europe and America, and in the mid-19th century, the idea arose of "organized contacts between the statisticians of different countries although informal contacts occurred earlier". In those days, the name "statistics" referred mainly to "matters of state", and British statisticians were often called "statists". Belgian scientist Adolphe Quetelet (1796–1874) introduced the notion of the "average man" (l'homme moyen) as a means of understanding complex social phenomena such as crime rates, marriage rates, and suicide rates. In 1853 Quetelet organised in Brussels the First International Statistical Congress in order to unify measurement in statistical research. The modern field of statistics emerged in the late 19th and early 20th century in three stages. The first wave, at the turn of the century, was led by the work of Francis Galton and Karl Pearson, who transformed statistics into a rigorous mathematical discipline used for analysis, not just in science, but in industry and politics as well. Galton's contributions included introducing the concepts of standard deviation, correlation, regression analysis and the application of these methods to the study of the variety of human characteristics—height, weight and eyelash length among others. Pearson developed the Pearson product-moment correlation coefficient, defined as a product-moment, the method of moments for the fitting of distributions to samples and the Pearson distribution, among many other things. Galton and Pearson founded Biometrika as the first journal of mathematical statistics and biostatistics (then called biometry), and the latter founded the world's first university statistics department at University College London. The second wave of the 1910s and 20s was initiated by William Sealy Gosset, and reached its culmination in the insights of Ronald Fisher, who wrote the textbooks that were to define the academic discipline in universities around the world. Fisher's most important publications were his 1918 seminal paper The Correlation between Relatives on the Supposition of Mendelian Inheritance (which was the first to use the statistical term, variance), his classic 1925 work Statistical Methods for Research Workers and his 1935 The Design of Experiments, where he developed rigorous design of experiments models. He originated the concepts of sufficiency, ancillary statistics, Fisher's linear discriminator and Fisher information. He also coined the term null hypothesis during the Lady tasting tea experiment, which "is never proved or established, but is possibly disproved, in the course of experimentation". In his 1930 book The Genetical Theory of Natural Selection, he applied statistics to various biological concepts such as Fisher's principle (which A. W. F. Edwards called "probably the most celebrated argument in evolutionary biology") and Fisherian runaway, a concept in sexual selection about a positive feedback runaway effect found in evolution. The final wave, which mainly saw the refinement and expansion of earlier developments, emerged from the collaborative work between Egon Pearson and Jerzy Neyman in the 1930s. They introduced the concepts of "Type II" error, power of a test and confidence intervals. Jerzy Neyman in 1934 showed that stratified random sampling was in general a better method of estimation than purposive (quota) sampling. Among the early attempts to measure national economic activity were those of William Petty in the 17th century. In the 20th century the uniform System of National Accounts was developed. Today, statistical methods are applied in all fields that involve decision making, for making accurate inferences from a collated body of data and for making decisions in the face of uncertainty based on statistical methodology. The use of modern computers has expedited large-scale statistical computations and has also made possible new methods that are impractical to perform manually. Statistics continues to be an area of active research, for example on the problem of how to analyze big data.
== Applications ==
=== Applied statistics, theoretical statistics and mathematical statistics === Applied statistics, sometimes referred to as Statistical science, comprises descriptive statistics and the application of inferential statistics. Theoretical statistics concerns the logical arguments underlying justification of approaches to statistical inference, as well as encompassing mathematical statistics. Mathematical statistics includes not only the manipulation of probability distributions necessary for deriving results related to methods of estimation and inference, but also various aspects of computational statistics and the design of experiments. Statistical consultants can help organizations and companies that do not have in-house expertise relevant to their particular questions.
=== Machine learning and data mining === Machine learning models are statistical and probabilistic models that capture patterns in the data through use of computational algorithms.