kb/data/en.wikipedia.org/wiki/History_of_statistics-2.md

5.6 KiB
Raw Blame History

title chunk source category tags date_saved instance
History of statistics 3/8 https://en.wikipedia.org/wiki/History_of_statistics reference science, encyclopedia 2026-05-05T04:00:26.751121+00:00 kb-cron

The term 'statistic' was introduced by the Italian scholar Girolamo Ghilini in 1589 with reference to this science. The birth of statistics is often dated to 1662, when John Graunt, along with William Petty, developed early human statistical and census methods that provided a framework for modern demography. He produced the first life table, giving probabilities of survival to each age. His book Natural and Political Observations Made upon the Bills of Mortality used analysis of the mortality rolls to make the first statistically based estimation of the population of London. He knew that there were around 13,000 funerals per year in London and that three people died per eleven families per year. He estimated from the parish records that the average family size was 8 and calculated that the population of London was about 384,000; this is the first known use of a ratio estimator. Laplace in 1802 estimated the population of France with a similar method; see Ratio estimator § History for details. Although the original scope of statistics was limited to data useful for governance, the approach was extended to many fields of a scientific or commercial nature during the 19th century. The mathematical foundations for the subject heavily drew on the new probability theory, pioneered in the 16th century by Gerolamo Cardano, Pierre de Fermat and Blaise Pascal. Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi (posthumous, 1713) and Abraham de Moivre's The Doctrine of Chances (1718) treated the subject as a branch of mathematics. In his book Bernoulli introduced the idea of representing complete certainty as one and probability as a number between zero and one. In 1700, Isaac Newton carried out the earliest known form of linear regression, writing the first of the ordinary least squares normal equations, averaging astronomical data, and summing the residuals to zero in his analysis of Hipparchuss equinox observations. He distinguished between two inhomogeneous sets of data and might have thought of an optimal solution in terms of bias, but not in effectiveness. A key early application of statistics in the 18th century was to the human sex ratio at birth. John Arbuthnot studied this question in 1710. Arbuthnot examined birth records in London for each of the 82 years from 1629 to 1710. In every year, the number of males born in London exceeded the number of females. Considering more male or more female births as equally likely, the probability of the observed outcome is 0.5^82, or about 1 in 4,8360,0000,0000,0000,0000,0000; in modern terms, the p-value. This is vanishingly small, leading Arbuthnot that this was not due to chance, but to divine providence: "From whence it follows, that it is Art, not Chance, that governs." This is and other work by Arbuthnot is credited as "the first use of significance tests" the first example of reasoning about statistical significance and moral certainty, and "... perhaps the first published report of a nonparametric test ...", specifically the sign test; see details at Sign test § History. The formal study of theory of errors may be traced back to Roger Cotes' Opera Miscellanea (posthumous, 1722), but a memoir prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous errors are discussed and a probability curve is given. Simpson discussed several possible distributions of error. He first considered the uniform distribution and then the discrete symmetric triangular distribution followed by the continuous symmetric triangle distribution. Tobias Mayer, in his study of the libration of the moon (Kosmographische Nachrichten, Nuremberg, 1750), invented the first formal method for estimating the unknown quantities by generalized the averaging of observations under identical circumstances to the averaging of groups of similar equations. Roger Joseph Boscovich in 1755 based in his work on the shape of the earth proposed in his book De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani gradus a PP. Maire et Boscovicli that the true value of a series of observations would be that which minimises the sum of absolute errors. In modern terminology this value is the median. The first example of what later became known as the normal curve was studied by Abraham de Moivre who plotted this curve on November 12, 1733. de Moivre was studying the number of heads that occurred when a 'fair' coin was tossed. In 1763 Richard Price transmitted to the Royal Society Thomas Bayes proof of a rule for using a binomial distribution to calculate a posterior probability on a prior event. In 1765 Joseph Priestley invented the first timeline charts. Johann Heinrich Lambert in his 1765 book Anlage zur Architectonic proposed the semicircle as a distribution of errors:

    f
    (
    x
    )
    =
    
      
        1
        2
      
    
    
      
        (
        1
        
        
          x
          
            2
          
        
        )
      
    
  

{\displaystyle f(x)={\frac {1}{2}}{\sqrt {(1-x^{2})}}}

with -1 < x < 1.