kb/data/en.wikipedia.org/wiki/Statistics-4.md

6.8 KiB
Raw Blame History

title chunk source category tags date_saved instance
Statistics 5/8 https://en.wikipedia.org/wiki/Statistics reference science, encyclopedia 2026-05-05T06:38:24.334738+00:00 kb-cron

The standard approach is to test a null hypothesis against an alternative hypothesis. A critical region is the set of values of the estimator that leads to refuting the null hypothesis. The probability of type I error is therefore the probability that the estimator belongs to the critical region given that null hypothesis is true (statistical significance) and the probability of type II error is the probability that the estimator does not belong to the critical region given that the alternative hypothesis is true. The statistical power of a test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false. Referring to statistical significance does not necessarily mean that the overall result is significant in real-world terms. For example, in a large study of a drug it may be shown that the drug has a statistically significant but very small beneficial effect, such that it is unlikely to help the patient noticeably. Although in principle the acceptable level of statistical significance may be subject to debate, the significance level is the largest p-value that allows the test to reject the null hypothesis. This test is logically equivalent to saying that the p-value is the probability, assuming the null hypothesis is true, of observing a result at least as extreme as the test statistic. Therefore, the smaller the significance level, the lower the probability of committing type I error. Some problems are usually associated with this framework (See criticism of hypothesis testing):

A difference that is highly statistically significant can still be of no practical significance, but it is possible to properly formulate tests to account for this. One response involves going beyond reporting only the significance level to include the p-value when reporting whether a hypothesis is rejected or accepted. The p-value, however, does not indicate the size or importance of the observed effect and can also seem to exaggerate the importance of minor differences in large studies. A better and increasingly common approach is to report confidence intervals. Although these are produced from the same calculations as those of hypothesis tests or p-values, they describe both the size of the effect and the uncertainty surrounding it. Fallacy of the transposed conditional, aka prosecutor's fallacy: criticisms arise because the hypothesis testing approach forces one hypothesis (the null hypothesis) to be favored, since what is being evaluated is the probability of the observed result given the null hypothesis and not probability of the null hypothesis given the observed result. An alternative to this approach is offered by Bayesian inference, although it requires establishing a prior probability. Rejecting the null hypothesis does not automatically prove the alternative hypothesis. As everything in inferential statistics it relies on sample size, and therefore under fat tails p-values may be seriously mis-computed.

===== Examples ===== Some well-known statistical tests and procedures are:

=== Bayesian statistics ===

An alternative paradigm to the popular frequentist paradigm is to use Bayes' theorem to update the prior probability of the hypotheses in consideration based on the relative likelihood of the evidence gathered to obtain a posterior probability. Bayesian methods have been aided by the increase in available computing power to compute the posterior probability using numerical approximation techniques like Markov Chain Monte Carlo. For statistically modelling purposes, Bayesian models tend to be hierarchical, for example, one could model each YouTube channel as having video views distributed as a normal distribution with channel dependent mean and variance

        N
      
    
    (
    
      μ
      
        i
      
    
    ,
    
      σ
      
        i
      
    
    )
  

{\displaystyle {\mathcal {N}}(\mu _{i},\sigma _{i})}

, while modeling the channel means as themselves coming from a normal distribution representing the distribution of average video view counts per channel, and the variances as coming from another distribution. The concept of using likelihood ratio can also be prominently seen in medical diagnostic testing.

=== Exploratory data analysis ===

Exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

=== Mathematical statistics ===

Mathematical statistics is the application of mathematics to statistics. Mathematical techniques used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure-theoretic probability theory. All statistical analyses make use of at least some mathematics, and mathematical statistics can therefore be regarded as a fundamental component of general statistics.

== History ==

Formal discussions on inference date back to the mathematicians and cryptographers of the Islamic Golden Age between the 8th and 13th centuries. Al-Khalil (717786) wrote the Book of Cryptographic Messages, which contains one of the first uses of permutations and combinations, to list all possible Arabic words with and without vowels. Al-Kindi's Manuscript on Deciphering Cryptographic Messages gave a detailed description of how to use frequency analysis to decipher encrypted messages, providing an early example of statistical inference for decoding. Ibn Adlan (11871268) later made an important contribution on the use of sample size in frequency analysis. Although the term statistic was introduced by the Italian scholar Girolamo Ghilini in 1589 with reference to a collection of facts and information about a state, it was the German Gottfried Achenwall in 1749 who started using the term as a collection of quantitative information, in the modern use for this science. The earliest writing containing statistics in Europe dates back to 1663, with the publication of Natural and Political Observations upon the Bills of Mortality by John Graunt. Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data, hence its stat- etymology. The scope of the discipline of statistics broadened in the early 19th century to include the collection and analysis of data in general. Today, statistics is widely employed in government, business, and natural and social sciences.