6.6 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Statistics | 7/8 | https://en.wikipedia.org/wiki/Statistics | reference | science, encyclopedia | 2026-05-05T03:56:59.364659+00:00 | kb-cron |
=== Statistics in academia === Statistics is applicable to a wide variety of academic disciplines, including natural and social sciences, government, and business. Business statistics applies statistical methods in econometrics, auditing and production and operations, including services improvement and marketing research. A study of two journals in tropical biology found that the 12 most frequent statistical tests are: analysis of variance (ANOVA), chi-squared test, Student's t-test, linear regression, Pearson's correlation coefficient, Mann-Whitney U test, Kruskal-Wallis test, Shannon's diversity index, Tukey's range test, cluster analysis, Spearman's rank correlation coefficient and principal component analysis. A typical statistics course covers descriptive statistics, probability, binomial and normal distributions, test of hypotheses and confidence intervals, linear regression, and correlation. Modern fundamental statistical courses for undergraduate students focus on correct test selection, results interpretation, and use of free statistics software.
=== Statistical computing ===
The rapid and sustained increases in computing power starting from the second half of the 20th century have had a substantial impact on the practice of statistical science. Early statistical models were almost always from the class of linear models, but powerful computers, coupled with suitable numerical algorithms, caused an increased interest in nonlinear models (such as neural networks) as well as the creation of new types, such as generalized linear models and multilevel models. Increased computing power has also led to the growing popularity of computationally intensive methods based on resampling, such as permutation tests and the bootstrap, while techniques such as Gibbs sampling have made use of Bayesian models more feasible. The computer revolution has implications for the future of statistics with a new emphasis on "experimental" and "empirical" statistics. A large number of both general and special purpose statistical software are now available. Examples of available software capable of complex statistical computation include programs such as Mathematica, SAS, SPSS, and R.
=== Business statistics ===
In business, "statistics" is a widely used management- and decision support tool. It is particularly applied in financial management, marketing management, and production, services and operations management. Statistics is also heavily used in management accounting and auditing. The discipline of Management Science formalizes the use of statistics, and other mathematics, in business. (Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships.) A typical "Business Statistics" course is intended for business majors, and covers descriptive statistics (collection, description, analysis, and summary of data), probability (typically the binomial and normal distributions), test of hypotheses and confidence intervals, linear regression, and correlation; (follow-on) courses may include forecasting, time series, decision trees, multiple linear regression, and other topics from business analytics more generally. Professional certification programs, such as the CFA, often include topics in statistics.
== Specialized disciplines ==
Statistical techniques are used in a wide range of types of scientific and social research, including: biostatistics, computational biology, computational sociology, network biology, social science, sociology and social research. Some fields of inquiry use applied statistics so extensively that they have specialized terminology. These disciplines include:
In addition, there are particular types of statistical analysis that have also developed their own specialised terminology and methodology:
Statistics form a key basis tool in business and manufacturing as well. It is used to understand measurement systems variability, control processes (as in statistical process control or SPC), for summarizing data, and to make data-driven decisions.
== Misuse ==
Misuse of statistics can produce subtle but serious errors in description and interpretation—subtle in the sense that even experienced professionals make such errors, and serious in the sense that they can lead to devastating decision errors. For instance, social policy, medical practice, and the reliability of structures like bridges all rely on the proper use of statistics. Even when statistical techniques are correctly applied, the results can be difficult to interpret for those lacking expertise. The statistical significance of a trend in the data—which measures the extent to which a trend could be caused by random variation in the sample—may or may not agree with an intuitive sense of its significance. The set of basic statistical skills (and skepticism) that people need to deal with information in their everyday lives properly is referred to as statistical literacy. There is a general perception that statistical knowledge is all-too-frequently intentionally misused by finding ways to interpret only the data that are favorable to the presenter. A mistrust and misunderstanding of statistics is associated with the quotation, "There are three kinds of lies: lies, damned lies, and statistics". Misuse of statistics can be both inadvertent and intentional, and the book How to Lie with Statistics, by Darrell Huff, outlines a range of considerations. In an attempt to shed light on the use and misuse of statistics, reviews of statistical techniques used in particular fields are conducted (e.g. Warne, Lazo, Ramos, and Ritter (2012)). Ways to avoid misuse of statistics include using proper diagrams and avoiding bias. Misuse can occur when conclusions are overgeneralized and claimed to be representative of more than they really are, often by either deliberately or unconsciously overlooking sampling bias. Bar graphs are arguably the easiest diagrams to use and understand, and they can be made either by hand or with simple computer programs. Most people do not look for bias or errors, so they are not noticed. Thus, people may often believe that something is true even if it is not well represented. To make data gathered from statistics believable and accurate, the sample taken must be representative of the whole. According to Huff, "The dependability of a sample can be destroyed by [bias]... allow yourself some degree of skepticism." To assist in the understanding of statistics Huff proposed a series of questions to be asked in each case: