kb/data/en.wikipedia.org/wiki/Analysis_of_variance-4.md

5.8 KiB

title chunk source category tags date_saved instance
Analysis of variance 5/7 https://en.wikipedia.org/wiki/Analysis_of_variance reference science, encyclopedia 2026-05-05T09:48:53.349210+00:00 kb-cron

==== The number of experimental units ==== In the design of an experiment, the number of experimental units is planned to satisfy the goals of the experiment. Experimentation is often sequential. Early experiments are often designed to provide mean-unbiased estimates of treatment effects and of experimental error. Later experiments are often designed to test a hypothesis that a treatment effect has an important magnitude; in this case, the number of experimental units is chosen so that the experiment is within budget and has adequate power, among other goals. Reporting sample size analysis is generally required in psychology. "Provide information on sample size and the process that led to sample size decisions." The analysis, which is written in the experimental protocol before the experiment is conducted, is examined in grant applications and administrative review boards. Besides the power analysis, there are less formal methods for selecting the number of experimental units. These include graphical methods based on limiting the probability of false negative errors, graphical methods based on an expected variation increase (above the residuals) and methods based on achieving a desired confidence interval.

==== Power analysis ==== Power analysis is often applied in the context of ANOVA in order to assess the probability of successfully rejecting the null hypothesis if we assume a certain ANOVA design, effect size in the population, sample size and significance level. Power analysis can assist in study design by determining what sample size would be required in order to have a reasonable chance of rejecting the null hypothesis when the alternative hypothesis is true.

==== Effect size ====

Several standardized measures of effect have been proposed for ANOVA to summarize the strength of the association between a predictor(s) and the dependent variable or the overall standardized difference of the complete model. Standardized effect-size estimates facilitate comparison of findings across studies and disciplines. However, while standardized effect sizes are commonly used in much of the professional literature, a non-standardized measure of effect size that has immediately "meaningful" units may be preferable for reporting purposes.

==== Model confirmation ==== Sometimes tests are conducted to determine whether the assumptions of ANOVA appear to be violated. Residuals are examined or analyzed to confirm homoscedasticity and gross normality. Residuals should have the appearance of (zero mean normal distribution) noise when plotted as a function of anything including time and modeled data values. Trends hint at interactions among factors or among observations.

==== Follow-up tests ==== A statistically significant effect in ANOVA is often followed by additional tests. This can be done in order to assess which groups are different from which other groups or to test various other focused hypotheses. Follow-up tests are often distinguished in terms of whether they are "planned" (a priori) or "post hoc." Planned tests are determined before looking at the data, and post hoc tests are conceived only after looking at the data (though the term "post hoc" is inconsistently used). The follow-up tests may be "simple" pairwise comparisons of individual group means or may be "compound" comparisons (e.g., comparing the mean pooling across groups A, B and C to the mean of group D). Comparisons can also look at tests of trend, such as linear and quadratic relationships, when the independent variable involves ordered levels. Often the follow-up tests incorporate a method of adjusting for the multiple comparisons problem. Follow-up tests to identify which specific groups, variables, or factors have statistically different means include the Tukey's range test, and Duncan's new multiple range test. In turn, these tests are often followed with a Compact Letter Display (CLD) methodology in order to render the output of the mentioned tests more transparent to a non-statistician audience.

== Study designs == There are several types of ANOVA. Many statisticians base ANOVA on the design of the experiment, especially on the protocol that specifies the random assignment of treatments to subjects; the protocol's description of the assignment mechanism should include a specification of the structure of the treatments and of any blocking. It is also common to apply ANOVA to observational data using an appropriate statistical model. Some popular designs use the following types of ANOVA:

One-way ANOVA is used to test for differences among two or more independent groups (means), e.g. different levels of urea application in a crop, or different levels of antibiotic action on several different bacterial species, or different levels of effect of some medicine on groups of patients. However, should these groups not be independent, and there is an order in the groups (such as mild, moderate and severe disease), or in the dose of a drug (such as 5 mg/mL, 10 mg/mL, 20 mg/mL) given to the same group of patients, then a linear trend estimation should be used. Typically, however, the one-way ANOVA is used to test for differences among at least three groups, since the two-group case can be covered by a t-test. When there are only two means to compare, the t-test and the ANOVA F-test are equivalent; the relation between ANOVA and t is given by F = t2. Factorial ANOVA is used when there is more than one factor. Repeated measures ANOVA is used when the same subjects are used for each factor (e.g., in a longitudinal study). Multivariate analysis of variance (MANOVA) is used when there is more than one response variable.