kb/data/en.wikipedia.org/wiki/Design_of_experiments-2.md

---
title: "Design of experiments"
chunk: 3/5
source: "https://en.wikipedia.org/wiki/Design_of_experiments"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T03:42:27.446946+00:00"
instance: "kb-cron"
---

Let yi be the measured difference for i = 1, ..., 8. The relationship between the true weights and experimental measurements may be represented with a general linear model, with the design matrix


        W


    {\displaystyle W}

 having entries from


        {
        −
        1
        ,
        0
        ,
        1
        }


    {\displaystyle \{-1,0,1\}}

:


        y
        =
        W
        θ
        +
        ϵ


    {\displaystyle y=W\theta +\epsilon }


The first design is represented by an identity matrix while the second design is represented by an 8x8 Hadamard matrix,


        H


    {\displaystyle H}

, both examples of weighing matrices.
The weights are typically estimated using the method of least squares. Using a weighing matrix, this is equivalent to inverting on the measurements:


                θ
                ^


            A


        =

          I

            −
            1


        y
        =
        y


    {\displaystyle {\hat {\theta }}_{A}=I^{-1}y=y}


                θ
                ^


            B


        =

          H

            −
            1


        y


    {\displaystyle {\hat {\theta }}_{B}=H^{-1}y}


The question of design of experiments is: which experiment is better?
Investigating estimate A vs B for the first weight:


        Var
        ⁡
        (


                θ
                ^


            A
            ,
            1


        )
        =
        Var
        ⁡
        (

          y

            1


        )
        =

          σ

            2


    {\displaystyle \operatorname {Var} ({\hat {\theta }}_{A,1})=\operatorname {Var} (y_{1})=\sigma ^{2}}


        Var
        ⁡
        (


                θ
                ^


            B
            ,
            1


        )
        =
        Var
        ⁡
        (


                y

                  1


              +

                y

                  2


              −

                y

                  3


              −

                y

                  4


              +

                y

                  5


              +

                y

                  6


              −

                y

                  7


              −

                y

                  8


            8


        )
        =


              σ

                2


            8


    {\displaystyle \operatorname {Var} ({\hat {\theta }}_{B,1})=\operatorname {Var} ({\frac {y_{1}+y_{2}-y_{3}-y_{4}+y_{5}+y_{6}-y_{7}-y_{8}}{8}})={\frac {\sigma ^{2}}{8}}}


A similar result follows for the remaining weight estimates. Thus, the second experiment gives us 8 times as much precision for the estimate of a single item, despite costing the same number of resources (number of weightings).
Many problems of the design of experiments involve combinatorial designs, as in this example and others.

== Avoiding false positives ==

False positive conclusions, often resulting from the pressure to publish or the author's own confirmation bias, are an inherent hazard in many fields.
Use of double-blind designs can prevent biases potentially leading to false positives in the data collection phase. When a double-blind design is used, participants are randomly assigned to experimental groups but the researcher is unaware of what participants belong to which group. Therefore, the researcher can not affect the participants' response to the intervention.
Experimental designs with undisclosed degrees of freedom are a problem, in that they can lead to conscious or unconscious "p-hacking": trying multiple things until you get the desired result. It typically involves the manipulation – perhaps unconsciously – of the process of statistical analysis and the degrees of freedom until they return a figure below the p<.05 level of statistical significance.
P-hacking can be prevented by preregistering researches, in which researchers have to send their data analysis plan to the journal they wish to publish their paper in before they even start their data collection, so no data manipulation is possible.
Another way to prevent this is taking a double-blind design to the data-analysis phase, making the study triple-blind, where the data are sent to a data-analyst unrelated to the research who scrambles up the data so there is no way to know which participants belong to before they are potentially taken away as outliers.
Clear and complete documentation of the experimental methodology is also important in order to support replication of results.

== Discussion topics when setting up an experimental design ==
An experimental design or randomized clinical trial requires careful consideration of several factors before actually doing the experiment. An experimental design is the laying out of a detailed experimental plan in advance of doing the experiment. Some of the following topics have already been discussed in the principles of experimental design section:

How many factors does the design have, and are the levels of these factors fixed or random?
Are control conditions needed, and what should they be?
Manipulation checks: did the manipulation really work?
What are the background variables?
What is the sample size? How many units must be collected for the experiment to be generalisable and have enough power?
What is the relevance of interactions between factors?
What is the influence of delayed effects of substantive factors on outcomes?
How do response shifts affect self-report measures?
How feasible is repeated administration of the same measurement instruments to the same units at different occasions, with a post-test and follow-up tests?
What about using a proxy pretest?
Are there confounding variables?
Should the client/patient, researcher or even the analyst of the data be blind to conditions?
What is the feasibility of subsequent application of different conditions to the same units?
How many of each control and noise factors should be taken into account?
The independent variable of a study often has many levels or different groups. In a true experiment, researchers can have an experimental group, which is where their intervention testing the hypothesis is implemented, and a control group, which has all the same element as the experimental group, without the interventional element. Thus, when everything else except for one intervention is held constant, researchers can certify with some certainty that this one element is what caused the observed change. In some instances, having a control group is not ethical. This is sometimes solved using two different experimental groups. In some cases, independent variables cannot be manipulated, for example when testing the difference between two groups who have a different disease, or testing the difference between genders (obviously variables that would be hard or unethical to assign participants to). In these cases, a quasi-experimental design may be used.