kb/data/en.wikipedia.org/wiki/Group_testing-4.md

---
title: "Group testing"
chunk: 5/10
source: "https://en.wikipedia.org/wiki/Group_testing"
category: "reference"
tags: "science, encyclopedia"
date_saved: "2026-05-05T09:50:23.496143+00:00"
instance: "kb-cron"
---

Suppose a non-adaptive group testing procedure for


        n


    {\displaystyle n}

 items consists of the tests


          S

            1


        ,

          S

            2


        ,
        …
        ,

          S

            t


    {\displaystyle S_{1},S_{2},\dots ,S_{t}}

 for some


        t
        ∈


            N


            ≥
            0


    {\displaystyle t\in \mathbb {N} _{\geq 0}}

. The testing matrix for this scheme is the


        t
        ×
        n


    {\displaystyle t\times n}

 binary matrix,


        M


    {\displaystyle M}

, where


        (
        M

          )

            i
            j


        =
        1


    {\displaystyle (M)_{ij}=1}

 if and only if


        j
        ∈

          S

            i


    {\displaystyle j\in S_{i}}

 (and is zero otherwise).
Thus each column of


        M


    {\displaystyle M}

 represents an item and each row represents a test, with a


        1


    {\displaystyle 1}

 in the


        (
        i
        ,
        j
        )


            -th


    {\displaystyle (i,j){\textrm {-th}}}

 entry indicating that the


        i


            -th


    {\displaystyle i{\textrm {-th}}}

 test included the


        j


            -th


    {\displaystyle j{\textrm {-th}}}

 item and a


        0


    {\displaystyle 0}

 indicating otherwise.
As well as the vector


          x


    {\displaystyle \mathbf {x} }

 (of length


        n


    {\displaystyle n}

) that describes the unknown defective set, it is common to introduce the result vector, which describes the results of each test.

Let


        t


    {\displaystyle t}

 be the number of tests performed by a non-adaptive algorithm. The result vector,


          y

        =
        (

          y

            1


        ,

          y

            2


        ,
        …
        ,

          y

            t


        )


    {\displaystyle \mathbf {y} =(y_{1},y_{2},\dots ,y_{t})}

, is a binary vector of length


        t


    {\displaystyle t}

 (that is,


          y

        ∈
        {
        0
        ,
        1

          }

            t


    {\displaystyle \mathbf {y} \in \{0,1\}^{t}}

) such that


          y

            i


        =
        1


    {\displaystyle y_{i}=1}

 if and only if the result of the


        i


            -th


    {\displaystyle i{\textrm {-th}}}

 test was positive (i.e. contained at least one defective).
With these definitions, the non-adaptive problem can be reframed as follows: first a testing matrix is chosen,


        M


    {\displaystyle M}

, after which the vector


          y


    {\displaystyle \mathbf {y} }

 is returned. Then the problem is to analyse


          y


    {\displaystyle \mathbf {y} }

 to find some estimate for


          x


    {\displaystyle \mathbf {x} }

.
In the simplest noisy case, where there is a constant probability,


        q


    {\displaystyle q}

, that a group test will have an erroneous result, one considers a random binary vector,


          v


    {\displaystyle \mathbf {v} }

, where each entry has a probability


        q


    {\displaystyle q}

 of being


        1


    {\displaystyle 1}

, and is


        0


    {\displaystyle 0}

 otherwise. The vector that is returned is then


                y

              ^


        =

          y

        +

          v


    {\displaystyle {\hat {\mathbf {y} }}=\mathbf {y} +\mathbf {v} }

, with the usual addition on


        (

          Z


          /

        2

          Z


          )

            n


    {\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{n}}

 (equivalently this is the element-wise XOR operation). A noisy algorithm must estimate


          x


    {\displaystyle \mathbf {x} }

 using


                y

              ^


    {\displaystyle {\hat {\mathbf {y} }}}

 (that is, without direct knowledge of


          y


    {\displaystyle \mathbf {y} }

).

=== Bounds for non-adaptive algorithms ===
The matrix representation makes it possible to prove some bounds on non-adaptive group testing. The approach mirrors that of many deterministic designs, where


        d


    {\displaystyle d}

-separable matrices are considered, as defined below.

A binary matrix,


        M


    {\displaystyle M}

, is called


        d


    {\displaystyle d}

-separable if every Boolean sum (logical OR) of any


        d


    {\displaystyle d}

 of its columns is distinct. Additionally, the notation


              d
              ¯


    {\displaystyle {\bar {d}}}

-separable indicates that every sum of any of up to


        d


    {\displaystyle d}

 of


        M


    {\displaystyle M}

's columns is distinct. (This is not the same as


        M


    {\displaystyle M}

 being


        k


    {\displaystyle k}

-separable for every


        k
        ≤
        d


    {\displaystyle k\leq d}

.)
When


        M


    {\displaystyle M}

 is a testing matrix, the property of being


        d


    {\displaystyle d}

-separable (


              d
              ¯


    {\displaystyle {\bar {d}}}

-separable) is equivalent to being able to distinguish between (up to)


        d


    {\displaystyle d}

 defectives. However, it does not guarantee that this will be straightforward. A stronger property, called disjunctness does.

A binary matrix,


        M


    {\displaystyle M}

 is called


        d


    {\displaystyle d}

-disjunct if the Boolean sum of any


        d


    {\displaystyle d}

 columns does not contain any other column. (In this context, a column A is said to contain a column B if for every index where B has a 1, A also has a 1.)
A useful property of


        d


    {\displaystyle d}

-disjunct testing matrices is that, with up to


        d


    {\displaystyle d}

 defectives, every non-defective item will appear in at least one test whose outcome is negative. This means there is a simple procedure for finding the defectives: just remove every item that appears in a negative test.
Using the properties of


        d


    {\displaystyle d}

-separable and


        d


    {\displaystyle d}

-disjunct matrices the following can be shown for the problem of identifying


        d


    {\displaystyle d}

 defectives among


        n


    {\displaystyle n}

 total items.

The number of tests needed for an asymptotically small average probability of error scales as


        O
        (
        d

          log

            2


        ⁡
        n
        )


    {\displaystyle O(d\log _{2}n)}

.
The number of tests needed for an asymptotically small maximum probability of error scales as


        O
        (

          d

            2


          log

            2


        ⁡
        n
        )


    {\displaystyle O(d^{2}\log _{2}n)}

.
The number of tests needed for a zero probability of error scales as


        O

          (


                  d

                    2


                  log

                    2


                ⁡
                n


                  log

                    2


                ⁡
                d


          )


    {\displaystyle O\left({\frac {d^{2}\log _{2}n}{\log _{2}d}}\right)}

.

== Generalised binary-splitting algorithm ==

The generalised binary-splitting algorithm is an essentially-optimal adaptive group-testing algorithm that finds


        d


    {\displaystyle d}

 or fewer defectives among


        n


    {\displaystyle n}

 items as follows: