856 lines
11 KiB
Markdown
856 lines
11 KiB
Markdown
---
|
||
title: "Group testing"
|
||
chunk: 5/10
|
||
source: "https://en.wikipedia.org/wiki/Group_testing"
|
||
category: "reference"
|
||
tags: "science, encyclopedia"
|
||
date_saved: "2026-05-05T09:50:23.496143+00:00"
|
||
instance: "kb-cron"
|
||
---
|
||
|
||
Suppose a non-adaptive group testing procedure for
|
||
|
||
|
||
|
||
n
|
||
|
||
|
||
{\displaystyle n}
|
||
|
||
items consists of the tests
|
||
|
||
|
||
|
||
|
||
S
|
||
|
||
1
|
||
|
||
|
||
,
|
||
|
||
S
|
||
|
||
2
|
||
|
||
|
||
,
|
||
…
|
||
,
|
||
|
||
S
|
||
|
||
t
|
||
|
||
|
||
|
||
|
||
{\displaystyle S_{1},S_{2},\dots ,S_{t}}
|
||
|
||
for some
|
||
|
||
|
||
|
||
t
|
||
∈
|
||
|
||
|
||
N
|
||
|
||
|
||
≥
|
||
0
|
||
|
||
|
||
|
||
|
||
{\displaystyle t\in \mathbb {N} _{\geq 0}}
|
||
|
||
. The testing matrix for this scheme is the
|
||
|
||
|
||
|
||
t
|
||
×
|
||
n
|
||
|
||
|
||
{\displaystyle t\times n}
|
||
|
||
binary matrix,
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
, where
|
||
|
||
|
||
|
||
(
|
||
M
|
||
|
||
)
|
||
|
||
i
|
||
j
|
||
|
||
|
||
=
|
||
1
|
||
|
||
|
||
{\displaystyle (M)_{ij}=1}
|
||
|
||
if and only if
|
||
|
||
|
||
|
||
j
|
||
∈
|
||
|
||
S
|
||
|
||
i
|
||
|
||
|
||
|
||
|
||
{\displaystyle j\in S_{i}}
|
||
|
||
(and is zero otherwise).
|
||
Thus each column of
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
represents an item and each row represents a test, with a
|
||
|
||
|
||
|
||
1
|
||
|
||
|
||
{\displaystyle 1}
|
||
|
||
in the
|
||
|
||
|
||
|
||
(
|
||
i
|
||
,
|
||
j
|
||
)
|
||
|
||
|
||
-th
|
||
|
||
|
||
|
||
|
||
{\displaystyle (i,j){\textrm {-th}}}
|
||
|
||
entry indicating that the
|
||
|
||
|
||
|
||
i
|
||
|
||
|
||
-th
|
||
|
||
|
||
|
||
|
||
{\displaystyle i{\textrm {-th}}}
|
||
|
||
test included the
|
||
|
||
|
||
|
||
j
|
||
|
||
|
||
-th
|
||
|
||
|
||
|
||
|
||
{\displaystyle j{\textrm {-th}}}
|
||
|
||
item and a
|
||
|
||
|
||
|
||
0
|
||
|
||
|
||
{\displaystyle 0}
|
||
|
||
indicating otherwise.
|
||
As well as the vector
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {x} }
|
||
|
||
(of length
|
||
|
||
|
||
|
||
n
|
||
|
||
|
||
{\displaystyle n}
|
||
|
||
) that describes the unknown defective set, it is common to introduce the result vector, which describes the results of each test.
|
||
|
||
Let
|
||
|
||
|
||
|
||
t
|
||
|
||
|
||
{\displaystyle t}
|
||
|
||
be the number of tests performed by a non-adaptive algorithm. The result vector,
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
=
|
||
(
|
||
|
||
y
|
||
|
||
1
|
||
|
||
|
||
,
|
||
|
||
y
|
||
|
||
2
|
||
|
||
|
||
,
|
||
…
|
||
,
|
||
|
||
y
|
||
|
||
t
|
||
|
||
|
||
)
|
||
|
||
|
||
{\displaystyle \mathbf {y} =(y_{1},y_{2},\dots ,y_{t})}
|
||
|
||
, is a binary vector of length
|
||
|
||
|
||
|
||
t
|
||
|
||
|
||
{\displaystyle t}
|
||
|
||
(that is,
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
∈
|
||
{
|
||
0
|
||
,
|
||
1
|
||
|
||
}
|
||
|
||
t
|
||
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {y} \in \{0,1\}^{t}}
|
||
|
||
) such that
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
i
|
||
|
||
|
||
=
|
||
1
|
||
|
||
|
||
{\displaystyle y_{i}=1}
|
||
|
||
if and only if the result of the
|
||
|
||
|
||
|
||
i
|
||
|
||
|
||
-th
|
||
|
||
|
||
|
||
|
||
{\displaystyle i{\textrm {-th}}}
|
||
|
||
test was positive (i.e. contained at least one defective).
|
||
With these definitions, the non-adaptive problem can be reframed as follows: first a testing matrix is chosen,
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
, after which the vector
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {y} }
|
||
|
||
is returned. Then the problem is to analyse
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {y} }
|
||
|
||
to find some estimate for
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {x} }
|
||
|
||
.
|
||
In the simplest noisy case, where there is a constant probability,
|
||
|
||
|
||
|
||
q
|
||
|
||
|
||
{\displaystyle q}
|
||
|
||
, that a group test will have an erroneous result, one considers a random binary vector,
|
||
|
||
|
||
|
||
|
||
v
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {v} }
|
||
|
||
, where each entry has a probability
|
||
|
||
|
||
|
||
q
|
||
|
||
|
||
{\displaystyle q}
|
||
|
||
of being
|
||
|
||
|
||
|
||
1
|
||
|
||
|
||
{\displaystyle 1}
|
||
|
||
, and is
|
||
|
||
|
||
|
||
0
|
||
|
||
|
||
{\displaystyle 0}
|
||
|
||
otherwise. The vector that is returned is then
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
^
|
||
|
||
|
||
|
||
=
|
||
|
||
y
|
||
|
||
+
|
||
|
||
v
|
||
|
||
|
||
|
||
{\displaystyle {\hat {\mathbf {y} }}=\mathbf {y} +\mathbf {v} }
|
||
|
||
, with the usual addition on
|
||
|
||
|
||
|
||
(
|
||
|
||
Z
|
||
|
||
|
||
/
|
||
|
||
2
|
||
|
||
Z
|
||
|
||
|
||
)
|
||
|
||
n
|
||
|
||
|
||
|
||
|
||
{\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{n}}
|
||
|
||
(equivalently this is the element-wise XOR operation). A noisy algorithm must estimate
|
||
|
||
|
||
|
||
|
||
x
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {x} }
|
||
|
||
using
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
^
|
||
|
||
|
||
|
||
|
||
|
||
{\displaystyle {\hat {\mathbf {y} }}}
|
||
|
||
(that is, without direct knowledge of
|
||
|
||
|
||
|
||
|
||
y
|
||
|
||
|
||
|
||
{\displaystyle \mathbf {y} }
|
||
|
||
).
|
||
|
||
=== Bounds for non-adaptive algorithms ===
|
||
The matrix representation makes it possible to prove some bounds on non-adaptive group testing. The approach mirrors that of many deterministic designs, where
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-separable matrices are considered, as defined below.
|
||
|
||
A binary matrix,
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
, is called
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-separable if every Boolean sum (logical OR) of any
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
of its columns is distinct. Additionally, the notation
|
||
|
||
|
||
|
||
|
||
|
||
|
||
d
|
||
¯
|
||
|
||
|
||
|
||
|
||
|
||
{\displaystyle {\bar {d}}}
|
||
|
||
-separable indicates that every sum of any of up to
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
of
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
's columns is distinct. (This is not the same as
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
being
|
||
|
||
|
||
|
||
k
|
||
|
||
|
||
{\displaystyle k}
|
||
|
||
-separable for every
|
||
|
||
|
||
|
||
k
|
||
≤
|
||
d
|
||
|
||
|
||
{\displaystyle k\leq d}
|
||
|
||
.)
|
||
When
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
is a testing matrix, the property of being
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-separable (
|
||
|
||
|
||
|
||
|
||
|
||
|
||
d
|
||
¯
|
||
|
||
|
||
|
||
|
||
|
||
{\displaystyle {\bar {d}}}
|
||
|
||
-separable) is equivalent to being able to distinguish between (up to)
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
defectives. However, it does not guarantee that this will be straightforward. A stronger property, called disjunctness does.
|
||
|
||
A binary matrix,
|
||
|
||
|
||
|
||
M
|
||
|
||
|
||
{\displaystyle M}
|
||
|
||
is called
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-disjunct if the Boolean sum of any
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
columns does not contain any other column. (In this context, a column A is said to contain a column B if for every index where B has a 1, A also has a 1.)
|
||
A useful property of
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-disjunct testing matrices is that, with up to
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
defectives, every non-defective item will appear in at least one test whose outcome is negative. This means there is a simple procedure for finding the defectives: just remove every item that appears in a negative test.
|
||
Using the properties of
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-separable and
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
-disjunct matrices the following can be shown for the problem of identifying
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
defectives among
|
||
|
||
|
||
|
||
n
|
||
|
||
|
||
{\displaystyle n}
|
||
|
||
total items.
|
||
|
||
The number of tests needed for an asymptotically small average probability of error scales as
|
||
|
||
|
||
|
||
O
|
||
(
|
||
d
|
||
|
||
log
|
||
|
||
2
|
||
|
||
|
||
|
||
n
|
||
)
|
||
|
||
|
||
{\displaystyle O(d\log _{2}n)}
|
||
|
||
.
|
||
The number of tests needed for an asymptotically small maximum probability of error scales as
|
||
|
||
|
||
|
||
O
|
||
(
|
||
|
||
d
|
||
|
||
2
|
||
|
||
|
||
|
||
log
|
||
|
||
2
|
||
|
||
|
||
|
||
n
|
||
)
|
||
|
||
|
||
{\displaystyle O(d^{2}\log _{2}n)}
|
||
|
||
.
|
||
The number of tests needed for a zero probability of error scales as
|
||
|
||
|
||
|
||
O
|
||
|
||
(
|
||
|
||
|
||
|
||
|
||
d
|
||
|
||
2
|
||
|
||
|
||
|
||
log
|
||
|
||
2
|
||
|
||
|
||
|
||
n
|
||
|
||
|
||
|
||
log
|
||
|
||
2
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
|
||
)
|
||
|
||
|
||
|
||
{\displaystyle O\left({\frac {d^{2}\log _{2}n}{\log _{2}d}}\right)}
|
||
|
||
.
|
||
|
||
== Generalised binary-splitting algorithm ==
|
||
|
||
The generalised binary-splitting algorithm is an essentially-optimal adaptive group-testing algorithm that finds
|
||
|
||
|
||
|
||
d
|
||
|
||
|
||
{\displaystyle d}
|
||
|
||
or fewer defectives among
|
||
|
||
|
||
|
||
n
|
||
|
||
|
||
{\displaystyle n}
|
||
|
||
items as follows: |