5.8 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Group testing | 1/10 | https://en.wikipedia.org/wiki/Group_testing | reference | science, encyclopedia | 2026-05-05T09:50:23.496143+00:00 | kb-cron |
In statistics and combinatorial mathematics, group testing is any procedure that breaks up the task of identifying objects into tests on groups of items, rather than testing each item individually. First studied by Robert Dorfman in 1943, group testing is a relatively new field of mathematics that can be applied to a wide range of practical applications and is an active area of research today. A familiar example of group testing involves a string of light bulbs connected in series, where exactly one of the bulbs is known to be broken. The objective is to find the broken bulb using the smallest number of tests (where a test is when some of the bulbs are connected to a power supply). A simple approach is to test each bulb individually. However, when there are a large number of bulbs it would be much more efficient to pool the bulbs into groups. For example, by connecting the first half of the bulbs at once, it can be determined which half the broken bulb is in, ruling out half of the bulbs in just one test. Schemes for carrying out group testing can be simple or complex and the tests involved at each stage may be different. Schemes in which the tests for the next stage depend on the results of the previous stages are called adaptive procedures, while schemes designed so that all the tests are known beforehand are called non-adaptive procedures. The structure of the scheme of the tests involved in a non-adaptive procedure is known as a pooling design. Group testing has many applications, including statistics, biology, computer science, medicine, engineering and cyber security. Modern interest in these testing schemes has been rekindled by the Human Genome Project.
== Basic description and terms == Unlike many areas of mathematics, the origins of group testing can be traced back to a single report written by a single person: Robert Dorfman. The motivation arose during the Second World War when the United States Public Health Service and the Selective service embarked upon a large-scale project to weed out all syphilitic men called up for induction. Testing an individual for syphilis involves drawing a blood sample from them and then analysing the sample to determine the presence or absence of syphilis. At the time, performing this test was expensive, and testing every soldier individually would have been very expensive and inefficient. Supposing there are
n
{\displaystyle n}
soldiers, this method of testing leads to
n
{\displaystyle n}
separate tests. If a large proportion of the people are infected then this method would be reasonable. However, in the more likely case that only a very small proportion of the men are infected, a much more efficient testing scheme can be achieved. The feasibility of a more effective testing scheme hinges on the following property: the soldiers can be pooled into groups, and in each group the blood samples can be combined. The combined sample can then be tested to check if at least one soldier in the group has syphilis. This is the central idea behind group testing. If one or more of the soldiers in this group has syphilis, then a test is wasted (more tests need to be performed to find which soldier(s) it was). On the other hand, if no one in the pool has syphilis then many tests are saved, since every soldier in that group can be eliminated with just one test. The items that cause a group to test positive are generally called defective items (these are the broken lightbulbs, syphilitic men, etc.). Often, the total number of items is denoted as
n
{\displaystyle n}
and
d
{\displaystyle d}
represents the number of defectives if it is assumed to be known.
=== Classification of group-testing problems === There are two independent classifications for group-testing problems; every group-testing problem is either adaptive or non-adaptive, and either probabilistic or combinatorial. In probabilistic models, the defective items are assumed to follow some probability distribution and the aim is to minimise the expected number of tests needed to identify the defectiveness of every item. On the other hand, with combinatorial group testing, the goal is to minimise the number of tests needed in a 'worst-case scenario' – that is, create a minmax algorithm – and no knowledge of the distribution of defectives is assumed. The other classification, adaptivity, concerns what information can be used when choosing which items to group into a test. In general, the choice of which items to test can depend on the results of previous tests, as in the above lightbulb problem. An algorithm that proceeds by performing a test, and then using the result (and all past results) to decide which next test to perform, is called adaptive. Conversely, in non-adaptive algorithms, all tests are decided in advance. This idea can be generalised to multistage algorithms, where tests are divided into stages, and every test in the next stage must be decided in advance, with only the knowledge of the results of tests in previous stages. Although adaptive algorithms offer much more freedom in design, it is known that adaptive group-testing algorithms do not improve upon non-adaptive ones by more than a constant factor in the number of tests required to identify the set of defective items. In addition to this, non-adaptive methods are often useful in practice because one can proceed with successive tests without first analysing the results of all previous tests, allowing for the effective distribution of the testing process.