3.6 KiB
| title | chunk | source | category | tags | date_saved | instance |
|---|---|---|---|---|---|---|
| Group testing | 10/10 | https://en.wikipedia.org/wiki/Group_testing | reference | science, encyclopedia | 2026-05-05T09:50:23.496143+00:00 | kb-cron |
=== Data forensics === Data forensics is a field dedicated to finding methods for compiling digital evidence of a crime. Such crimes typically involve an adversary modifying the data, documents or databases of a victim, with examples including the altering of tax records, a virus hiding its presence, or an identity thief modifying personal data. A common tool in data forensics is the one-way cryptographic hash. This is a function that takes the data, and through a difficult-to-reverse procedure, produces a unique number called a hash. Hashes, which are often much shorter than the data, allow us to check if the data has been changed without having to wastefully store complete copies of the information: the hash for the current data can be compared with a past hash to determine if any changes have occurred. An unfortunate property of this method is that, although it is easy to tell if the data has been modified, there is no way of determining how: that is, it is impossible to recover which part of the data has changed. One way to get around this limitation is to store more hashes – now of subsets of the data structure – to narrow down where the attack has occurred. However, to find the exact location of the attack with a naive approach, a hash would need to be stored for every datum in the structure, which would defeat the point of the hashes in the first place. (One may as well store a regular copy of the data.) Group testing can be used to dramatically reduce the number of hashes that need to be stored. A test becomes a comparison between the stored and current hashes, which is positive when there is a mismatch. This indicates that at least one edited datum (which is taken as defectiveness in this model) is contained in the group that generated the current hash. In fact, the amount of hashes needed is so low that they, along with the testing matrix they refer to, can even be stored within the organisational structure of the data itself. This means that as far as memory is concerned the test can be performed 'for free'. (This is true with the exception of a master-key/password that is used to secretly determine the hashing function.)
== Notes ==
== References ==
=== Citations ===
=== General references === Ding-Zhu, Du; Hwang, Frank K. (2000). Combinatorial group testing and its applications (2nd ed.). Singapore: World Scientific. ISBN 978-9810241070. Atri Rudra's course on Error Correcting Codes: Combinatorics, Algorithms, and Applications (Spring 2007), Lectures 7. Atri Rudra's course on Error Correcting Codes: Combinatorics, Algorithms, and Applications (Spring 2010), Lectures 10, 11, 28, 29 Du, D.; Hwang, F. (2006). Pooling Designs and Nonadaptive Group Testing. World Scientific. ISBN 9789814477864. Aldridge, M.; Johnson, O.; Scarlett, J. (2019). "Group Testing: An Information Theory Perspective" (PDF). Foundations and Trends in Communications and Information Theory. 15 (3–4): 196–392. arXiv:1902.06002. doi:10.1561/0100000099. S2CID 62841593. Porat, E.; Rothschild, A. (2011). "Explicit nonadaptive combinatorial group testing schemes". IEEE Transactions on Information Theory. 57 (12): 7982–89. arXiv:0712.3876. Bibcode:2011ITIT...57.7982P. doi:10.1109/TIT.2011.2163296. S2CID 8815474. Kagan, Eugene; Ben-gal, Irad (2014), "A group testing algorithm with online informational learning", IIE Transactions, 46 (2): 164–184, doi:10.1080/0740817X.2013.803639, ISSN 0740-817X, S2CID 18588494
== See also == Balance puzzle