kb/Algorithmic_bias-5.md at a36fea87f27c916f4e8e49e4d0e70e23f4a1e902

turtle89431 173b4d5af1 Scrape wikipedia-science: 20542 new, 4794 updated, 25967 total (kb-cron)

2026-05-05 09:31:32 -07:00

5.8 KiB

Raw Blame History

title	chunk	source	category	tags	date_saved	instance
Algorithmic bias	6/13	https://en.wikipedia.org/wiki/Algorithmic_bias	reference	science, encyclopedia	2026-05-05T16:31:03.393915+00:00	kb-cron

=== Emergent === Emergent bias is the result of the use and reliance on algorithms across new or unanticipated contexts. Algorithms may not have been adjusted to consider new forms of knowledge, such as new drugs or medical breakthroughs, new laws, business models, or shifting cultural norms. This may exclude groups through technology, without providing clear outlines to understand who is responsible for their exclusion. Similarly, problems may emerge when training data (the samples "fed" to a machine, by which it models certain conclusions) do not align with contexts that an algorithm encounters in the real world. In 1990, an example of emergent bias was identified in the software used to place US medical students into residencies, the National Residency Match Program (NRMP). The algorithm was designed at a time when few married couples would seek residencies together. As more women entered medical schools, more students were likely to request a residency alongside their partners. The process called for each applicant to provide a list of preferences for placement across the US, which was then sorted and assigned when a hospital and an applicant both agreed to a match. In the case of married couples where both sought residencies, the algorithm weighed the location choices of the higher-rated partner first. The result was a frequent assignment of highly preferred schools to the first partner and lower-preferred schools to the second partner, rather than sorting for compromises in placement preference. Additional emergent biases include:

==== Correlations ==== Unpredictable correlations can emerge when large data sets are compared to each other. For example, data collected about web-browsing patterns may align with signals marking sensitive data (such as race or sexual orientation). By selecting according to certain behavior or browsing patterns, the end effect would be almost identical to discrimination through the use of direct race or sexual orientation data. In other cases, the algorithm draws conclusions from correlations, without being able to understand those correlations. For example, one triage program gave lower priority to asthmatics who had pneumonia than asthmatics who did not have pneumonia. The program algorithm did this because it simply compared survival rates: asthmatics with pneumonia are at the highest risk. Historically, for this same reason, hospitals typically give such asthmatics the best and most immediate care.

==== Unanticipated uses ==== Emergent bias can occur when an algorithm is used by unanticipated audiences. For example, machines may require that users can read, write, or understand numbers, or relate to an interface using metaphors that they do not understand. These exclusions can become compounded, as biased or exclusionary technology is more deeply integrated into society. Apart from exclusion, unanticipated uses may emerge from the end user relying on the software rather than their own knowledge. In one example, an unanticipated user group led to algorithmic bias in the UK, when the British National Act Program was created as a proof-of-concept by computer scientists and immigration lawyers to evaluate suitability for British citizenship. The designers had access to legal expertise beyond the end users in immigration offices, whose understanding of both software and immigration law would likely have been unsophisticated. The agents administering the questions relied entirely on the software, which excluded alternative pathways to citizenship, and used the software even after new case laws and legal interpretations led the algorithm to become outdated. As a result of designing an algorithm for users assumed to be legally savvy on immigration law, the software's algorithm indirectly led to bias in favor of applicants who fit a very narrow set of legal criteria set by the algorithm, rather than by the more broader criteria of British immigration law.

==== Feedback loops ==== Emergent bias may also create a feedback loop, or recursion, if data collected for an algorithm results in real-world responses which are fed back into the algorithm. For example, simulations of the predictive policing software (PredPol), deployed in Oakland, California, suggested an increased police presence in black neighborhoods based on crime data reported by the public. The simulation showed that the public reported crime based on the sight of police cars, regardless of what police were doing. The simulation interpreted police car sightings in modeling its predictions of crime, and would in turn assign an even larger increase of police presence within those neighborhoods. The Human Rights Data Analysis Group, which conducted the simulation, warned that in places where racial discrimination is a factor in arrests, such feedback loops could reinforce and perpetuate racial discrimination in policing. Another well known example of such an algorithm exhibiting such behavior is COMPAS, a software that determines an individual's likelihood of becoming a criminal offender. The software is often criticized for labeling Black individuals as criminals much more likely than others, and then feeds the data back into itself in the event individuals become registered criminals, further enforcing the bias created by the dataset the algorithm is acting on. Recommender systems such as those used to recommend online videos or news articles can create feedback loops. When users click on content that is suggested by algorithms, it influences the next set of suggestions. Over time this may lead to users entering a filter bubble and being unaware of important or useful content.

== Impact ==

5.8 KiB Raw Blame History

5.8 KiB

Raw Blame History