Inside the Black Box: Built-in Bias of Policing Risk Assessment Tools
From the COVID-19 pandemic to the murder of George Floyd, 2020 is a year that will go down in history. George Floyd’s death fueled protests all around the world and calls for governments, especially in the United States, to address systemic misconduct in police departments and prosecutors’ offices (Cockrell, 2021). As the world grappled with the intensity and emotional turmoil of these events, the pervasive reality of Anti-Black racism became evident (among other social disparities), and could no longer be ignored. As this blog post will highlight, these demands for widespread change in policing and a call for racial justice intersect with the growing presence and power of algorithmic crime forecasting tools. More and more police departments, municipalities, and states are using predictive algorithms to make their decisions (Cockrell, 2021; Hälterlein, 2021). Thus, it is imperative to review research that explores the potential benefits and dangers of such tools, as well as the relevant theme of ‘big’ data. To that end, Jens Hälterlein’s article Epistemologies of predictive policing: Mathematical social science, social physics and machine learning will be used as the main literature to review the implications of my object of study: the Correctional Offender Management Profiling for Alternative Sanctions’ (COMPAS) predictive risk assessment software.
While some proponents believe that predictive policing tools can be used to ensure greater objectivity when making decisions on where police departments should send officers for patrol, as well as which defendants should be released on bail and how judges should hand out sentences, others fear that they will further perpetuate discrimination and become a new source of inequality (Cockrell, 2021; Hälterlein, 2021). COMPAS, a software owned by Northpointe, is among the most widely used risk assessment tools in the United States (Angwin et al., 2016). At its core, COMPAS is used to predict the likelihood of an individual committing a future crime. Some judges have cited using this software in their sentencing decisions, which resulted in the release of dangerous criminals or longer sentencing than warranted (Angwin et al., 2016). Research conducted by ProPublica proved that its formula was especially prone to falsely identify black defendants as future criminals, “wrongly labeling them this way at almost twice the rate as white defendants” (Angwin et al., 2016:3). Alternatively, white defendants were “mislabeled as low risk more often than black defendants” (Angwin et al., 2016:3). Yet very few people know what its formula consists of (Angwin et al., 2016).
The company does not divulge its algorithm’s formula, claiming it is intellectual property. Furthermore, as the company is a for-profit organization, there was no financial incentive for them to do so. Indeed, “knowledge about the computational methods that generate predictions remains vague or concealed behind the claims of software companies and other actors directly involved with the development and implementation of the tools that are supposed to identify crime before it happens” (Hälterlein, 2021:10). Consequently, neither defendants nor the public can see what variables might be driving the disparity between the races’ scores produced by COMPAS (Angwin et al., 2016). Northpointe eventually released the basics of its future-crime formula to ProPublica, revealing that its algorithm considers factors such as education levels and whether a defendant has a job (Angwin et al., 2016). As people become more aware of the potential risks associated with the usage of predictive policing softwares like COMPAS, they are becoming angered by “a growing dependence on automated systems that are taking humans and transparency out of the process” (Metz et al., 2020, para.9). Such public outcry has led the state of Idaho to pass a law requiring that the “methods and data used in bail algorithms be publicly available so the general public can understand how they work” (Metz et al., 2020, para.13).
Nevertheless, as pointed out by Hälterlein, “many systems that are built on [machine learning] algorithms would remain a black box even when transparency is given” (2021:9). In other words, it is difficult to hold algorithms accountable if individuals do not have the ability to backtrack its internal operations and evaluate its results (Hälterlein, 2021). Thus, there remain many unanswered questions surrounding the potential for algorithmic objectivity, data collection, and machine bias in these technologies, and who is held accountable for them when they fail.
Algorithmic Objectivity, Data, and Machine Learning Bias
The notion of algorithmic objectivity is prominent in today’s contemporary landscape. Algorithms are generally perceived as neutral or objective primarily because individuals understand them as mathematical entities that produce automated results and trust them under the assumption that they are free from human interference. Although this may appear so at first glance, this is “carefully crafted fiction” (Gillespie, 2014). Algorithms are designed and coded by human beings to perform specific observations while neglecting others. Consequently, who develops the algorithm matters because designers of these technologies might embed certain inequalities, knowingly or not (Caplan et al., 2018). After an algorithm is created, “it must be trained—fed large amounts of data on past decisions—to teach it how to make future decisions. If that training data is itself biased, the algorithm can inherit that bias” (Caplan et al., 2018:3). Thus, decisions made by computers and reinforced through machine learning are not fundamentally more logical or unbiased than those made by humans (Caplan et al., 2018). Algorithms—and by extension the data they encode—are used as material means to achieve particular social outcomes or arrangements that serve some interests at the expense of others. They are also strongly compatible with particular forms of political and economic relationships and not others. Consequently, algorithms and data are never neutral; instead, they are the combined outcome of intended technical design, social, and historical context.
In the past few years, people have become increasingly concerned that there is too much of their own personal information available online. However, when it comes to algorithms the contrary is true. Algorithms do not know nearly enough about individuals. As a result, their outputs can have dire consequences for members of society, particularly those who belong to marginalized communities. Algorithms do not evaluate individuals equally and their decisions do not carry the same weight for everyone. For instance, should a white, middle-class male search for the Montreal Canadiens’ highlights online using Google’s search engine, partial information regarding the user is of no concern. More specifically, it does not matter whether the top search result is Sportsnet or TSN because they will both have the same findings. However, if a judge is relying on an algorithm to make a sentencing decision for a black, homeless female, suddenly the stakes of being judged automatically by partial data are significantly higher and more democratically significant. Therefore, the contentious debate on whether the usage of predictive risk assessment softwares is acceptable is very much alive and relevant as these tools far too often condone, normalize, perpetrate, and perpetuate inequality and injustice against Black people.
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. ProPublica. 1-16.
Caplan, R., Donovan, J., Hanson, L., & Matthews, J. (2018). Algorithmic accountability: A primer. Data & Society. 1-12.
Cockrell, J. (2021). Will algorithms fix what’s wrong with American justice, or make things worse? Chicago Booth Review. N.p.
Gillespie, T. (2014). The relevance of algorithms. Media technologies: Essays on communication, materiality, and society. 167, 1-20.
Hälterlein, J. (2021). Epistemologies of predictive policing: Mathematical social science, social physics, and machine learning. Big Data & Society. 1-13.
Metz, C. and Satariano, A. (2020). An Algorithm That Grants Freedom, or Takes It Away. New York Times. N.p.