Can an Algorithm Identify Repeat Offenders?

• Bookmarks: 77


Assessing the likelihood that a defendant will reoffend in the future is a vital task in the criminal justice system. Judges weigh the risk of recidivism when making decisions about bail and sentencing, and their conclusions can have a significant impact on defendants’ lives. In 1998, the software company Northpointe (since rebranded as “equivant”) released a criminal risk assessment tool called COMPAS—Correctional Offender Management Profiling for Alternative Sanctions. COMPAS features an algorithm that is purported to objectively predict recidivism risk. The software uses 137 personal characteristics, including criminal history, to assess the probability that a given individual will commit a misdemeanor or felony within two years following his or her assessment.

Although the algorithm does not explicitly take race into account, it does include characteristics that can be correlated with race and thus, according to ProPublica, contribute to racially biased sentencing. These characteristics include criminal history and past disciplinary action at school. In the wake of the 2016 ProPublica report showing that COMPAS underpredicted recidivism for white defendants and overpredicted it for black defendants, a research team led by Julia Dressel and Hany Farid of Dartmouth University tested the algorithm’s accuracy. They compared the degree of bias in COMPAS assessments to that in predictions made by human volunteers with no background in criminal justice.

Volunteers reviewed a total of 1,000 defendant descriptions, which included sex, age, crime charge and criminal history. The volunteers were asked to predict whether or not each defendant would recidivate within two years, and their answers were measured against those produced by COMPAS for the same defendants.

The authors reported that the median accuracy rate of people’s predictions was 62.8 percent, just under COMPAS’ accuracy rate 65.2 percent for this particular group of defendants. When the authors pooled respondents’ answers using a majority-rules criterion—that is, they compared the answers selected by the majority of respondents to those produced by COMPAS—they found that COMPAS was not significantly more accurate than the human assessors.

The researchers also determined that the human assessors exhibited the same racial biases as the algorithm, despite the fact that race was not explicitly included in the defendant descriptions. In particular, both the volunteers and COMPAS predicted that black defendants would reoffend more often than they did, and whites less often. The rates of error were reasonably consistent between human and machine.

To explore whether the explicit inclusion of the defendant’s race would affect prediction accuracy or bias, Dressel and Farid recruited a new pool of volunteers and presented the same defendant descriptions but with race included. The resulting crowd-based accuracy was 66.5 percent—again, not substantially different from COMPAS’ performance. Once again, the test showed a tendency to overpredict recidivism by black defendants and underpredict that by whites. The authors did not identify any effect of participant age, gender or level of education on accuracy. They noted that there were not enough nonwhite respondents in the volunteer sample to determine differences in accuracy by participant race.

The internal workings of COMPAS software are proprietary to equivant, so it is not known how the algorithm processes a given defendant’s 137 descriptor variables. However, the authors tested the accuracy of a simple linear predictor using only the seven characteristics provided to the study volunteers. They trained a logistic regression model and found that the linear predictor had a prediction accuracy rate of 66.6 percent. This rate is not significantly different from COMPAS’ accuracy rate of 65.4 percent and requires 130 fewer explanatory variables. In fact, the authors trained another logistic regression model using only two explanatory variables (age and total number of previous convictions) and found that the two-variable model performs almost identically to COMPAS with respect to both prediction accuracy and bias.

Algorithmic criminal risk assessment tools have been widely deployed in the United States—COMPAS alone has assessed more than 1 million offenders since its launch in 1998. However, Dressel and Farid’s work calls into question whether these predictive tools are actually an improvement over human judgment and simple regression models. It is clear there’s still much work to be done to improve the quality of such computational approaches and tools that broadly impact the American criminal justice system.

Article source: Dressel, Julia and Hany Farid. “The accuracy, fairness and limits of predicting recidivism.” Science Advances 4, no.1 (2018).

Featured photo: cc/(LightFieldStudios, photo ID: 951885428, from iStock by Getty Images)

1013 views
bookmark icon