NIST Face Recognition Study Finds That Algorithms Vary Greatly, Biases Tend to Be Regional

The use of face recognition software by governments is a current topic of controversy around the globe. The world’s major powers, primarily the United States and China, have made major advances in both development and deployment of this technology in the past decade. Both the US and China have been exporting this technology to other countries. The rapid spread of facial recognition systems has alarmed privacy advocates concerned about the increased ability of governments to profile and track people, as well as private companies like Facebook tying it to intimately detailed personal profiles.

A recent study by the US National Institute of Standards and Technology (NIST) that examines facial recognition software vendors has found that there is definitely some merit to claims of racial bias and poor levels of accuracy in specific demographics. Among other things, the study found that law enforcement systems tended to have trouble accurately identifying Native American and African American faces, and that women and people more toward the extreme ends of the age spectrum were more likely to experience false positives.

Perhaps even more importantly, the study discovered that there was major regional variation based on where the algorithm came from; accuracy tended to be stronger in whatever the majority population of that area was.

Key findings

The third in the series of tests of vendor face recognition capability and accuracy, the study published in December 2019 focused specifically on how well these systems identify various demographic groups.

The tests used both one-to-one and one-to-many face recognition processes, drawing on a variety of government photo databases: “mugshots” taken during arrests, visa photographs, border crossing photographs and pictures attached to immigration applications. Most of these data sets contain high-quality images suitable for modern face recognition applications, but the border crossing photographs were included intentionally as an example of low-quality data. The study tested for both false positives and false negatives.

A key finding of all of these studies is that algorithm accuracy varies widely between vendors. One important finding specific to this most recent study is that algorithms tend to have a racial bias that corresponds to where they were created. For example, many of the algorithms are developed in China and these particular ones tend to have better accuracy when matching East Asian faces.

Across all of the algorithms, false positives are much more frequent than false negatives. Higher rates of false positives for women, children and the elderly were also consistent across all algorithms, though none were more frequent than false positives attributable to race.

Taken as a whole, the higher-quality data sets produced high false positives with African and East Asian subjects (with the exception of the algorithms developed in Asian countries) at rates of 10 to 100 times those of Caucasian subjects. The ones used by law enforcement agencies had the highest false positive rates with Native American subjects, and significantly elevated rates with African-American and Asian subjects.

The false negatives that occurred were most frequent in the mugshots of Asian and Native American subjects when examining the high-quality datasets. In the border crossing photo sets, the rates were highest for Africans and subjects from the Caribbean and were particularly inaccurate for older people.

Face recognition bias

Racial bias is a long-established problem in face recognition systems. The culprit here is likely the same as it has been in previous incidents – facial training data sets that skew too heavily to whatever the majority race is where the algorithm is developed, but that also appear to skew male and middle-aged across cultures.

As the Algorithmia blog points out, many commonly-used face recognition processes rely on core data sets used for training that were scraped from the internet 10 years ago. These also may have leaned heavily on celebrity sources which would naturally skew heavily toward the ethnic majority of the area in which they were collected.

The report stresses that individual algorithms tend to have their own biases, and it is critical for anyone using an algorithm to be aware of the biases it contains. NIST reports that it found demographic differentials, particularly false positives, in the majority of the algorithms that it tested.

The study included 189 algorithms from 99 developers, which represents most of those used today. NIST cited NEC Corporation’s NEC-3 algorithm as the most accurate one it evaluated. A few major algorithms were absent from the study, however, most notably Amazon’s Rekognition system.

Problems for government agencies

While numerous concerns have been voiced and allegations have been made in the past decade, there has actually been little real empirical evidence to this point to support the notion of widespread racial bias in face recognition. This federal study makes clear that it is a real thing present in many different commonly-used facial recognition technology algorithms, and that some sort of changes are needed (such as more diverse training data) to produce more equitable outcomes and reduce error rates.

Obvious problems start at national borders, where facial recognition algorithms have been used to identify potential security threats for some time now. These algorithms not only may be making erroneous matches of minority demographics at an alarming rate, they may also be missing legitimate threats if they are using photo data that is not up to standard.

The NIST study results will most certainly amplify calls to ban the use of facial recognition software by law enforcement agencies to identify people, which the cities of Oakland and San Francisco have already done. At present there is no federal law in the United States restricting the use of face recognition software. Government agencies have backed off of some of their proposed face recognition plans recently, most notably Customs and Border Protection no longer planning to scan US citizens when they enter and exit the country.

NIST Face Recognition Study Finds That Algorithms Vary Greatly, Biases Tend to Be Regional

Key findings

Face recognition bias

Problems for government agencies

Irish DPC Investigating Ryanair’s Use of Facial Recognition

Meta Agrees to Pay $1.4 Billion to Settle Texas Facial Recognition Lawsuit

GoldPickaxe Mobile Trojan Malware Captures Facial Data, Intercepts Text Messages to Access Financial Accounts

Amnesty International Report: Palestinians in Hebron Regularly Subject to an Experimental Surveillance System That Stores Facial Recognition Data Without Consent

Overdue Data Protection Fine for Clearview AI Facial Recognition Software Is Leading to Big Penalties

How Safe Can Biometrics Really Be? The Rock Solid Measures That Guarantee It

PimEyes Face Search Engine Alarmingly Thorough and Accurate, Able To Pick Faces Out of Crowds

Lead EU Cybersecurity Agency to Receive Early Access to Mythos AI

Who Will Break Who: Microsoft or Mythos?

Commercial Location Data Used to Track Deployed U.S. Military Across Theaters of Operation

Carnival Cruise Data Breach Exposes Nearly 6 Million People in Cyber Attack Linked to ShinyHunters