Big data and artificial intelligence are revolutionizing the ways in which financial firms, governments, and employers classify individuals. Surprisingly, however, one of the most important threats to anti-discrimination regimes posed by this revolution is largely unexplored or misunderstood in the extant literature. This is the risk that modern algorithms will result in “proxy discrimination.” Proxy discrimination is a specific type of practice producing a disparate impact. It occurs when two conditions are met. The first is widely recognized: a facially-neutral characteristic that is relevant to achieving a discriminator’s objectives must be correlated with membership in a protected class. By contrast, the second defining feature of proxy discrimination is generally overlooked: in addition to producing a disparate impact, proxy discrimination requires that the predictive power of a facially-neutral characteristic is at least partially attributable to its correlation with a suspect classifier. For this to happen, the suspect classifier must itself have some predictive power, making it ‘rational’ for an insurer, employer, or other actor to take it into consideration. As AIs become even smarter and big data becomes even bigger, proxy discrimination will represent an increasingly fundamental challenge to many anti-discrimination regimes. This is because AIs are inherently structured to engage in proxy discrimination whenever they are deprived of predictive data. Simply denying AIs access to the most intuitive proxies for predictive variables does nothing to alter this process; instead it simply causes AIs to locate less intuitive proxies. The proxy discrimination produced by AIs therefore has the potential to cause substantial social and economic harms by undermining many of the central goals of existing anti-discrimination regimes. For these reasons, anti-discrimination law must adapt to combat proxy discrimination in the age of AI and big data. This Article offers a menu of potential responses to the risk of proxy discrimination by AI. These include prohibiting the use of non-approved types of discrimination, requiring the collection and disclosure of data about impacted individuals’ membership in legally protected classes, and requiring firms to eliminate proxy discrimination by employing statistical models that isolate only the predictive power of non-suspect variables.
Keywords: Proxy Discrimination, Artificial Intelligence, Insurance, Big Data, GINA