Abstract

Data-driven tools are increasingly used to make consequential decisions. In recent years, they have begun to advise employers on which job applicants to interview, judges on which defendants to grant bail, lenders on which homeowners to give loans, and more. In such settings, different data-driven rules result in different decisions. The problem is, to every data-driven rule, there are exceptions. While a data-driven rule may be appropriate for some, it may not be appropriate for all. In this case study, we argue that individuals have the right to be an exception to a data-driven rule. That is, they should not be held responsible when they are, through no fault of their own, the data-driven exceptions. We motivate the right and explain why it is not addressed by existing frameworks. The right also places a duty on decision makers—that the presumption should not always be that a data-driven rule is appropriate for every decision-subject, but rather that the data-driven rule must be justified for use on a given decision-subject, providing a level of consideration fit to the risk of harm. We provide a framework for justifying and contesting on the basis of the right to be an exception. The framework requires that a data-driven rule be assessed in terms of three components: individualization, uncertainty, and harm. We emphasize the importance of uncertainty—that the decision maker utilize a data-driven recommendation only if the levels of individualization and certainty are high enough to justify the level of harm that would result from that recommendation.

🎧Listen to an audio version of this case study.

Keywords: data-driven decision-making, rights and duties, individualization, uncertainty, harm

Sarah H. Cen
Department of Electrical Engineering and Computer Science, MIT

Manish Raghavan
Department of Electrical Engineering and Computer Science and Sloan School of Management, MIT

Learning Objectives

Identify the different ways that data-driven exceptions arise, such as sample bias and distribution shift.
Explore differences between human- and data-driven decisions that make the governance of data-driven decisions challenging.
Understand how data-driven exceptions fail to be protected by various existing criteria, such as requiring that a data-driven rule be accurate or calibrated.
Unpack the benefits and limitations of individualization (the process of tailoring a data-driven rule to each decision-subject by, for instance, adding covariates or training data).
Examine the importance of uncertainty evaluation and quantification in upholding an individual’s right to be an exception.

Introduction: Rules and Exceptions

We make sense of our world through rules. A bed is for sleeping, while a desk is for working. Dogs have dewclaws, while cats have retractable ones. Doing well in school improves one’s chances of landing a job.

But, to every rule, there are exceptions. Some people work on their beds, Huskies have retractable claws, and good grades do not always lead to good jobs. Although exceptions are by definition uncommon, they often carry significance disproportionate to their numbers. Exceptions not only improve our understanding of the rules, but they also help us develop better ones. Which dog breeds have retractable claws and why? If a certain dog breed has retractable claws, what other characteristics best distinguish them from cats?

Designing rules to distinguish between dogs and cats may be a fun thought experiment, but rules are also used to make decisions under much higher stakes. For instance, rules are used to determine which applicants a bank approves for loans, which defendants a judge grants bail, and which blobs in an image an autonomous vehicle classifies as human. In these contexts, exceptions matter. An individual who fails to get a loan may lose their house, a defendant who is denied bail may be unable to provide for their dependents, and an autonomous vehicle may not stop for a person misclassified as a part of the road.

No matter how good the rule, mistakes are inevitable, begging the question: What happens to individuals on which the rule fails—the exceptions?

What Happens to Exceptions?

In some cases, nothing. We accept that rules and generalizations are, on occasion, tolerable and even necessary. Indeed, the law allows landlords to put no-pet clauses in rental agreements (a rule based on the generalization that renters with pets cause more damage to homes than renters without pets) and airlines to remove passengers for safety reasons (a policy that relies on judgments about actions that a passenger could but has not yet committed).

In other cases—typically, when the risk of harm is high—the state steps in to shield individuals from the adverse effects that can follow from the overapplication of rules. Consider sentencing decisions as an example. For many crimes, there are mandatory minimum sentences: a set of standardized rules that prescribe the minimum sentence a defendant must serve for a crime, if convicted. These rules arose in the 1970s as a way to “make sentencing procedures fairer and sentencing outcomes more predictable and consistent.”1 Notably, mandatory minimum sentencing rules were also used in capital cases, which effectively meant that the death penalty was mandatory for certain crimes, regardless of mitigating circumstances.

In 1976, however, the U.S. Supreme Court ruled in Woodson v. North Carolina that capital cases should not be subject to mandatory minimum sentencing rules. The Court wrote that there must be “consideration of the character and record of the individual offender and the circumstances of the particular offense” before imposing a sentence as serious and irrevocable as the death penalty.2 In other words, the Court decided that, when it comes to the death penalty, rules that regularly yield exceptions—defendants on which the rule, but not the presiding judge, would impose the death penalty—are unacceptable. As a result, the courts responded by giving greater discretion to judges.

Exceptions in Data-Driven Decision-Making

In this piece, we turn our attention to data-driven rules. By “data-driven rules,” we mean the decision rules that define the relationship between the inputs (i.e., covariates) and outputs (i.e., recommendations) of data-driven decision aids.3 For instance, a data-driven decision aid in criminal justice may take in information about a defendant (i.e., the defendant’s covariates) and advise a judge on whether to grant parole. Such decision aids typically produce a score that “predicts” the likelihood that a defendant would reoffend if released on parole.4 Different rules generate different scores. While one rule may give higher scores to applicants with families, another may not. One rule may use details about the defendant’s crime as a covariate while another may not. As such, a defendant may be granted parole under some rules but not others.

Such data-driven rules are behind an increasing number of consequential decisions. They inform decisions not only in criminal justice, but also in hiring, college admissions, health care, social welfare, and lending, to name a few. As many scholars have acknowledged, however, there is a gap in the governance of data-driven decisions because individuals who are subject to data-driven rules are not always protected by a legal system that has been built around human decisions.5

In an effort to close this gap, we argue that individuals have the right to be an exception to a data-driven rule.6 This right is built on the following intuition. For a given context, there are many possible data-driven rules. Each rule yields a different set of exceptions. In most (if not all) cases, it is impossible to know who the exceptions are. Indeed, if a decision maker could identify the exceptions a priori, then they would be able to correct the rule on those individuals, which would imply that the decision maker does not need a decision aid to begin with. It is therefore inevitable that some individuals are—through no fault of their own—the exceptions.

Should data-driven exceptions be held responsible for failing to be captured by a data-driven rule? That is, should an individual be required to live such that their covariates elicit the correct treatment under every data-driven rule, or otherwise accept the consequences? In this piece, we argue no—individuals have the right to be an exception to a data-driven rule. It is unreasonable to expect individuals to anticipate every data-driven rule that may be applied during their lifetime and either ensure that they are not an exception to each or be willing to accept the consequences. In fact, such a requirement is generally impossible to satisfy since data-driven rules are opaque and constantly evolving.7

Currently, however, decision makers are permitted to argue that excellent (or even good) average-case performance justifies the poor treatment of exceptions, placing the burden of proof on decision-subjects.8 Under the right to be an exception, the burden of proof shifts to the decision maker. It is their duty to ensure that the data-driven rule is appropriate for each decision-subject. This requirement can be summarized as follows: the decision maker should not presume that a data-driven rule is suitable for the given decision-subject. A decision maker should instead consider the possibility that the decision-subject is an exception, giving a level of consideration appropriate to the decision’s risk of harm (in some cases, the risk of harm may be so low that no more than cursory consideration is required). When the risk of harm is high, the decision maker should presume that the decision-subject is an exception, inflicting harm only if they are sufficiently confident that the decision-subject is not an exception.

In this case study, we examine the right to be an exception in detail. We split our discussion into two parts. In the first part, we motivate the right to be an exception, exploring why data-driven exceptions matter and how they fall through the cracks of existing frameworks. In the second part of this piece, we propose a framework that can be used to determine whether a data-driven rule respects or infringes on an individual’s right to be an exception. This framework consists of three ingredients—individualization, uncertainty, and harm—and we investigate why all three are needed to assess a data-driven rule. We emphasize the importance of uncertainty, which is often omitted in discussions of a data-driven rule’s suitability.

Back to Basics: How Do Data-Driven Exceptions Arise?

An exception is an instance that is not captured by a general rule or statement. One can therefore think of a data-driven exception as a decision-subject on which a data-driven rule fails. Before we delve into the details, we briefly describe four ways data-driven exceptions arise. The takeaway is that there are many causes of data-driven exceptions. It is not always the case that exceptions are the individuals who are poorly represented in the training data. (If they were, data-driven exceptions would be relatively predictable and, as a result, easy to handle.) Here are some of the most common causes of data-driven exceptions:

Sampling bias causes one type of sample to appear fewer times in the data than others.9 For instance, suppose that a medical condition is rare and appears only once in a data set collected over the general population. Unless a data-driven rule accounts for the fact that this condition is rare, training on this data set can result in poor performance on patients with this condition.
Model capacity is a measure of a model’s expressiveness.10 Exceptions can occur when a model’s capacity is too low to capture patterns beyond broad stroke generalizations.11 If the relationship between the input variables and the target variable is more complex than the expressiveness of the model, then the model must make simplifications.12 Such a model often performs well on some inputs but at the cost of performing poorly on others.
Distribution shift arises when a model is trained on samples drawn from one probability distribution but deployed on another.13 The resulting data-driven rule learns patterns that are in the training data but not necessarily in the test environment. For instance, one would not expect an algorithm that is trained on criminal justice data in the United States to perform well for criminal justice decisions in the United Kingdom. Those in the United Kingdom who do not fit the patterns learned from the training data are exceptions to the US-based rule.
Partial observability captures scenarios in which some relevant information is missing.14 For instance, suppose that two types of individuals exist, but a data-driven rule cannot distinguish between them due to missing information. Such a rule typically performs well on one type but not the other because, unable to tell them apart, it treats them similarly. For instance, if a computer science department bases graduate admissions on an applicant’s major and GPA, it may not admit qualified applicants who did not major in computer science but have relevant work experience.15

Sampling bias, model (in)capacity, distribution shift, and partial observability illustrate that there are many ways data-driven exceptions surface. But what makes them different from human-driven exceptions? Why do they deserve special attention?

Why Data-Driven Exceptions Matter

In some sense, exceptions are exceptions, regardless of their origins. After all, both humans and machines use rules to make decisions, and these rules inevitably make mistakes. There are, however, several reasons why data-driven exceptions deserve special attention.

For one, averages appear in every part of the data-driven decision-making pipeline, making exceptions unavoidable. The most popular data-driven methods—including maximum likelihood estimation16 and empirical risk minimization17—use averages as their objective functions. Not only that, but the most common performance metrics—including accuracy, precision, and recall—are, in fact, averages.18 The pervasive use of averages matters because, by definition, averages work well on average, but there is no guarantee that they work well for a specific individual. In fact, when the population is heterogeneous, an average-based rule always yields exceptions. The message is not that we must abandon averages, but that data-driven assessments should be viewed for what they are. Indeed, an algorithm that is known to work well in aggregate should be applied with caution when the outcome of each individual matters.19

That modern machine learning is built on averages does not, on its own, motivate special attention to data-driven exceptions. After all, humans also rely on averages. Data-driven rules, by contrast, can also be applied rapidly and repeatedly, resulting in systemic oversight. Imagine, for example, that an individual is looking for a job and that all their potential employers use a popular algorithmic hiring tool to sift through resumés. Suppose that although the tool works well in general, it vastly underestimates this particular applicant’s qualifications. While no tool is expected to be perfect, the scale at which data-driven algorithms are deployed means that this applicant may never receive a job offer.20 Humans, by comparison, make decisions more slowly and in a less patterned manner. As Arvind Narayanan once wrote, humans may make mistakes, but they do so in a diverse way:21 although one employer may not like an applicant’s qualifications, another may give them a chance.22 In this way, data-driven decisions can inflict systemic, long-term harm on an individual.

On top of all this, data-driven exceptions are difficult to anticipate and understand. While the reasoning behind human-driven decisions can often be surfaced through intuition, the “reasoning” behind data-driven decisions is so opaque that there are multiple research fields dedicated to interpreting them.23 For example, adding a data point to an algorithm’s training set can, in some cases, completely change its predictions.24 In other cases, slightly perturbing the input in ways that are negligible to humans can flip a data-driven prediction.25 This barrier to anticipating and understanding data-driven exceptions has several implications. First, it makes frameworks designed to catch and rectify exceptions in human-driven decision contexts difficult to apply in data-driven contexts. Second, it supports the right to be an exception. Indeed, it is unreasonable to hold individuals responsible for being data-driven exceptions if individuals are unable to intuit or even keep up with the data-driven rule.

Together, the fact that data-driven exceptions are (i) unavoidable, (ii) capable of inflicting significant and repeated harm, as well as (iii) difficult to anticipate and understand suggests that we not only need a framework for protecting data-driven exceptions, but also that it is unreasonable to hold data-driven exceptions responsible when they—through no fault of their own—fail to be captured by a data-driven rule.

Why Exceptions Fall through the Gaps of Existing Frameworks

The realization that data-driven exceptions matter is, of course, not new. There are many works in law, philosophy, and computer science that study the downfalls of data-driven decision-making and their consequences.26 These works have covered enormous ground, highlighting issues that arise during the application of data-driven technologies and their governance. So, is the right to be an exception to a data-driven rule addressed by existing frameworks? To answer this question, we examine several approaches to the governance of data-driven decisions.

One approach is abolitionist: that data-driven algorithms should not make decisions when the stakes are sufficiently high. The argument goes: decisions without a human in the loop rob individuals of their agency and leave them vulnerable to “inhumane machine logic.”27 Many worry that a machine decision simply cannot incorporate the same nuance and values that a human could.28 The General Data Protection Regulation (GDPR), for instance, states that individuals have “the right not to be subject to a decision based solely on automated processing.”29 Such proposals are often motivated by the same concerns expressed in this piece—that data-driven rules can fail in unexpected ways that can cause significant harm.30 However, the “right to a human decision” leaves a large gap. For one, it is unlikely that all data-driven decisions will have humans in the loop. Perhaps more importantly, adding a human in the loop does not remove the influence of a data-driven decision aid. As such, data-driven exceptions will continue to arise, even with human participation.

In an effort to bring data-driven decisions into alignment with human values, many have asked that data-driven rules be “fair.” One notion of fairness that has gained popularity is group fairness, which requires that a data-driven rule not discriminate between groups in some pre-defined sense.31 For instance, one might require that the proportion of Black defendants granted parole is equal (or close) to the proportion of White defendants granted parole in order to ensure that the algorithm is fair with respect to race.32 Another popular notion of fairness is individual fairness, which asks that individuals with similar covariates be treated similarly.33 For instance, it asks that defendants who commit similar crimes under similar circumstances receive similar sentences. While these two conceptions of fairness bring data-driven outcomes closer toward what society considers “right,” neither guarantees good performance because both are inherently comparative. As an extreme example, an algorithm that makes the wrong prediction on every decision-subject satisfies both group and individual fairness: all groups and individuals are treated equally poorly.

In response to this observation, various scholars have suggested performance-based requirements—for instance, that a data-driven rule be accurate or calibrated.34 While essential, these performance metrics imply good performance on average. They do not imply, however, that a data-driven rule performs well in a specific decision context. Consider the following examples.

Example 1: High accuracy can be misleading. High accuracy indicates that an algorithm performs well in a high proportion of decision contexts.35 Yet a data-driven rule that is 95 percent accurate will still get it wrong 5 percent of the time. There is no guarantee that the specific decision-subject under consideration is not in the 5 percent. In this way, high accuracy on its own is not enough to uphold an individual’s right to be an exception.

Example 2: A calibrated rule can take the easy way out. Good calibration implies that prediction frequencies match true frequencies in the long term.36 That is, if a rule predicts an outcome occurs with X percent probability, it occurs X percent of the time in the long term. However, a rule can be calibrated without being fit for the specific decision context under consideration. Suppose, for instance, that it rains 10 percent of days of the year. A rule that simply predicts, on each day, that the chance of rain is 10 percent is perfectly calibrated, but it does not incorporate day-specific information (e.g., the weather conditions that day). As such, calibration does not guarantee that a data-driven rule is appropriate for a specific decision context.

A Framework for the Right to Be an Exception

We can therefore revisit the question: Is the right to be an exception addressed by existing proposals? Our discussion suggests that, no, requiring human oversight, algorithmic fairness, high accuracy, or calibration does not prevent an individual from being held responsible for being a data-driven exception. That is, we lack a framework—a language—with which to discuss and protect data-driven exceptions. For instance, as we saw in Example 1, if a decision-subject is only permitted to legally contest a data-driven rule on the basis of accuracy, as long as a data-driven rule performs well on the vast majority of individuals, it is highly accurate and therefore uncontestable, even for individuals on which it performs poorly.

In the next few sections, we argue that the right to be an exception can be upheld by considering three components of a data-driven rule: individualization, uncertainty, and harm. Notably, these three factors are not only meaningful to humans, but also meaningful concepts in machine learning. They therefore provide a clear language with which to assess, justify, or contest data-driven rules. Via these three components, we provide a concrete framework by which a decision maker can determine when a data-driven rule is appropriate or a decision-subject can determine whether to contest a data-driven decision.

The framework can be summarized as follows. Recall that, under the right to be an exception, the presumption should not be that a data-driven rule is suitable for the given decision-subject. Rather, a decision maker should only make a decision that inflicts harm if they are sufficiently confident (relative to the risk of harm) that the data-driven rule is appropriate for the decision-subject. To this end, individualization is important as it moves data-driven rules away from generalizations and toward more fine-grained assessments, and uncertainty measures the decision maker’s confidence in the data-driven recommendation. Combining these three ingredients, the right to be an exception is upheld if the rule’s levels of individualization and certainty are high enough to justify the harm that would follow from the data-driven recommendation. In the next few sections, we unpack this framework and explain why all three ingredients are needed to uphold the right.

Importantly, a decision-subject can claim the right to be an exception even if the data-driven rule does not make a mistake. While a mistake occurs when a prediction does not match the observed outcome, the right to be an exception can be violated without observing the true outcome of interest. It can be violated when a data-driven rule that is not appropriate for a decision-subject is applied to them. For example, consider a judge deciding whether to deny parole. This decision is largely based on whether the judge believes the defendant will commit a crime if released. If a defendant is denied parole, they cannot prove that they would not have committed a crime if they had been granted parole instead. In this case, it is impossible to determine if the decision to deny parole was a mistake. One can, however, determine whether the decision to deny parole was justified under the circumstances using the proposed framework.

Individualization: Moving from the Aggregate to the Individual

The first ingredient in our framework is individualization: the process of tailoring a rule such that it performs well on the specific decision-subject under consideration.37 For many, individualization is the natural first step to designing a data-driven rule that is appropriate for specific decision-subjects. Indeed, Lippert-Rasmussen argues for the right to be treated as an individual in order to reduce statistical discrimination (treating an individual as if they are the statistical average of similar individuals).38

Individualization involves two components: (i) more fine-grained consideration and (ii) improved performance. That is, the more individualized a rule, the more suitable it is for a particular individual.39 For example, one way to make a data-driven rule more individualized is to add features, or inputs, to the model. Holding everything else equal, a data-driven rule that uses an applicant’s age, home address, and occupation in order to decide whether to grant a loan is more individualized than one that uses only their age and home address if these additional covariates improve the rule. Another way to individualize a rule is to incorporate more training data that is relevant to the decision-subject.

Individualization is a particularly useful concept because it appears in both legal texts as well as technical ones.40 As such, a law requiring individualization in data-driven rules would pave a clear path for computer scientists. Indeed, much of machine learning echoes the belief that, with enough information and data, a data-driven rule can predict the target outcome with perfect accuracy. Individualization has become so central to machine learning that data-driven rules are often justified based on their level of individualization. Many theorems in machine learning, for instance, follow the template: “As N goes to infinity, the estimate becomes optimal,” where N quantifies the number of covariates or data samples.

There are, however, limits to individualization. For one, there may be privacy concerns associated with individualization (e.g., feeding a data-driven rule detailed information about decision-subjects). For another, current data-driven methods are incapable of individualizing in ways that humans do naturally. A judge, for example, can update their belief when given novel information about a defendant (e.g., that the defendant volunteers or has dependents). In contrast, most (if not all) data-driven rules have fixed inputs and cannot incorporate features that are not present in the training data. In addition, some attempts to individualize a rule do not always work. That is, they unintentionally hurt performance. For instance, adding covariates (without regularizing) can cause what is known as overfitting: performing extremely well in training but poorly when deployed.

Although individualization has its limits, it brings us one step closer to the right to be an exception because it discourages the use of coarse data-driven rules and encourages attention to each decision-subject’s circumstances. For example, objecting the use of a data-driven rule because it disregards an important piece of information is an objection on the basis of poor individualization. So, is the right to be an exception simply the right to individualized data-driven rules?

Individualization Is Not Enough: Uncertainty Also Matters

No—individualization is not the only pertinent factor in upholding the right to be an exception. There are two additional components: uncertainty and harm, and we focus on the former in this section. The takeaway is that while individualizing a data-driven rule takes an important step toward ensuring that it does not neglect relevant information, no amount of individualization can remove all the uncertainty in a data-driven rule, and this uncertainty matters when the risk of harm is high.

Recall that individualization relies on the belief that, holding everything else equal, adding information improves a data-driven rule. Conveniently, this reasoning also underlies machine learning, which is founded on the idea that data is king (i.e., that with enough information, a data-driven rule can perform perfectly). In reality, however, even the best data-driven models make mistakes. There are very few (if any) meaningful settings in which a perfect rule exists, and the main barrier is uncertainty.

To illustrate this point, consider the following two types of uncertainty:41

Epistemic uncertainty is systematic or reducible uncertainty that arises from lack of knowledge. For example, a prediction of tomorrow’s temperature that is based on past years’ temperatures at this time of year has greater epistemic uncertainty than the prediction of tomorrow’s temperature based on past years’ temperature at this time of year and today’s temperature.
Aleatoric uncertainty is statistical or irreducible uncertainty that arises from the inherent randomness or “unknowability” of an event. At the time of prediction, no information exists that can reduce this type of uncertainty. For example, the randomness in the wind patterns that may occur between today and tomorrow prevents a temperature prediction that is made today from being perfectly certain about tomorrow’s temperature, and this randomness can be attributed to aleatoric uncertainty.

Through these two types of uncertainty, it becomes clear that while individualization may reduce epistemic uncertainty, it cannot reduce aleatoric uncertainty. Consider the following example.

Example 3: The unknowability of unobserved outcomes. Consider a data-driven decision aid for college admissions—specifically, one that predicts how well a student will perform if admitted. Beyond random noise, there are multiple ways that aleatoric uncertainty arises. Even if the student is similar to previous students for which there is data, one could argue that a student's performance is not predetermined, that is, that they have the ability to perform differently from past individuals, and a future student should not be held responsible for the performance of a previous—albeit similar—student.42 Another way aleatoric uncertainty arises is through omission bias: the training data only contains the performance of students who were admitted, and the performance of a student who was not admitted is unknowable.43 Perhaps a student who was not admitted would have performed very well. Moreover, even if a decision maker had perfect knowledge of previous students' outcomes, any decision that is made now can only use information obtained up until this moment. There are, however, countless factors (or, in the language of causal inference, “interventions”) that could influence a student’s performance between the time of acceptance and graduation, including whether they receive tutoring, which students they befriend, whether they take a part-time job, and more.

Weighing Individualization and Certainty against the Risk of Harm

In short, individualization can, at best, remove epistemic uncertainty, but no amount of individualization can remove aleatoric uncertainty.44 We argue, however, that eliminating uncertainty is beside the point. It is unreasonable to ask for a perfect data-driven rule that makes no mistakes. Instead, the right to be an exception asks that the levels of individualization and certainty be balanced against the risk of harm.

More precisely, suppose that one of the decision outcomes would inflict significant harm. Then, no matter how individualized a decision rule may be, the decision to inflict harm should only follow if the level of certainty is high enough. If, on the other hand, the level of uncertainty (epistemic and aleatoric, combined) is too high, then the decision maker should err on the side of caution (less harm). As an extreme example, suppose a decision maker is presented with a newborn and must decide whether to confine them based on an evaluation of whether they will commit murder during their lifetime. The decision is made at the time of birth, so the only information that is available must also be available at the time of birth. A rule could be perfectly individualized (based on the information at the time of birth), but most would agree that there are so many unknowable factors that could contribute to the newborn's future actions that no amount of individualization would justify inflicting a harm as high as confining a newborn for life.

A decision outcome’s risk of harm therefore determines the amount of individualization and certainty necessary to utilize a data-driven rule whose recommendation inflicts harm. It is natural to then ask: How should harm be measured? While providing an explicit framework for quantifying harm is out of the scope of this work, we note that prior works have laid out a path for doing so, including Wachter and Mittelstadt’s work on the right to reasonable inferences (in which they discuss the determination of “high-risk inferences”) and Kaminski and Urban’s right to contest AI (in which they characterize risk of harm in terms of “significant effects”).45 The European Union’s Artificial Intelligence Act also provides a “risk methodology” for evaluating and categorizing high-risk decision contexts.

Putting It All Together

Data-driven decision aids are increasingly used to make decisions in hiring, lending, criminal justice, education, health care, and more. Each decision aid is governed by a data-driven rule that defines the relationship between the inputs to the decision aid and the recommendations it issues. But, to every rule, there are exceptions. While a data-driven rule may work well for some individuals, it may not work well for all.

At the moment, decision-subjects are often held responsible for the failures of data-driven rules. In particular, data-driven decision makers are permitted to argue that excellent average-case performance justifies the poor treatment of exceptions, placing the burden of proof on decision-subjects.46 The right to be an exception argues that a decision-subject who finds themself—through no fault of their own—poorly captured by a data-driven rule should not be held responsible for being a data-driven exception. As long as an individual makes decisions that are lawful and responsible, they should be able to live free from worry that their actions may lead them to be a data-driven exception (and to the consequences that come with it).

The right to be a data-driven exception therefore places a duty on data-driven decision makers. It says that the presumption should not be that a data-driven rule is fit for use on the decision-subject in question. Rather, a decision maker should instead consider the possibility that the decision-subject is an exception, giving a level of consideration appropriate to the decision’s risk of harm. When the risk of harm is high, the decision maker should presume that the decision-subject is an exception, inflicting harm only if they are sufficiently confident that the decision-subject is not an exception. To make this duty concrete, we provide a framework for assessing, justifying, and contesting the suitability of a data-driven rule for a given decision-subject. The framework consists of three components: individualization, uncertainty, and harm. The right to be an exception is upheld if the data-driven rule’s levels of individualization and certainty are high enough to justify the harm that would follow from the data-driven rule’s recommendation.

Importantly, the right does not imply that data-driven rules should be dropped altogether, nor does it suggest that they be used all the time. The right does not even suggest that there is a clear line between the types of decisions in which data-driven rules are appropriate (e.g., that data-driven decision aids should be used in lending but not sentencing). Rather, the right argues that there are some contexts in which the stakes are so high that each decision-subject deserves appropriate consideration of whether the data-driven rule is fit for them. In the same way that certain information is discarded as irrelevant (e.g., a college admissions board may discard a student’s sophomore Fall grades if a family tragedy occurred that semester), a data-driven recommendation may need to be discarded. While useful, this analogy does not carry over perfectly because it is unclear when to discard a data-driven rule. Data-driven rules behave quite differently from human ones—for instance, the “intent” and “reasoning” behind a data-driven recommendation are often inscrutable.

Although out of the scope of this piece, such a right could be accompanied by ex-ante requirements on data-driven decision rules (e.g., that they provide an evaluation of their harm, individualization, and uncertainty) as well as a procedure for contesting data-driven decisions based on this right.47

Discussion Questions

What is a data-driven rule, and what does it mean to be a data-driven exception? Is an exception the same as an error?
In addition to those listed above, what other factors differentiate data-driven decisions from human ones?
Beyond what is discussed above, what are some of the benefits and downsides of individualization?
Why is uncertainty so critical to the right to be an exception? When the stakes are high (e.g., in criminal sentencing), is there any evaluation metric (e.g., accuracy) that can justify the use of a data-driven rule without the consideration of uncertainty?
How would you go about quantifying and balancing the three components of the proposed framework: individualization, uncertainty, and harm?
Are there other components that you would add to the framework proposed in this piece?

Bibliography

Ajunwa, I. “An Auditing Imperative for Automated Hiring Systems.” Harvard Journal of Law and Technology 34, no. 2 (2021): 622–85. https://jolt.law.harvard.edu/assets/articlePDFs/v34/5.-Ajunwa-An-Auditing-Imperative-for-Automated-Hiring-Systems.pdf.

Alexander, M. The New Jim Crow: Mass Incarceration in the Age of Colorblindness. New York: New Press, 2010.

Barocas, S., M. Hardt, and A. Narayanan. Fairness and Machine Learning. fairmlbook.org, 2019. https://fairmlbook.org/.

Barocas, S., and A. D. Selbst. “Big Data’s Disparate Impact.” California Law Review 104 (2016): 671–732. http://dx.doi.org/10.15779/Z38BG31.

Benjamin, R. Race After Technology: Abolitionist Tools for the New Jim Code. New York: Polity Press, 2018.

Berry, W. W. III “Individualized Sentencing.” Washington & Lee Law Review 76 (2019): 13–92. https://scholarlycommons.law.wlu.edu/wlulr/vol76/iss1/4.

Bishop, C. M. Pattern Recognition and Machine Learning. New York: Springer, 2016.

Bommasani, R., K. A. Creel, A. Kumar, D. Jurafsky, and P. Liang. “Picking on the Same Person: Does Algorithmic Monoculture lead to Outcome Homogenization?” Preprint submitted November 25, 2022. https://doi.org/10.48550/arXiv.2211.13972.

Chen, X., C. Liu, B. Li, K. Lu, and D. Song. “Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning.” Preprint submitted December 25, 2017. https://doi.org/10.48550/arXiv.1712.05526.

Chouldechova, A. “Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments.” Big Data 5, no. 2 (2017): 153–63. https://doi.org/10.1089/big.2016.0047.

Citron, D. K. “Technological Due Process.” Washington University Law Review 85 (2008): 1249–314. https://openscholarship.wustl.edu/law_lawreview/vol85/iss6/2.

Citron, D. K., and F. Pasquale. “The Scored Society: Due Process for Automated Predictions.” Washington Law Review 89 (2014): 1–33. https://digitalcommons.law.uw.edu/wlr/vol89/iss1/2.

Creel, K., and D. Hellman. “The Algorithmic Leviathan: Arbitrariness, Fairness, and Opportunity in Algorithmic Decision-Making Systems.” Canadian Journal of Philosophy 52, no. 1 (2022): 26–43. https://doi.org/10.1017/ca.2022.3.

Cuddeback, G., E. Wilson, J. G. Orme, and T. Combs-Orme. “Detecting and Statistically Correcting Sample Selection Bias.” Journal of Social Service Research 30, no. 3 (2004): 19–33. https://doi.org/10.1300/J079v30n03_02.

Dawid, A. P. “The Well-Calibrated Bayesian.” Journal of the American Statistical Association 77 (1982): 605–10. https://doi.org/10.1080/01621459.1982.10477856.

Dawid, P. “On Individual Risk.” Synthese 194 (2017): 3445–74. https://doi.org/10.1007/s11229-015-0953-4.

Devroye, L., L. Györfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition (Vol. 31, Stochastic Modelling and Applied Probability). New York: Springer Science & Business Media, 2013.

Doshi-Velez, F., and B. Kim. “Towards a Rigorous Science of Interpretable Machine Learning.” Preprint submitted March 2, 2017. https://doi.org/10.48550/arXiv.1702.08608.

Dwork, C., M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. “Fairness Through Awareness.” In ITCS’12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, edited by Shafi Goldwasser, 214–26. New York: ACM, 2012. https://doi.org/10.1145/2090236.2090255.

Eubanks, V. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York: St. Martin’s, 2018.

Fawcett, T. “An introduction to ROC Analysis.” Pattern Recognition Letters 27, no. 8 (2006): 861–74. https://doi.org/10.1016/j.patrec.2005.10.010.

General Data Protection Regulation. (2016). Complete Guide to GDPR Compliance. GDPR.EU. https://gdpr.eu/.

Gilpin, L. H., D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal. “Explaining Explanations: An Overview of Interpretability of Machine Learning.” In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics, 80–89. Piscataway, NJ: IEEE, 2018. https://doi.org/10.1109/DSAA.2018.00018.

Goodfellow, I. J., J. Shlens, and C. Szegedy. “Explaining and Harnessing Adversarial Examples.” Preprint submitted March 20, 2015. https://doi.org/10.48550/arXiv.1412.6572.

Gray, M. L., and S. Suri. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass. Boston: Houghton Mifflin Harcourt, 2019.

Hashimoto, T., M. Srivastava, H. Namkoong, and P. Liang. “Fairness without Demographics in Repeated Loss Minimization.” Proceedings of the 35th International Conference on Machine Learning (ICML). PMLR 80 (2018): 1929–38. https://proceedings.mlr.press/v80/hashimoto18a.html.

Hébert-Johnson, U., M. Kim, O. Reingold, and G. Rothblum. “Multicalibration: Calibration for the (Computationally-Identifiable) Masses.” Proceedings of the 35th International Conference on Machine Learning. PMLR 80 (2018): 1939–48, https://proceedings.mlr.press/v80/hebert-johnson18a.html.

Hu, X., L. Chu, J. Pei, W. Liu, and J. Bian. “Model Complexity of Deep Learning: A Survey.” Preprint submitted August 3, 2021. https://arxiv.org/abs/2103.05127.

Huq, A. Z. “A Right to a Human Decision.” Virginia Law Review 106, no. 3 (2020): 611–88. https://www.virginialawreview.org/articles/right-human-decision/.

Jorgensen, R. “Algorithms and the Individual in Criminal Law.” Canadian Journal of Philosophy 52, no. 1 (2021): 61–77. https://doi.org/10.1017/can.2021.28.

Kaelbling, L. P., M. L. Littman, and A. R. Cassandra. “Planning and Acting in Partially Observable Stochastic Domains.” Artificial Intelligence 101, nos. 1–2 (1998): 99–134. https://doi.org/10.1016/S0004-3702(98)00023-X.

Kalman, R. E. “Mathematical Description of Linear Dynamical Systems.” Journal of the Society for Industrial and Applied Mathematics, Series A: Control 1, no. 2 (1963): 152–92. https://doi.org/10.1137/0301010.

Kaminski, M. E., and J. M. Urban. “The Right to Contest AI.” Columbia Law Review 121, no. 7 (2021): 1957–2048. https://columbialawreview.org/content/the-right-to-contest-ai/.

Kendall, A., and Y. Gal. “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?” In NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, edited by U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, and R. Fergus, 5580–90. Red Hook, NY: Curran Associates, 2017. https://dl.acm.org/doi/10.5555/3295222.3295309.

Kleinberg, J., H. Lakkaraju, J. Leskovec, J. Ludwig, and S. Mullainathan. “Human Decisions and Machine Predictions.” The Quarterly Journal of Economics 133, no. 1 (2018): 237–93. https://doi.org/10.1093/qje/qjx032.

Kleinberg, J., and M. Raghavan. “Algorithmic Monoculture and Social Welfare.” PNAS 118, no. 22 (2021): e2018340118. https://doi.org/10.1073/pnas.2018340118.

Koh, P. W., S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu et al. “WILDS: A Benchmark of in-the-Wild Distribution Shifts.” Proceedings of the 38th International Conference on Machine Learning (ICML). PMLR 139 (2021): 5637–64. https://proceedings.mlr.press/v139/koh21a.html.

Lippert-Rasmussen, K. “‘We Are All Different’: Statistical Discrimination and the Right to Be Treated as an Individual.” The Journal of Ethics 15 (2011): 47–59. https://doi.org/10.1007/s10892-010-9095-6.

Lockett v. Ohio, 438 U.S. 586 (U.S. Supreme Court July 3, 1978). https://supreme.justia.com/cases/federal/us/438/586/.

Madry, A., A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. “Towards Deep Learning Models Resistant to Adversarial Attacks.” Preprint submitted September 4, 2019. https://doi.org/10.48550/arXiv.1706.06083.

National Research Council, Committee on Law and Justice. The Growth of Incarceration in the United States: Exploring Causes and Consequences. Washington, DC: National Academies Press, 2014. https://nap.nationalacademies.org/catalog/18613/the-growth-of-incarceration-in-the-united-states-exploring-causes.

O'Neil, C. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York: Crown, 2017.

Perdomo, J., T. Zrnic, C. Mendler-Dünner, and M. Hardt. “Performative Prediction.” Proceedings of the 37th International Conference on Machine Learning (ICML). PMLR 119 (2020): 7599–609. https://proceedings.mlr.press/v119/perdomo20a.html.

Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., and Weinberger, K. Q. “On Fairness and Calibration.” In NIPS'17: Proceedings of the 31st Conference on Neural Information Processing System, edited by U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, and R. Fergus, 5684–93. Red Hook, NY: Curran Associates, 2017. https://dl.acm.org/doi/10.5555/3295222.3295319.

Rosen, J. “Right to Be Forgotten.” Stanford Law Review Journal 64 (2012): 88–92. https://www.stanfordlawreview.org/online/privacy-paradox-the-right-to-be-forgotten/.

State v. Loomis, 881 N.W.2d 749 (Wisconsin Supreme Court July 13, 2016). https://caselaw.findlaw.com/wi-supreme-court/1742124.html.

Steinhardt, J., P. W. Koh, and P. S. Liang. “Certified Defenses for Data Poisoning Attacks.” In NIPS’17: Proceedings of the 31st International Conference on Neural Information Processing Systems, edited by U. von Luxburg, I. Guyon, S. Bengio, H. Wallach, and R. Fergus, 3520–32. Red Hook, NY: Curran Associates, 2017. https://dl.acm.org/doi/10.5555/3294996.3295110.

Vapnik, V., E. Levin, and Y. L. Cun. “Measuring the VC-dimension of a Learning Machine.” Neural Computation 6, no. 5 (1994): 851–76. https://doi.org/10.1162/neco.1994.6.5.851.

Wachter, S., and B. Mittelstadt. “A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI.” Columbia Business Law Review no. 2 (2019): 494–620. https://doi.org/10.7916/cblr.v2019i2.3424.

Wolfram, S. A New Kind of Science. Boston: Wolfram Media, 2002.

Woodson v. North Carolina, 428 U.S. 280 (U.S. Supreme Court 1976). https://supreme.justia.com/cases/federal/us/428/280/.

The Right to Be an Exception to a Data-Driven Rule