《道德算法》，与迈克尔·卡恩斯对话 |Carnegie Council 国际事务伦理Carnegie Council

关于该系列

人工智能能否以促进平等的方式部署，或者人工智能系统是否会加剧现有的结构性不平等并制造新的不平等？人工智能与平等》播客试图了解人工智能影响平等和国际事务的无数方式。

在一代人的时间里，算法已经从数学抽象概念变成了日常生活的强大媒介。它们提高了我们的生活效率，但也日益侵犯我们的基本权利。宾夕法尼亚大学的迈克尔-卡恩斯（Michael Kearns）教授就如何更好地将人类原则嵌入机器代码，同时又不阻碍数据驱动的科学探索进程，分享了一些想法。

This talk was accompanied by a PowerPoint presentation. Please follow along with the full event video or the PowerPoint PDF, both located in the right sidebar.

MICHAEL KEARNS: Thank you all for coming. My name is Michael Kearns.

This talk is about a book that I wrote with a close friend and colleague at Penn, Aaron Roth. Both of us are card-carrying career researchers in artificial intelligence (AI) generally, but specifically in machine learning, more on the algorithm design side, thinking about what the right core algorithms are and what the principles underlying those algorithms are. We do a fair amount of experimental research as well.

Many people in my field have watched with some combination of surprise and alarm at the developments of the last 10 years or so. The field that I've been working in for many decades went from a relatively obscure corner of computer science to permeating all of society with algorithms, particularly algorithms that are the result of a learning process—machine learning—making very, very consequential decisions about the lives of ordinary individuals.

Just to make things very concrete here, I'm talking about things like algorithms or predictive models deciding things like: whether you get a loan or not, whether you are admitted to the college of your choice, human resource departments using these models to screen résumés, judges using risk-assessment models to decide whether an incarcerated individual should get parole or not, or what sentence they should receive in the first place. We have in a relatively short period of time—primarily due to the rise of the consumer Internet, which has allowed all of us the opportunity to provide incredibly granular data about our movements, our interests, our fears, our hopes, our medical records, what we're Googling, etc.—moved from making aggregate decisions about scientific systems like weather prediction or the directional movement of the stock market, to making very, very personalized decisions about you.

Machine learning has experienced a lot of amazing success in the last 20 years or so, and I would crudely characterize what has happened in the last 10 years—at least from a mainstream media standpoint—as 2010-15 there was this rush of excitement as deep learning and related technologies made serious inroads into longstanding core technologies like speech recognition and image processing and the like. The last five years or so have been a little bit more of a buzzkill, as we've realized that those same systems and models can essentially engage in violations of privacy in a systematic way or result in algorithmic decision making that is discriminatory against racial, gender, or other groups.

These days it's difficult to pick up mainstream media publications and not find at least one article a day on these kinds of phenomena. Many of you may have seen this one just last week in Science magazine that got a lot of attention, where a model for predictive health care was systematically discriminating against black people.

So, why did we decide to write this book? There were a number of good books before us—three of which I've shown here—that we admire, that we think do a very good job to a lay audience describing what the problems are, but we felt like they were a little bit short on solutions. These books all do a very good job at pointing out the ways in which machine learning can violate individual privacy or notions of fairness in prediction, but when you get to "What should we do about this?" in these books—they all have a section that discusses this, usually toward the end—their answers are basically, "We need better laws, we need better regulations, we need better watchdog groups, we really have to keep an eye on this stuff."

We agree with all of that, but we also think that there are things that can be done at a technical level. In particular, if algorithmic misbehavior is largely the problem, we could think about making the algorithms better in the first place—better in the social sense—like not engaging in discriminatory behavior, leaks of private data, and the like.

We're not proposing these types of algorithmic solutions to algorithmic problems to the exclusivity of these institutional solutions. It's just we think, This is what we know about, first of all. We're computer scientists, we're not policymakers, social workers, or legal scholars. We also think, This can be done right now. Institutional change, laws, and regulations take a lot of time. Companies like Google, Facebook, what have you, can make their algorithms better right now, and they actually know how to do it.

We are part of a growing subcommunity of the machine learning community that is taking these ideas seriously and thinking about, literally in the code of these algorithms, putting in conditions that prevent different types of anti-social behavior. We thought there was enough literature on this topic now and that it was interesting and important enough that we would write a general audience book, trying to explain what that underlying science looks like, what its promise is, and also what its limitations are.

In the one sense, it is a technologically optimistic book. We are technological optimists, we're not technological utopianists. But we try to do a balanced job of saying: "This is what we know how to do now. This is what we don't know how to do now. This is what we think we'll know how to do 10 years from now, and these are some things that we think algorithms should never do." That's what the book is about.

Some people, when they hear the title of the book, think it's a conundrum or almost a contradiction. We've had the reaction before, like, "Ethical algorithms? Isn't that like discussing ethical hammers?" After all, a hammer is a tool designed by human beings for a particular purpose, and anything unethical or ethical about the use of a hammer, you can directly attribute to whoever wields that hammer. So, even though a hammer is designed for building stuff and pounding nails and all that good stuff, I could hit you on the hand with the hammer, and we might consider that an unethical use of this tool. But nobody would say, "Well, the hammer was unethical." You would ascribe it to the user of that tool, not to the tool itself.

We actually argue that algorithms are different. Yes, algorithms are human-designed artifacts and tools to solve specific problems, but especially the modern nature of algorithmic decision making when it's acting on particular individuals has a different moral character, a moral character that cannot be ascribed to the designer or the user of that algorithm.

At a high level, what I mean by that is the way algorithmic decision making is done these days is usually not by explicit programming. If I am a large institutional lender and I am trying to build a model to decide to whom to give mortgages or not, depending on the data that's in your mortgage application—and by the way, maybe a lot of other data about you that I got from somewhere else, like your social media activity. This is a thing if you didn't know it, to not just use the information on your application but to use any other data that I can buy about you to make that decision.

Instead of a human being sitting down and thinking, Okay, based on all these different variables I know about you, what should the rule be about who gets a loan or doesn't get a loan? That's not how it works. It's all done with machine learning.

So, I take a large, perhaps very high-dimensional, complicated data set describing previous loan applications that, let's say, I did grant a loan to, and an indication about whether you repaid the loan or not. This is my training data. I have these X/Y pairs; X is all the stuff I know about you, Y is whether you repaid the loan or not. And I want to use machine learning to learn a model to predict that, and that's what I'm going to use for making my decisions.

This complicated data set gets turned into some complicated objective function or "landscape" in which I'm trying to maximize some mathematically well-defined objective, which typically—almost exclusively these days—has to do with predictive accuracy. I say: "Find the neural network on this historical lending data set that minimizes the number of mistakes of prediction made," meaning people that the model would have denied the loan to who would have repaid the loan or people the model gave the loan to who didn't repay.

You go from the data to the subjective function. The objective function is used to search through some complicated space of models—like a deep neural network, for those of you who know what that means, but it doesn't really matter—and it's that model that gets deployed.

Even though every step of this process is scientific and the machine learning scientist entirely understands that "Okay, what we're going to do is use the data to define this objective and then search through some space of models for the model that best meets that objective," that human designer isn't going to be able to tell you what this model will do on any particular input.

So, if you ask the model, "Here's a loan application. Do you think the model will give them a loan or not," the human designer isn't going to be able to say, "Oh, yeah. Because I wrote the code I know that they'll get the loan or that they won't because on line 17 of the program they haven't been in their current job long enough." The designer's going to say: "I don't know. Run the model on it, and we'll find out whether they get a loan or not."

Maybe more to the point, if you ask the question to the designer, "Well, is it possible that this model that you built systematically rejects the loans of creditworthy black applicants compared to creditworthy white applicants," again the answer will be, "I don't know. Why don't we try it and see?"

Or, is it possible that just releasing the details of this model or even using it in the field might leak the private information about the individuals used to train the model? Again, the answer would be, "I don't know, because I oversaw this pipeline, but I didn't specifically engage in the automated optimization of the subjective function defined by a very large, complicated data set."

In this way, even though hammers and algorithms are both human-designed artifacts for specific purposes, algorithms are quite different from a moral and ethical standpoint in that any ethical misbehaviors about these algorithms are quite removed from the individual who oversaw this process. Also, their flexibility of purpose—a hammer is designed under normal circumstances to do exactly one thing very well that everybody completely understands. An algorithm is much more complicated and has much greater flexibility of purpose, so it's much harder to assign blame to any part of this pipeline or to understand what the consequences will be.

In our book, what we try to argue is that one important component of the solutions to these problems—in addition to the ones I mentioned before that are more about social change or institutional change—is that we need to embed social values that we care about directly into the code of our algorithms. If there's one takeaway from our book, it's that we're here to tell you that that is scientifically possible—not in all cases, and there will also be costs and tradeoffs that I'll talk about a little bit—but the road is clear to making our algorithms better than they are today. I didn't say perfect, but better than they are today.

If you're going to tell an algorithm how to avoid violations of privacy or of fairness or other social norms—you might care about accountability or interpretability—the first step is not to go write computer code. It's to think extremely hard about definitions, because if I'm going to tell an algorithm—the thing about algorithms is that you can't leave anything uncertain. You can't say, "Well, you know, make sure that you're not unfair to this population." You need to really say, "What does 'unfair' mean, and how do you measure an unfairness, and how much unfairness is too much?"

One of the interesting exercises about this kind of scientific research is that even though scholars and practitioners—many centuries or even eons before people like me came along—have thought very deeply about things like fairness, for instance. Computer scientists are late to the study of fairness. Philosophers, economists, and social scientists have all thought much longer and more deeply about fairness than computer scientists. What's different is that they never had to think about it in so precise a way that you could tell it to a computer program. Sometimes there's great virtue in just that rigor by itself, even if you don't go on and do anything with it. Sometimes thinking that precisely about definitions exposes flaws in your thinking that were only going to be exposed by being that precise, and I'll give concrete examples as we go.

The basic research agenda that Aaron and I and our students and many others in the machine learning community have been engaged in in the past few years is going through this process of thinking about different social properties or ethical properties we would like from our algorithms. Thinking hard about what the right definition—or, as it might be, definitions—should be, and then thinking about how do you implement them in an algorithm, and then what is the cost to other things that you might care about. To preview one big message of our book, nobody should expect that by asking for "fairness" from an algorithm, or "privacy" from an algorithm, that you won't degrade its accuracy because it's an additional constraint.

If the optimal model—ignoring fairness—for making loans accurately happens to be discriminatory against some minority, then eradicating that discrimination by definition is going to make the error worse. So, there are going to be real costs. There are going to be monetary costs for the company that adopts the solutions that we suggest, and hard decisions to be made about how much to adopt them in exchange for how much profitability, for example.

You've noticed I've written down a number of social norms here in different levels of gray scale. The gray scale is basically representing how much scientific progress we have made on the algorithmic study of these different social norms. Our opinion is that notions of data privacy or algorithmic privacy are on the firmest scientific foundations right now. The feeling is that we've settled on the right definition on the one hand, and we have made a fair amount of progress at algorithmically implementing that definition.

Fairness has quite a bit of progress. It's not just messier right now; it's going to be messier, period, going forward, in the sense that we're going to have to entertain multiple competing definitions of fairness.

Other social norms that you might have heard about are accountability or interpretability or even morality in algorithms. We don't think that any of these are less interesting or less important, but we put them in a lighter shade because less is known about them. I promise you that "the singularity" is written down there, but you can't see it because it's so light.

What I want to do with my time is give you a flavor of what this research agenda looks like and where the science is right now for the areas of privacy and for fairness, and then just very briefly at the end talk about some other topics that are in the book.

Let's talk about data privacy for a second. To highlight this point that I made about sometimes thinking very precisely about definitions helping you expose the flaws in your own intuitions about these topics, let me start by picking on the notion of anonymization of data, which unfortunately is by an extremely wide margin the prevailing definition of privacy that is used in practice in industry. Any tech company whose services you use that has some privacy policy statement is almost certainly using a privacy definition that is some form of data anonymization. I'm here to deliver the unfortunate news that not only are those definitions flawed technically, they're fundamentally flawed conceptually. They cannot be fixed. It's a waste of time. Let me tell you why.

What does anonymization, first of all, mean? Anonymization basically means taking an original data set—like this little toy example; this table, the top one for just right now—and redacting certain columns from it or coarsening certain columns in order to reduce the resolution of the data. Another term that's used that sounds fancier but is basically the same, is "removing personally identifiable information (PII)."

Here's a hypothetical medical record database of a hospital, in which they decided sensibly, "If we're going to release this data for scientific research"—which is something, by the way, that we'd all like them to do; scientific research is one of the better uses of things like machine learning these days—"we're not going to include people's names. We're just going to redact that column entirely and reduce the granularity of the data. I'm not going to specify people's exact age in years, I will just group them into decades. I'll say whether you're a 20-30, 30-40, etc. I won't put your full ZIP code in, but I'll put the first three digits so researchers have some idea of where you live, and then maybe I include some medical fields without coarsening or redaction."

What is the goal of this kind of anonymization? The primary goal is that if I had a neighbor, let's say, because she's my neighbor I know what her name is, and I know her age, and maybe I know or suspect that she has been a patient at this particular hospital, then I shouldn't be able to identify her medical record from that information.

In particular, if I have a neighbor and her name is Rebecca and I know she's 55 years old and I know she's female, because of this redaction and coarsening in this top database, I go and look at this database, and there are two records that match the information I know about Rebecca—these two that are highlighted in red. The idea is that, if there are enough matches after the redaction and the coarsening, I will not really be able to pick out Rebecca's medical record, and she should somehow feel reassured by this.

Of course, in this toy example, you can see I've already learned from this that she's either HIV-positive or has colitis, and Rebecca might prefer that I not know that she has one of those or the other, but you could imagine, Well, in a real database, there would be tens of thousands of these records, and if I did enough of this coarsening and redaction, maybe there would 100 different medical records that matched Rebecca, and now I wouldn't be able to glean much information about her true medical status at all.

The problem is, suppose there's a second data set that has also been coarsened and redacted for exactly the same purposes, and I also happen to know or suspect, because it's another local hospital that Rebecca might have been treated or seen there as well. So, I go to the second database, and I say, "Okay, 55-year-old females, how many of them are on the data set?" This time there are three matches after the coarsening and redaction. But when I do the join of these two databases—when I take the red records from the top database and the bottom database—I now uniquely know that my neighbor is HIV-positive.

You might again argue, "Oh, but if the data sets are big enough and I do enough of this coarsening"—the idea is just broken. What's fundamentally broken about notions of anonymization is—and this is a technical consequence of this flaw—they pretend like the data set that's in front of you is the only data set that is ever going to exist in the world and that there are no auxiliary sources of information or additional databases that you can combine to try to triangulate and reidentify people.

Lest you think, Oh, is this a real problem? It's a real problem. Many of the biggest breaches of private data—other than the ones that are just due to cryptographic hacks—are exactly due to this kind of reidentification, sometimes called "linkage analysis," in which I take the allegedly anonymized database, but then I combine it with even publicly available information about individuals, and then I'm able to figure out, "Okay, this allegedly anonymized individual in the database is this particular real-world individual." This happens all the time.

This is an example where thinking hard about a definition will cause you to conclude that there's something irretrievably bad about this particular one. You might say, "Okay, well, what's your better idea?"

Again, to demonstrate the value of thinking in a very precise way, let me propose what I think we could all agree—at least in my English description of it—is the strongest, most desirable definition of individual data privacy that we could possibly propose. It would go something like this: Suppose I promised you that any analysis or computation that involved your private data, that no harm could come to you as a consequence of that computation or analysis. You can write this down mathematically, but I think the spirit of it is clear from the English. Any computation that involves your private data cannot result in any harm to you, no matter what happens in the future, including new data sets becoming available that we didn't foresee at the time of this computation.

I want to argue that this is asking too much. It would be great if we could do it, but it's asking too much in the sense that if we enforce this definition of privacy, we'll never be able to do anything useful with data.

Here's the example. This is the front page of a famous paper from the 1950s. Suppose it's 1950, and you are a smoker. If it's 1950, you are a smoker because in 1950 everybody is a smoker because there are no known health risks of smoking, there's no social stigma associated with smoking, it's seen as glamorous and fashionable, and everybody does it. Suppose you're asked by some medical researchers, "Hey, would you be willing to let your medical record be included in a study about the potential harmful effects of smoking," and you say, "Sure."

This study is then done, and this study firmly establishes the correlation between smoking and lung cancer, and your data was part of that study, and you're a smoker. We can argue that real harm was caused to you as a result of this study, because now the world knows that smoking and lung cancer are correlated.

And if we want to make these harms concrete, since you didn't hide the fact that you were a smoker from anyone, including your health insurer, and they know that you're a smoker, and now they know this fact about smoking and lung cancer, they might raise your premiums. Among other things, literal financial harm has come to you as a result of this study. If we adopt the definition of privacy that I suggested, we would disallow this kind of scientific study, which I think we can all agree is a good type of study.

But here's an observation about this study and this definition, which is that actually the inclusion of your specific idiosyncratic medical record in this study was in no way necessary to establish the link between smoking and lung cancer. The link between smoking and lung cancer is a fact about the world that can be reliably established with any sufficiently large database of medical records. It wasn't like your medical records' inclusion was the key piece of data that really nailed the correlation. As long as they had enough data of smokers and non-smokers and their medical history, they were going to learn this correlation.

This leads us to a slight modification of the definition that I gave that I'm claiming is too strong, which is instead of saying "no computation that involved your data should create any harm for you," we'll say, "no harm should be created that wasn't going to be created if the same computation was done with only your data set removed." This is what's called "differential privacy," which was introduced in roughly 2005.

The thought experiment here isn't like, Well, your data was included, does harm come to you or not? We start with a database of n medical records, and yours is one of the n medical records in this individual—you are Xavier, the medical record in the file folder in red.

We perform two thought experiments. We say, "Suppose we perform this computation or study or analysis using all the medical records," and the counterfactual is that we consider the same computational analysis done with n-1 medical records, where the -1 is we remove yours.

If the harms that can come to you downstream from including your medical record in the n and disincluding it in the n-1 are basically the same, then we're going to call that "privacy," and the only harms we're going to protect are the ones where the harm really came from the inclusion of your data set versus the other n-1.

Notice this definition allows the smoking study because you are a smoker, yes, but there were lots of smokers in the data set. The same conclusion was going to be reached if it was removed, but any harm that really relied on your particular data, that's going to be disallowed by this definition of privacy. This is called differential privacy.

Differential privacy basically acts by adding noise to computation. It turns out when you think about this—when you get under the hood of differential privacy—it really requires that algorithms be randomized or probabilistic, that they flip coins during computation.

Let me describe one of the earliest industry deployments of differential privacy, which was done by Apple. They were apparently so excited about this that a company that we normally associate with good design taste decided to do this incredibly tacky thing of renting out the side of a Best Western or Residence Inn in Las Vegas and put this big ad for themselves on the side.

So, what do they do? If you are a user of a recent or later model iPhone or iPad, one of the things your phone periodically does is report to the mothership of Apple information about your app usage, statistics about your app usage. Maybe on a weekly basis, it says, "How many hours did you look at your email? How many hours did you play Angry Birds," etc.

Why is Apple interested in these app usage statistics? They claim they're not actually interested in your app usage statistics, they want to know aggregates, they want to know what are the most popular apps platform-wide. Their app developers, of course, are very interested in these popularity statistics as well.

So, rather than report your detailed app usage statistics, like you played Angry Birds for 7.2 hours, you read The Wall Street Journal for 2.1 hours, etc., they'll take that histogram of your app usage, and they will add a lot of noise to it. They'll basically add random positive or negative numbers to each of your different app usage statistics. Maybe you read The Wall Street Journal for 2.1 hours, and it adds +5 hours to that, and maybe it takes your 7.2 hours of Angry Birds usage and subtracts 3.7 hours from that. It just adds noise all across the board.

It adds so much noise to this histogram that actually if I looked at this post-randomized version, I wouldn't learn much of anything at all about your actual app usage statistics because there's so much noise. But, because this noise is independent—we would say it's zero-mean; it could be positive, it could be negative, and it's symmetrically distributed around zero—if I have 100 million users and I get 100 million noisy reports like this and I add them together, I get extremely accurate estimates of aggregate usage without compromising any individual's privacy. This is the core concept behind differential privacy. Where it gets interesting is when you want to make much more complicated computations or analyses differentially private.

The good news is that many of the most useful kinds of computations that we do these days, including pretty much everything in statistics and machine learning, can be made differentially private. The backpropagation algorithm for neural networks, which is the core algorithm underlying deep learning technology, is not differentially private. It does not give any privacy promises of any kind. There is a variant of it that in carefully chosen places adds carefully chosen types of noise that provably obeys this very strong definition of privacy.

You might say, "Okay, great. So, Apple has a way of protecting your app usage statistics. This seems like a rather low-stakes application," and I would agree with you.

A big moonshot test for differential privacy is in the works. The U.S. Census is constitutionally required to preserve privacy of individual data, but the U.S. Census has never before committed to a definition of privacy, so it didn't really matter. They had very ad hoc anonymization-style ways of implementing privacy in the past. Recently, the U.S. Census has decided to bite the bullet, and every single statistic or report that they issue from the underlying data of the 2020 Census will be done under the constraint of differential privacy.

This is a big endeavor because there are a lot of engineering details to think about here. In particular, one of the engineering details—and one of the themes of our book—is in this adding noise to your app usage statistics, there's a knob we can turn here. I can add a little bit of noise to your app usage statistics, but then people will be able to infer more about your app usage statistics. If you really played an extraordinary amount of Angry Birds this week and I only add a little bit of noise, I won't exactly know how much Angry Birds that you played, but I'm going to be able to look at this and say, "You're playing a lot of Angry Birds."

If I add a little bit of noise, I provide less privacy to you, but then the aggregate will be more accurate. Or, I can add much, much more noise, and then the aggregate will be less accurate. So, there's a knob provided by differential privacy which lets you choose the tradeoff between the promises you make to individuals about their level of privacy and the accuracy of the aggregate computations that you're doing.

Differential privacy, by the way, is correctly silent on how you should set that knob because how you should set that knob should depend on what's at stake. Maybe we don't really care that much about how much Apple knows about our usage statistics, and so we're okay with relatively little noise being added, but maybe if it's our medical record or even our census data or our financial history, we want a lot more noise to be added. So, differential privacy provides you a framework to manage this tradeoff and think about it quantitatively, but it doesn't tell you how to set that knob. One of the big engineering details, of course, for the U.S. Census is how to set that knob and how to set that knob on different computations.

This is all I'm going to talk about for now about privacy, but I hope this gives you some sense of how thinking in this field goes about good definitions and bad definitions and about tradeoffs between social norms like privacy and things like utility, profitability, and accuracy.

The study of algorithmic fairness, and more specifically, fairness in machine learning, is in a much more nascent state than privacy, and in particular, differential privacy. But we already know it's going to be a little bit messier than privacy, where I think many people who have thought deeply about privacy have converged on differential privacy as the right core notion.

There's not agreement on definitions in fairness. In fact, it's even worse than that. It is known that entirely reasonable definitions of fairness that each makes sense in isolation are provably mathematically impossible to achieve simultaneously. This was actually discovered through a controversy between a company that developed a criminal risk assessment model used in sentencing decisions and a watchdog group that audited that model. The watchdog group pointed out a particular definition of fairness which was racially discriminatory.

The company came back and said, "Well, we're very concerned about racial fairness, and we implemented racial fairness in our model. We used this definition of racial fairness." There was some back-and-forth between these two parties. Then some more mathematically minded researchers in the community said, "Huh. I wonder if it's even mathematically possible to achieve these two things simultaneously?" and they proved a theorem showing that it wasn't.

Not only like in the study of privacy will there be these tradeoffs between fairness and accuracy. It's even worse than that. There might be tradeoffs between different notions of fairness or even the same notion of fairness in different groups. In particular, there's no guarantee that if I build a predictive model for lending and make sure that it doesn't falsely reject black people more often than it falsely rejects white people, there's no guarantee that in the process of enforcing that fairness condition, I won't actually magnify gender discrimination.

To put it bluntly, something that we say early and often in the book is that when machine learning is involved and you pick some objective function to optimize like error, you should never expect to get for free anything that you didn't explicitly state in the objective, and you shouldn't expect to avoid any behavior that you didn't specify should be explicitly avoided. Because if you're searching some complicated model space looking for the lower error and there's some little corner of the model space where you can even incrementally, infinitesimally improve your error at the expense of some social norm, machine learning is going to go for that corner because that's what it does.

But let me give you a quick visceral example of the ways in which machine learning can naturally engender unfairness or discrimination of various clients. Let's suppose that we're a college admissions office and we're trying to develop a predictive model for collegiate success based on, let's say, just high school grade point average (GPA) on the X-axis and SAT score on the Y-axis.

Each little plus or minus here is a previous applicant to our college that we actually admitted, so we know whether they succeeded in college or not. Pick any quantitative definition of success in college that you want. Let's say it's "graduate within five years of matriculating with at least a 3.0 GPA." It could be "donates at least $10 million back to the college within 20 years of graduating." As long as we can measure it, I don't care what it is.

Each little point's X-value is the GPA of that past admit; the Y-value is the SAT score of that past admit. But then the plus or minus is whether they succeeded in college or not.

I'd like you to notice a couple of things about this population of individuals. First of all, if you stared at this carefully, you'd notice that slightly less than half of the admits succeeded. There are slightly more than 50 percent minuses and slightly less than 50 percent pluses. That's observation number one.

The other observation is that if I had to predict a model from this data, it's pretty clear that there's a simple model that separates the pluses from the minuses. If I draw this diagonal line and I say, "Everybody whose combination of SAT score and GPA is above that line, I admit, the other ones I reject," I would make very few mistakes on this data set. There are a couple. There are some false admits and some false rejects, but this simple model does a pretty good job of separating the successes from the non-successes.

But suppose in my data set there's also a second population and that their data looks like this. I want to make a couple of points about this population. First of all, they are the minority. There are many fewer of these red points than there were of these orange points.

Observation number two is that this population is slightly more qualified for college. There are an equal number of pluses here and minuses here in this minority population compared to the majority.

Observation number three is that there is actually a perfect model separating for pluses and minuses here, which is this diagonal line.

Now, if I train a model on the aggregate data—I've got both the reds and the oranges here now—and I say just minimize my predictive error on the combined data set, well, because the red points are such a small fraction of the data, I'm still going to end up choosing the model, which basically is the best model for the orange population, at the cost of rejecting every single qualified minority applicant.

You might ask, "Can this really happen in the real world?" Here's one story about why this might happen. The difference between the orange and the red population doesn't have to do with collegiate success. It basically is the case that the SAT scores of the minority population are systematically shifted downward, regardless of success of not.

One explanation for that could be that the orange population comes from a wealthy demographic in which you pay for SAT preparation courses and multiple re-takes of the exam, all of which cost money. The minority population can't afford any of that, and so, even though they're no less prepared for college, they're less financially able to game this exam, and so they have systematically deflated SAT scores.

This is one of several ways in which the natural optimization process of machine learning can result in visceral discrimination. There are a number of things you might say about this. You might say, "If I just look at this data, I would realize it's not that they're less qualified, it's just that their SAT scores are lower, and there are very natural socioeconomic explanations for that. Why not just build two separate models? Why not use this model for the orange population and this model for the minority population." In fact, by doing that, I would avoid this tradeoff between accuracy and fairness. I would actually have a model that's both more fair and more accurate on both populations.

The problem is many laws forbid the use of race, for example, as an input to the model at all, and a model that says, "If you're from this race, then use this line, and if you're from this race, use that line," that is a model that is using race as an input. It's like a decision tree that says, "First, look at the race and then branch right in one race and this way in another." A lot of laws that we have that are explicitly designed to protect some minority group can have the unintended effect of guaranteeing that we discriminate against that minority group if machine learning is the process by which we're developing our model.

What's a fix to this? One fix to this would be to not enact laws that forbid the use of making these observations and building different models if it increases fairness. Another thing I could do is, instead of saying the objective is to minimize the error on the combined data set, I could basically specify a new objective that says, "The goal is to minimize the error, subject to the constraint that the false rejection rates between the two populations can't be too different." The false rejection rate on the red population of this model is 100 percent; every red plus here is rejected. The false rejection rate on the orange population is close to 0. The most accurate model has the maximum possible unfairness.

I could instead say, "You can still have accuracy as an objective, but you have to find the single line that minimizes the error, subject to the condition that the fraction of red pluses that you reject and the fraction of orange pluses that you reject have to be within 1 percent of each other or 5 percent of each other or 15 percent of each other. So, now I have another knob. I have a knob that basically says how much unfairness do I allow?, and conditioned on that amount of unfairness, then you optimize the error. Right now, for this model, that knob has to be set at 100 percent. The disparity in false rejection rate is close to 100 percent. But as I crank that knob down and ask for less and less unfairness, I'm going to change the model, and it'll make the accuracy worse.

This is not some kind of vague conceptual thing. On real data sets, you can actually plot out quantitatively the tradeoffs you face between accuracy and unfairness. I won't go into details, but on three different data sets in which fairness is a concern—one of these is a criminal recidivism data set, another one is about predicting school performance. On the X-axis here—if you could see what the X-axis is—is error, so smaller is better; the Y-axis is unfairness, so smaller is better. In a perfect world, we'd be at zero error and zero unfairness. In machine learning in general on real problems you're never going to get to zero error, period.

But you can see that on this plot there is this tradeoff. I can minimize my error at this point, but I'll get the maximum unfairness. I can also ask for zero unfairness, and I get the worse error, or I can be anywhere in between.

This is where science has to stop and policy has to start. Because people like me can do a very good job at creating the theory and algorithms that result in plots like this, but at some point somebody or society has to decide, "In this particular application, like criminal risk assessment, this is the right tradeoff between error and unfairness. In this other application, like gender bias in science, technology, engineering, and mathematics advertising on Google, this is the right balance between accuracy and fairness."

At some point, this dialogue has to happen between scientists and policymakers, regulators, legal, and practitioners. We think that a very good starting point for that dialogue is to make these tradeoffs quantitative and really discuss the hard numbers around: When you ask for more fairness, how much accuracy are you giving up, and vice versa?

What I've just described to you along with the introduction covers roughly the first half of our book, and our book is meant to be hopefully an engaging, entertaining, general audience, readable treatment of these topics. There are no equations in the book whatsoever. We try to populate it with many real-world examples.

But it only covers the first half of the book. You might wonder, What's in the second half of the book? A big part of the second half of the book concerns situations in which algorithms are exhibiting behaviors that we might think of as socially undesirable, but it's not so easy to blame the algorithm exclusively.

In the examples I've been giving so far, it's like, "You got rejected for a loan or rejected from college, even though you deserved to get the loan or get in." You may not even know that an algorithm was making this decision about you. Or, you may not even know that your data was used to train the model that is making this decision on other people. To a first approximation it really does seem fair to think of algorithms as victimizing individual people.

There are a lot of other modern technological settings where there's a population of users of an algorithm, or I might say more specifically an app, and the misbehavior is an emergent phenomenon, not of just the algorithm or app itself, but the incentives and use of that app or algorithm by the entire population.

There are many examples of this. Let me give the cleanest one. The cleanest one is navigation apps like Google Maps and Waze. These apps clearly are taking all of our collective data, including our real-time Global Positioning System coordinates, so they know about real-time traffic on all of the roadways, and you type in, "I want to drive from point A to point B. What's the fastest route?" And it's in response, not in some abstract way by looking at a fold-out map and saying, "This is the minimum distance," because that's not what you care about. You want to get from point A to point B the fastest.

On the one hand, what could be better? What could be better than this app that in response to your particular desires right this minute, knowing what everybody else is doing on the roads, optimizes your driving route. Trust me, I use them every time I drive.

But if you step back from this for a second, you might ask the following question: Is it possible that by all of us using these apps that are greedily maximizing our own selfish interests—to use game-theoretic language because much of the second half of the book is about settings in which game theory is a valuable tool for thinking—could it be that we're actually all worse off by using these apps?

You might think, How could that happen? How could it be that everybody being selfish for themselves results in a collective outcome that's worse for many individuals or maybe even the entire population? If you've ever taken a game theory class, probably the first example you were given is prisoner's dilemma. Prisoner's dilemma is the canonical simple game-theoretic model in which, by everybody being self-interested and optimizing against what everyone else is doing, the outcome for the two players is much, much worse for both of them than it could have been under some alternate, non-equilibrium solution.

I think a very correct view of these types of apps is that, in game theory terms, they are "helping all of us compute our best response" in a game, and that is driving us toward the competitive or literally the Nash equilibrium of these games, if you've taken some game theory.

In fact, it is the case not just on paper—in the book we give a simple mathematical example of where everybody optimizing selfishly for their driving time causes the collective driving time to go up by 33 percent. There are actually real instances of this in the world. For the locals here, there was huge concern about closing much of Times Square to vehicular traffic. People were like, "Oh, my god! Are you out of your mind? The busiest urban traffic area in the United States, you're going to close that and make it a pedestrian mall?"

In fact, it's not that bad. In fact, maybe it has even gotten better. Just because you add capacity to a network of roads or take it away, people react to it in a game-theoretic, competitive, or self-interested way, and so you have to think not just about the capacity you've added or taken away but about what the equilibrium behavior that will result from it is.

Much of the second half of the book is concerned with these kinds of things, and we take some liberties in this discussion that I think are not unreasonable. Another domain in which we have all become quite used to personalization, or what I would call the "algorithmically aided computation of best responses," is in social media. There's an algorithm in Facebook's newsfeed that decides what content to show you: which of your friends' posts to prioritize, what ads to show you, what news or content to show you. It's like Google Maps and Waze. It's learning about your preferences and showing you the stuff that you like; the code word for this being "maximizing user engagement."

These models built by machine learning have discovered that it's better to show you, let's say, political news that you are inclined to agree with than political news that you find offensive or disagreeable. That increases engagement.

Again, like Google Maps and Waze, what could be better? What could be better than, instead of my having to look at a bunch of stuff that I don't like or irrelevant stuff, what could be better than just having an algorithm tell me which route to drive or show me what content?

The result, of course, is an equilibrium that we might not like. We might feel like, yes, each individual is having this app personally optimized for them. Maybe this has come at the cost, let's say, not to driving time in the case of Facebook's newsfeed but to a deliberative democratic society, for example. I think many people feel like that is in fact what has happened. What we try to do is put this on slightly firmer scientific footing, as thinking about this as a bad equilibrium that results from all this self-optimization enabled by apps.

We do have algorithmic proposals for these kinds of problems as well. They're a little bit different than just going into the code, but again, there are things that could be done now. As a concrete example, if we don't like the equilibrium that Facebook's current newsfeed algorithm has driven us to—trust me, the way machine learning works is that by the same reason that Facebook knows what content you like, that same model also tells it what you don't like, or what you like slightly less than what you like the most. It would not take a big change to their code to do a little bit more exploration and mix into your newsfeed some stuff that you might find less agreeable, or what we might call "opposing viewpoints."

They don't have to do this all at once wholesale. They could in fact just offer a slider bar to individual users that they could experiment with and say, "Show me a little bit more stuff that's further afield than what I agree with," and this might help the algorithm nudge us out of this bad equilibrium that many people feel like we've gotten into.

We also do talk about things that we think have a much longer way to go scientifically—like interpretability of algorithms, accountability, morality—and we do even talk a little bit about the singularity toward the end, but I'll leave that unspoken to entice you to go read the book.

Let me stop there, and I'll take any questions.

Questions

QUESTION: I had one question regarding the constraints. You said that some of these sensitive variables by law cannot be included when we are modeling. Is it okay if you are using them in the constraints then?

MICHAEL KEARNS: First of all, just to separate two issues, I basically think these laws are a bad idea. They're a bad idea for at least two reasons. One is the one that I discussed, which is that by refusing the use of race in a predictive model you might actually be ensuring discrimination against the very group that you were trying to protect by disincluding it.

Second, you're fooling yourself. Maybe it made sense in the 1970s for credit scoring to forbid the use of race because such limited data was available about people. These days there are so many proxies for race that I can find in other sources of data. Unfortunately, in the United States for the most part your ZIP code is a rather good indicator of your race already, as well as many other things that would surprise you—like what kind of car you drive, whether you're a Mac or a PC user.

One joke is that 10 years ago or so I was at an academic conference, and academics are famous for their liberal politics. After a few drinks at a dinner, the host said, "I want you all to think. Do any of you know a Republican who uses a Mac?" Everybody is absorbed in thought for a minute, and then I said, "Actually, what everybody's thinking about is whether they know a Republican first, and then they'll think about whether they use a Mac."

I'm joking, but the point is a lot of apparently innocuous attributes about you that you don't even think are particularly sensitive, can be very correlated statistically with things like race or even other things that you might not even know about yourself. I think that to "get fairness by forbidding the use of certain features" is a losing proposition and that the right alternative is to say, "Use any data that you want. Don't racially discriminate." That's the right solution, not trying to get to it indirectly by assuming that the models work a certain way and that if you don't give them certain information that they can't discriminate. That has just proven to be a flawed way of thinking, especially in the modern, data-rich era.

QUESTION: Thank you so much for a fascinating talk. With nanotechnology increasing the velocity of computation and the existence of both good actors and bad actors, how from an ethical point of view can the influence—it might be very hard to undo something when it's happening at nano speed. How do we navigate this kind of territory?

MICHAEL KEARNS: I don't think I have a general answer to what we might think of as the computational arms race. I think that arms race is being played out in many domains. In particular, computer security has this flavor all the time. In computer security you try to anticipate all of the ways in which hackers might breach your system or your data, but they have the luxury of figuring out your vulnerabilities and the things that you didn't think about in getting you there.

I think that that area—the more data you have, the more compute power you have—that all helps, but at the end of the day the thing is fundamentally some kind of game between two players, and in the game it is harder to be the defender than the attacker. It's like a universal versus an existential quantifier: the defender has to protect against everything; the attacker just has to find the one thing you didn't think of.

One concrete example where I think things do need to change is in the regulatory approach to tech companies. To the extent that large regulatory agencies in the United States want to get serious about preventing the kinds of harms that we discuss in the book on users, I think we are in a very flawed state right now. First of all, in defense of the regulatory agencies, they have very strong limitations on what they can and can't do right now. They are basically at a great disadvantage to the big tech companies.

People think of it as a big win for the regulatory community when Facebook has to pay a huge fine for severe privacy violations. Then, the main remedy to that going forward is, "You need to fix your internal processes and people."

Imagine an alternative world where the Federal Trade Commission or the Department of Justice or what have you actually is allowed internal access to the systems, data, and algorithms of a company like Facebook and can find misdeeds before they have widespread damage. This is going to require legal and regulatory change. Tech companies are going to be resistant to it, but there's nothing conceptual about it that's difficult. If an algorithm is exhibiting racial, gender, or other discrimination, there's a way of auditing those algorithms and discovering that with not great difficulty. Why not let the regulatory agencies have that access rather than waiting until the media or some other hacker group has to expose a flaw, and then there's a big fine and procedural change?

I think that's an area where—I'm not sure about nanotechnology per se—in this battle between two opposing parties, there are concrete ways you could level that playing field I think in very productive ways.

This is a longer conversation, but I also don't really buy the argument that this would infringe severely on the intellectual property of the technology companies. I think it could be done in a way that lets them have their intellectual property but provides much greater consumer protections.

QUESTIONER: So that would be the same idea of the defender and the attacker. At a regulatory level, there would be the attacker, if they were able to go in—

MICHAEL KEARNS: Maybe I wouldn't call it attacker and defender, I would call it an "auditor."

QUESTIONER: That's fine, whatever, but I understand the dynamic. Thank you for that.

QUESTION: Parveen Singh, Carnegie New Leaders.

There has been a big conversation on the West Coast with Uber and Lyft about shared mobility and the data that Uber, Lyft, and Waze collect being given to city and municipal governments. As a New Yorker, we would love to have data-driven technology in our transit system. I want to get your perspective on what you think about city and local governments having access to the data that private companies collect for their transit systems.

MICHAEL KEARNS: I'm generally a great believer in any socially good use of data, provided there are protections against the kinds of things that I describe here. I think it would be great if Uber and Lyft wanted to share their data with urban municipalities in order to improve public transit or mobility.

You have to also be concerned about the privacy of the individual data that Uber and Lyft collect being shared with a wider set of people. You might already worry about them just having it in the first place.

But in general, I'm very much in favor of that kind of effort. It just has to be managed and negotiated in a way that balances the tradeoff between the good that is caused and the potential harms. A lot of those types of things I think are negotiations that need to happen on the policy side and maybe are less technical than some of the more limited problems that we're discussing in the book, but I think it's all on the same spectrum.

QUESTION: I have a question about the ethics in terms of collaborating and teaching techniques—advanced techniques—to people who are developing AI systems who don't incorporate these ethical elements, who are only concerned with optimization. Is there a trend within the academic world, or the world more broadly, to try to make sure that people are only collaborating with people who incorporate these? I'm thinking specifically in terms of international relations related to Chinese efforts to use AI to identify potential nodes of dissent, and how that's shifting.

MICHAEL KEARNS: I don't have any special insight into Chinese governmental use of data. My view is probably no more informed than many people in this room, but it doesn't look good. It doesn't look good to have one party setting policy with unfettered access. If we worry about the United States being a surveillance state, I think there it's not even an open secret, it's just open that it is.

I don't have a lot to say about China in particular. I do think that the use of things like social media data to identify dissident groups and to use it for political purposes, or to disadvantage or harm your enemies, is the opposite of the kinds of applications that he was discussing over there, and so it should be disallowed basically.

We've had our own versions of these things—or at least claims of our own versions of these things—in the United States that again seem small in comparison to places where that kind of data is used to really harm people physically, but again you don't want to walk down that slope at all. I don't like the use of people's social media data to target political misinformation at them, to the extent that that's happening. I think these are all bad uses of algorithms and machine learning.

QUESTION: Thank you so much for your lecture. I really enjoyed it.

I think my question is a little similar. I work for an organization. We are part of a global coalition called Campaign to Stop Killer Robots. In essence, we advocate for international law to prohibit development of fully autonomous weapons, which basically means any weapons system that can deploy force without meaningful human control. I just wondered, on behalf of the campaign, if you had any thoughts about the use of algorithms in weapons systems or even in law enforcement.

MICHAEL KEARNS: Yes. We talk very briefly about this in the last chapter of the book. Let me use your example to more generally ask the question that I think people individually need to think about and society as a whole needs to think about. Are there certain types of decisions that we just don't want algorithms to make, or certain types of things that we don't want algorithms to do?—not even because of the problems that I'm identifying here, which is that they do them badly in some way or another, but even if they did them perfectly, we don't want them to do it because of the moral character involved.

We reference in this part of the book this great book by Michael Sandel who is a well-known ethicist at Harvard University. He is the source of these trolley car thought experiences: What would you do? Should the car turn and kill the passenger or kill the school kids? He has a great book called What Money Can't Buy: The Moral Limits of Markets. I was very influenced by his book in writing our book. His book is about economics, and his book is about the rise of economic and market-based thinking after World War II and the gradual creep of markets into things that he, morally, doesn't think should be markets at all.

He points out that sometimes when you make a market out of something, it changes the nature of the good being sold. I was very influenced by that because I felt like in many places in this book you could swap out the word "markets" for "algorithms" and "things that shouldn't be sold" for "decisions algorithms shouldn't make," and it would make sense, and I think automated warfare is a good one.

Even if, hypothetically, algorithms could make perfectly targeted drone strikes with no collateral damage whatsoever in an autonomous, unsupervised, no-human-in-the-loop fashion, maybe we think that shouldn't be done, just because only a human being can accept the moral agency or responsibility of the decision to kill another human being in a way that an algorithm just can't. So, even if it makes the decision perfectly, you change the moral character of the decision by having an algorithm make it.

I definitely agree. There are domains—and I would personally count for myself automated warfare as one—where I would want to tread extremely carefully, even if algorithms are much better than human actors. I think I agree with the mission that you describe.

QUESTION: Thank you again. I just wanted to ask, by the title of your book it seems like the "ethical algorithm" might exist, or at least that's what you're arguing. It's mentioned that fairness, for example, could be gained in these algorithms by introducing multiple constraints, but privacy could be considered an additional constraint. Then, when you go on to list the others, the more and more we introduce constraints into this optimization problem, there is going to be some utility tradeoff as mentioned.

What evidence have you seen in your research would lend itself to say that? Even with as many constraints as we can think of in terms of formulating morality into an algorithm, how can we still get something that works and provides value?

QUESTION: I have a question about privacy and fairness. One of the things you mentioned is that fairness might require knowing the variables like race or gender to make sure that the algorithm is fair. Do you think that could be at odds to privacy because the user may not want to reveal those variables in the first place? What do you think about that?

MICHAEL KEARNS: I think these two questions are extremely similar.

Basically, the more constraints you add, the more tradeoffs there are. If you ask for more fairness, you might get less accuracy; if you ask for more fairness, you might get less privacy; if you ask for more fairness of one type, you might get less for another. If you ask for all these things simultaneously, you might not be able to do anything useful with data, and that's the reality.

When we say the "ethical algorithm," we're not suggesting that there's going to be some master algorithm which encodes all of the social norms we might care about into a single algorithm that's still useful for anything. I don't think that thing exists. That's why I think science can take us to the point of making these tradeoffs explicit and quantitative. Then, society in the abstract and stakeholders in particular problems are going to have to navigate which of these norms they really care about and how much and make those tradeoffs and hope that there is still something useful to do with algorithms and data once they've specified their constraints.

QUESTION: I just wanted to segue into policy a little bit. What do we do when—for example, with Jonathan Haidt's "foundations of morality," we've seen that there are very different ways that different cultures interpret morality, fairness, and even privacy to a certain degree. How do we think about social norms and policies that can be applied on a larger scale globally where we can actually try to come up with some sort of a social norm on anything—be it on privacy or on fairness—that is applicable culturally on a global scale?

MICHAEL KEARNS: That's a great question, and we talk very briefly about this kind of thing. It really is true, especially in fairness, where there are multiple different definitions that are in competition with each other, each of these definitions is in some sense like received wisdom. Who decides what group should be protected, and what would constitute harm to that group? That might mean something very different in one culture or another culture. It might even mean something different to different people.

As far as I can tell, there's very little known about ordinary people's subjective notions of privacy or fairness. There is very little behavioral work on this kind of thing. We did a little bit in a mini-project at Penn back in the spring, where we showed subjects pairs of criminal records from the Correctional Offender Management Profiling for Alternative Sanctions data set and said, "Do you think these two individuals are sufficiently similar that they should be assessed to have similar risk of recommitting a violent crime?"

We still haven't really grokked the data in detail yet, but the first thing you realize from this data is that different people have different opinions on this. Just in this pair-wise assessment task, some people have very restrictive notions of fairness that would cost a lot to accuracy if you implemented their fairness notions, and other people are more liberal, and you could build more accurate models and still satisfy their subjective notion.

I think this line of work is basically absent in the literature that we're describing in this book right now, because we're trying to talk about the science. But to do science on this we have to know, first of all, what different cultures would think constitutes fairness or privacy and how it differs across different cultures. While I'm interested in doing this kind of research, I'm not a social scientist by training, and I think we need more social scientists by training doing this kind of work in multiple cultures and across multiple demographic groups and the like.

So, I think it's a pretty wide-open landscape for this kind of stuff. I think it's going to be increasingly important because, especially once these received-wisdom fairness notions prove to really be costing you in other dimensions, then people are going to look hard at them and say, "Wait. Who says that this is the group to protect and that this is what constitutes harm to them?" I don't think there are great answers to that question.

Thank you, everybody. I appreciate your coming out.

道德算法，与迈克尔-卡恩斯合作

特色

迈克尔-卡恩斯

关于该系列

附件

您可能还喜欢

乔安娜-布赖森：人工智能只是一种人工制品吗？

负责任的人工智能与大型模型的伦理权衡，与 Sara Hooker 合著

人工智能与战争：军备控制与威慑的新时代》，与保罗-沙尔合著

联系方式

道德算法，与迈克尔-卡恩斯合作

特色

迈克尔-卡恩斯

关于该系列

相关链接

附件

分享

随时了解新闻、活动及更多信息

您可能还喜欢

乔安娜-布赖森：人工智能只是一种人工制品吗？

负责任的人工智能与大型模型的伦理权衡，与 Sara Hooker 合著

人工智能与战争：军备控制与威慑的新时代》，与保罗-沙尔合著

道德赋权

订阅新闻和活动

联系方式