在本期人工智能与信息无障碍播客中,主持人 Ayushi Khemka 将与巴西联邦法官 Isabela Ferrari 和 AIEI 顾问委员会成员 Kobi Leins 博士讨论有关人工智能、法律和社会正义的问题。他们谈到了数据安全、数字访问和网络复原力,特别是在巴西和澳大利亚的背景下,同时谈到了他们最初进入人工智能领域的兴趣所在。法拉利和莱恩斯还谈到了负责任和公平的人工智能的(不)可能性。
AI4IA 播客系列与 9 月 28 日举行的人工智能促进信息无障碍 2022 年大会相关联,以纪念国际全民信息无障碍日。AI4IA 会议和播客系列还与阿尔伯塔大学的AI4Society和Kule 高级研究所、印度观察家研究基金会的新经济外交中心以及牙买加广播委员会合作举办。
要获取会议演讲,请使用此链接。
CORDEL GREEN: Hello and welcome. My name is Cordel Green, chairman of the UNESCO Information for All Programme Working Group on Information Accessibility. Welcome to the AI for Information Accessibility podcast, organized by Carnegie Council for Ethics in International Affairs. Your host is Ayushi Khemka, a Ph.D. student at the University of Alberta.
AYUSHI KHEMKA: Hello and welcome back to another episode of the AI & Information Accessibility podcast. This is your host, Ayushi Khemka and today I have with me Dr. Kobi Leins and Judge Isabela Ferrari.
Judge Ferrari has been a Brazilian federal judge since 2012 and a member of the UNESCO WGIA since 2019. She was a visiting researcher at Harvard Law from 2016 to 2017. Judge Ferrari teaches AI and the rule of law at UNESCO MOOC. She was a 2021 Distinguished Jurist Lecturer for the Judiciary of Trinidad & Tobago. She has published several books on law and technology, especially in the field of online codes.
Kobi Leins is a visiting senior research fellow at King's College London, an expert for Standards Australia providing technical advice to the International Standards Organisation on forthcoming AI Standards, a co-founder of IEEE's Responsible Innovation of AI and the Life Sciences, a non-resident fellow of the United Nations Institute for Disarmament Research, and advisory board member of the Carnegie Artificial Intelligence & Equality Initiative (AIEI).
To start our conversation, I talked with Dr. Leins and Judge Ferrari about what initially sparked their interest in AI and law and what that has led them to today.
Here’s Dr. Leins.
KOBI LEINS: My background in law is old and I refer to myself as a reformed lawyer. I undertook two degrees. I studied languages and I studied law because I was really interested in how the world worked and how power worked, and for me, law was always a source of power. It was trying to understand what runs the world and what governs what we do, who gets to tell us what we can do.
But years ago, I retired the hat of law when I went to the UN and worked on disarmament issues, including chemical and biological weapons control and other issues within the UN where I really started to become more interested in governance more generally beyond law, what structures control human behavior, including culture, networks, looked at a lot of different angles of power.
More recently, I've become interested in AI through the work on my Ph.D. and in the book on looking at regulating nano materials in warfare. So looking at new technologies and how they're used got me thinking about technology as well, and not just how it's used in warfare, but how it's used day to day and how our lives have changed profoundly because of these automated systems.
AYUSHI KHEMKA: Similarly for Judge Ferrari, her interest in the law came before she started researching emerging technologies, like AI.
ISABELA FERRARI: So the thing is, as I'm a judge, the interest in law is on my DNA. Ever since I was a kid I wanted to be a judge. What happened was that in 2016 I went to study abroad. I had been a visiting researcher at Harvard Law. When I was there studying comparative law, I got in touch with the topic of AI for the first time. I realized that AI was changing all the legal world. It would be changing the legal world within the next years.
And I started to think how this could affect my specific environment, how AI could affect courts. That's when I started studying online courts. What I first realized were all the benefits that AI could bring to our daily lives, and then I started studying the problems we might have using this new technology. So I decided to focus on the topics of opacity first and then bias. That is my topic of study in the current moment.
AYUSHI KHEMKA: Judge Ferrari then spoke specifically about how AI is being used in the Brazilian judicial system, describing a tool called Codex and some of the opportunities and challenges related to it.
ISABELA FERRARI: We have a new project that has been impressing me with its potential. It’s called Codex. What happens in Brazil? Since some years ago, we started looking at the situation of our lawsuits. Brazil is one of the countries in the world that has the most lawsuits. I think we just compare with China. We have a backlog now of around 18 million lawsuits, so we need to develop something to deal with the situation.
And the National Council of Justice started trying to analyze Brazilian situation regarding lawsuits, collecting information on the lawsuits, the name of the parties, who was the judge, what was the topic, at which stage was the lawsuit, etc. After this information had started being gathered, we could understand much more our situation and could plan accordingly. In our first moment, this information was delivered by each of the courts doing a file.
So the civil servants, they did a file gathering of this information and sent it to the National Council of Justice. Then the National Council of Justice thought, "Okay, but we could develop an AI system to gather this information once Brazil has almost 100 percent of lawsuits in an electronic mode." So they started developing this tool that is called Codex, an AI system that is going to bring the information when it's published online.
So Codex gathers this information. In a first stage, Codex will only bring this macro information of the lawsuits, the most important information. As I told you, the name of the parties, who is the judge, what is the topic, what is the situation of the lawsuit, is it already in appeal stage, is it an initial stage, etc. With this information gathered, we can already think about a lot of intelligent solutions for the problems we have.
But in the second stage, this is where things change completely. In a second stage, Codex intends to read the information within the lawsuit, the information that appears in the documents of the lawsuits. If we can do this with all the lawsuits of Brazil, with the ones that are in the Supreme Court—because the Supreme Court is in a position out of the control of the National Council of Justice. The National Council of Justice controls all the jurisdictions beneath the Supreme Court.
But if we can have information of all the lawsuits in Brazil, with the ones in the Supreme Court, with this level of density, let's say, being able to look inside the lawsuit to understand the arguments that are posed, to understand which laws are effectively being applied, to understand the time that the lawsuit takes, to understand the decisions that are taken, to understand how consistent our judges' decisions, etc., if we have all this information organized, this would be step one to be able to effectively do data science within the judiciary.
So when this happens, we'll be able to change the question that we do today that is, "What can we do with AI?" And we should definitely start thinking, "What shouldn't we do with AI?" Because when we have access to all of this information, we'll be able to do basically everything we want. We'll be able to draft decisions. We could structure a robot judge for example. We could do basically anything with this information.
It is a change that might affect not only the judiciary, and we could change—it's important also to say that Codex allows us to change the logic of the decisions from an individual logic. "I Isabela decide one lawsuit, two lawsuits, three, four, five, six, seven," to an algorithmic logic. What is decision making through an algorithmic logic? It is basically taking a big problem, unpacking this big problem in some small problems, and thinking about solutions for each of these small problems. And after you get solutions for each of these small problems, you can train an algorithm to solve them for you.
So this is the change for human decision to an algorithmic logic. That is something that people that have children going to schools are starting to be acquainted with because now schools are starting to train children on this kind of situation, on these kinds of skills. So this could change all the way we decide issues within the judiciary.
AYUHSI KHEMKA: With Codex and similar tools becoming a reality, AI and its effect on the justice system is something that Judge Ferrari spends a lot of time thinking about. But huge questions remain about how we can check the racial, gender, class, and other biases that might come with this technology. Here’s Judge Ferrari on some ways to think about these issues.
ISABELA FERRARI: So this is actually the $1 million question. It would be really important if we could find at a first look or at an initial moment if there is racial, gendered, class, or another kind of bias in that decision that is suggested by AI. For now, what helps us is auditing. So through auditing you can analyze the data you have gathered and understand if there appears to be a biased pattern in this data. But the main problem is that auditing demands that you have already taken a lot of previous decisions on that matter.
That's how people found out there was a problem with COMPAS, an algorithm that was being used in Wisconsin to make analysis of likelihood that a defendant would flee the jurisdiction. So after many years of it being used, they found out that it was biased against African Americans. So auditing is an answer, but it's not a good enough answer because auditing demands us that we let that system work for some time.
So I'll say that if this is the answer we have for now. It would be really important that if before using the systems we could have a really strong phase of testing. We should do testing, we should do monitoring before we let the systems decide. And this testing and monitoring phase should be longer if the systems are deciding things that are more important or closer to the core values of our society like freedom, like decisions that could affect dramatically your life, or these kind of values.
AYUSHI KHEMKA: I also asked Dr. Leins and Judge Ferrari if they can imagine an accountable AI, one that doesn’t discriminate or create new forms of injustice.
Here’s Judge Ferrari.
ISABELA FERRARI: Yeah, we can imagine about something like that the same way we could imagine heaven. I mean I can imagine what is heaven for me. It's probably very different from what heaven in death is, but I can have an idea. So an accountable AI is something that is very far from our reality. And the only thing we can do is have this as a target knowing that the target is probably in a place that is very different from where it'll be when we really get there, if we get there. But I would say that in this situation, the only thing we can do is have this target and give the next step.
I mean it is as if we were in a dark place with a flashlight or with an oil lamp, if you want to be more poetic, and the only thing we can do is take the next step. We don't know exactly where we're heading to, but we know we need to do something to get there. So instead of imagining or trying to imagine something that would be perfect, that is so far from what we have, I think our question should be, "What can I do now with the information I have now, with the technology that I have now, to get closer to this big aim of an accountable AI?" And that's how we're getting step by step closer to this big, and great, and wonderful, and so far from reality, idea.
AYUSHI KHEMKA: Dr. Leins agreed that we are extremely far from a reality in which there is accountable AI. Her answer then led to more questions about the justice system, automation, and terminology.
KOBI LEINS: It's an excellent question and I'd probably start the response like a lot of other fellow experts in this area by saying that AI is not ever going to be accountable, humans are always going to be accountable, and the challenge for those working with these systems is where does that accountability lie? Who is responsible for the building of, the repair of, the connection of, the use of the application of these systems? So Sheila Jasanoff refers to the sociotechnical imaginaries. My sociotechnical imaginary is that we ideally would be in a world where, with these systems, those using them are held to account, that anyone who is harmed by or affected by the system can raise alarm bells or whistleblow as it's more professionally known. That there are also really, I know this is a deeply unpopular view, but that the area is regulated. So for the example of banking, I am working now for one of the most regulated industries and yet that is also one of the industries that has adopted AI with the most speed, because where there is a safety, protections and guardrails, companies are then safe to use these systems.
Where there are no guardrails, concerns are constantly raised, there is outrage and every day, there's a new story. Today, there was a story about AI that's been developed to identify skin color that's built into a lot of systems. Then we find that those are often in breach of the law and often problematic and undermine human rights, and those are things that obviously we don't want to see. So having those sorts of safeguards in place is necessary because humans ultimately are the ones responsible for these systems, or should be.
AYUSHI KHMEMKA: I'm more thinking in terms of the justice system and certainly companies have their own regulation systems and humans need to be accountable. And so in terms of when there are, think post COVID-19, there is a lot of conversation around regulating and automating a lot of decisions so as to make things faster and process everything faster, there's so much data out there. In those terms, whose responsibility is it to keep those racial biases or gender biases and different sorts of biases in check when it comes to AI?
KOBI LEINS: Well, as a counter-argument, I just came from a conference where with a dead straight face, somewhat on a panel said, "Humans are biased and humans do horrible things all the time, so if these systems aren't worse, that's fine." To which there was an uproar and some fairly outrageous feedback, and justifiably so because the difference is that humans are governed by the laws, the machines are not. It is different, so again, coming back to that human-centric view. But to your point, even just tapping in on one of those thoughts, we talk about biases and I think this is something that we don't talk about enough, is how much we talk past each other. So if you are a data scientist, you understand and know that bias is implicit in your work. Everything is ones and zeros, everything is coded. You are always preferencing something over something else.
And so this idea of removing bias, I think often when we're talking about it, we're talking past each other. Data scientists, really, for them it's quite complicated. I'm seeing a lot of systems now where tools are being sold that are AI to review AI. It's turtles all the way down. How do you even start to critique this because you've then got systems that already have biases reviewing systems that have biases, including tolerances for things like racism, sexism. To the point of the speaker on the panel, it's okay if you're this racist because this is how racist we generally are. That's not really how the law works, nor should it be. We have human rights, we have various laws in many jurisdictions to prevent these kinds of things.
So I think one of the most important points is that we're often talking past each other when we use terminology. Having done a lot of work on the ISO standards with the Standards Australia, there are really a lot of conversations that we're not having about how we not just define these words because that's a very legalistic approach, but what we are actually trying to fix and I think a lot of this comes back to human rights and how we're thinking about complying with fundamental rights of humans.
The other thing which I think we often don't talk about is who's driving these conversations about expediting these processes? We have more humans now than we've ever had in the history of humankind. We are going to have even more. There isn't a shortage of workforce so there's a particular economic imperative to minimize human labor. But at the same time, even as we're talking about doing that, a lot of these systems have a lot of hidden labor in them and people like Luke Munn and Kate Crawford have written about this, saying even though we say these things are automated, when you actually break it down, there's a lot of unseen, lowly paid labor, both in the systems and in the data, often in the data tagging as well.
So I'm not sure if that actually answers your question, but in terms of the actual justice system itself, which I think was your actual question, France, years ago, prohibited any analytics on the judicial decisions. So we know that judges tend to make different decisions before and after lunch for example. We know that they tend to make different decisions based on different factors, and being able to use data means we can analyze those. The French drew a line and said, "No, we don't want that kind of inspection of our work. We actually want the human element to remain."
Now, that itself could be criticized because potentially, there are biases there that are not being seen, but my main concern with automation in the justice system is that law is already an elite sport. Those who practice and understand law are already from a very narrow demographic in society generally, and that's probably globally pretty true. If you add to that data science and the requirement of both skills to understand law and data science to interrogate systems that automate legal systems, we'd have an even more elite, inaccessible series of systems for people to ask about. And I would like people to be able to ask questions. Why is this system being used in this way? How is it affecting me? Why was that decision made that way? Which in the case of a court, not a human, you can ask.
On the other hand, I'm perfectly happy for automation to do the discovery work. I remember spending months as a junior lawyer in basements reviewing thousands of documents manually. That, by all means, if that's something that can be automated, sure. But I think we need to be really careful and thoughtful. It's a very long answer to say I think we need to be really careful and thoughtful, not just looking at the processes but also looking at the meta, sort of why are we automating? What is the driving force behind what we're trying to achieve?
AYUSH KHEMKAI: As Dr. Leins mentioned in this answer, there are ongoing issues in the field of AI related to the inaccessibility of the technology and the highly technical nature of these systems. This is something that Judge Ferrari thinks about as well. Here, she shares some thoughts on how to improve the knowledge of this technology.
ISABELA FERRARI: Usually when we talk about these advancements, there is information available. The problem or the point is not having information available, but being able to understand that that situation exists and that you should search for that information and more, being able to understand what you are being informed. So I would say that there is a duty from developers or from those who are using the technology to make this information available. But we also need to educate our society to understand the importance of understanding that, the importance of searching for that knowledge.
Because if you don't understand the changes that are happening in our society, you don't even know that you should search for that information or that you should be aware of that situation of what they're doing with your data, et cetera. So building this culture is something that must be thought not only in the judiciary system, but much more than that, it's something that should be thought within our educational system as a society to understand that change is happening and that we should adapt to this new moment.
And that in this new moment we should be aware of these kind of things. So whereas I see that many times there is information available, I perceive that many people do not look for that information because they did not yet understand that it is important, that it is important information. It is important to understand what is happening. So I'll start with the educational system. Educate people in a different way because we're heading towards a digital society and this demands change.
AYUSHI KHEMKA: Dr. Leins expanded her answer on how we can make AI more accessible to larger societal questions, explaining why it’s so important to being having these conversations. This led into a wider discussion of data security, leading back to the justice system, and, finally, wrapping up with cyber-resilience.
KOBI LEINS: How do you enable democracy? How do you support human rights? How do you have any kind of accountability around systems that a very small elite are able to ask questions about? And we're already at that point, not just in the justice system but in other systems, and that's part of the concern is who is having the control over these systems, over these conversations?
There are certain conversations which I've written about in the Carnegie Council's AI & Equality Initiative, certain conversations that are not being heard or are being silenced. Missy Cummings, who's a professor in this area, has been heavily criticized for talking about the shortcomings of these systems. There is a very utopian kind of view that these systems will solve all of our problems, which is deeply concerning because these systems bring their own problems with them, and part of that is a certain form of elitism that I think we're not really talking about or engaging with.
AYUSHI KHEMKA: I want to connect this to the other question that I had, and I think this would maybe overlap with the previous question that I've also asked, but I just want to pick your brains on the issue of data security and privacy.
So going back to the legal systems and justice systems and AI trying to automate decisions, in terms of now that I think with COVID, post-COVID, however we want to term it, there's a lot of data out there and online codes have become a thing, but now, we don't know where the data for those online codes is being stored. How can the legal system then manage such concerns of data security, data privacy? Who is using that data and what is it being used for?
KOBI LEINS: So I think there are two parts to your question. One is the data itself and the other is what the systems use or how the systems use the data. I'll start with the second question because it's actually easier to answer I think in some ways.
So a couple of years ago, I co-authored a paper with Tim Baldwin in response to a presentation at a natural language processing conference, which for those who don't do data science, is a form of data science that is used all the time, so language recognition, finding language patterns and using them in increasingly sophisticated ways. And in this conference, a group came forward and presented the idea that you could automate sentencing including through to the death penalty without any human involvement at all. And the group didn't really have an ethics review committee at that point, they just had a review committee but they were concerned by this proposal and suggested that they go away and think about some of the ethical implications of automating this kind of serious impact legal work.
So the group went away and wrote a small paragraph on ethics and the paper was presented as it was, and as a result, concerns were raised in the community about this idea of automation, including to the death penalty without any kind of review. So what data was being taken into account? Was it previous convictions? What if you were potentially misidentified as someone different? Was it based on affiliation? Was it based on other behaviors outside of the courts? It wasn't clear, it wasn't transparent in terms of being fair and being just, and I think often, people assume that law is just and equitable and it's often not.
But so many questions around the actual process arose and we addressed some of those, including looking at some other fields for guidance. So in the use of biology and chemistry, there have already been these concerns, and nuclear as well, concerns of misuse. So what lessons could be learned about transparency, about upskilling those with the specialist expertise to think about risks, to think about dual use, unintended consequences, all those kinds of things?
In terms of the data, I still don't think many have got their head around how much data is actually out there, and particularly the link between civilian and military use, which I've just come off the back of the conference about thinking about what we're clicking on and what we're looking at is being tracked and collated and curated in the data broker ecosystem. Before it even gets to the courts or any kind of legal review, there is so much information, to your point, about us that is out there that although ostensibly is de-identified, can still be connected back to us either as individuals or even as groups, which the Germans have a specific prohibition on in their constitution due to their history for example. So it's not just about me as an individual. It's also what do other people like me, how do they behave and what do they do?
So then you bring that back in, zooming back into the justice system. That raises an incredible number of risks in the public sphere. When it's in a justice system and you're talking about outcomes for people's lives, obviously, the stakes are just that much higher in terms of what does a justice system know about me from a data point of view? There are probably hundreds of thousands of data points about me on the web. Even though I have all of my trackers locked down, I do public speaking and I have outward engagements, so how do you ensure that the correct ones are used, that the correct inferences are drawn, and then to the first piece that I was covering, that the systems are using that data in relevant ways? And that's not even talking about the security of the data, which I think was your actual question. How do you ensure that that data is then not shared? That it's secure? There is a lot. That's a really, really big question.
AYUSHI KHEMKA: No, I'm sure, but this definitely does explain a lot of it. And I think I would want to sum up by asking you my final question, and I read your work around cybersecurity and cyber resilience, so could you talk about that a bit? What does it mean? How can we practice cybersecurity, cyber resilience as general laypersons with no practical or academic or industrial knowledge around both the things?
KOBI LEINS: Well, to a certain extent, I think we need to flip the question. I think it's on governments to educate citizens about data literacy, and Estonia is a really good example where obviously, they've been in the direct line of cyber resilience, a need for cyber resilience directly, and so they've had some fairly simple and straightforward courses that have upskilled their population. This doesn't answer your part of the question but I think one of the biggest things that's missing in cyber resilience and security, which is becoming more apparent, is the need for diversity. That diversity is not a nice to have, it's actually absolutely imperative.
I have this flashback to the launch of AI in Germany, a launch of the AI strategy, and they had a panel on security which had nine white men. Now, you're not going to capture all the risks with that, and I'm not saying there aren't different views amongst that panel, but many minoritized groups have very different perspectives on what the risks are, and it took me years to recognize a lot of my own perception of risk comes from being half German and being raised with this acute awareness of the role of IBM and the use of punch cards in tracking people during the Holocaust, the way that you can collect and misuse data.
In terms of individuals, I think it's really hard. My parents don't understand, don't know how to protect themselves, and so there's a combination of some of this needs to come from higher authorities, some of it needs to be trained. I don't think we have anywhere near the data literacy or the cyber literacy that we should be having in societies, particularly given the impact that it's having on, again, democracy as we see people not only being pushed to extremes, but also, and more recently a paper came out, that we're being pushed into bubbles so that we are just hanging out with the same people, having the same conversations, not challenging ourselves.
So beyond the practical, where does data go? How can data be hacked? There's also how are our behaviors being changed by our online activity? Which I think is just as important as that question. Sorry, I don't have a simple answer. I wish I could say, "Take your grandma to this website and it will all be good," but actually, it's really complicated and I think most people just don't understand what's at stake.
AYUSHI KHEMKA: That was Dr. Kobi Leins and Judge Isabela Ferrari on all things AI and accessibility. Thanks for listening.
CORDEL GREEN: The AI4IA Conference and the podcast series are being hosted in collaboration with AI4Society and the Kule Institute for Advanced Studies, both at the University of Alberta, the Centre for New Economic Diplomacy at the Observer Research Foundation in India, and the Broadcasting Commission of Jamaica.