Thumbnail for When AI Can Fake Reality, Who Can You Trust? | Sam Gregory | TED by TED

When AI Can Fake Reality, Who Can You Trust? | Sam Gregory | TED

TED

12m 3s1,916 words~10 min read
Auto-Generated

[0:04]It's getting harder, isn't it? To spot real from fake, AI generated from human generated. With generative AI along with other advances in deep fakery, it doesn't take many seconds of your voice, many images of your face to fake you, and the realism keeps increasing. I first started working on deep fakes in 2017, when the threat to our trust in information was overhyped, and the big harm in reality was falsified sexual images. Now, that problem keeps growing, harming women and girls worldwide. But also with advances in generative AI, we're now also approaching a world where it's broadly easier to make fake reality, but also to dismiss reality as possibly faked. Now, deceptive and malicious audio visual AI is not the root of our societal problems, but it's likely to contribute to them. Audio clones are proliferating in a range of electoral contexts. Is it isn't it claims, cloud human rights evidence from war zones. Sexual deep fakes target women in public and in private, and synthetic avatars impersonate news anchors. I lead witness. We're a human rights group that helps people use video and technology to protect and defend their rights. And for the last five years, we've coordinated a global effort, prepare, don't panic, around these new ways to manipulate and synthesize reality and on how to fortify the truth of critical front line journalists and human rights defenders. Now, one element in that is a deep fake rapid response task force, made up of media forensics experts and companies who donate their time and skills to debunk deep fakes and claims of deep fakes. The task force recently received three audio clips from Sudan, West Africa and India. People were claiming that the clips were deep fake, not real. In the Sudan case, experts used a machine learning algorithm trained on over a million examples of synthetic speech to prove almost without a shadow of a doubt that it was authentic. In the West Africa case, they couldn't reach a definitive conclusion because of the challenges of analyzing audio from Twitter and with background noise. The third clip was leaked audio of a politician from India. Nilesh Christopher of rest of the world brought the case to the task force. The experts used almost an hour of samples to develop a personalized model of the politician's authentic voice. Despite his loud and fast claims that it was all falsified with AI, experts concluded that it at least was partially real, not AI. As you can see, even experts cannot rapidly and conclusively separate truth and false, and the ease of calling that's deep faked on something real is increasing. The future is full of profound challenges both in protecting the real and detecting the fake. We're already seeing the warning signs of this challenge of discerning fact from fiction. Audio and video deep fakes have targeted politicians, major political leaders in the EU, Turkey and Mexico, and US mayoral candidates. Political ads are incorporating footage of events that never happened, and people are sharing AI generated imagery from crisis zones claiming it to be real. Now again, this problem is not entirely new. The human rights defenders and journalists I work with are used to having their stories dismissed, and they're used to widespread deceptive shallow fakes, videos and images taken from one context or time or place and claimed as if they're in another. Used to share confusions, spread disinformation. And of course we live in a world that is full of partisanship and plentiful confirmation bias. Given all that, the last thing we need is a diminishing baseline of the shared trustworthy information upon which democracies thrive, where the spectrum of AI is used to plausibly believe things you want to believe and plausibly deny things you want to ignore. But I think there's a way we can prevent that future if we act now. That if we prepare, don't panic, we'll kind of make our way through this, somehow. Panic won't serve us well, plays into the hands of governments and corporations who will abuse our fears. And into the hands of people who want a fog of confusion and will use AI as an excuse. How many people were taken in just for a minute by the Pope in his dripped out puff jacket? You can admit it. More seriously, how many of you know someone who's been scammed by an audio that sounds like their kid? And for those of you who are thinking, I wasn't taken in, I know how to spot a deep fake. Any tip you know now is already outdated. Deep fakes didn't blink, they do now. Six-fingered hands are were more common in deep fake land than real life, not so much. Technical advances erase those visible and audible clues that we so desperately want to hang on to as proof we can discern real from fake. But it also really shouldn't be on us to make that guess without any help. Between real deep fakes and claimed deep fakes, we need big picture structural solutions. We need robust foundations to enable us to discern authentic from simulated, tools to fortify the credibility of critical voices and images, and powerful detection technology that doesn't raise more doubts than it fixes. There are three steps we need to take to get to that future. Step one is to ensure that detection skills and tools are in the hands of the people who need them. I've talked to hundreds of journalists, community leaders and human rights defenders, and they're in the same boat as you and me and us. They're listening really closely to the audio trying to think, can I spot a glitch? Looking at the image saying, oh, does that look right or not? Or maybe they're going online to find a detector, and the detector they find, they don't know whether they're getting a false positive, a false negative, or a reliable result. Here's an example. I used a detector which got the Pope in the puffer jacket, right? But then when I put in the Easter Bunny image that I made for my kids, it said that it was human generated. This is because of some big challenges in deep fake detection. Detection tools often only work on one single way to make a deep fake, so you need multiple tools, and they don't work well on low quality social media content. Confidence score, 0.76, 0.87. How do you know whether that's reliable if you don't know if the underlying technology is reliable, or whether it works on the manipulation that has been used? And tools to spot an AI manipulation don't spot a manual edit. These tools also won't be available to everyone. There's a trade-off between security and access, which means if we make them available to anyone, they become useless to everybody. Because the people designing the new deception techniques will test them on the publicly available detectors and evade them. But we do need to make sure these are available to the journalists, the community leaders, the election officiers globally, who are our first line of defense. Thought through with attention to real-world accessibility and use. Though, at the best circumstances, detection tools will be 85 to 90% effective, they have to be in the hands of that first line of defense, and they're not right now. So for step one, I've been talking about detection after the fact, step two, AI is going to be everywhere in our communication, creating, changing, editing. It's not going to be a simple binary of, yes, it's AI or, whew, it's not. AI is part of all of our communication. So we need to better understand the recipe of what we're consuming. Some people call this content provenance and disclosure. Technologists have been building ways to add invisible watermarking to AI generated media. They've also been designing ways and I've been part of these efforts within a standard called the C2PA to add cryptographically signed metadata to files. This means data that provides details about the content, cryptographically signed in a way that reinforces our trust in that information. It's an updating record of how AI was used to create or edit it, where humans and other technologies were involved, and how it was distributed. It's basically a recipe and serving instructions for the mix of AI and human that's in what you're seeing and hearing. And it's a critical part of a new AI infused media literacy. And this actually shouldn't sound that crazy. Our communication is moving in this direction already. If you're like me, you can admit it, you browse your TikTok for you page, and you're used to seeing videos that have an audio source, an AI filter, a green screen, a background, a stitch with another edit. This, in some sense, is the alpha version of this transparency in some of the major platforms we use today. It's just that it does not yet travel across the internet, it's not reliable, it's not updatable, and it's not secure. Now, there are also big challenges in this type of infrastructure for authenticity. As we create these durable signs of how AI and human were mixed, that carry across the trajectory of how media is made, we need to ensure they don't compromise privacy or backfire globally. We have to get this right. We can't oblige a citizen journalist filming in a repressive context, or a satirical maker using novel gen AI tools to parry the powerful, to have to disclose their identity or personally identifiable information, in order to use their camera or chat GPT. Because it's important they be able to retain their ability to have anonymity at the same time as the tool to create is transparent. This needs to be about the how of AI human media making, not the who. This brings me to the final step. None of this works without a pipeline of responsibility that runs from the foundation models and the open source projects, through to the way that is deployed into systems, APIs and apps, to the platforms where we consume media and communicate. I've spent much of the last 15 years fighting essentially a rear guard action, like so many of my colleagues in the human rights world against the failures of social media. We can't make those mistakes again in this next generation of technology. What this means is that governments need to ensure that within this pipeline of responsibility for AI, there is transparency, accountability and liability. Without these three steps, detection for the people who need it most, provence that is rights respecting, and that pipeline of responsibility, we're going to get stuck looking in vain for the six-fingered hand or the eyes that don't blink. We need to take these steps, otherwise we risk a world where it gets easier and easier to both fake reality and dismiss reality as potentially faked. And that is a world that the political philosopher Hannah Arendt described in these terms, a people that no longer can believe anything cannot make up its own mind. It is deprived not only of its capacity to act but also of its capacity to think and to judge. And with such a people you can then do what you please. That's a world I know none of us want and that I think we can prevent. Thanks.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript