Skip directly to content

The sunrise of digital media forgery, and the sunset of trust

on Fri, 06/08/2018 - 17:59

2018 - 2020 are going to be years that will be seen in history as the dawn of an era when the only way you can trust what you are hearing / seeing someone say is to see it in-person.  It will be the sunest of the era when you can trust that what you are hearing from a person speaking who is captured and rebroadcast digitally is actually what they said.

This is because the application of machine learning and other artificial intelligence techniques is advancing into areas we (society) have historically taken for granted. Three things combined over the last several days to give me pause.

  • Manipulated video.  Researchers collaborating across institutions have created software that can process a video image of one person speaking, and render that person's facial expressions accurately onto another person's face. The dry, scholarly demonstration video is jaw-dropping to watch. This technology can be used to create a video that accurately makes it appear that any person said something that they never actually said.

Imagine, for instance, that a political action committee (PAC) for any political office used this to fabricate their opponent saying something damning that opponent did not in fact say.  Even if the PAC owned up and said "This is just a simulation" in text below the image, voters could easily be left with the perception the candidate actually said the damning thing.

The same technique could also be used in legal cases, or to ruin the reputation of innocent, otherwise-upstanding people.  For someone with enough motivation, there will soon be a technique to fabricate reality.

  • Manipulated audio.  Last week I heard the investor "pitch" from a startup company seeking capital from an angel group to which I belong. This company can do for the spoken word what the group above does for video: The software can convert the spoken word of one person into the spoken word of another quite accurately.  The active speaker can speak using the words, rhythym & pace of their own, and the software creates a resulting audio file that renders that speech in the voice of, say, Jeff Bezos. Or, Barach Obama. Or any other speaker from whom the company can obtain a small, clean sample set of spoken speech on which its machine learning algorithms can train.

Imagine a rogue state (nation) creating an audio of another nation's leader saying something damning, and combing that with the manipulated video image above. Now, it is not simply the image accurately reflecting the fabricated dialog; the voice you hear on the "recorded" video sounds accurately like the leader the rogue state is trying to damage. Post this on YouTube, and the damage is done.

And the same application to legal or reputation situations above applies here, too.

  • Fake news allegations. The use of the term "fake news" to generate mistrust in media reporting has already eroded consumer trust across the world. The ability to simply claim bias on the part of a reporter has raised the bar for the accuracy and diligence a reporter now must use prior to telling a story.  And at a time when that deeper diligence is required, the ability for that reporter to trust what she sees in digital form will be rapidly disappearing.

To their credit, the startup company that gave me the presentation of manipulated audio is adding (invisible) audio-watermarking to their technology, and providing tools to detect whether a given audio file has their watermark. However, it must be stated that if these founders can build this technology, others can, too - and those others are not obligated to add the audio-watermarking. The rogue state actor exemplified above would not likely be interested in adding such fraud detection.

This can certainly leave us concerned. Just at the moment when many of us feel unmoored, unable to trust institutions, we often turn to seeing "what somebody actually said" - vs. what they're rumored / reported to have said - as the way to ground our beliefs.  I fear that ability to get properly grounded is about to disappear.

This leaves me with two final things to consider:

  1. I don't know how we re-create trust.  (And by "media", I mean the files-- not the institutions.) And no, I'm not at all thinking blockchain solves the problem.
  2. The next IT security software companies to get really big will be those companies providing software to detect digital media forgeries.

Post new comment