Once upon a time we used to say “read my lips”. Well those times will be gone soon. Say thanks to AI and Deep Learning.
Several researchers teams around the world have been working on rendering lips movements and animating the facial expressions to be in synch with the words being spoken.
One application would be to help people who have problem in hearing to read the spoken words on the lips of a character that is reconstructed at the receiving end. The calling party speak on a normal phone and the receiving party can look at the lips to “see” the words spoken. The face can even be the real face of the calling partner, taken from a photo.
Another application can be in gaming and entertainment, making cartoon characters talk with real facial expressions and correct lips movements, cutting production cost and speeding up the development.
Voice can also be imitated by programs, to the point of becoming indistinguishable from the original one, with all the inflections and habits that are typical of that particular person.
To achieve these amazing feat (watch the video) researchers analyse video clips of that person, and her voice. The analyses is done with Artificial Intelligence and Deep Learning methods leading to amazing fidelity. Now, this is all great but…
How can you tell if what you see and what you hear are actually coming from the person you see and hear? Basically, you can’t. Researchers are saying that the only way to catch the fake is to run AI and Deep Learning applications to analyse the video. They would be way smarter than us in detecting the fraud. Well, it is the old: it takes a thief to catch a thief… Nothing really new under the Sun.
The only thing is that there are many more thieves than in the past, and we are the ones unleashing them!