Technology is opening up unexpected possibilities that in turns may lead to undesired results.
This is surely the case with the results obtained in a collaboration between Princeton and Adobe: the editing of voice messages.
The joint team has created a voice editor, VoCo, that works exactly like a text editor. With a text editor you can scroll, or search, for a specific word and you may delete it or change it with another word. VoCo lets you manipulate spoken messages. Take a look at the clip and see what can be done.
Although it has been possible for many years to edit digital sound, the editing of voices has proven challenging, probably because we are so good in detecting nuances in voices that any glitch, doesn’t matter how tiny it is, is immediately detected. The problem is that there is not a “sound” corresponding to a given word. The sound depends on what words are in that sentence, what is the emotion associated by the speaker to the sentence, basically, every single sentence we utter has its own “music” and this is reflect by the words composing it. Hence editing a voice message requires the capability to extract, in a way, the emotion impressed by the speaker, and reinsert that “emotion” in the word you are substituting. Like wise, you cannot delete a word by just deleting its sound. anybody would immediately spot that there is something wrong in that sentence.
VoCo is based on a sophisticated algorithm that learns the characteristics of the speaker in that specific sentence and is able to replicate it in the introduction of a new word seamlessly joining the word to the previous and following ones.
Now, all this is great. At the same time it is not difficult to imaging some bad guy hacking your voice and introducing changes that may completely reverse the sense of the sentence, and the intention of the speaker. And yet, the voice is the one of the person you know, it is him who said that.
In the US sometimes people say “read my lips”. Well, that might be the answer to detecting someone has tampered with your voice. Your lips will no more be in synch… Unfortunately, a program developed at Disney Research, FaceDirector, can take care of that: changing the facial expression and the lips movement to synchronise the video with the sound!
We are really entering a new dimension where the virtual and the real get blurred and trust is challenged ever more.