Be careful of the Potato Chips bag

"Any sufficiently advanced technology is indistinguishable from magic". 
I couldn’t help but remember this phrase from Arthur Clarke as I read the MIT news on a way to extract sound from video.
Researchers have been able through better and better signal processing to extract information on your pulse by having a cell phone camera filming your face. The tiny changes in hues that your skin undergoes as more or less blood flows in your face capillary system in response to the heart beat are providing the information required. That was an amazing feat but it did not strike me as "impossible".
But now further sophistication in signal processing has allowed a joint team of MIT, Microsoft and Adobe researchers to extract sound information by looking at the extremely tiny vibration that sounds waves create on objects, like a simple potato chip bag. And this is really magic!
In order to reconstruct the sound the camera has to film the object at a frame rate that is higher than the sound frequency one wants to recover. In the experiment they placed a person in a soundproof room with a soundproof glass and filmed the potato chip bag from outside the room at a distance of 5 m as the person talked. Then they analysed the micro vibrations of the potato chip bag and reconstructed the sound, the voice of that person.
This requires a special camera, able of frame rates in the order of 60 thousands frames per second, well above the ones we have in our smart phones, but below what special cameras can do (reaching up to 100,000 frames per second).
They also proved that with a normal camera, filming at 60 frames per second (something some cell phones can do today), they can derive a signature of the voice of a person letting them tell if it was a he or a she talking and even matching that sound with the voice of a specific person whose signature is on file. This is not easy at all, since 60 frames per second is really low but the researchers have found a way to examine neighbour pixels that results in a multiplication of the frequency used.
Also, the detection of such tiny vibrations is a real challenge. Clearly it is beyond our 
Interestingly, they have also been able to determine, just looking at the video, the material of which the object was made, e.g. is the potato chip bag made of plastic, tin or paper foil? It turns out that different materials absorb sound waves differently and each one results in a specific signature that leads to the identification of the substance.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the New Initiative Committee and co-chairs the Digital Reality Initiative. He is a member of the IEEE in 2050 Ad Hoc Committee. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.