AI for protein folding

A protein is a complex molecule whose interaction with other molecules depends on the way it folds. Image credit: Alphafold, Google

Proteins are a crucial component of our body. It is always difficult to make rankings but if you want a ranking I would say that proteins make the number 1 place in terms of importance. A protein consist of long sequence of amino acids,  there are 20 different of them that can be combined in very many ways to form a protein, and yes, they are the result of the transcription of the DNA code in our genome into (messenger) RNA that is turned into a specific protein by a ribosomes in our cells.

The sequence of the amino acids is very important because it results in very specific folding of the protein (see the ribbon like structures in the image). Understanding this folding means to understand the protein since the folding gives a “shape” to the protein and this “shape” is what makes the protein able to connect to other molecules. Viruses and bacteria attack specific proteins in the cell membrane and drugs work by creating a sort of shield that does not allow the noxious agent to attach to the protein.

Today, and more so in the future drugs are being designed using computers looking for places in proteins (our and the ones of bacteria/viruses) where it can be effective.

he problem is that working out the folding of a protein is extremely difficult and requires supercomputers support.

Back in 2005 a software -Rosetta Commons- was released  by Baker Labs at University of Washington to harvest the processing power of computers al around the world and by 2016 it comprised over sixty thousands volunteer computers (from privates people) delivering an average of 210 TeraFLOPS. If you want to participate in the project –Rosetta@Home– take a Rosetta tutorial.

The Human Proteome project is aiming at creating an atlas of all the proteins we have in our body and part of this endeavour is to identify their folding.

Now Google has announced that DeepMind can be used to predict the folding of proteins, thus greatly accelerating the study of the human proteome and the research form new drugs. Predicting the folding by looking at the amino acids sequence is extremely complex since each one interacts with all the others and as protein folds the strength of each on the other changes.

So far evaluating the possible foldings of a protein is a trial and error endeavour. By using artificial intelligence to get hints on possible folding means to greatly reduce the number of trials.

DeepMind has been trained using the large, and growing, amount of data on genome sequencing. These data have been analysed by DeepMind using deep learning techniques resulting in AlphaFold, the program that can be used to predict the possible folding of a protein with a given amino acids sequence. The program was entered into the CASP (Critical Assessment of Structure Prediction) competition and was ranked first! (AlphaFold is labelled A7D in the contestants’ list).

This is just one more area where artificial intelligence is taking an edge on us… What is impressive is that the intelligence is resulting from the capability of the machine to learn from data, rather than from been programmed.

About Roberto Saracco

Roberto Saracco fell in love with technology and its implications long time ago. His background is in math and computer science. Until April 2017 he led the EIT Digital Italian Node and then was head of the Industrial Doctoral School of EIT Digital up to September 2018. Previously, up to December 2011 he was the Director of the Telecom Italia Future Centre in Venice, looking at the interplay of technology evolution, economics and society. At the turn of the century he led a World Bank-Infodev project to stimulate entrepreneurship in Latin America. He is a senior member of IEEE where he leads the New Initiative Committee and co-chairs the Digital Reality Initiative. He is a member of the IEEE in 2050 Ad Hoc Committee. He teaches a Master course on Technology Forecasting and Market impact at the University of Trento. He has published over 100 papers in journals and magazines and 14 books.