Proteins are a crucial component of our body. It is always difficult to make rankings but if you want a ranking I would say that proteins make the number 1 place in terms of importance. A protein consist of long sequence of amino acids, there are 20 different of them that can be combined in very many ways to form a protein, and yes, they are the result of the transcription of the DNA code in our genome into (messenger) RNA that is turned into a specific protein by a ribosomes in our cells.
The sequence of the amino acids is very important because it results in very specific folding of the protein (see the ribbon like structures in the image). Understanding this folding means to understand the protein since the folding gives a “shape” to the protein and this “shape” is what makes the protein able to connect to other molecules. Viruses and bacteria attack specific proteins in the cell membrane and drugs work by creating a sort of shield that does not allow the noxious agent to attach to the protein.
Today, and more so in the future drugs are being designed using computers looking for places in proteins (our and the ones of bacteria/viruses) where it can be effective.
he problem is that working out the folding of a protein is extremely difficult and requires supercomputers support.
Back in 2005 a software -Rosetta Commons- was released by Baker Labs at University of Washington to harvest the processing power of computers al around the world and by 2016 it comprised over sixty thousands volunteer computers (from privates people) delivering an average of 210 TeraFLOPS. If you want to participate in the project –Rosetta@Home– take a Rosetta tutorial.
The Human Proteome project is aiming at creating an atlas of all the proteins we have in our body and part of this endeavour is to identify their folding.
Now Google has announced that DeepMind can be used to predict the folding of proteins, thus greatly accelerating the study of the human proteome and the research form new drugs. Predicting the folding by looking at the amino acids sequence is extremely complex since each one interacts with all the others and as protein folds the strength of each on the other changes.
So far evaluating the possible foldings of a protein is a trial and error endeavour. By using artificial intelligence to get hints on possible folding means to greatly reduce the number of trials.
DeepMind has been trained using the large, and growing, amount of data on genome sequencing. These data have been analysed by DeepMind using deep learning techniques resulting in AlphaFold, the program that can be used to predict the possible folding of a protein with a given amino acids sequence. The program was entered into the CASP (Critical Assessment of Structure Prediction) competition and was ranked first! (AlphaFold is labelled A7D in the contestants’ list).
This is just one more area where artificial intelligence is taking an edge on us… What is impressive is that the intelligence is resulting from the capability of the machine to learn from data, rather than from been programmed.