Jump to content
Powered by

Protein folding to unravel the origin of life

Computer analyses of protein folding have shed light on the evolution of early life on earth. Researchers from the Heidelberg Institute for Theoretical Studies and the University of Illinois, USA, have examined the folding speed of the domains of proteins and found that there has been a trend towards the optimisation of protein folding since their appearance 3.8 billion years ago. 1.5 billion years ago, more complex domain structures and multi-domain proteins emerged and caused a ‘Big Bang’ of proteins.

Different types of protein folding © C. Debes, HITS

In order for proteins to fulfil their functions, the protein’s long amino acid chains have to fold into three-dimensional structures. These structures consist of domains, i.e. compact protein regions whose evolution, structure and function differs from those of other regions. Proteins usually consist of several domains that are combined into complex arrangements. Depending on their amino acid make-up, the domains or parts thereof can fold into characteristic secondary structures, the most important being α-helices and β-sheets. The speed with which the protein structures fold into functional units varies considerably: from microseconds to several hours.

Classification of protein domains

Dr. Frauke Gräter - head of the research group “Molecular Biomechanics“ © HITS

Together with her colleague Prof. Gustavo Caetano-Anollés at the University of Illinois at Urbana-Champaign, chemist Dr. Frauke Gräter, head of the research group Molecular Biomechanics at the Heidelberg Institute for Theoretical Studies (HITS), used comprehensive computer analyses to examine the folding speed of all currently known protein domains. The researchers used 92,000 domains defined by the Structural Classification of Proteins (SCOP), which were derived from 989 fully sequenced genomes which were classified according to age. The researchers’ analyses were based on phylogenomic trees built by Caetano-Anollés from the protein domains defined by SCOP and used to describe the early history of the protein world. The appearance of domains, which obeys a molecular clock, was used to determine the appearance of proteins, the dynamics of domain organisation in proteins and the first appearance of free oxygen on earth. 

Gräter and her colleague Cedric Debes developed a mathematical model on the basis of experimentally determined structures to predict the theoretical folding rate of proteins. A computer simulation calculating the folding speed of proteins from their denatured into their native, functional conformation on the basis of all atoms would by far exceed computer capacities. This is why the researchers from Heidelberg used the concept of “size-modified contact order” (SMCO), which showed an improved correlation with the folding speed that was previously determined in experiments. SMCO predicts how fast these intramolecular contact points will meet and thus how fast the protein will fold. According to Gräter, this is the first analysis to combine all known protein structures and genomes with folding rates as a physical parameter. 

From early life to the Big Bang of proteins

Prof. Dr. Gustavo Caetano-Anollés, Evolutionary Bioinformatics Laboratory, University of Illinois, Urbana, Illinois, USA © University of Illinois

Experts believe that the first life forms and the first proteins emerged around 3.8 billion years ago. Some rocks – for example in Greenland and Western Australia – dating back to the time when the destructive bombardment of meteorites had ceased and the earth’s crust had cooled, leading to the emergence of primeval continents and oceans, can still be found. A carbon isotope ratio (13C/12C) that is characteristic of life processes and differs from that of mineral carbon depositions has been found in tiny graphite and carbonate depositions in the resistant crystals in some of these rocks. The oldest bacteria-like cells – or what are believed to be bacteria-like cells – were discovered in 3.5-billion-year-old rocks. Microfossils of bacteria and archaea, including photosynthetic cyanobacteria that started to enrich the atmosphere with oxygen, were discovered in rocks that were two billion years old.  

Gräter’s and Caetano-Anollés’ investigations reveal that the function of proteins was gradually optimised from achaea to multicellular organisms during the long period of time from 3.8 to 1.5 billion years ago. The researchers found that protein folding speed increased (and SMCO decreased) and that protein domains became shorter over the course of evolution, from an average length of 300 amino acids to less than 150.

At that time, there was a trend reversal in both the foldability and the length of the domains. But above all, an almost explosive increase in the number of domain architectures and rearrangements in multi-domain proteins occurred, which was mainly triggered by increased rates of domain fusion and fission. While one-domain proteins had previously dominated, rearrangements of multi-domain proteins now took over. Caetano-Anollés therefore refers to the period prior to 1.5 billion years ago as the ‘Big Bang’ of proteins.

Emergence of eukaryotes

Proteins are highly complex three-dimensional structures consisting of many domains. The different colours show domains of a different evolutionary age. © G. Caetano-Anollés, University of Illinois

The researchers speculate that the development of slower-folding protein structures is due to an increase in the number of β-sheet structures in comparison to the faster folding α-helix structures. Alternatively, the observed slow-down after the ‘Big Bang’ might also be related to the appearance of protein architectures like chaperones that help proteins to fold and repair misfolded proteins. Of great importance is the observation that protein architectures specific to eukaryotes appeared around 1.5 billion years ago. Eukaryotes have a more elaborate protein synthesis machinery, involving enzymes for posttranslational modification, than prokaryotes. This machinery may have mitigated the constraints for faster folding as it prevented misfolding and aggregation prior to attaining the native fold. Although a further increase in the folding speed of proteins has no longer been in the foreground, the tendency towards higher speed in protein folding has since remained. The researchers speculate that proteins are less susceptible to aggregation, as is known for misfolded proteins that are the hallmark of Alzheimer’s disease and prion diseases.

It seems reasonable to suppose that the biological ‘Big Bang’ correlates with the emergence of eukaryotes, the third kingdom of organisms, which has attained a new degree of organisation and complexity vis-à-vis the previously dominant kingdoms of bacteria and archaea. However, very little palaeontological evidence is available to substantiate this assumption. The fossil record of an alga, which is accepted by most experts as the earliest stage of eukaryote evolution, was found in 1.25-billion-year-old sediments in Arctic Canada. But maybe another lucky find will one day confirm the rock-based evidence of the profound transformation of organisms that happened after the biological ‘Big Bang’ 1.5 billion years ago.

Cédric Debès, Minglei Wang, Gustavo Caetano-Anollès, Frauke Gräter: Evolutionary Optimization of Protein Folding. PLOS Computational Biology 9(1), 1-9; 2013.

Website address: https://www.gesundheitsindustrie-bw.de/en/article/news/protein-folding-to-unravel-the-origin-of-life