In the primordial soup that was early Earth, life started small. Elements joined to form the simple carbon-based molecules that were the precursors of everything that was to come. But there is debate about the next step.

One popular hypothesis suggests that ribonucleic acid (RNA) molecules, which contain the genetic blueprints for proteins and can perform simple chemical reactions, kick-started life. Some scientists refute this idea, however, saying RNA is too large and complex a molecule to have started it all. That group says simpler molecules had to evolve the ability to perform metabolic functions before macromolecules such as RNA could be built. This idea is appropriately named "metabolism-first," and new evidence out of the University of Illinois backs it up. 

"All living organisms have a metabolism, a set of life-sustaining chemical transformations that provide the energy and matter needed for the functions of the cell. These metabolic transformations are assumed to have occurred very early in life, in primitive Earth. Organisms probably replaced chemical reactions already going on in the planet and internalized them into cells through development of enzymatic activities," says Gustavo Caetano-Anollés, bioinformatician and professor in the Department of Crop Sciences at U of I. 

Caetano-Anollés and Ibrahim Koç, a visiting scholar in the department, found evidence for the "metabolism-first" hypothesis by studying the evolution of molecular functions in organisms representing all realms of life. For 249 organisms, their genomes - or complete set of genes - were available in a searchable database. What's unique about this particular resource, known as the Gene Ontology (GO) database, is the fact that for each gene product - a protein or RNA molecule - a set of terms describing its function goes with it. 

"You can take an entire genome that represents an organism, like the human genome, and visualize it through the collection of functionalities of its genes. The study of these 'functionomes' tells us what genes do, instead of focusing on their names and locations. For example, we can find out what kinds of catalytic, recognition, or binding activities a gene product has, which is much more intuitive," Caetano-Anollés notes. "The best way to understand an organism is through its functions."

According to Caetano-Anollés, the number of times a function appears in a genome provides historical information. So the team took the GO terms describing all of the molecular functions in each organism and counted them up. The idea was that an ancient function, such as the catalytic activity of metabolism, is likely shared by all organisms and will be found in large numbers. On the other hand, more recent functions are found in lower numbers and in smaller subsets of organisms. 

The team used the information and advanced supercomputational methods to construct a tree that traced the most likely evolutionary path of molecular functions through time. At the base of the tree, close to its roots, were the most ancient functions. The most recent were close to the crown.

At the base of the tree, corresponding to the origin of life on Earth, were functions related to metabolism and binding. "It is logical that these two functions started very early because molecules first needed to generate energy through metabolism and had to interact with other molecules through binding," Caetano-Anollés