The amino acid (green) slithers into the chemical reaction center, moving through an evolutionarily ancient corridor of the ribosome (purple). The amino acid is delivered to the reaction core by the transfer RNA molecule (yellow).
The amino acid (green) slithers into the chemical reaction center, moving through an evolutionarily ancient corridor of the ribosome (purple). The amino acid is delivered to the reaction core by the transfer RNA molecule (yellow).

Largest Computational Biology Simulation Mimics The Ribosome

Researchers at Los Alamos National Laboratory have set a new world's record by performing the first million-atom computer simulation in biology. Using the "Q Machine" supercomputer, Los Alamos computer scientists have created a molecular simulation of the cell's protein-making structure, the ribosome. The project, simulating 2.64 million atoms in motion, is more than six times larger than any biological simulations performed to date. Today, the effort is featured in a paper in the Proceedings of the National Academy of Sciences.

The ribosome is a living factory, the essential element within cells that creates proteins by decoding each protein type's specific recipe that is stored within messenger RNA. Ribosomes are a fundamental model for future nano-machines, producing the protein building blocks of all living tissue. Credit: Los Alamos National Laboratory

The ribosome is the ancient molecular factory responsible for synthesizing proteins in all organisms. Using the new tool, the Los Alamos team led by Kevin Sanbonmatsu is the first to observe the entire ribosome in motion at atomic detail. This first simulation of the ribosome offers a new method for identifying potential antibiotic targets for such diseases as anthrax. Until now, only static, snapshot structures of the ribosome have been available.

Sanbonmatsu posits that this technique offers a powerful new tool for understanding molecular machines and improving the efficacy of antibiotics. Antibiotic drugs are less than one one-thousandth the size of the ribosome and act like a monkey-wrench in the machinery of the cell. Such drugs diffuse into the most critical sites of this molecular machine and grind the inner working of the ribosome to a halt.

"Designing drugs based on only static structures of the ribosome might be akin to intercepting a missile knowing only the launch location and the target location with no radar information. Our simulations enable us to map out the path of the missile's trajectory," Sanbonmatsu said. "The methods and implications lie at the interface between biochemistry, computer science, molecular biology, physics, structural biology and materials science," said Sanbonmatsu. "I believe the results serve as a proof-of-principle for materials scientists, chemists and physicists performing similar simulations of artificial molecular machines in the emerging field of nano-scale information processing. Sanbonmatu's study focuses on decoding, the essential phase during protein synthesis within the cell wherein information transfers from RNA to protein, completing the information flow specified by Francis Crick in 1958 and known as the Central Dogma of Molecular Biology. "The ribosome is, in fact, a nano-scale computer and is very much analogous to the 'CPU' of the cell," he said.

The ribosome is so fundamental to life that many portions of this molecular machine are identical in every organism ever genetically sequenced. In developing the project, the team identified a corridor inside the ribosome that the transfer RNA must pass through for the decoding to occur, and it appears to be constructed almost entirely of universal bases, implying that it is evolutionarily ancient. The corridor represents a new region of the ribosome containing a variety of potential new antibiotic targets. The simulations also reveal that the essential translating molecule, transfer RNA, must be flexible in two places for decoding to occur, furthering the growing belief that transfer RNA is a major player in the machine-like movement of the ribosome. The simulation also sets the stage for future biochemical research into decoding by identifying 20 universally conserved ribosomal bases important for accommodation, as well as a new structural gate, which may act as a control mechanism during transfer RNA selection.

The aminoacyl-transfer-RNA (red) caught in the act of delivering its amino acid to the growing protein hanging off the peptidyl-transfer-RNA (yellow). The ribosome (large subunit in white and small subunit in cyan) uses the transfer RNA molecules to read the genetic information from the messenger RNA (green). Water molecules are shown in blue. For visualization purposes, only 1 of every 10 water molecules are shown and the top portion of the ribosome is cut away so that the transfer RNA molecules are visible. Credit: Los Alamos National Laboratory

The multi-million-atom simulation was run on 768 of the "Q" machine's 8,192 available processors. Sanbonmatsu worked to develop the simulation with Chang-Shung Tung of Los Alamos, as well as Simpson Joseph of the University of California at San Diego. Funding for the research was provided by the National Institutes of Health, Los Alamos National Laboratory's research and development fund, and support from the Laboratory's Institutional Computing Project.

SDSC Researchers Accurately Predict Protein Docking

Computational biologist Lynn Ten Eyck and colleagues at the San Diego Supercomputer Center (SDSC) at UC San Diego have used software known as DOT to produce accurate predictions of protein-protein interaction as part of the Critical Assessment of PRedicted Interactions (CAPRI), an ongoing evaluation of docking algorithms. CAPRI is a community-organized experiment hosted at the European Bioinformatics Institute. The SDSC group's entries were among the most accurate submitted in the seventh round of CAPRI.

Accurately Predicting Protein Docking - This protein-protein complex was judged best in round seven of the CAPRI experiment. The structure shown is a blind test prediction using the DOT program developed at SDSC. The small dots show the degree of surface complementarity at the interface of the proteins as measured by the Fast Atomic Density Evaluator (FADE), another SDSC software product. Warmer colors are better. L. Ten Eyck, M. Hotchko, D. Law, E. Thompson, SDSC; M. Pique, V. Roberts, TSRI. Graphics rendered by PyMOL (DeLano Scientific, 2002).

"The strength of DOT is that we approach the problem in stages," said Ten Eyck, Associate Director for Science Research and Development at SDSC. "First, DOT finds fast, approximate answers using a scalable algorithm that allows us to take advantage of modern parallel computing to carry out a comprehensive search."

DOT's speed also comes from using an algorithm that computes estimated interaction energy for all possible relative positions in a single step for each orientation. After using DOT to quickly screen the billions of possibilities to find a small number of promising cases, more computationally demanding methods and visual inspection can then be applied to find the correct protein docking configuration.

The motivation of the NIH-funded research is the ongoing search by biologists to better predict the interactions among proteins, the molecules of life. Examples of these problems include examining cellular metabolism, finding the most stable relative orientations between two proteins, studying protein subunit aggregation, performing computer-aided drug design, and solving problems of cellular signaling and expression. The benefits of this research include both greater scientific understanding and advances in efficient drug discovery.

"There is an amazing diversity of conditions in which proteins are found, from cell membranes to being loose in the blood, or buried in cells with little free water," said Ten Eyck. "So there is no 'one-size-fits-all solution for protein interactions."

Using DOT as an initial screening method allows the researchers to efficiently zero in on a solution for each different case. In the four-year-old CAPRI evaluation, which encourages the development of improved protein docking algorithms, organizers solicit from crystallographers protein structures that have been solved but not published. Then they invite scientists to use their best docking algorithms to predict how the protein pairs will fit together. The problem is presented as two isolated protein molecules, so that if there is conformational change, that is, the molecules change shape as they dock, the researchers have to account for this in their solution. Participants can submit up to 10 predictions of how the proteins will interact in the double-blind evaluation.

While modern sophisticated methods that allow flexibility ion binding would be expected to perform the best, the SDSC DOT entries have done surprisingly well over the course of the CAPRI experiment, getting acceptable predictions for around one-third of the targets overall. This is considered quite good performance, according to Ten Eyck, especially in light of DOT's relatively modest resource requirements.

In a remarkable story of software longevity, the DOT software was originally developed in 1994 by Ten Eyck and then-graduate student Jeffrey Mandell, now at The Scripps Research Institute (TSRI), as well as Victoria Roberts and Mike Pique of TSRI. The software was christened DOT for "Daughter of Turnip" since it was based on the earlier docking program, TURNIP, developed by TSRI's Victoria Roberts.

Although DOT then moved to the back burner, it is finding new use in the CAPRI evaluation. While the DOT algorithm is efficient in comparison with other docking algorithms, this class of problems is still computationally intensive. Running on 64 processors of Blue Gene, for example, DOT can produce an answer in about an hour or run on several Linux workstations in about a day.

The SDSC DOT software is open source and available on the CCMS website, and Ten Eyck notes that an updated version will be released soon. For the future, Ten Eyck's group is working to further improve their predictions. One way they are doing this is by collecting information on the location where binding occurs and then using this with methods that take protein flexibility into account.

10 Gig E Team members: Wes Bethel, John Christman, John Shalf, Chip Smith, Mike Bennett
10 Gig E Team members: Wes Bethel, John Christman, John Shalf, Chip Smith, Mike Bennett

Berkeley Lab Proves 10-Gigabit Ethernet Data Transfer is a Reality

Just yesterday Lawrence Berkeley National Laboratory, and several key partners put together a demonstration system running a real-world scientific application to produce data on one cluster, and then send the resulting data across a 10 Gigabit Ethernet connection to another cluster, where it is then rendered for visualization. Publicly proving more than switch interoperability, the demonstration was a first.

On June 17th it was announced that the final milestone in the IEEE standards approval process was reached last week when the IEEE 802.3ae specification for 10 Gigabit Ethernet was approved as an IEEE standard by the IEEE Standards Association (IEEE-SA) Standards Board. With that announcement the speed of Ethernet operations, at least on paper, saw one heavy-duty increase. However, achieving a 10-fold increase in actual Ethernet performance is still a challenge that can only be met with very high-end equipment and expertise. Yesterday, Lawrence Berkeley National Laboratory, announced that it has teamed with Force10 Networks, SysKonnect, FineTec Computers and Ixia to put together a demonstration system running a real-world scientific application (Cactus -- developed by Professor Ed Seidel and his team at the Albert Einstein Institute in Potsdam, Germany) to produce data on one cluster, and then ship over the resulting data across a 10-Gigabit Ethernet connection to another cluster, where it is then rendered for visualization. Specifically, it visualizes the gravity waves resulting from the collision of two black holes.

To say the test went well would be something of an understatement. The Berkeley team not only met its goal of demonstrating sustained 10 Gigabit Ethernet performance, they surpassed it, delivering a sustained data transfer rate of 10.6 gigabits per second. The demonstration consisted of two powerful Linux clusters, each at the ends of a pair of Force10 Networks switches connected via 2 pairs of 10 Gigabit Ethernet interfaces. One cluster of dual-CPU Linux PCs ran the Cactus simulation code and fed data to another cluster of PC's, which ran the Visapult application (a remote visualization app developed by LBNL’s Wes Bethel) which rendered the received data for real-time visual display and analysis. Each machine in the clusters is capable of delivering at least 930 Mbs of load to the network. The team ran traffic from10 of the 11 machines through one 10 gig link and the remaining traffic through the other 10 gig link. The Ixia equipment used in the demo was for monitoring purposes only; there was no analyzer-generated background traffic. “The only things that limit this demonstration are time and money as the cluster is scalable and the network equipment has the capacity. We are very grateful to Finetec, Syskonnect, Force10 Networks, Quartet Network Storage and Ixia for their generosity in supporting this demonstration,” said LBNL team member Mike Bennett. Bennett continued, "In spite of some last minute glitches, we did what we said we'd do -- achieve more than 10 Gig of data throughput. It was great to see it all come together and run as predicted -- with real-world applications that's not always the case.” 

According to Bennett the point of the demonstration was twofold: one, the demo proves that 10 Gigabit Ethernet, recently ratified by the IEEE, is real and necessary to solve bandwidth problems and two, there are real applications today that can use this new technology. A significant factor to consider as well is that prior to this demonstration, all of the publicly held 10 Gigabit Ethernet demonstrations have been to show interoperability not to test the performance of real world applications. In that sense, it’s really a first. “It is very important to show that all of the various vendors that have 10 Gigabit Ethernet products can actually operate correctly when inter-connected. This proves that the IEEE 802.3ae standard is a success,” Bennett stated. “It is equally important to demonstrate the ability of the network system to deliver the capacity needed by bandwidth hungry applications like Cactus and Visapult.” Commenting on the mood around the lab team member John Shalf said, “We’re very excited and certainly the people with the CACTUS base are pretty excited, but it’s kind of limited by us actually trying to convince people that we really need to deploy this technology, so that kind of tempers the excitement. This is more of a technology demonstration so we can make the argument that this really is the way to go.” Bennett added, “I think it's safe to say that we're all really excited about the new technology. It’s been hectic, sometimes frustrating, but exciting - typical pre-demonstration stuff. We were sure the demo would run fine (several successful dry runs), so everyone's definitely stoked.” Another team member, John Christman had this to say, "It wasn't the lack of bandwidth that held us back, it was the lack of resources. Now that we've done 10 Gig, it's time to start looking at 100." A gutsy statement to be sure, but given some time and additional resources I bet they can do it. I’ll put my money on Berkeley’s team any day. Additional technical information about the demo can be found here. Information on Cactus and Visapult can be found at www.cactuscode.org and here respectively.