Cloud computing has proven to be a cost-efficient model for many commercial web applications, but will it work for scientific computing? Not unless the cloud is optimized for it, writes a team from the Lawrence Berkeley National Laboratory.

After running a series of benchmarks designed to represent a typical midrange scientific workload—applications that use less than 1,000 cores—on Amazon's EC2 system, the researchers found that the EC2's interconnect severely limits performance and causes significant variability. Overall, the cloud ran six times slower than a typical mid-range Linux cluster, and 20 times slower than a modern high performance computing system.

The team's paper, "Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud," was honored with the Best Paper Award at the IEEE's International Conference on Cloud Computing Technology and Science (CloudCom 2010) held Nov. 30-Dec.1 in Bloomington, Ind.

"We saw that the communication pattern of the application can impact performance, Applications like PARATEC with significant global communication perform relatively worse than those with less global communication," says Keith Jackson, a computer scientist in the Berkeley Lab’s Computational Research Division (CRD) and lead author of the paper.

He also notes that the EC2 cloud performance varied significantly for scientific applications because of the shared nature of the virtualized environment, the network, and differences in the underlying non-virtualized hardware.

The benchmarks and performance monitoring software used in this research were adapted from the large-scale codes used in the National Energy Research Scientific Computing Center's (NERSC) procurement process. NERSC is located at the Berkeley Lab and serves approximately 4,000 Department of Energy (DOE) supported researchers annually in disciplines ranging from cosmology and climate to chemistry and nanoscience.In this study, the researchers essentially cut these benchmarks down to midrange size before running them on the Amazon cloud.

"This set of applications was carefully selected to cover both diversity of science areas and the diversity of algorithms," said John Shalf, who leads NERSC’s Advanced Technologies Group."They provide us with a much more accurate view of the true usefulness of a computing system than ‘peak flops’ measured under ideal computing conditions." 

The benchmark modifications and performance analysis in this research were done in collaboration with the DOE’s Magellan project, funded by the American Recovery and Reinvestment Act."The purpose of the Magellan Project is to understand how cloud computing may be used to address the computing needs for the Department of Energy's Office of Science.  Understanding how our applications run in these environments is a critical piece of the equation," says Shane Canon, who leads the Technology Integration Group at NERSC.

In addition to Canon, Jackson and Shalf, Berkeley Lab's Lavanya Ramakrishnan, Krishna Muriki, Shreyas Cholia, Harvey Wasserman and Nicholas Wright are also authors on the paper.

"This was a real collaborative effort between researchers in Berkeley Lab's CRD, Information Technologies and NERSC divisions, with generous support from colleagues at UC Berkeley—it is a great honor to be recognized by our global peers with a Best Paper Award," adds Jackson.

The award is the second such honor for Jackson and Ramakrishnan this year. Along with Berkeley Lab colleagues Karl Runge of the Physics Division and Rollin Thomas of the Computational Cosmology Center, they won the Best Paper Award at the Association for Computing Machinery’s ScienceCloud 2010 workshop for"Seeking Supernovae in the Clouds: A Performance Study."

The Department of Energy's Office of Advanced Scientific Computing Research and the National Science Foundation funded the work; and CITRIS at the University of California, Berkeley donated Amazon EC2 time.

Read the paper here.

Appro has announced that it has been awarded a subcontract for a 147.5TF Appro 1U-Tetra supercomputers from Lockheed Martin in support of the DoD High Performance Computing Modernization Program (HPCMP). The HPCMP supports DoD objectives to strengthen national prominence by advancing critical technologies and expertise through use of High Performance Computing (HPC). Research scientists and engineers benefit from HPC innovation to solve complex US defense challenges.

As a subcontractor of Lockheed Martin, Appro will provide system integration, project management, support and technical expertise for the installation and operation of the supercomputers and Lockheed, as a prime contractor will provide overall systems administration, computer operations management, applications user support, and data visualization services supporting five major DoD Supercomputing Resource Centers (DSRCs). This agreement was based on a common goal of helping customers reduce complexity in deploying, managing and servicing their commodity High Performance Computing solutions while lowering their total cost of ownership.

The following are the supercomputing centers where Appro clusters will be deployed through the end of 2010:
Army Research Laboratory DSRC at Aberdeen Providing Ground, MD,
US Air Force Research Laboratory DSRC at Wright Patterson AFB, OH,
US Army Engineer Research and Development Center DSRC in Vicksburg, MS,
Navy DoD Supercomputing Resource Center at Stennis Space Center, MS,
Arctic Region Supercomputing Center DSRC in Fairbanks, AK.

“We are extremely pleased to work with Lockheed Martin and be part of providing advanced cluster technologies and expertise in High Performance Computing (HPC) in support of the DoD High Performance Computing Modernization Program (HPCMP), said Daniel Kim, CEO of Appro. "Lockheed Martin leads its industry in innovation and has raised the bar for reducing costs, decreasing development time, and enhancing product quality for this important government program, and our products and solutions are a perfect fit for their demanding expectations."

Fulcrum Microsystems has announced the FocalPoint FM6000 series of fully integrated wire-speed 10G and 40G Ethernet switch chips, which incorporate the company's new and innovative Alta high-speed Ethernet switching architecture.

One key new feature of the FM6000 series switches is Fulcrum's FlexPipe low-latency packet-processing pipeline, which can parse, modify and apply multiple rules to traffic at more than 1 Billion packets per second in a completely deterministic manner. FlexPipe also can be upgraded in the field to support future datacenter networking protocols as they emerge.

The FM6000 series devices are based on Fulcrum's Alta switch architecture that, in addition to FlexPipe, features flexible 10G and 40G Ethernet port logic and third-generation RapidArray output-queued shared-memory architecture. Alta-based FocalPoint devices achieve unprecedented performance while maintaining low cut-through packet latencies of less than 300nS, regardless of configuration or features enabled. Fulcrum's pioneering efforts in developing low-latency Ethernet switch technology has made FocalPoint the preferred datacenter fabric building-block for applications such as financial trading and computer clustering in today's virtualized and high-scale datacenters.

Virtualization is increasing the density of server farms and enabling datacenter operators to efficiently deploy cloud services. In addition, new server architectures include multi-core processors, increasing network bandwidth requirements. The FlexPipe packet-processing pipeline in the FM6000 series devices delivers full line-rate performance across up to 72 ports, offering non-blocking throughput for thousands of virtualized flows.

"Our Dell'Oro forecasting models show that 10GE server ports will grow dramatically in coming years, which will drive demand for high density 10Gb switches and also the demand for 40Gb Ethernet uplinks," said Alan Weckel, director of Ethernet research for Dell'Oro Group. "Given this expected demand, the launch of this switch from Fulcrum is very timely."

FlexPipe allows the functionality of several key logic blocks, such as the packet parser and egress frame modification unit, to be upgraded to support new datacenter networking standards or proprietary performance-enhancing application tags. With this functionality, switch manufacturers can sell switching systems that are field upgradable with support for emerging datacenter interconnect topologies such as TRILL and SPB, as well as emerging virtualized networking standards such as 802.1bg (Edge Virtual Bridging) and 802.1bh (Port Extenders).

There are nine devices in the FM6000 series, each offering a different port configuration and total bandwidth, ranging from 160Gbps to 720Gbps. FM6000 series switches can be used to build very high port-count top-of-rack or end-of-row datacenter switches with industry-leading latency, performance, and scale. With the ability to drive SFP+ direct-attach copper cable directly without the need for an external PHY, the FM6000 series reduces the latency, cost and power of these top-of-rack switch designs. To enable network convergence, the FM6000 supports the efficient mix of storage, HPC and LAN data traffic with extensive QoS and datacenter bridging (DCB) features such as PFC, ETS, and QCN, simultaneously supporting lossless operation alongside bandwidth and latency guarantees. Additionally, system-wide management features offer line rate per-flow monitoring and policing for clear visibility and a single point of management, reducing overall system complexity.

"Fulcrum is changing the game again by delivering standards-based switching solutions that provide advanced features and shatter all established benchmarks for high bandwidth, low latency and power efficiency," said Mike Zeile, Fulcrum Microsystems president and COO. "The FM6000 series, with our revolutionary Alta architecture, is helping define the future of virtual computing by delivering the performance needed for next-generation datacenter fabrics."

Configuration and Availability

All nine members of the FocalPoint FM6000 series will be generally available in 2Q 2011.

Page 22 of 22