ENTERTAINMENT
Framework for processing LSST astronomy data undergoing first annual challenge
- Written by: Writer
- Category: ENTERTAINMENT
The Large Synoptic Survey Telescope (LSST) won't begin operation until 2013, but researchers are already rehearsing for the massive volume of data the telescope will produce. Three TeraGrid sites--the National Center for Supercomputing Applications (NCSA), the San Diego Supercomputer Center (SDSC), and the Texas Advanced Computing Center (TACC) - are collaborating on the LSST Data Challenge, the first annual test of the planned end-to-end astronomy cyberenvironment that will meet the challenge of transferring, processing, storing, and sharing the terabytes of data LSST will produce every night. The telescope's comprehensive, time-lapse imaging will provide an unprecedented census of the solar system, including transient objects like comets and potentially hazardous near-Earth asteroids. LSST's repeated sweeps of the sky will also help to reduce noise, allowing astronomers to home in on fainter and fainter objects; by seeing farther and farther into the universe they are also seeing further and further into the past. And LSST aims to discover the nature of "dark energy," the enigma that is causing the expansion of the universe to accelerate. It's estimated that the LSST will generate 15 terabytes of raw data and more than 100 terabytes of processed data every night. The raw data will move from the telescope to a nearby base camp, where near-real-time processing will occur in order to provide feedback to the telescope to optimize imaging and to promptly alert the astronomy community to interesting observations. The raw data will then be transmitted to the archive center for thorough processing, with the processed data stored and disseminated to the astronomy research community. In the current Data Challenge, three the three TeraGrid sites are standing in for the telescope (TACC), the base camp (SDSC), and the archiving center (NCSA). Data will be transferred from site to site and processed along the way in order to evaluate the design of the prototype data management system. This prototype integrates grid technologies with components developed by partners at the LSST Corporation, the National Optical Astronomy Observatory (NOAO), the Stanford Linear Accelerator Center, and the University of Washington. "The challenge mimics the data transport and processing as it will happen in real life once the telescope is operating," says Cristina Beldica, project manager for NCSA's LSST effort. First, the challenge will test the data replication software, which is used to transfer data from site to site. Developed at NOAO, the Data Service (DS) software leverages SDSC's Storage Resource Broker (SRB) software. Then the basic functionality proposed for the data processing pipeline will be evaluated using prototype science codes and "resource consumers" that model how actual algorithms would consume compute cycles. These codes will be stitched together through middleware components developed by NCSA and its partners, to mimic the actions and applications that will be components of the final pipeline. Information gathered through the challenge will guide the team's further development of the LSST data management system.