LSST Database Architecture - Preparing for the Extreme Scale Analytics

Becla, Jacek

The Large Synoptic Survey Telescope will generate a data set larger than 25 petabytes, encompassing billions of astronomical objects and trillions of detections. We present a scalable architecture that is capable of cost-effectively supporting LSST users' anticipated simple and complex analytics, including time series or full sky near-neighbor spatial correlations, at our extreme scale. We examine the important architectural aspects, such as the trade-offs in a map-reduce approach compared to a database-centric approach and solid-state drives compared to electromechanical disks. We continue with a description of the features necessary to support LSST needs (or another project of similar data scale), such as spatially-oriented overlapping partitioning, multi-level data chunking, and shared scans. As science becomes increasingly data intensive, understanding these issues and techniques is a prerequisite for designing current and future scientific data management systems.

Return to oral presentation list