NOAO E2E next generation distributed file management using iRODS

Barg, Irene

The NOAO Mass Storage System (MSS) holds astronomical data collected from about two dozen different scientific instruments at eleven telescopes on three mountain tops in two different countries both north and south of the Equator. Data are transferred via the net, from each mountain to Data Centers in La Serena, Chile and Tucson, Arizona. Then replicated across both hemispheres. A third copy is saved on tape at NCSA. This system is collectively called the End-to-End system (E2E). The data flow and file repository management is accomplished using a collection of custom code built on top of the Storage Resource Broker (SRB) developed by Data Intensive Cyber Environments (DICE) research group at the University of North Carolina at Chapel Hill, and the Institute for Neural Computation at the University of California, San Diego.

iRODS, the Integrated Rule-Oriented Data System, is a data grid software system developed by DICE and collaborators and the successor to SRB. Both SRB and iRODS provide the ability to manage large amounts of data which can be distributed across data centers. This paper will describe why NOAO Science Data Management (SDM) chose iRODS as the next generation file repository management system for the E2E data management system. We will explain: 1) why we chose iRODS vs other similar technologies; 2) our phased migration from SRB to iRODS; 3) how the use of iRODS 'rules' and 'micro services' reduced application code dramatically; 4) how the iRODS schema simplifies integration of other external E2E services.

