Automated Curation of Infrared Imaging Data in the WFCAM & VISTA Science Archives.

Cross, Nicholas

The two fastest near infrared survey telescopes are UKIRT-WFCAM in the north and VISTA in the south. The data from both these instruments is archived by WFAU, using the same pipeline, taking into account the instrumental differences. The final catalogues from VISTA surveys will contain tens of billions of detections.

For both instruments, data are taken for a range of pointed surveys and smaller PI programmes. The surveys vary from shallow hemisphere surveys to ultra deep single pointings with hundreds of individual epochs. The scientific goals range from finding rare objects to measuring large scale statistical properties, so a wide range of products and database tables need to be created, depending on the data set. The curation pipeline uses an SQL schema for each programme and curation tables to drive all the main processes: creation of deep images and catalogues from multiple observations, creation of band-merged catalogues, and variability tables to analyse light curves and measure proper motion.

In the main science surveys, curation is semi-automated, allowing high-level requirements from the survey team, but low-level requirements are derived from the data, whereas smaller programmes are completely automated. The decision making process which drives the curation pipeline is therefore a crucial element of archiving these data.

Setting up involves "building" each programme, where the science frames are grouped and decisions about what processes need to be run are made. The ProgrammeBuilder code updates all the curation tables -- the tables that drive the processing of data -- and creates the SQL schema file for each survey from a template that is processed using the current programme requirements. We explain the important aspects of the software in the ProgrammeBuilder code and automated pipeline.

