Next: Script language Up: ALMA Memo #293 ALMA Previous: Interaction with other Actors Contents

Archiving

The archive should enable astronomers to easily use data which has already been obtained with ALMA. This includes being able to judge the quality of the data, to produce new images from parts of the uv-data which have not previously been analysed, and to re-analyse data using new data reduction algorithms. The archive should produce data products which are requested by the user. An astronomer unfamiliar with the project should be able to easily find out what data is available in the archive. This might involve successive inquires to the archive to find: i) what sources have been observed, ii) what spectral lines, velocity coverage and velocity resolution and continuum frequencies and bandwidth was used, iii) inspect the images which were produced by the on-line pipeline, iv) make new images from the archived uv-data. The data in the archive must be easily accessed in a number of ways, e.g. by date, project, source, frequency, spectra line.

Clearly, the archive needs to include the raw uv-data, the calibration data and the images produced by the on-line pipeline. In the case of irreversible on line corrections, such as atmospheric phase correction on time scales shorter than the integration time of the uv-data, it is desirable to keep both the corrected and the uncorrected uv-data (the French solution).

It would be useful to include in the archive the observing scripts, the data reduction scripts used to produce the images in the on-line pipeline, and a description of the scope and goals of the observations. Since new data reduction algorithms will be developed and existing code and algorithms will evolve, the original scripts might not be either usable at a later date, but they will serve as a record of how the data was produced and reduced at the time the data was taken. The description is very useful at later date, when a user unfamiliar with the project can read the description, rather than doing reverse engineering on the observing parameters to deduce what the information the data might reveal. To enable the use of alternative calibration routines, the uv-data within each observing block should be identified by the function which was intended (phase calibrator, target source, coherence calibrator etc.) Generally an observing block should be self contained, and be able to be calibrated independently, although it might use system calibrations e.g. Jy/Kelvin scales and bandpass.

The post-processing of an observing block can be described by the use of objects, and methods. Objects include such things as phase calibrators, amplitude calibrators, coherence calibrators, bandpass calibrators, source(s) and bands (including both multiple observing frequencies and spectral windows). The calibration objects can be selected at the time of the observation, so the detailed data reduction should follow from the actual observations, and not from a previously prepared observing script. The methods specify how to use the phase calibrator objects to apply phase calibration to the target sources, etc. New algorithms might be developed, so that archived data can be re-analysed using using new methods on the previously defined objects within the observing blocks. One could think about inheritance, so that an observing block which does not have a specific bandpass calibrator, could use a default bandpass defined at the system or project level. The designation of the calibration objects by the observer is used by the pipeline post processing and in the calibration of current or archived data. i.e. we don't draw any distinction between data written in close to real time, and data recovered from the archive at a later date; both can be post processed using the chosen methods to deliver the requested data product: "raw" or calibrated uv-data, selected spectral windows and spectral resolution, synthesized images and beams, or deconvolved images.

The software to retrieve the data should check whether they are actually public before forwarding them to astronomers outside the original team of investigators of a given project.

There may be several archives which hold all or subsets of the data. The principal archives should be easily accessed by users in the USA, Europe, and Japan (elsewhere ?). There also needs to be an archive which is easily accessed from the ALMA site, both to buffer the data from the telescope and to provide data for on-line imaging of multiple array configurations.

Next: Script language Up: ALMA Memo #293 ALMA Previous: Interaction with other Actors Contents

Kate Weatherall
2000-03-08