|
Requirements to archive data at the HEASARC
Introduction
This document lists general guidelines to archive X-ray,
Gamma-ray and EUV data at the HEASARC.
These guidelines are to ensure and maintain the capability of
the multi-mission approach of the HEASARC archive as described
in the HEASARC charter. Project specific needs have to be agreed
on individual cases.
The HEASARC's general policy is that archival data to be effective must
include, in addition to the data, documentation, software and
calibration data. The lack of any of these components prevents
the full exploitation of the archival data.
Every NASA Astrophysics project usually produces a Project Data Management
Plan (PDMP) that describes how their data will be analyzed and archived.
NASA HQ will ask HEASARC to review this plan, and concur that is acceptable.
The projects are encouraged to work with the HEASARC in writing their PDMP.
Data
Delivery
The HEASARC can receive the data either at the end of mission operations
as the final mission archive site, or during the mission operations phase
as the primary mission archive site.
To use the HEASARC as the primary archive during the mission operations phase,
the project should contact the HEASARC to establish the details of the data
delivery and the archive structure. All data delivered to the HEASARC are made
public as soon as they are archived, unless the mission requires a proprietary
period, in which case the HEASARC will store the data in a protected format.
Data can be delivered to the HEASARC using different media or methods.
The media currently supported are : 8 mm, 4mm, DLT tapes and CDs.
Support for DVDs is planned in the near future.
No other media are supported. Data can also be delivered electronically
to the HEASARC staging area. This area consists of a set of disks located
outside of the HEASARC main archive. Data are transferred in the stage area,
checked for correct transmission and after moved in the main archive.
For active missions, electronic data transfer is the method preferred by
the HEASARC. HEASARC has adopted a data transfer protocol,
DTS, originally
developed by the XMM consortium. This uses the FTP protocol and requires
the DTS software to be installed on the site that initiates the transfer.
Agreements with HEASARC personnel need to be established to use an
electronic data transfer method.
Format
NASA has mandated the archival of astrophysical data in FITS format.
Following this mandate, HEASARC has adopted and promoted FITS as the standard
format for all levels of data, e.g. from the basic reformatted telemetry
to the higher products such as lightcurves, spectra or images.
To help projects to provide data in FITS format, HEASARC has developed FITS
standards for headers and data structure to describe most of the high energy
astrophysical data.
Data delivered to the HEASARC should comply with these existing standards.
If these standards are not appropriate for a particular data set, projects are
encouraged to interact with the HEASARC personnel to define headers and data
structure suitable for their data.
Using standard headers and data structure facilitates the usage of
existing software to manipulate FITS files and, if suitable, of analysis
packages available at the HEASARC. In the past, this has been proven
to be effective in reducing the costs associated with data analysis software.
Simple FITS wrappers of the raw data are discouraged for both science data and
calibration data files. HEASARC will also accept gif, jpeg and/or ps files
as quick-view or preview versions of the FITS data products.
As a general policy HEASARC does not archive the original telemetry, which
is in general not in FITS format. If a project wishes to do so, suitable working
software, to read the telemetry data as well as documentation has to be
delivered with the archival data.
The non-FITS data (and the accompany software and documentation) will not
be available from the HEASARC on-line archive, but available only upon request.
As a general policy, data that are not in FITS format, and for which insufficient
software or documentation exist, are not suitable to be archived at the HEASARC.
Projects that use HEASARC as their archive should not assume that HEASARC
will reformat non-FITS data into FITS. The project should make
the HEASARC aware of their plans and agree upon their PDMP with the HEASARC
well before the mission is active. The HEASARC might reformat non-FITS data
into FITS, depending on the available software, documentation and
HEASARC resources. However this would be an exception and not the rule.
Calibration
Any data archived at HEASARC need to be accompanied by their relevant
calibration data. As for the science data, calibration data should also
be delivered in FITS format.
High-level calibration data, e.g. response matrices, should
conform with the standard HEASARC format.
Lower-level calibration data, used for example by the reduction
software, can be stored in a mission-specific FITS format.
At the HEASARC calibration data are stored either with the science data
or in the HEASARC calibration database.
The FITS files for the calibration database must contain appropriate
keywords to access the data via the calibration database tools.
All calibration data should clearly state their range of validity.
Software
The HEASARC provides a suite of programs suitable for manipulating
FITS files and also multi-mission software to analyze high-level products
that are in the HEASARC standard format. These programs are part of the HEAsoft
software package. HEAsoft also includes many mission specific tasks
to deal with the specifics of the experiment calibration or the screening
of the archival data.
The HEASARC requires the archiving of any mission specific software or scripts
that may be needed to reduce the data in the archive.
If several levels of FITS data, for a specific mission, are archived at the
HEASARC scripts, programs and/or recipes used to screen and derive higher data
levels, should be delivered along with the data.
The HEASARC encourages missions to use the HEAsoft infrastructure and to add,
their specific packages to HEAsoft. To secure a long-lasting lifetime for the
software, HEASARC encourages and promotes: software portability, to ensure
software operability on many (of the popular) operating systems; modular
software, with each program dedicated to a specific task rather than trying to
do 'everything' in one program; clean interfaces, e.g. ones not dependent on
commercial database systems or databases in general. The HEASARC does not
support software built on commercial packages (e.g. IDL).
To assist developers to meet these standards, the HEASARC distributes libraries
to read and write FITS files, and to implement parameter file interfaces
(e.g. FITSIO, XPI).
The HEASARC cannot accept software that does not meet these standards, because
it would be much too difficult to maintain.
Any special requirements of a mission software package that do not meet these
HEASARC standards, would have to be agreed to by the HEASARC beforehand.
Just as for data, HEASARC can receive, and ingest in the HEAsoft package, the mission
specific software, either at the end of mission operations as part of their
final archive, or during the mission operations phase.
During the mission operation phase, the mission specific software can be
included as part of the HEAsoft software distribution. However, the project
will retain the responsibility for the software maintenance and update.
The details of the software delivery during operations and the turn-over
of the responsibility for the software maintenance should be agreed to by
the HEASARC personnel.
Documentation
The HEASARC requires the delivery of every level of useful documentation
that is relevant to the usage of the archival data. The documentation
should include satellite and instrument descriptions, data format
descriptions, software documentation and calibration documents. Any
important events that occurred during the mission lifetime that have
relevance to the archival data should also be documented. When possible,
the documentation should be provided in electronic form.
Web pages that contain vital mission information can be imported to the HEASARC
and made accessible via the HEASARC web pages, once the mission is completed,
and the mission website is closed down.
Database Table
The final archive of an experiment also includes catalogs.
These catalogs may be contained in database tables that have various types
of information, e.g. a source catalogs as final product of an experiment, or
timelines and observing logs. These catalogs should be ingested into the
HEASARC database system.
Database tables are also used, in the HEASARC database system, to access and
retrieve the data from the archive. For this type of usage, the table must
contain a field that uniquely identify a dataset or a file located in the
archive. The HEASARC standard for database table is a plain ASCII file, where
the various fields are pipe-delimited, accompanied by an ASCII header that
describes the data type of the fields and other table characteristics.
If the HEASARC is the main archive during the mission operations, the project
and the HEASARC should agree of a schedule for updating all mission specific
tables that will be available through the HEASARC on-line system interface "Browse".
The HEASARC archive "Browse" interface can also support databases that
remotely access data that are located at other institutions.
HEASARC Home |
Observatories |
Archive |
Calibration |
Software |
Tools |
Students/Teachers/Public
Last modified: Monday, 19-Jun-2006 11:24:57 EDT
|