This Legacy journal article was published in Volume 7, June 1998, and has not been updated since publication. Please use the search facility above to find regularly-updated information about this topic elsewhere on the HEASARC site.
INTEGRAL Data Models

D. Jennings (INTEGRAL Science Data Centre/RSTX),
P. Dubath, S. Paltani, R. Walter (INTEGRAL Science Data Centre)


INTEGRAL is a gamma-ray observatory mission consisting of 4 co-aligned coded masked high energy instruments and an optical monitor camera. It is due for launch in 2001 and is capable of performing observations in the 2 keV - 10 MeV energy range. Due to the "dithering" observation strategy employed for coded mask source reconstruction, most observations of targets will consist of hundreds of individual pointed exposures. A FITS format-based data model has been developed that allows the analysis software and mission archive to effectively organize and access the numerous data structures that comprise each target observation.

  1. Introduction to the INTEGRAL Mission

    INTEGRAL (INTErnational Gamma Ray Astrophysics Laboratory) is an European Space Agency-led Medium Class gamma-ray observatory mission scheduled for launch in April of 2001, with a nominal lifetime of 2 years and possible extension of an additional 3 years. The spacecraft and payload modules reuse the design developed for ESA-XMM in an effort to minimize costs, and the payload consists of five co-aligned instruments: a wide field gamma-ray imager (IBIS) and spectrometer (SPI), dual X-ray monitors (JEM-X) and an optical monitor (OMC). Each of the four instruments is built by a consortium of institutions and funded through national space agencies. Significant U.S. and Russian participation in the mission is anticipated through provision of groundstation support (NASA DSN) and the satellite launch vehicle (Russian Proton), respectively. The data center for the mission, the INTEGRAL Science Data Centre (ISDC), is provided by a consortium of institutions led by the Geneva Observatory and is located in Versoix, Switzerland.

    The instruments of the INTEGRAL scientific payload have been carefully chosen to facilitate simultaneous multiwavelength analysis of the observed targets with good imaging and spectral coverage in the high energy bands. Their measurements will be made available to Guest Observers (GOs) as a single comprehensive data set for each target. IBIS is optimized for imaging discrete gamma-ray sources over the energy range from 20 keV to 10 MeV with an angular resolution of 12 arcmin FWHM; source localization on the order of 30 arcsec is anticipated. SPI is optimized for detailed measurements of gamma-ray lines and the mapping of diffuse sources with an angular resolution from 2.5 to 25 degrees depending on targets. The 19 SPI detectors make precise measurements (E/dE = 500) of the gamma-ray energies over the 20 keV - 8 MeV range. The dual JEM-X detectors will provide imaging with an angular resolution of 3 arcmin in the 2 - 120 keV band, simultaneously with the primary instruments' observations, to determine the X-ray behavior of the gamma-ray sources and to better identify the counterparts of the sources. OMC will observe the optical emission from the prime targets of the three other instruments, offering the first opportunity to make long observations in the optical band simultaneously with those at X-rays and gamma-rays.

    A brief summary of scientific topics to be covered by the INTEGRAL mission observing program includes: The galactic center (detection of transients, 511 keV annihilation line and 1809 keV Aluminum 26 emission line mapping), compact objects, nucleosynthesis, Active Galactic Nuclei (AGN), Clusters of Galaxies, and Gamma Ray Bursts (GRBs).

  2. Observing Strategy and Dithering

    The IBIS imager, SPI spectrometer and JEM-X X-ray monitors all employ coded mask detection technology. Mainly due to the needs of SPI, the observation strategy will require the spacecraft to "dither" its pointing position in order to accumulate the required number of coded sky images for source reconstruction. For a given target, the dithering strategy will consist nominally of 20 minute long spacecraft pointings (e.g., the FOV at fixed RA, DEC), followed by 2 degree slews to the next pointing coordinates. The ditherings will follow either a hexagonal pattern or a 5-by-5 pointing raster scan pattern, with the target of interest typically located in the center of the pattern.

    Diagram 1: Examples of the three dithering patterns (hexagonal, raster, GPS) used in the observing strategy.

    INTEGRAL will also perform scans of the galactic plane (GPS) for approximately 15 percent of its total observing time (i.e., one day per week) searching for transient sources. These scans employ a "saw tooth" pattern and result in spacecraft pointings of 12 minute durations separated by 4.5 degree slews.

    The data acquired from a number of these pointings and slews will make up what is typically termed an "observation" of a celestial target. For a typical 100,000 to 300,000 second observation, hundreds of pointing and slew data sets will result. Note that INTEGRAL may spend a large fraction of its time, perhaps up to 20 percent, performing slews; therefore, it is important to make use of the data acquired during slews.

    The dithering strategy creates some unique issues for INTEGRAL data organization and analysis. Not only do observations of a single target consists of collections of data acquired during numerous pointings and slews, but data from a single pointing or slew may belong to several observations simultaneously. Pointings and slews are themselves collections of science, instrument housekeeping and auxiliary data, all of which may be organized into various combinations for reasons of efficiency and conceptual elegance. The data model upon which the INTEGRAL analysis software, production pipelines and archive are designed must be able to process and store pointing and slew data sets as independent units of data, and allow for observations to be constructed from large dynamic sets of these data sets.

    Diagram 2: Schematic relationship between data acquired during pointings and slews to the target observations.

  3. ISDC Data Model

    To meet the requirements and constraints of the mission observation strategy the ISDC has developed a data model that generalizes pointings, slews, target observations, and their associated auxiliary data into a four level hierarchy: (1) data elements, (2) data groupings, (3) science windows, and (4) observations. All levels are supported within the FITS data format (ADF, 1997) and ISDC software system via CFITSIO (Pence, 1998) and the ISDC Data Access Layer (ISDC, 1998).

    A.   Data Elements

    The first level of data within the model is the Data Element, which represents the atomic unit of data organization. A data element corresponds to a single structure such as an array of numbers or a table of information. Each data element resides in a single FITS Header Data Unit (HDU) for purposes of data format access and storage.

    Data elements of a particular pre-defined structure are given identifiers of the form "AAAA-BBBB-CCC". "AAAA" is a sequence of 4 characters that defines the class of the data structure. Currently defined classes are:

    IBIS     IBIS specific elements 
    SPI.     SPI specific elements
    JMXi     JEM-X specific elements, respectively 
             JMX1 and JMX2 for Jem-X 1 and Jem-X 2. 
    OMC.     OMC specific elements
    IREM     REM specific elements
    AUX.     elements containing auxiliary data  
    OSM.     OSM specific elements 
    QLA.     Quick-Look Analysis specific elements 
    GNRL     General elements, not uniquely related to any of the above classes 

    BBBB" is a sequence of 4 characters that defines the subclass of the data element and distinguishes between different types of data. Some examples of currently defined subclasses are:

    SUBWOMC    Subwindows (either in normal, or in fast mode) 
    SEVTSPI    Single events 
    PEVTSPI    PSD events 
    MExySPI    Multiple (xy-uple) events 
    ISGRIBIS   ISGRI events 
    PICSIBIS   PICsIT events 
    COMPIBIS   Compton events 
    FULLJEM-X  Full imaging mode events 
    RESTJEM-X  Restricted imaging mode events 
    SPTIJEM-X  Spectral-timing mode events 

    Finally, "CCC" is a sequence of 3 characters that defines the type of data. Some examples of currently defined data element types are:

    PRW  Telemetry packets   
    ARW  Additional raw scientific data  
    RAW  Raw scientific event or histogram data
    PRP  Prepared scientific events
    COR  Corrected scientific event data
    ARR  Corrected array data
    RHK  Raw housekeeping data 
    CNV  Converted housekeeping data 

    B.  Data Groupings

    The second level of data organization is the Data Grouping, used to form generic associations between data elements and to create compound data structures. The association between the various data elements of a data grouping is defined at the FITS format level using the FITS Hierarchical Grouping Convention (Jennings et. al, 1997).

    Data groupings are implemented within FITS as a grouping table HDU. The rows of the grouping tables contain information about their member elements (HDUs) such as the member file location, file name, and location or identification within the file. This member information essentially forms a pointer to the member, allowing software applications to locate and access the member elements (HDUs) regardless of their true location.

    Diagram 3: Schematic of the relationship between a grouping table and its members. Note that the member HDUs may reside in different FITS files.

    The CFITSIO library will soon support the direct creation and management of grouping tables, and the level 2 ISDC Data Access Layer (DAL) library has been written to support the creation and manipulation of entire data groupings as single data objects. The level 3 ISDC DAL libraries will provide application programming interfaces to specific pre-defined data groupings and data elements, so that applications may manipulate all member data elements of select data groupings as if they were single entities. The support for data groupings, especially the more complex data groupings, within the data access software libraries is a key feature of the ISDC data model.

    The use of data groupings as a data organization tool provides the following advantages:

        1.   The data groupings may be defined without regard to FITS file content, location, or organization.

        2.   The data element associations defined in the data groupings are part of the data itself, thus allowing for long term archival storage.

        3.   Data groupings may span multiple files, multiple file systems and multiple computer systems.

        4.   Data elements may be shared by many data groupings with no duplication of information.

        5.   Data groupings may contain other data groupings as elements, thus allowing for the construction of arbitrary data hierarchies.

    C.   Science Windows

    The third level of data organization is the Science Window. Science windows are a specific type of data grouping that contain many other data groupings created at various stages of data processing. Each science window corresponds to a time period, or "window", within the mission observing schedule where useful science data has been acquired by the INTEGRAL instruments and a given valid attitude solution exists. Thus, all data acquired during a given science window share a common spacecraft attitude (i.e., RA and DEC coordinates) or spacecraft slew path (i.e., RA and DEC interpolations with positions given every 10 seconds). The science windows also constitute the fundamental unit of data for purposes of processing within the ISDC system.

    Science Windows are each identified by a 12 digit Science Window ID (ScWID) of the form "RRRRPPPPFSSS", where:

    "RRRR" is the mission orbital revolution number,

    "PPPP" is the mission pointing identifier,

    "F" is a flag specifying if the science window results from a stable pointing (0) or slew (1), and

    "SSS" is a number that enumerates possible analysis-induced subdivisions of the original planned spacecraft pointing or slew.

    D.   Observations

    The highest level of the data model organization is the observation. Observations, themselves data groupings, are collections of science windows that have been processed together through Standard Analysis, along with the resulting high level data products. The scientific content of observation data groupings will usually correspond to target observations resulting from accepted proposals as defined in the mission scheduling system; however, it is possible for users to combine arbitrary collections of science windows into a observation and process them in Standard Analysis. The observation data grouping will also define the primary unit of data for purposes of distribution to the GOs.

    Observations are each identified by an Observation ID (obsID) of the form

    "AAPPPPOO", where:
    "AA" is the Announcement of Opportunity identifier,

    "PPPP" is the proposal ID, and<> "OO" is the observation number unique to the proposal.

    When "AA" is set to 00 it implies data acquired during the Pre-Verification phase; likewise, an "AA" value of 99 implies an observation data grouping resulting from an arbitrary collection of science windows.

  4. Data Grouping, Science Window and Observation Production

    The ISDC system will operate two parallel processing streams, one for the Near Real Time (NRT) telemetry data stream and another for the non-real time consolidated telemetry stream. Both processing streams create data products arranged into data groupings, science windows and observations. The following table summarizes the data groupings and products created by the processing streams.

  5. Data Grouping Created By Contents Mnemonic (Subsystem) ScWG1 Pre-Processing Raw data (decoded telemetry) ScWG2 Data Preparation HK, event time conversions OSMG Operation Status Monitoring GTIs, inst. monitoring results ScWG3 Data Correction and Binning selected events, array binning SQLAG Quick-Look Analysis QLA sci. win. analysis products DQLAG Quick-Look Analysis QLA observation grouping DQLAPG Quick-Look Analysis QLA analysis products SAG Standard Analysis Standard Analysis observation SAPG Standard Analysis Standard Analysis products

    The NRT processing stream works from a non-complete telemetry feed routed from the spacecraft via the Mission Operations Centre (MOC) at ESOC/Darmstadt, and has the duty to detect Gamma Ray Bursts and Targets Of Opportunity, as well as monitor the scientific performance of the instruments.

    The consolidated processing stream works from a complete-as-possible "consolidated" set of telemetry delivered to ISDC approximately 10 days after data acquisition, and processes the consolidated telemetry into standard data products that are archived and distributed to GOs.

    Diagram 4: Architecture of the near real time and consolidated pipeline streams, showing input and output at the data grouping level.

    Each processing stream consists of three top-level pipelines: (1) Input Pipeline, (2) Science Window Pipeline, and (3) Analysis Pipeline. The input pipeline reads the incoming telemetry frames and produces raw data products (decoded telemetry) belonging to a data grouping of class ScWG1. ScWG1 is the first level of data created for a given science window.

    The ScWG1 is input to the science window pipeline, which performs several instrumental and housekeeping transformations and monitors the status of the instruments. The output data consists of data products belonging to a given science window of grouping classes ScWG2 (prepared data) and OSMG (results of instrument monitoring, good time intervals). The contents of all science windows are ingested into the archive at the level of ScWG2/OSMG.

    For the NRT pipeline stream, the next step is to perform a "quick look" analysis of the science window data in order to monitor the scientific quality of the ongoing observation schedule and to search for TOOs. Using science window groupings at level ScWG2/OSMG as input, the data for each science window is corrected and binned producing data products of grouping class ScWG3, and then analyzed via automatic processing to create products of grouping class SQLAG. Science window data from levels ScWG3 and SQLAG are then grouped into observations of grouping class DQLAG and analyzed via automatic processing, resulting in observation data products of grouping class DQLAPG.

    For the consolidated pipeline stream, science windows at the level of ScWG2/OSMG are grouped into observations of grouping class SAG and submitted to the Standard Analysis pipeline. Each ScWG2 grouping is corrected and binned producing data products of grouping class of ScWG3. The SAG is then processed producing high level observation products of grouping class SAPG.

  6. Summary and Conclusions

    The ISDC data model is FITS-based and focuses primarily upon managing and organizing the large number of individual data sets resulting from the observation dithering strategy. It provides for four basic levels of data organization, from individual data structures known as data elements, to associations of data elements known as data groupings, to science window groupings, and finally observation groupings. The model is supported in software via the CFITSIO and ISDC DAL libraries.

    At the time of writing, INTEGRAL is still well over 2 years from launch and the ISDC systems are in their architectural design phase. Hence, the details of the data model presented in this article will certainly change and clarify before the full system becomes operational. Because ISDC has the (limited) luxury of time, enhancements in software such as CFITSIO will certainly be incorporated into the functionality of the data model. It will also be possible to see the software and models developed by missions such as XMM and AXAF in operation and "extract" the best features for INTEGRAL's use.

  7. References

    ADF, 1997, A Users Guide for the Flexible Image Transport System (FITS), Ver. 4.  

    Pence, W.D. 1998. CFITSIO Users Guide, Ver. 1.4.

    ISDC, 1997, Data Access Layer Users Guide, Ver. 0.5 Beta.

    Jennings, D. G., et al., 1997. A Hierarchical Grouping Convention for FITS.

    Next Proceed to the next article Previous Return to the previous article

    Contents Select another article

    HEASARC Home | Observatories | Archive | Calibration | Software | Tools | Students/Teachers/Public

    Last modified: Wednesday, 04-May-2011 14:58:54 EDT