Minutes of HEASARC Users Committee Meeting

Feb. 25, 2000

The HEASARC Users Group (HUG) re-convened (the last meeting was in Sept. 1997) with the following members present:

Niel Brandt (Penn State), Josh Grindlay (CfA), Chryssa Kouveliotou (MSFC), Jonathan McDowell (CfA), Ron Remillard (MIT), Pat Slane(CfA), Allyan Tennant (MSFC), Ray White (U. AL), and Nick White (GSFC).

John Nousek (Penn State), HUG Chair, had hoped to attend but was unable to due to illness. Josh Grindlay, The new HUG chair, was therefore asked (by Nick) to run the meeting.

The first item of business was to thank John Nousek for his fine stewardship of the HUG over the past several years.

The (revised) Agenda for the meeting is given below as Appendix A. It was followed, though with increasing time dilation due to good discussion of good presentations. These Minutes provide a brief summary of key points made in each presentation, together with ***Concerns, which may have been noted by members of the Commitee, and >>>Recommendations, for consideration by the HEASARC or NASA.

Overall, the HUG was very impressed by the continued high quality of work and services provided to the general astronomical community and public at large by the HEASARC. The meeting occurred soon after the tragic loss of Astro-E (Feb. 10), with effects on the HEASARC operations and budget still not fully understood. The Committee was also concerned about the possible de-orbit of CGRO, discussed at the GSFC Management Council the day before, and the adverse impact this would likely have on the COSSC and transition of CGRO data support to HEASARC.

HEASARC Overview (N. White)

The user community for data archives maintained by the HEASARC continues to expand, with three general groups of users: active "observers" using current or recent mission data, archive data users, and education/outreach users. The archive now includes some 20 missions, over 250 catalogues, and ~2 Tb (compressed) data. The Committee was generally pleased by the new "corporate look" HEASARC website design, although there was some

*** concern that the old x-ray account (used by "expert" HEASARC users) will soon be disabled due to security concerns. The committee was pleased to learn there are plans to migrate the capabilities of this account to the web and feels that this is an important action.

The committee was also pleased to learn of the latest releases of Ftools (5.0), XSPEC (11.0) and Xanadu in a combined package, HEAsoft, just the week before.

The budget for overall HEASARC operations has remained flat at a level of ~$1.3M for the past 7 years. A significant portion of the support for development of its widely used packages has come from GOFs for individual missions. In particular, the Astro-E GOF, as successor to ASCA, was supporting a significant portion of the development of FTOOLS and Xanadu development. With the loss of Astro-E, the committee

>>> recommended that the support for software development and maintenance of primary analysis tools previously supported by Astro-E be migrated into the HEASARC.

The upcoming Senior Review provides a challenge to the HEASARC to maintain and assume functions (e.g. from GOFs) as well as its visibility among non-HEA astronomers (e.g. the optical and IR community). The Committee discussed ways to make tools such as W3BROWSE known to these colleagues; it was

>>> recommended that simple website changes such as adding top-level buttons/links to other websites could promote HEASARC tools.

In support of the proposal for the Senior Review, the committee members were urged to provide (to K. Arnaud and N. White) examples of new science enabled by HEASARC data or tools. These inputs, which could come from colleagues or students of committee members, are needed by April 1.

Archive Usage (N. White for S. Drake)

Access to the HEASARC data and archives continues to expand. Statistics were provided on the data transfer rates by mission, with RXTE out in front. The Committee requested additional input on data access by observation; this has been provided now by Steve Drake (see Appendix B below) and shows that RXTE is clearly still the most requested data.

The data optical jukeboxes are being phased out and replaced by disk farms (RAID). The data volume overall is growing at a rapid pace (about 1 Gb/day) and is expected to increase significantly this year and next with HETE-II (400 Gb/yr), XMM (300 Gb/yr) and INTEGRAL (1000 Gb/yr). The committee expressed

***concern that HETE-II data formats were still not provided to HEASARC, though launch is scheduled now for May.

ROSAT Archive (M. Corcoran)

A new "enhanced" HRI images archive is now available, as are two releases of HRI sources. About 62% of the HRI data has released RRA products. A PSPC source release is coming. The Final Archive of ROSAT data will be completed in April, 2000, and will include final boresight error corrections and data verification. Funding for the US ROSAT Data Center ends in April.

ASCA Archive (K. Mukai)

Data processing is now Rev. 2 (since '98), and new calibration corrections (e.g. 4th order polynomial for radial gain) are now available for GIS data. SIS calibration updates are planned. In view of the loss of Astro-E, the recently started (Feb 13) long observation phase of ASCA may be re-evalutated. The orbital decay is still relatively modest, but re-entry is still expected sometime in 2001. Current funding has FY00 as the final year of the ASCA GOF. The Committee was impressed by the continued increase in numbers of scientific papers from ASCA, with 222 in 1999. Accordingly, the Committee

>>> recommends that in view of the loss of Astro-E, support for ASCA GOF activities be continued at least through FY01 to allow final calibrations and support of the mission through its remaining active phase.

BeppoSAX Archive (L. Angelini)

The Committee was pleased that 70% of the NFI public data from BeppoSAX is now available at the HEASARC. However,

*** concern was expressed that the WFC data is not available despite repeated requests to SRON and the SAX-SDC. The Committee

>>> suggests that renewed attempts be made, perhaps via higher level bi-lateral negotiations, to obtain the WFC data which will become an increasingly relevant historical database for planned wide-field surveys (e.g. Swift and beyond).

Software Issues (B. Pence and K. Arnaud)

The new and unified Ftools/Xanadu release HEAsoft will not be available on CD Rom, which may be a problem for some home Linux users without fast network connections. The new FTOOLS (5.0) includes Astro-E tools, which will now be archived for the expected future use. The Committee regards the software included in HEAsoft as very important to maintain and further develop. XSPEC, in particular, is the universal standard for spectral model fitting. The (many) new features added in XSPEC v11, such as inclusion of model uncertainties and provision for spectral line fitting and identification, are promising. Thus the Committee was

*** concerned to learn that programmers for Ftools are being reduced from 5 to 2, which are (in turn) specific to XMM. Therefore,

>>> the Committee recommends that two general Ftools programmers, and one Xanadu programmer, be added to maintain and further develop these important resources for the HEA community.

The Committee also noted that the XIMAGE package is in need of improved documentation. This is a powerful package, but many of its capabilities cannot be used effectively by many high-energy astronomers. A relatively small investment in improving the documentation and would be beneficial to the community.

The Committee was interested to hear about the new effort (supported by ADP) by Pence and McGlynn to develop Hera, the generalized web-based analysis environment for HEASARC data and software.

Interdisciplinary Activities (T. McGlynn)

The Skyview program remains one of the centerpieces of HEASARC, providing more than 1000 images/day to the general public. Similarly, the W3Browse tool is a widely-used for access to current data. The Committee

>>> recommends that W3Browse be enhanced by providing postage stamp GIF images of the requested image so that users may quickly ascertain whether this is indeed the image they wish to download.

Astrobrowse is a new (prototype) tool under development which allows for access to more than 1750 databases. A phase 2 development effort will allow it to integrate or combine information from different sources.

Given the growing support for the National Virtual Observatory (NVO), integrated display, analysis and search tools such as being developed at HEASARC are going to be increasingly valuable. The HEASARC has already developed links to other archives: EUVE data can be accessed through HEASARC though it is at STScI, and ROSAT data can be accessed at HEASARC via the STScI interface.

Education and Outreach (J. Lochner):

The number and scope of activities in E/PO developed at the HEASARC have been impressive. Some, like APOD and Imagine the Universe, are among the most widely contacted/used astronomy sites on the Web. The Committee was interested to hear about the ways in which activities were coordinated with other E/PO sites (e.g. SEU Forum) as well as how its Workshops and materials can be made available to an even larger number of teachers and classrooms.

RXTE Archive (A. Smale)

Interest in RXTE remains very high, with its data access (by volume as well as observation) now highest from HEASARC (cf. Appendix B). The curent rate of production of scientific papers also highest. The GOF data distribution is now completely electronic, and re-processing of all cycle 1-4 data with improved background models is expected to be complete by July 2000.

CGRO Archive (T. McGlynn)

Both EGRET and BATSE data are well-integrated into the HEASARC archives (though some BATSE data is delayed), but COMPTEL and OSSE data are incomplete and suffer from little portable analysis software. With the end of CGRO mission life possibly imminent, the data analysis support center (COSSC) must transition to the HEASARC. There is currently no gamma-ray expertise within HEASARC, though this will be increasingly important not only for CGRO (archival data) but also for the upcoming INTEGRAL and Swift missions. Thus, the Committee

>>> recommends that at least one gamma-ray astronomer position be provided at the HEASARC.

Data Restoration (L. Angelini)

The Committee was pleased to hear that conversion to FITS of the major historical x-ray datasets has been nearly completed and that software (e.g. lightcurve analysis) for these early missions is also operative. The programming staff has declined to 1 (from 4) for data restoration; the Committee felt this was reasonable, and that more extensive data restoration was generally not warranted.

The agreement for Archive Exchange to obtain the BeppoSAX archive in exchange for copies of the ASCA and ROSAT archives appears to be limited by the rate at which SAX data is made available. Also, as noted above, the WFC data have not been provided, despite repeated requests from the HEASARC.

NSSDC Report (J. King)

The historical data (pre-HEASARC) holdings of the NSSDC were discussed and the Committee was asked for its views on how to maintain the historical collection of data tapes. The Committee

>>> recommends that additional resources not be spent to maintain original data tapes for which original software to read them is no longer available. The Committee also recommends that data tapes which have been largely copied to FITS format and are in the HEASARC archive (e.g. HEAO-1; the 129 tapes mentioned as having "just a little data not at HEASARC") not be copied or maintained with HEASARC resources.

XMM Archive and GOF (S. Snowden)

An overview of XMM support was provided. A number of important steps are being done at GSFC (GOF), including conversion of the SAS software to Linux; translation software to allow XMM data to be analyzed with Ftools; extension of hardware team (RGS and OM) software to Ftools; and extension of proposal preparation tools (e.g. PIMMS and Quicksim) to more readily useable packages. The original XMM simulation package, for example, was exceedingly slow. The Committee was told that from preliminary discussions with ESA, the complete data archive (not just US PI data, as now allowed by ESA) should be made available; the Committee

>>> recommends that formal agreements be reached between NASA and ESA to ensure the complete XMM archive will be available at the HEASARC.

Chandra Archive (A. Rots)

The state of Chandra data pipeline processing is evolving so that a rev 2 processing of data and complete archive of publicly available data will be available soon. Access to the Chandra archive from W3Browse will be via links from the W3Browse Missions page and could be designed as nearly transparent. As the data processing at the CXC becomes more finalized, the form of the interface can be better defined. The Committee was pleased to hear that for users accessing Chandra archival data through the HEASARC (rather than directly from the CXC), the interface can be seamless eventually.


Appendix A: Agenda for HUG Meeting, Feb. 25, 2000

9.00 Introductions
9.15 The Archive scene and the upcoming senior review - Nick White
10.00 Archive usage and future plans - Nick White
10.15 ROSAT archive - Mike Corcoran
10.30 ASCA archive - Koji Mukai
10.45 Coffee break
11.00 BeppoSAX archive - Lorella Angelini
11.15 Software - Bill Pence/Keith Arnaud
11.45 W3Browse future plans - Tom McGlynn
12.00 Lunch
13.00 Interdisciplinary activities & Skyview - Tom McGlynn
13.30 Outreach and Education - Jim Lochner
13.45 RXTE archive - Alan Smale
14.00 CGRO archive - Tom McGlynn
14.15 Data restoration - Lorella Angelini
14.30 NSSDC report - Joe King
14.45 XMM archive and GOF - Steve Snowden
15.00 Relationship to Chandra archive  - Arnold Rots/Nick White
15.30 Break
15.45 Exec session
16.45 New members
17.00 Adjourn

Appendix B: Mission observation access of archives

(report by Steve Drake)

From: Stephen Drake 
Subject: no subject (file transmission)
To: nwhite@lheapop.gsfc.nasa.gov
Date: Fri, 10 Mar 2000 14:51:28 -0500 (EST)

Nick:

     In response to the 2 action items from the HUG that you relayed to me
a week ago, here are the 1999 ftp stats for all HEASARC missions big and
small. You are welcome to pass this message on to the HUG chair and/or other
members. I have defined a typical 'observation' for each mission (more
of an art than a science) and thus give, for each mission, the number of
observations ftped as well as the raw number of Gigabytes ftped. I also
give the 'data attractiveness quotient' (amount ftped/archive size) for
each mission: the bigger this number the more 'attractive' the mission
data are. Finally, using all 3 measures of popularity, I gave rankings
for all missions: the basic result is that RXTE really is arguably our
most popular mission despite the bias in the ftped data volume due to
the large size of a typical RXTE 'observation'

                                  Steve

    PS Alan Smale is checking my estimate of the size of a typical RXTE
observation (I assumed 300 Megabytes), and if he comes up with a very
different number I will send you the revised table.

--------------------------------------------------------------------------

          1999 ftp Statistics: detailed breakdown by category/mission

   Note: These stats include both legacy and cossc ftp transfers.

     The 1999 data volume ftped was 1391 Gigabytes (1.65 times the 842 GB
ftped in 1998), which can be compared with the 1999 data volume of web pages
httped of 484 Gigabytes. Thus, the total data volume transferred by both
ftp and http combined in 1999 was 1875 Gigabytes, which is comparable to the
size of the HEASARC data holdings (1660 GB as of December 1999).

----------------------------------------------------------------------

Breakdown of legacy + cossc ftp Transfer Rate Statistics: 1999

Mission              1999                                        R1  R2  R3
Directory        Transferred
            # Files   Amount   % Total   # Obs.   Data Attractiveness
                    Gigabytes                  (ftp amount/archive size)
 
ARIEL5         72     0.050    0.0%       55         22.6%       16  10   8
ASCA      134,306   172.0     12.4%     1458         34.7%        2   4   5
BBXRT          81     0.006    0.0%        1          0.3%       19  17  16
BSAX       10,558     6.6      0.5%      183         18.3%       11   8   9
CGRO      381,754    50.9      3.7%     1818         28.9%        5   3   7
COPERNICUS     98     0.003    0.0%        8          0.9%       21  14  15
COSB           62     0.002    0.0%        2          2.6%       22  16  10
DXS             3     0.000    0.0%        0          0.0%       24  20  20
EINSTEIN      583     0.202    0.0%       72          1.3%       13   9  13
EUVE        2,609    17.0      1.2%      472         30.4%        8   7   6
EXOSAT      6,013    30.5      2.2%      709         37.2%        7   5   3
GINGA         781    10.7      0.8%      596         52.8%        9   6   2
HEAO1         214     0.028    0.0%       11          0.3%       17  13  17
HEAO3          29     0.004    0.0%        1          0.1%       20  18  19
HEASARC    11,846     8.8      0.6%         not applicable       10  --  --
OSO8          106     0.101    0.0%       34          1.5%       14  12  12
RETRIEVE    8,960   154.0     11.1%         not applicable        3  --  --
ROSAT      56,253    37.1      2.7%     4122         36.7%        6   1   4
RXTE      494,256   764.4     55.0%     2548        112.2%        1   2   1
SAS2           49     0.001    0.0%        1          1.3%       23  19  14
SAS3          263     0.012    0.0%       47          0.2%       18  11  18
SOFTWARE  100,238   136.7      9.8%         not applicable        4  --  --
VELA5B        269     0.098    0.0%        5          1.8%       15  15  11
XMM         3,042     3.7      0.3%         not yet clear        12  --  --

----------------------------------------------------------------------------

TOTAL   1,213,520  1390.8 GB

R1 is ranking in order of transferred data amount
R2 is ranking in order of number of 'observations' (See Appendix)
R3 is ranking in order of data attractiveness

---------------------------------------------------------------------------

      (1) Using transferred data amount to rank missions (parameter R1 in table)

      No real surprises in the breakdown by mission/directory: RXTE is the
dominant mission (55% of the total volume, increasing from 33% in 1998), with
ASCA 2nd (12% of the total volume). All other missions are in the single
digits percentage-wise, e.g., CGRO 5th at 4%, ROSAT 6th at 3%, EXOSAT 7th at
2%. and EUVE 8th at 1% of the total volume. Data files staged to the retrieve
area accounted for 11% (3rd rank) of the volume transferred, while software
files were 10% (4th rank) of the total.

     (2) Using transferred number of 'observations' to rank missions
         (parameter R2 in table)

     The definition of an observation is easy for some missions, much less
clear for others. In the Appendix I give the assumed number of megabytes
for an observation of each mission together with how I arrived at this
estimate. Using this method to rank missions, ROSAT is the winner, followed
by RXTE (2nd), CGRO (3rd), ASCA (4th), EXOSAT (5th) and Ginga (6th).

     (3) Using data attractiveness quotient R3 = total data transferred divided
by the total archive size for the mission (parameter R3 in table)

Using this method to rank missions,
RXTE is the clear winner with R3=112%, followed by Ginga (2nd, 53%),
EXOSAT (3rd, 37.2%), ROSAT (4th, 36.7%), ASCA (5th, 35%), EUVE (6th, 30%),
CGRO (7th, 29%), ARIEL5 (8th, 23%), and BSAX (9th, 18%). ALL other missions
have data attractiveness quotients of 3% or less.

         CONCLUSION

     RXTE is the most popular mission in 2 of the 3 ranking schemes, and 2nd
in the third scheme. ASCA and ROSAT are always in the top 6 in all 3
ranking schemes and thus are essentially neck and neck. In the next tier
are EXOSAT, CGRO, Ginga, and EUVE. Missions for which the HEASARC received
data files with little or no documentation or analysis software such as
DXS and HEAO3 have (not surprisingly) the lowest ranks in all 3 ranking
schemes.

-------------------------------------------------------------------------

    Appendix: How did I come up with the size of a typical observation for
              each mission?


    Mission        Comment       Mission Size       #obs's 

                (MB)                              (GB)                    (MB)
     ARIEL5         no. of distinct ASM entries   0.22           250       0.9
     ASCA       118       Mukai/Pier estimate   495             3642     136
     BBXRT                                        2.1            157      13
     BSAX        50         Lorella estimate     36             1603      22
     CGRO                                       176             6359?     28
     COPERNICUS no. of distinct XCOPRAW entries   0.39           867       0.45
     COSB       no. of distinct COSBRAW entries   0.094           62       1.5
     DXS        no. of orbits data acquired       0.36            91       4.0
     EINSTEIN                                    15.6           5659       2.8
     EUVE        25            56             1183      47
     EXOSAT                                      82             1901$     43
     GINGA                                       20.3           1141!     18
     HEAO1                                        9.9           4036^      2.5
     HEAO3      no. of files in /data area        5.6            832       6.8
     OSO8                                         6.7           2224&      3.0
     ROSAT        7         <56 PSPC procdirs>  101             9462      11
     RXTE                                       681             2253*    302
     SAS2       no. of distinct SAS2RAW entries   0.049           24       2.0
     SAS3       no. of distinct sequences (guess) 7.2          27629       0.26
     VELA5B                                       5.5            268      20
--------------------------------------------------------------------------
    ? assumes no of entries in cgrotl is ~# of obs's
    $ assumes no. of obs = no. of FOTID's/3
    ! guess an average of 12 Sirius numbers per observation, and there are
      13689 distinct Sirius numbers
    ^ assumes no. of obs = no. of entries in A2RTRAW
    & assumes no. of obs = no. of entries in OSO8RTRAW
    * assumes an average of 12 obs_id's per observation, and there are 27489
      obs_id's
-------------------------------------------------------------------------

     Final adopted values for a typical observation for each HEASARC mission:

        ARIEL5           0.9 MB
        ASCA           118
        BBXRT           13
        BSAX            36
        CGRO            28
        COPERNICUS       0.45
        COSB             1.5
        DXS              4.0
        EINSTEIN         2.8
        EUVE            36
        EXOSAT          43
        GINGA           18
        HEAO1            2.5
        HEAO3            6.8
        OSO8             3.0
        ROSAT            9
        RXTE           300
        SAS2             2.0
        SAS3             0.26
        VELA5B          20