The HEASARC Database System

Accessing the HEASARC Database and Metadata

This section describes three mechanisms through which (primarily internal) users can query the HEASARC database interactively beyond standard Browse user table queries. Users may wish to examine the HEASARC metadata, perform queries that cannot be made by Browse directly, investigate problems with table queries, or have other reasons for such access. APIs for software access are discussed elsewhere.

The three mechanisms are the HDBmc command-line interface, the Browse Keyword Search interface to query HEASARC metadata, and vendor-supplied tools for database access. After discussing these access methods, a few examples of the kinds of queries a user might attempt using either HDBmc or other access methods supporting SQL queries are given.

The HDBmc command

The HDBmc command is available for querying the HEASARC database from most HEASARC servers in the /heasarc/bin/ directory. This command provides a tool for doing a single simple query of the database or metadata.

The HDBmc command may be used in the following manner:

    HDBmc observatory=AnObservatoryName
    HDBmc table=ATableName
    HDBmc parameter=ATableName
    HDBmc sql="Any sql query"

The optional command-line parameter -host=hostname[:port] may be used to specify the database host name and socket port. The default port is 40975. This argument can be used to connect to the backup database if desired, but it is mostly used when debugging databases or software. There are four commands available within HDBmc.

HDBmc: observatory

The observatory command lists tables available for a given observatory with each table giving a pair of lines of the form

    table_name=sometable
    observatory_name=someobservatory

The observatory may include a '*' as a wildcard character. This query uses the observatory name as specified in the ZZEXT observatory_name virtual parameter.

Example:

   HDBmc observatory="GENERAL CATALOG"

Response:

   table_name=B/2mass
   observatory_name=GENERAL CATALOG
   table_name=I/252
   observatory_name=GENERAL CATALOG
   table_name=I/271
   observatory_name=GENERAL CATALOG
   table_name=heasarc_class
   observatory_name=GENERAL CATALOG
   table_name=heasarc_messier
   observatory_name=GENERAL CATALOG
   table_name=heasarc_ngc2000
   observatory_name=GENERAL CATALOG
  table_name=heasarc_pg
   observatory_name=GENERAL CATALOG

HDBmc: table

The table request returns the ZZGEN and ZZEXT metadata associated with a given table as a list of the form

   field=value

where field gives the values in the ZZGEN table and then any virtual parameters for this table from the ZZEXT table. The input table name can be specified as a wildcard so that the output can include many tables. Output for each table begins with a line of the form

  table_name=theName

Example:

    HDBmc table=heasarc_messier

Response:

   table_name=heasarc_messier
   table_location=dbms1.gsfc.nasa.gov
   table_description=Messier Nebulae
   table_document_url=http://heasarc.gsfc.nasa.gov/W3Browse/general-catalog/messier.html
   create_date=2003-05-05 18:39:01
   modify_date=2003-05-05 18:39:01
   table_rows=109
   equinox=2000
   unique_key=name
   declination=@dec
   table_priority=3
   table_type=Object
   target_name=@name
   right_ascension=@ra
   default_search_radius=60
   frequency_regime=Optical
   observatory_name=GENERAL CATALOG

HDBmc: parameter

The parameter request gives the details for the parameters of the table. A new parameter is signaled by a pair of lines of the form

  table_name=aTableName
  parameter_name=aParameterNAme

Example:

   HDBmc parameter=heasarc_messier

Response:

   table_name=heasarc_messier
   parameter_name=object_type
   parameter_description=Object Category
   parameter_format=char2
   parameter_unit=
   parameter_is_index=Y
   parameter_minval=DI
   parameter_maxval=S
   parameter_comment=
   parameter_default=0
   table_name=heasarc_messier
   parameter_name=bii
   parameter_description=Galactic Latitude
   parameter_format=float8
   parameter_unit=degree
   parameter_is_index=Y
   parameter_minval=-51.932565
   parameter_maxval=84.424982
   parameter_comment=
   parameter_default=0
   table_name=heasarc_messier
   parameter_name=lii
   parameter_description=Galactic Longitude
   parameter_format=float8
   parameter_unit=degree
   parameter_is_index=Y
   parameter_minval=1.722578
   parameter_maxval=356.865940
   parameter_comment=
   parameter_default=0
   ...

HDBmc: sql

The SQL command lets the user send an arbitrary SQL query to the database. The result will be pipe delimited table. Note that the SQL statement needs to be enclosed in quotes to ensure that it is seen as a single token. Double quotes are recommended since SQL uses single quotes as the delimiter for strings. E.g.,

   HDBmc sql="select * from zzext where table_name='zzdp'"

would return all of the ZZEXT entries for the ZZDP table (the shortcuts discussed in the previous chapter).

Example:

   HDBmc sql="select * from heasarc_messier"

Response:

+--------------+----------------------+-----------+--------------+----------------------+...
|alt_name      |bii                   |class      |constell      |dec                   |
+--------------+----------------------+-----------+--------------+----------------------+
|NGC 6475      |            -4.5212902|       3600|SCO           |      -34.816664907176|
|NGC 6637      |           -10.2731749|       3080|SGR           |      -32.350009145856|
|NGC 6681      |           -12.5107943|       3080|SGR           |      -32.300012468954|
|NGC 6405      |            -0.7155348|       3600|SCO           |       -32.21666091568|
|NGC 6809      |           -23.2733634|       3080|SGR           |      -30.966694770854|
|NGC 6715      |           -14.0974428|       3080|SGR           |      -30.483349176839|
|NGC 6266      |             7.3185708|       3080|OPH           |      -30.116649736484|
...
|NGC 3034      |            40.5596785|       6600|UMA           |       69.683390351975|
+--------------+----------------------+-----------+--------------+----------------------+
(109 rows)

(Note that only the first five fields of the output were shown.)

Note: The HDBfilter utility is extremely useful for formatting the results of a SQL query made using HDBmc.

A more detailed discussion (suitable for programmers) pertaining to the HDBmc command and its inner workings is in another document.

Querying HEASARC Metadata Using Browse Keyword Search

The Browse Keyword Search interface (help) provides a way that users can query Browse metadata through the Web. If users enter the name of a Browse metadata table, then that table will be queried. If there are other keyword parameters, then the specified metadata table will be queried for values for the Browse user tables whose metadata matches the other keywords.

E.g., to simply dump the first few values in the ZZGEN table the user can enter

      zzgen

But if the user wants only the XMMMASTER entry for ZZGEN they could try

      zzgen xmmmaster

Or to get the ZZGEN entries for all ASCA tables

      zzgen asca

Keywords are searched against all metadata so one can get more results than one expects. E.g.,

      zzgen xtemaster

will return results for both XTEMASTER and XTEINDEX since XTEINDEX mentions XTEMASTER in its metadata. If the user is searching a table that does not have a table_name field, then the additional keywords should not be specified. The table name is matched against the list of tables that the user has specified.

Users can put qualifications on the query themselves. E.g.,

      zzext table_name=zzdp

will list the data product shortcuts which are stored in the ZZEXT table with the table_name set to zzdp. Or try

      zzext parameter_name=observatory_name

to get the mission/observatory associated with each table.

The ZZGEN, ZZPAR, ZZEXT, ZZDP, ZZDPSETS, ZZMASTER, ZZWORDS, ZZLINK, and ZZBIB tables can all be queried using this interface. One note regarding querying the ZZBIB table: It is possible for Browse Keyword Search interface to get confused with the bibcode strings. These often have multiple consecutive periods which can be interpreted as the Browse parameter value range operator ("..").

The row and table limits of the keyword search interface apply to these metdata queries. Only the first 200 matching rows will be shown by default if keywords are specified only the top 10 matching tables will be included. The rows and tables qualifiers can be used to change these settings. E.g.,

     zzpar x-ray r:all t:all

will describe all parameters of all rows for any table that includes x-ray in its metadata.

Vendor-Supplied Methods of Accessing the HEASARC Database

All major database vendors provide tools to access their respective database systems. These tools usually need to run on the machines on which the database software is installed, but they provide full access to the capabilities of the database. With these tools users can create, view, modify and query tables. The specific vendor-supplied tools available at the HEASARC are discussed in the appendix. Generally, they provide a mechanism for executing an arbitrary SQL command and displaying the results to a terminal. More sophisticated GUI tools may also be available.

Typically, database configuration and performance tuning are possible only through these vendor-supplied interfaces.

Common Database Tasks

While the Browse tools simplify access to the database, they provide only read access to the database and do not give the users access to all the query capibilities availble through SQL. In this section we describe a few queries that cannot be executed in Browse but might be of interest to a user. This is not intended as an SQL primer. SQL documentation is supplied with the vendor-supplied database system and is also widely available on the Web.

List all of the tables in the database

   SQL:
       select table_name from zzgen

Note that this will include tables which are not displayed in Browse either because they are metadata tables, they are missing required ZZEXT metadata (e.g., a mission to be associated with), or they have been explicitly turned off using the w3browse_display=N virtual parameter mechanism. An equivalent query can be done through the Browse Keyword Search interface.

List all of the tables with data products

  SQL:
      select distinct table_name from zzdpsets

Note the usage of the distinct attribute since we don't want to display a table multiple times when if it has multiple data products.

List all of the missions that have data products

  SQL:
      select distinct parameter_value from zzext
         where table_name in (select distinct table_name from zzdpsets) and
	       parameter_name='observatory_name'

This involves a join of tables. We put on a requirement that the tables we're interested have data products and then display the missions.

Fix a particular value in a table where it is wrong in only a single location

  SQL:
     update heasarc_table
        set theparameter=newValue
	where _unique_id=12345

This uses the SQL update command. Normally, updates should be done by modifying the appropriate TDAT file and re-ingesting the table (see the next section), but this can be used under some circumstances. Note the use of the '_unique_id'. Almost all tables have this field which increments for each row so that it is guranteed to be unique. This field should be used in preference to other fields if you want to be absolutely certain you are modifying only a single row.

Change a field value whereever it has a certain value

Suppose a table uses the string 'STORED' to indicate the data is in our archive, but you wish to change that string to 'ARCHIVED' to be more compatible with the rest of our tables.

  SQL:
    update heasarc_table
       set status='ARCHIVED' where status='STORED'

The system will normally give an indication of the number of rows updated. Once such a change has been made, you may wish to immediately exgest the table to a TDAT file after making such changes.

Create a copy of a table

  SQL
    select  * into newtable from heasarc_table

This syntax is not standard SQL, but it is widely supported. Note that this does not copy the metadata.

Note: You can also create a copy of a table including the metadata easily using the HDBexgest tool to export the table as a TDAT. Then, after a simple modification of the TDAT file, you can use the HDBingest tool to ingest the new table. This is probably the preferred procedure under most circumstances and is descibed in another section.

Copy the ZZEXT metadata from an old table to a new table

E.g., the table created in the previous example.

  SQL
    insert into zzext
       select "newtable", parameter_name, parameter_value
          from zzext
	  where table_name='heasarc_table'

This is a little more tedious for the other tables, since they have more fields, but it is easy enough to write a script and then edit it when needed. Note how we do not just copy the old table name (heasarc_table) into zzext.

Once you have a copy of the table and metadata, you may modify the new table however you like. The HDBexgest tool can be used to save the table you end up with.

How many objects of each different class are there in the WGACAT catalog?

  SQL:
    select distinct count(*),class from heasarc_wgacat
      group by class
      order by class

Using the group by clause is advanced SQL. The standard Browse interface does not expose such advanced querying methods.


Documentation prepared by the HEASARC Database Group
HEASARC Home | Observatories | Archive | Calibration | Software | Tools | Students/Teachers/Public

Last modified: Wednesday, 01-Feb-2006 11:14:05 EST

The Astrophysics Science Division (ASD) at NASA's Goddard Space Flight Center (GSFC) seeks a creative, innovative individual with strong teamwork and leadership skills to serve as Director of the High Energy Astrophysics Science Archive Research Center (HEASARC). This will be a permanent civil servant position. + Learn more.