The HEASARC Database System

Appendix: Detailed System Configuration and Vendor Software

The discussion of the previous sections generally tries to avoid mention of specific implementation details for the HEASARC system. This includes the actual location of the database and executables, and details relating to the specific relational database system used by the HEASARC. These details are discussed in this appendix.

Physical System Architecture

Three major elements comprise the physical architecture for the HEASARC databases.

Web Services

The HEASARC's Web service computers comprise a load balancer and several Web servers (HEASARC1, HEASARC2, ...). Only system administrators generally work with the load balancer. Its function is to choose which machine should satisfy a particular request.

It is important in building HEASARC Web applications to recognize that any of several machines may process a request. While the load balancer can be tweaked to try to ensure that a series of related requests are processed by the same machine, services that require creation and use of persistent data should be carefully constucted. There are data areas which are shared among all of the Web processors which can be used to help. If your program does not create any files that are to be used in later applications, this issue need not be considered.

In addition to the operational Web servers there is a development Web server, HEASARCDEV. Both the development and operational Web software is mounted on this machine. The development hierarchy is rooted at /www which the operational hierarchy is rooted at /www.prod. These two hierarchies are not mirrored automatically. It is the responsibility of individual developers to copy files from one hierarchy to the other. E.g., the HEASARC development Browse CGI scripts are found at HEASARCDEV:/www/htdocs/cgi-bin/W3Browse. When a change in one of these scripts is to be delivered to the operational environment it is copied to HEASARCDEV:/www.prod/htdocs/cgi-bin/W3Browse.

Services and pages in the development environment can be tested by using Web addresses at http://heasarcdev.gsfc.nasa.gov. These addresses are not visible outside Goddard. Services and pages in the operational environment are visible using the Web addresses at http://heasarc.gsfc.nasa.gov.

Users planning to build Web applications should discuss the consequence of the 'chrooting' of the Web environment with the system staff. Basically this means that when Web scripts run they can see only a small fraction of the files mounted on the Web server. Applications need to ensure that all needed files are included in the 'chrooted' environment. Many system executables that are available in when logged in, are not available in the chrooted environment, nor is the Perl environment guaranteed to be the same, i.e., different modules may be loaded in the Web environment that in the login environment.

The Archive Data

The data for the HEASARC archive is stored in a Web appliance which is mounted on all Web servers as both /FTP and /www/htdocs/FTP. Both of these directories will work in both the normal and chroot()'ed Web environments.

A backup copy of the archive is mounted on the SAFEHOUSE machine. In principle this copy could be used to replace the current operational version, but there is no procedure in place to do this automatically. Please contact the system administrators if you feel this is required.

The HEASARC Database Machines

The HEASARC currently licenses two copies of the Sybase database system (version 12.5). These run on the machines DBMS1 and DBMS2. Users who wish to use Sybase software to directly access the databases, need to request permission to log into these machines. This is not required for accessing the databases through HEASARC developed software tools, e.g., the HDBmc command or the DBQ API that are discussed elsewhere.

Sybase software is installed at /usr/sybase. Binary executables are found at /usr/sybase/OCS/bin and detailed documentation for Sybase can be found at /usr/sybase/doc. The file /usr/sybase/doc/CONTENTS describes the documentation available. The available documentation includes details on the specific Sybase implementation of SQL, how to tune Sybase tables, and how to use Sybase utilities.

In particular users may wish to log onto one of these machine to run the Sybase supplied ISQL command. This provides the interactive user with the full power of the the Sybase system. Generally, users should refrain from using the operational Sybase environment on DBMS1 and limit any activities to the development database on DBMS2. If a user has modified a table in the development environment and wishes that change to be propogated to the operational environment, the table should be exported into a TDAT file using the HDBexgest command. This TDAT file can then be ingested in the operational environment using HDBingest.

Software Installation

This section describes the areas in which major elements of software are installed.

/heasarc/bin
Most generic HEASARC command-line executables and tools are found here, notably the HDBmc and zzlink tools. These should be usable from most HEASARC machines.
/heasarc/hrcdba/bin
Most software executables beginning with HDB are found here, notably the HDBingest and HDBexgest. These should be usable from dbms1.gsfc.nasa.gov and dbms2.gsfc.nasa.gov, and then usually only when logged into the hrcdba account.
/usr/sybase/OCS/bin, /heasarc/sybase/OCS/bin
On the DBMS1 and DBMS2 machines this directory contains the Sybase- supplied executables, notably "isql" and "bcp".
/www/htdocs/W3Browse
This directory contains the static pages and files used for Browse, notably the help files for each database table.
/www/htdocs/cgi-bin/W3Browse
This directory contains the CGI programs for Browse.
/www/htdocs/cgi-bin/W3Browse/lib/Browse
This directory contains the Browse Perl modules. This includes the latest version of the DBQ module that other users may wish to use (though this may migrate to the directory below).
/www/htdocs/cgi-bin/lib, /www/htdocs/cgi-bin/lib/heasarc
These directories contain Perl modules that are of general interest to the HEASARC software community, including time, data transformation and other utilities and libraries for querying the SIMBAD and NED name resolvers.

Database Vendor Software: Using ISQL

The design of the HEASARC database system is intended to be independent of the specific database system chosen. It currently is running over both Sybase and Oracle systems, and it has been used with INGRES and OpenINGRES in the past. The discussion of how to use the Sybase ISQL command is placed in this appendix to mininize the dependencies of the core document on our current configuration.

The Sybase database system, like many comparable products, uses a set of users ids and passwords that are additional to the the login user IDs and passwords. When a user invokes Sybase, the Sybase user ID and password must be supplied. There need not be (and generally is not) any correlation between the system ID of a user (i.e., the ID they log onto the system as), and the ID they use in accessing the HEASARC databases. Indeed, most access to the HEASARC databases uses one of two Sybase IDs: the ID 'hrcdba' is used when complete write access is needed for the database. The ID 'HDBmcd' is used when only read access is required. In fact even the HDBmcd account can create temporary tables (which is used during some cross-correlations). Activities building and modifiying tables use the hrcdba account, while Browse and most query tools use the HDBmcd account.

The passwords for these accounts can be obtained from the HEASARC Database Administrator (currently Ed Sabol) or Tom McGlynn. Note that if users have special requirements, additional accounts can be created and tables can be tailored to be accessible only to specific accounts.

Before using ISQL, the user needs to define two environment variables: SYBASE, which defines the directory location of the Sybase installation, and DSQUERY, which indicates the name of the Sybase database sever to which it should connect. Using CSH, the user might enter:

   setenv SYBASE /usr/sybase
   setenv DSQUERY SYBASE_DBMS2

Then the user can invoke the ISQL as

   /usr/sybase/OCS/bin/isql -Uuser -Ppassword -w2000

where user is a valid Sybase user name and password is the appropriate password. The -w2000 option indicates that ISQL should allow output lines to be 2000 characters long before breaking them. If this isn't specified, the default is only about 80, so tables are very commonly broken into multiple lines of output for each row.

ISQL will respond with the prompt:

  1>

A user then types in an SQL statement on as many lines as desired. When the user has finished the SQL statement it is executed by entering a line with just the text GO and hitting return. E.g., a user might enter

  1> select *
  2>    from heasarc_messier
  3> 
  4> go
 alt_name   bii                  class  constell dec                  dimension lii                  name   ...
 ---------- -------------------- ------ -------- -------------------- --------- -------------------- ------ 
 NGC 6475              -4.521290   3600 SCO                -34.816665     80              355.845674 M 7    
 NGC 6637             -10.273175   3080 SGR                -32.350009      7                1.722578 M 69   
 NGC 6681             -12.510794   3080 SGR                -32.300012      8                2.844483 M 70   
 NGC 6405              -0.715535   3600 SCO                -32.216661     15              356.585216 M 6    
 NGC 6809             -23.273363   3080 SGR                -30.966695     19                8.790994 M 55   
 NGC 6715             -14.097443   3080 SGR                -30.483349      9                5.607723 M 54   
 NGC 6266               7.318571   3080 OPH                -30.116650     14              353.569881 M 62   
 NGC 5236              31.972476   6200 HYA                -29.866606  11X10              314.579665 M 83   
 ...
 NGC 3031              40.902749   6200 UMA                 69.066724  26X14              142.087086 M 81   
 NGC 3034              40.559678   6600 UMA                 69.683390   11X5              141.410461 M 82   

(109 rows affected)

 1>

(The result lines have been truncated after the first few fields).

Note that ISQL reserves a certain amount of space for each data element even if that space is never entirely used up. E.g., if a datatype is a char80 type, then 80 characters will be reserved in the output even if the longest actual value is just 5 characters. Once ISQL has displayed the results the user can enter a new command.

A user can query a metadata table in just the same way. To get all of the shortcuts used in defining URLs in the ZZDP table we might enter

  1> select *
  2>    from zzext
  3>    where table_name='zzdp'
  4> go
 table_name           parameter_name          	 parameter_value                                                                  
 -------------------- ------------------------	 -------------------------------------------------------------------------------- 
 zzdp                 a1                      	 ${heao1}/a1                                                                      
 zzdp                 a2                      	 ${heao1}//a2                                                                     
 zzdp                 a4                      	 ${heao1}//a4/pingpong                                                            
 zzdp                 ariel5                  	 ${prefix}/ariel5                                                                 
 zzdp                 asca                    	 ${prefix}/asca/data                                                              
 zzdp                 asca_abst               	 ${prefix}/asca/abstracts//abstracts                                              
 zzdp                 ascasis                 	 ${asca}//ascasis                                                                 
 zzdp                 batse                   	 ${cgro}//batse                                                                   
 ...
 zzdp                 xteasmdef               	 ${xteasm}//definitive_1dwell                                                     
 zzdp                 xteasmquick             	 ${xteasm}//realtime_current                                                      
(91 rows affected)

This example shows what happens when we do not specify the -w option. The results are broken into two lines per result. Other parameters of the ISQL command can be used to change the delimiters between fields which can make the output a little more parseable.

Here is a more complex example illustrating a series of ISQL commands and responses. The lines which do not begin n> are system responses.

1> create table xyzzy(a int,b int)
2> go
1> create table xyzzy(a int,b int)
2> go
Msg 2714, Level 16, State 1:
Server 'SYBASE_DBMS1', Line 1:
There is already an object named 'xyzzy' in the database.
1> insert int xyzzy values(1,2)
2> go
Msg 102, Level 15, State 1:
Server 'SYBASE_DBMS1', Line 1:
Incorrect syntax near 'xyzzy'.
1> insert into xyzzy values(1,2)
2> go
(1 row affected)
1> update xyzzy set a=2 where a=1
2> go
(1 row affected)

The ISQL command has some facilities for redirecting inputs and outputs, but this is also easily done using standard Unix I/O redirection.

Complete documentation for the ISQL command is found at /usr/sybase/doc/PDF/asg1250e/util.pdf on DBMS1 or DBMS2. Documentation on the detailed SQL syntax supported by Sybase is also available in the same directories.


Documentation prepared by the HEASARC Database Group

HEASARC Home | Observatories | Archive | Calibration | Software | Tools | Students/Teachers/Public

Last modified: Monday, 21-Jun-2004 17:37:32 EDT