Next: Browse Basics Up: HEASARC Users Guide Previous: BROWSE: Walk Through

Subsections

BROWSE

Introduction

This chapter introduces BROWSE and contains explanations for the terminology used throughout this user's guide. This gives valuable insight into the internal structure of BROWSE. It is essential reading for a new user and a useful reference dictionary as the user explores the capabilities of BROWSE.

Capabilities

BROWSE is a command-driven program that uses a database management system to select data from tables of information and then executes analysis tasks on those data. The program name reflects the fact that it provides astronomers with a tool to ``browse" through large volumes of data, selecting or discarding them as required. Emphasis is placed on data visualization so that an immediate assessment of data quality, source properties, and overall characteristics can be made. BROWSE has been used to access EXOSAT, Einstein, and IUE data files, as well as many popular astronomical catalogs. Interfaces are provided to the XANADU X-ray astronomy data analysis packages (such as XSPEC and XRONOS) and to SAOIMAGE. These interfaces allow these analysis packages to be run from BROWSE on selected data files, without the need to extract the files, or specify the file names. In this way the user is able to display an image, make a spectral fit, without worrying about the location or names of the files. BROWSE also provides commands to extract the data files, so they can be copied to the user's local directory for use by analysis packages outside the BROWSE environment. The BROWSE functions include

the display of parameter values
searching by coordinates, name, or any other parameter
plotting of parameter values and data files
creation of samples that view only part of a database table
establishing correlations (relationships) between tables
the retrieval of data files
analysis of selected data files
a structured query language interface

History

BROWSE was originally created to provide access to the EXOSAT Database, which contains the results and data products from the European Space Agency's X-ray astronomy mission, EXOSAT (1983-1986). The rapid growth in the late 1980's of international computer networks which included remote login capability (such as the Space Physics Analysis Network, SPAN) provided the opportunity for immediate access via online services. BROWSE uses a database management system (DBMS) that was developed by the EXOSAT Observatory team and is optimized for astronomical queries (such as coordinate searches). It is based on original DBMS software written early on during the EXOSAT mission by P. Giommi and L. Chiappetti as part of a program to survey serendipitous sources detected by EXOSAT. The original software included a ``BROWSE" function in order to examine the various detected sources and their associated images, to verify the reality of detected sources. The XANADU analysis infrastructure developed at the Institute of Astronomy, Cambridge, UK, was adopted in 1985 to provide the parser (XPARSE), a plotting and function fitting package (QDP) and a spectral analysis environment (XSPEC). Timing (XRONOS) and image analysis (XIMAGE) packages were added by the EXOSAT Observatory team members. The pgplot package developed by Tim Pearson at CalTech provides the underlying graphics display. BROWSE was developed in conjunction with the astronomical community via two ``database workshops" held at ESTEC in December 1988 and February 1990. At these workshops more than 40 astronomers used a prototype version of BROWSE. These workshops, and other demonstrations of the system at astronomical meetings around the world were used to develop a user interface and functionality will meet the needs of astronomers. More recently BROWSE has been adopted by the High Energy Science Archive Research Center, which has continued its developement as a multimission database environment, in collaboration with ESA.

Basics

Databases

A database consists of

a base table consisting of one or more parameters and a number of records;
associated data product files, referenced in the table; and
a description of the contents of the database.

There is always one base table that is referred to as the current database, and upon which the BROWSE commands operate by default.

Parameters

The parameters within each database are predefined and tailored to the function of that database. The parameter fields contain either a numeric value or a character string. The numeric values are stored internally as one of the following: integer*2, integer*4, real*4, and real*8. Character strings can be between c*2 and c*254 in length. Parameter fields may contain observation details, results of data analysis, and the filenames of associated data products or files. Every parameter has a name and a number associated with it. Either the name or the number can be used to specify the parameter. If the number is used, it must be prefixed by a # mark WITH NO SPACE. For example, in the LE database, count rate has #17 associated with it. To specify this, the user can either type count or #17. Parameter names can also include the database name as a prefix, with the dot separating the two (for example, db_name.par_name). If the parameter is from the current database, then the current database name is used by default. The database name need only be specified when two or more base tables are accessed simultaneously.

Entries

ENTRIES refer to different rows in the table. When part or all of a table is viewed using BROWSE, each entry is assigned a number which refers to its descending order based on the value of a specified parameter (see samples, indexes, and subsamples). These entry numbers are not unique and they change if the current index is changed, or if the selection of entries is changed. There is always a CURRENT ENTRY which is used as the default by a command if no entry number is specified.

Samples

SAMPLES are sets of pointers to base table records selected using a range of one or more parameter values. Each sample gives a particular view of the base table (and is the same as the result of the SQL CREATE VIEW). Samples allow pointers to selected records to be saved in a file for later recall. A number of SYSTEM SAMPLES are provided which contain commonly-used selections. For every base table there is a TOTAL sample containing pointers to the entire set of tables.

Indexes

The record numbers that make up a sample are stored in a file which is ordered on the value of a particular parameter. The pointers contained in the file are then used to access entries in that order. The index file is used both to search rapidly for all entries corresponding to a range of values, and to list the selected sample in the order of a particular parameter value. All indexes are in descending order. The number of disk reads required to perform the search determines the speed at which the records are found. To optimize the search, an index file also contains the ordered parameter values. In many cases, the NUMBER OF DISTINCT VALUES in an index is considerably less than the NUMBER IN THE SAMPLE and, again to speed up the search process, the pointers to those entries that mark the beginning of a sequence of identical values are also stored in the index file. Only one index can be used at a time. Standard indexes are provided for commonly used parameters, such as coordinates, time, and name. If a search parameter is not indexed a sequential search is made. For coordinate searches the indexed parameter is the declination (Dec). The search locates all records within the specified declination and then sequentially checks that the right ascension (RA) is within the prescribed limits. Both Dec and RA are stored in the index file to improve the search speed for large catalogs.

Subsamples

The search process selects a number of records that satisfy the search criteria. These record numbers are saved in memory and are referred to as the SUBSAMPLE. The basic assumption behind samples and subsamples is that the number in the latter is orders of magnitude less than in the former. For searches through a large number of records (greater than a thousand), an index is essential; otherwise the search time becomes prohibitive. For a few hundred records or less, sequential search times are acceptable. In astronomical database applications the typical search is for all observations of a particular object. This usually results in massive reduction in the number of records required. However, once these records are found, the astronomer may wish to make further selections, reselections, and reordering of the records based on various parameter values. Accordingly, the subsample can be filtered and sorted. In addition, the subsample can be used to plot and list parameter values, as well as to display associated data products or execute other analysis tasks. If no subsample has been selected, then the SAMPLE and SUBSAMPLE are the same. There is, however, a limit to the size of a subsample. The limit of 50,000 records is imposed by the number or pointers stored in memory (this number may be changed from time to time; use the show command to see the current value). The sample size may be limited by the memory capacity of the underlying system, and in most cases this limit is not an issue. However, one exception is the HST Guide Star Catalogue and in this case the database is split into sub-databases using different ranges of RA. After a search has been made, any further searches operate on the original sample, not on the subsample, and the subsample will be lost, unless it is saved first.

Cross Correlations

There are a number of parameters which are common to many database tables and can be used to establish cross-correlations between database records. The most commonly used of these are coordinates (RA and Dec) and time. These correlation samples can be saved and restored. They are very similar to the samples discussed earlier. The major difference is that the related record numbers from several base tables are stored together. Cross-correlation samples connect records in each base table so that, for instance, the V Magnitude from an optical catalog can be plotted against the X-ray count rate in an X-ray catalog.

Data Files

Data files can be accessed from BROWSE, and can be plotted, extracted or analyzed. Typically these files contain data that has been preprocessed into an event list, a spectrum, lightcurve, or image. These may be the result of an automatic processing of the raw data, or generated interactively by an astronomer. The XANADU programs XSPEC, XIMAGE and XRONOS are used to plot and analyze data files. These programs are spawned from BROWSE, they are not part of BROWSE. The files may be extracted and copied to another site and analysis software (XANADU or otherwise) run on them.

Structured Query Language (SQL)

The structured query language, SQL, is an ANSI standard language used to make database queries. It provides a vendor-independent interface to a DBMS. However, it can be rather clumsy to use directly and it does not support all of the functions required to query an astronomical database. A subset of SQL commands is available from BROWSE that enables users familiar with this language to query the database tables and establish subsamples. A special function has been introduced to allow conical coordinate selections.

Next: Browse Basics Up: HEASARC Users Guide Previous: BROWSE: Walk Through

Michael Arida
1998-04-10