This chapter introduces BROWSE and contains explanations for the terminology
used throughout this user's guide. This gives valuable insight into the
internal structure of BROWSE. It is essential reading for a new user and a
useful reference dictionary as the user explores the capabilities of BROWSE.
BROWSE is a command-driven program that uses a database management system to
select data from tables of information and then executes analysis tasks on
those data. The program name reflects the fact that it provides astronomers
with a tool to ``browse" through large volumes of data, selecting or discarding
them as required. Emphasis is placed on data visualization so that an immediate
assessment of data quality, source properties, and overall characteristics can
be made.
BROWSE has been used to access EXOSAT, Einstein, and IUE data files, as well as
many popular astronomical catalogs. Interfaces are provided to the XANADU
X-ray astronomy data analysis packages (such as XSPEC and XRONOS) and to
SAOIMAGE. These interfaces allow these analysis packages to be run from BROWSE
on selected data files, without the need to extract the files, or specify the
file names. In this way the user is able to display an image, make a spectral
fit, without worrying about the location or names of the files. BROWSE also
provides commands to extract the data files, so they can be copied to the
user's local directory for use by analysis packages outside the BROWSE
environment.
The BROWSE functions include
the display of parameter values
searching by coordinates, name, or any other parameter
plotting of parameter values and data files
creation of samples that view only part of a database table
establishing correlations (relationships) between tables
BROWSE was originally created to provide access to the EXOSAT Database, which
contains the results and data products from the European Space Agency's X-ray
astronomy mission, EXOSAT (1983-1986). The rapid growth in the late 1980's of
international computer networks which included remote login capability (such as
the Space Physics Analysis Network, SPAN) provided the opportunity for
immediate access via online services.
BROWSE uses a database management system (DBMS) that was developed by the
EXOSAT Observatory team and is optimized for astronomical queries (such as
coordinate searches). It is based on original DBMS software written early on
during the EXOSAT mission by P. Giommi and L. Chiappetti as part of a program
to survey serendipitous sources detected by EXOSAT. The original software
included a ``BROWSE" function in order to examine the various detected sources
and their associated images, to verify the reality of detected sources.
The XANADU analysis infrastructure developed at the Institute of Astronomy,
Cambridge, UK, was adopted in 1985 to provide the parser (XPARSE), a
plotting and function fitting package (QDP) and a spectral analysis
environment (XSPEC). Timing (XRONOS) and image analysis (XIMAGE) packages
were added by the EXOSAT Observatory team members. The pgplot package
developed by Tim Pearson at CalTech provides the underlying graphics
display.
BROWSE was developed in conjunction with the astronomical community via two
``database workshops" held at ESTEC in December 1988 and February 1990. At
these workshops more than 40 astronomers used a prototype version of
BROWSE. These workshops, and other demonstrations of the system at
astronomical meetings around the world were used to develop a user
interface and functionality will meet the needs of astronomers.
More recently BROWSE has been adopted by the High Energy Science Archive
Research Center, which has continued its developement as a multimission
database environment, in collaboration with ESA.
The parameters within each database are predefined and tailored to the function
of that database. The parameter fields contain either a numeric value or a
character string. The numeric values are stored internally as one of the
following: integer*2, integer*4, real*4, and real*8. Character strings
can be between c*2 and c*254 in length. Parameter fields may contain
observation details, results of data analysis, and the filenames of associated
data products or files.
Every parameter has a name and a number associated with it. Either the
name or the number can be used to specify the parameter. If the number
is used, it must be prefixed by a # mark WITH NO SPACE. For example, in
the LE database, count rate has #17 associated with it. To specify this,
the user can either type count or #17.
Parameter names can also include the database name as a prefix, with the dot
separating the two (for example, db_name.par_name). If the parameter is from
the current database, then the current database name is used by default. The
database name need only be specified when two or more base tables are accessed
simultaneously.
ENTRIES refer to different rows in the table. When part or all of a table
is viewed using BROWSE, each entry is assigned a number which refers to its
descending order based on the value of a specified parameter (see samples,
indexes, and subsamples). These entry numbers are not unique and they
change if the current index is changed, or if the selection of entries is
changed. There is always a CURRENT ENTRY which is used as the default by a
command if no entry number is specified.
SAMPLES are sets of pointers to base table records selected using a range of
one or more parameter values. Each sample gives a particular view of the base
table (and is the same as the result of the SQL CREATE VIEW). Samples
allow pointers to selected records to be saved in a file for later recall. A
number of SYSTEM SAMPLES are provided which contain commonly-used selections.
For every base table there is a TOTAL sample containing pointers to the
entire set of tables.
The record numbers that make up a sample are stored in a file which is ordered
on the value of a particular parameter. The pointers contained in the file are
then used to access entries in that order. The index file is used both to
search rapidly for all entries corresponding to a range of values, and to list
the selected sample in the order of a particular parameter value. All indexes
are in descending order.
The number of disk reads required to perform the search determines the speed at
which the records are found. To optimize the search, an index file also
contains the ordered parameter values. In many cases, the NUMBER OF DISTINCT
VALUES in an index is considerably less than the NUMBER IN THE SAMPLE and,
again to speed up the search process, the pointers to those entries that mark
the beginning of a sequence of identical values are also stored in the index
file. Only one index can be used at a time. Standard indexes are provided for
commonly used parameters, such as coordinates, time, and name.
If a search parameter is not indexed a sequential search is made.
For coordinate searches the indexed parameter is the declination (Dec). The
search locates all records within the specified declination and then
sequentially checks that the right ascension (RA) is within the prescribed
limits. Both Dec and RA are stored in the index file to improve the search
speed for large catalogs.
The search process selects a number of records that satisfy the search
criteria. These record numbers are saved in memory and are referred to
as the SUBSAMPLE.
The basic assumption behind samples and subsamples is that the number in
the latter is orders of magnitude less than in the former. For searches through
a large number of records (greater than a thousand), an index is
essential; otherwise the search time becomes prohibitive. For a few hundred
records or less, sequential search times are acceptable. In astronomical
database applications the typical search is for all observations of a
particular object. This usually results in massive reduction in the number of
records required. However, once these records are found, the astronomer may
wish to make further selections, reselections, and reordering of the records
based on various parameter values. Accordingly, the subsample can be filtered
and sorted. In addition, the subsample can be used to plot and list parameter
values, as well as to display associated data products or execute other
analysis tasks.
If no subsample has been selected, then the SAMPLE and SUBSAMPLE are the
same. There is, however, a limit to the size of a subsample. The limit of
50,000 records is imposed by the number or pointers stored in memory (this
number may be changed from time to time; use the show command to see the
current value). The sample size may be limited by the memory capacity of
the underlying system, and in most cases this limit is not an issue.
However, one exception is the HST Guide Star Catalogue and in this case the
database is split into sub-databases using different ranges of RA.
After a search has been made, any further searches operate on the original
sample, not on the subsample, and the subsample will be lost, unless it is
saved first.
There are a number of parameters which are common to many database tables and
can be used to establish cross-correlations between database records. The most
commonly used of these are coordinates (RA and Dec) and time. These
correlation samples can be saved and restored. They are very similar to the
samples discussed earlier. The major difference is that the related record
numbers from several base tables are stored together.
Cross-correlation samples connect records in each base table so that, for
instance, the V Magnitude from an optical catalog can be plotted against the
X-ray count rate in an X-ray catalog.
Data files can be accessed from BROWSE, and can be plotted, extracted or
analyzed. Typically these files contain data that has been preprocessed
into an event list, a spectrum, lightcurve, or image. These may be the
result of an automatic processing of the raw data, or generated
interactively by an astronomer. The XANADU programs XSPEC, XIMAGE and
XRONOS are used to plot and analyze data files. These programs are spawned
from BROWSE, they are not part of BROWSE. The files may be extracted and
copied to another site and analysis software (XANADU or otherwise) run on
them.
The structured query language, SQL, is an ANSI standard language used to make
database queries. It provides a vendor-independent interface to a DBMS.
However, it can be rather clumsy to use directly and it does not support all of
the functions required to query an astronomical database. A subset of SQL
commands is available from BROWSE that enables users familiar with this
language to query the database tables and establish subsamples. A special
function has been introduced to allow conical coordinate selections.