Xamin can be used through a web interface designed as a Rich Internet Applicaiton (RIA) and also through a command line interface. All facilities are available through both interfaces.
http://heasarc.gsfc.nasa.gov/xamin
The Xamin web application contains a number of end-point URLs. This
web application includes not only the run-time code, but full documentation
and code for the installation and maintenance of Xamin.
icon to add all of the tables for that
mission. Then use Find matches in... Selected Tables to
search all of the tables in the mission for the selected time or positions.
You can add tables from other missions, or delete some of the tables before
querying if you like. You can then query the selected tables for
position or time constraints.
Query to...Correlate. Spatial cross-match
constraints will be added automatically, but you can override or modify
these to meet your needs.
Query to...Save to save this in your account. Bring up
the summary of your saved tables with theSession/User Tables--Info
menu. If you click on the Frequency column next to your saved table you should
get a pane that allows you to ask to be informed whenever the result of this
query change.
All tables can take part in correlations. However remote tables must be downloaded to the HEASARC before the correlation can take place. So in practice correlations of remote resources can only be done if the tables are of modest size or if constraints that can be passed to the remote site can restrict the size to a reasonable number. It may be impractical to query remote tables where the size of the downloaded data is more than ~10,000 rows.
In the web interface spatial correlation constraints are assumed and created for the user -- they can be deleted if not desired. In the command line version the OFFSET keyword is used to specify positional correlations, but the user must include this explicitly.
Temporal cross-correlations can be made as simple constraints on the time columns of the two tables. At the HEASARC times are stored in MJD's, so to ask for observations within a day of one another we can add a constraint of the format
b.time-1 .. b.time+1
in the Parameters explorer for the field a.time.
In the command line interface that might be specified as
constraint='a.time between b.time-1 and b.time+1'
where the quotes are to ensure that the shell treats the value as a single string.
While any number of tables can be correlated in principle, queries tend to become very slow when large numbers of tables are involved. The web interface imposes a maximum of four tables in a correlation, no limit is specified for the command line. This limit can be gotten around if necessary by breaking up the correlation into a series of correlations and saving intermediate results as tables.
Note that Xamin does not do careful analysis of the input constraints to determine if pieces of an input constraint can be included in the second query. E.g., compare the constraint
constraint='d.exposure>1000 and c.exposure>1000'
versus
constraint=d.exposure>1000,c.exposure>1000
From Xamin's perspective in the first example we have a single constraint.
If d is the anticorrelation table, then in the first case
Xamin will not include any constraint on the exposure from table c
when it runs the second part of the anticorrelation. In the second case,
since the constraint on table c is specified independently
this constraint will be included in the second query.
When doing an autocorrelation it is usually necessary to make sure that you don't return a row matching itself. A simple way to ensure that is to add a constraint of the form
constraint='a.id != b.id'
in the query. In the Xamin web interface just enter != b.id
next to the entry for a.id. Of course the appropriate field
may be named something other than id.
There are two fundamental types of products:
As data products are collected they are added to a shopping cart. The shopping cart can contain products from multiple selections and from multiple queries. Users can also download individual data products directly (or in the case of directories open a web link to the directory). Link products cannot be added to the shopping cart -- they generally involve data that is intended to be viewed in browser or which are found at remote sites.
At any point in an Xamin session users can download the data in their shopping cart. They can download individual products one by one from the shopping cart, but if there are more than two or three products users may prefer to collect the data products together. A tar file of all of the products in the cart can be created. This tar file is created dynamically so the download should start almost immediately.
Whenever a significant amount of data is to be downloaded we recommend that a data download script be created rather than directly tarring the data. Scripts can be created which use either the WGET or CURL commands both of which are widely available. Scripts are much more easily restarted and allow the users complete control of the data download process. The script can be displayed and copied via striping or it can be downloaded explicitly.
In the command line environment a query can be used to build a download script to download archival products or to provide a list of links to link products.
On-line users can also save the current query configuration at any time into
a named configuration. This configuration can be restored at any later time.
Note that the configuration only affects the Query window, not any of the
other Xamin windows. If a configuration is saved with the
name default, then auser can automatically restore this configuraition
in subsequent logins. To change the default configuration the current
default must first be deleted.
Examples:
Examples:
Examples:
Examples:
Examples:
Examples:
Examples:
This section describes the steps in creating the complete Xamin data access system. The discussion of each step includes both the general issues that need to be resolved by an implementor in their system, and the specific choices made at the HEASARC.
This section describes a complete installation of the Xamin server. Users who only wish to be able to use the Xamin from their CLI environment need only download the very small users.jar file from the Xamin distribution directory (e.g. http://heasarc.gsfc.nasa.gov/xamin/distrib/users.jar).
Xamin is designed to use a PostgreSQL database. Some PostgreSQL specific features are used in the software to populate the system and in the query system. The keyword search interface uses PostgreSQL features that paralleled in some other systems (e.g., SQLServer), but may not be generally available. Postgres specific features of the system are called out
Postgres is freely available at:
http://postgresql.org
While one click install downloads are available, these generally require superuser privileges on the destination machines..
The tar file should be unpacked with
> tar zxf postgresql-X.Y.Z.tar.gz
this creates a subdirectory of the current directory of postgresql-8.4.2. An INSTALL file with
detailed directions for installing Postgres is available in this directory. The path used for the
HEASARC installation is noted below.
Within the postgres-ql-X.Y.Z directory
> ./configure --prefix=desired_postgres_directory
e.g.,
> ./configure --prefix=/local/db/postgres
Then execute:
make
make install
to compile and install Postgres in the desired location.
The current Postgres operational machine is installed on the ASDDB machine.
The Postgres server is ideally run under a special user account but may be run under a normal account. The special account can be created using normal system procedures.
To initialize the database first create the directory in which the database data is to be stored. This directory should be owned by the same user as will run the server.
> mkdir directory_for_data
then initialize the database using
> initdb -D directory_for data
where initdb is in the bin subdirectory of your Postgres installation.
The output from the initdb command indicates how one can start the database server. This command
should be put in the startup scripts for your database machine (but is not currently for the
HEASARC).
Once the database has been initialized some small amount of customization of the configuration files is desirable. These are found in the data directory mentioned above.
shared_buffers = 500MB
temp_buffers = 8MB
max_prepared_transactions = 5
work_mem = 16MB
maintenance_work_mem = 64MB
max_stack_depth = 8MB
All of these lines are updated from the values in the default configuration which is sized for a small
machine.
Before starting data ingest, an installer will need to download the the needed jar files. Jar files are modified zip files which are used to store Java classes. However they can also be used to store other kinds of files including data files used for running programs. The Xamin jar includes a number of utility programs (mostly written in Perl) that need to be unpacked to be used. The Xamin war file is created in building a full distribution of Xamin. This war file (a jar file containing the complete Web application) is currently created using NetBeans. To unpack the war, copy it to the directory in which you want to unpack it and
> jar xf xamin.war
Postgres allows for multiple databases to be supported by a single Postgres installation. Each database can have multiple schemas and each schema multiple tables. Cross-database queries are not supported, but cross-schema queries are. All tables used within Xamin should be placed within a single database which may now be created either by using the psql command (the CLI interpreter for the database), or using the createdb command. E.g., using psql we might have:
> psql
prompt=# create database xamin;
CREATE DATABASE
prompt=#
We place all tables into the heasarc database.
The metadata table, metainfo, needs to be created before any other tables can be read into the system. The
buildmeta.csh
command sets this up. If a feedback table is to be supported in the system this
can be created using the
buildfeedback.csh
command.
The scripts for initializing and populating the HEASARC database are included with the Xamin Web service. The service should be installed as described below. The scripts described below can all be found in the operations area of the installed service (typically something like .../xamin/WEB-INF/classes/ops).
An initial ingest of HEASARC tables may be done using the [run]fullcopy command. The runfullcopy (in ops/ingest) sets up logs and such for the fullcopy command. Just
> runfullcopy
will copy all HEASARC tables into the system using the current values of the TDAT files in the HEASARC's
http://heasarc.gsfc.nasa.gov/FTP/heasarc/dbase/tdat_files directory. As discussed in operations
below, this script is also used to update the database with new and updated tables.
The runfullcopy command uses other scripts in the directory. It calls
fullcopy after setting up logs. This calls copyit
for each matching TDAT file. The real workhouse inside copyit is pgcopy.pl which
parses the TDAT files and creates the SQL to ingest the table. This is then
sent to the Postgres psql to be executed.
The procedure populates both the tables themselves and the metadata regarding the tables. However it does not populate the data products related to a table.
In addition to populating the science tables, the runfullcopy command updates the
documentation tables within Xamin. Xamin keeps a copy of all HEASARC and Vizier table documentation
in the database. These can either be fully refreshed or updated depending upon
whether an argument is specified in the command. The updateDocs.pl script
is used by runfullcopy to find and ingest documentation.
This resulting SQL file should be run and then the fixdocs.sql script should be run. This last script adds in a bit of HEASARC metadata to the HEASARC descriptions, populates the TSQUERY column in the tabledocs table and gets rid of descriptions of HEASARC tables that are not present in the database.
The metadata for HEASARC products is ingested using the updateProducts command in ops/products, specifying a directory with the product definition files for each mission. E.g., as the code is distributed if the user makes the directory specified above the default directory then the command
updateProducts missions
will ingest the data products for the missions (Found in ops/products).
The HEASARC has a products file for each mission and special products files for table links, external links, and bibliographic links that should be individually processed.
createMasters.pl
script does an initial creation of the master tables. Run the
buildFullMasters.pl
script to populate the master tables: pos_big, pos_small, pos_time_big, pos_time_small
and time. This process can take a couple of hours.
Master tables are views of two underlying tables a _prime table and a
_bck. Creating the master tables is fairly expensive because we create
a clustered index on fairly large tables. We do
not wish to recreate them and regenerate the index each time we update a table.
After buildFullMasters.pl completes, the _bck
tables are empty. As tables in the database are updated, part of the update process
is to add the position/time information from those tables into the corresponding _bck
tables. Another table keeps track of the tables that have been updated since the last time
we ran buildFullMasters.pl. When a user makes a subsequent query of the
master tables, we look in the _bck table for information about science
tables that have been updated, and in the _prime table for information
about unchanged tables. Since only a small fraction of the tables get updates, the
_bck table is much smaller than the _prime table. Thus even
though it is not as efficiently indexed, master table queries can run quickly.
Periodically we run buildFullMasters.pl to bring the data from the _bck
tables into the primary tables. Since this takes a couple of hours, the approach is
to build up the new versions of the master tables using temporary table names.
Only when the new master tables are complete do we drop the previous version and
rename the new tables to the standard names. This takes seconds.
The heasarc.settings file is used to set up the defaults for the Xamin system. The file is normally found in the WEB-INF subdirectory of the Xamin web application. Below we provide an annotated version. Other installations will need to adjust the values to meet their requirements. End users should generally not need to worry about it. It is not used on the client side of the Xamin batch interface.
# The default database
# This will be used as the suffix in the jdbc_xxxx settings
database=postgres
# The JDBC parameters for Postgres
# This is probably the same for everyone.
jdbc_class_postgres=org.postgresql.Driver
# The host and database name will need to be updated.
# The port is the Postgres default port.
jdbc_url_postgres=jdbc:postgresql://db.host:5432/postgres
jdbc_user_postgres=dbuser
jdbc_pwd_postgres=password
# Nominal hosts versus actual host names. We want to
# be sure to translate any hostname we get into a valid
# host. So if we get heasarc we want the full path
# heasarc.gsfc.nasa.gov. If running on a load balancer
# then we want to translate the actual server name
# that we are running on, to the name that we want to
# publish.
heaweb1=heasarc.gsfc.nasa.gov
heaweb2=heasarc.gsfc.nasa.gov
heaweb3=heasarc.gsfc.nasa.gov
heaweb4=heasarc.gsfc.nasa.gov
# What to translate the host "localhost" to.
localhost=heasarc.gsfc.nasa.gov
# These are settings that the user is not allowed to change.
fixedSettings=database,jdbc_class_postgres,jdbc_url_postgres,jdbc_user_postgres
# The User account for non-privileged access
standard_user=dbuser
# Is there forwarding to a TOMCAT server. This happens only
# for the nominal released and operational versions, not development versions.
# When we are constructing a local URL from the elements available in the
# server environment we may find that the servlet is running on port 6080,
# but we want to publish the original port 80 for the Apache server.
ForwardedPort=80,6080
# The current version of Xamin.
# The first digit is incremented with major releases.
# The second digit is incremented when there is a significant change to user functionality.
# Letters are appended to show bug/documentation fixes that do not appreciably
# change end user functionality.
Version=1.3g
# The following two fields are place holders that we fill in during the installation
# procedure with appropriate versions for the HDB and Xamin projects that are used
# in building the Xamin web application.
HDBVersion=0
XaminVersion=0
# This is replaced by the current date in the installation procedure.
ReleaseDate=2011-01-01
# This is a relative or absolute URL to service that should be invoked
# to request a user login. Note that Xamin itself does not manage user logins.
# While a user is logged in, all Xamin requests use the login service
# to validate that the indicated user is logged in to the current session.
LoginServlet=/srvlogin
# This is a pointer to the location of the HEASARC hera service.
HeraServer=herasrv.gsfc.nasa.gov
HeraPort=60000
DftHeraName=AnonHera
# These are the E-mail addresses for administrators who will receive notifications
# when Xamin users submit feedback.
feedbackRecipients=user1@xxx.gov,user2@xxx.gov
# These provide a skeleton for the structure of download requests.
wgetcommand="wget -q -nH -r -l0 -c -N -np -R 'index*' -erobots=off --retr-symlinks "
curlcommand="curl -s --remote-name "
# User download requests using the TAR download feature are constrained
# to the tree below.
DownloadPath=/FTP
Smaller versions of the settings files ops.settings and dev.settings
are used by the operations scripts. These include the Postgres settings that
should be used by the operations scripts (which will typically be different from
that used in Xamin user queries).
<Valve className="org.apache.catalina.valves.AccessLogValve"
directory="logs" prefix="localhost_access_log." suffix=".log"
pattern="common" resolveHosts="false"/>
The following entries in the VirtualHost information in the Apache
server configuration files enables the TOMCAT server to run under the Apache server
with any URL that begins with /xamin processed by TOMCAT and all other URLs
handled directly by Apache.
<VirtualHost>
....
ProxyPass /xamin http://localhost:nnnn/xamin
ProxyPassReverse /xamin http://localhost:nnnn/xamin
....
</VirtualHost>
The first line says that when the Apache server sees the request it should delegate
to another web server running at port nnnn. The second line tells Apache
that when it get back a response from the remote port it should scrub references
in the headers to the actual server and replace them with references to the
Apache server.
Xamin is intended to run in a servlet environment and at the HEASARC is developed using
TOMCAT. To install Xamin in TOMCAT simply copy the xamin.war file to the TOMCAT
webapps directory and restart TOMCAT. This should automatically install the
system. Installers may
need to update the heasarc.settings file in the webapps/xamin directory.
mkdir webapps/xamin
chdir webapps/xamin
jar xf some/path/to/xamin.war
where the user is assumed to start in directory in which TOMCAT has been
installed. The web application may need to be restarted within TOMCAT.
The Xamin war file contains the full Xamin distribution including operational scripts. Scripts are generally found under the xamin/WEB-INF/ops directory. A site may wish to move these operational scripts to other directories, however the WEB-INF area and its subdirectories are invisible to the outside world so this area can be used without revealing operational details to the public.
Since direct access to the database is not available outside of GSFC the Xamin batch mode gets around this by providing a query proxy to the user. The user need download only a very small executable Jar file that contains two classes. The classes in this jar take the user's input arguments and convert them in to an HTTP POST request to the Xamin web server. If there are file arguments (i.e., UPLOAD) then these files are read and appended to the POST. If the user has specified an OUTPUT argument, then the output is written to the designated file.
The server side of the batch interface is a thin layer which initiates a servlet that runs the user specified query and returns the results. The CLI servlet is very similar to the servlet used in processing Web requests except for the handling of user authentication. If a user wishes to access an Xamin account through the batch interface, both the user name password must be supplied in the command arguments. These are authenticated using the HEASARC login service.
To execute the users JAR file the user should enter:
java -jar user.jar arg1 arg2 ...
For example
java -jar user.jar table=rosmaster,ascamaster offset=a:b:10
will run a positional cross-correlation of the rosmaster and ascamaster tables with
an maximum allowed offset of 10 arc minutes.
This sections describes the general flow of control in processing a query with a view to showing how the various classes work together.
The first order of business for the Controller is to set up three global structures. The Connecter is used to get connections to the underlying database or databases. The Settings are used to manage the system and user settings. The MessageHandler is used to convey information about processing to the user. Each of these is available in a global context. To allow for robust access even in an environment where there may be many simultaneous queries, these three globals use a static ThreadLocal variables to ensure that each instance of a query provides an independent instance to the query processing.
The Connecter allows for reuse of a single connection to the underlying database or when multiple connections are desired to get as many connection as needed. During shutdown it ensures that all connections are closed.
The Settings are essentially is just a Map from keywords to string values. Settings are set by system set up files, user arguments and also by program operations.
The Message Handler provides a standard way to send messages back to the user. Initially it is set to simply echo whatever is sent on the standard error output. Once the query processing has progressed to a stage where the desired output format has been established, a message handler compatible with that format will be used.
After establishing the Settings the Controller begins to analyze the request the user has made. There are a few non-query requests that are handled first.
If the user has requested help documentation the Help class is used to statisfy the request. If a USER setting has been specified, the the UserAccount class is used to ensure that the password specified is correct (or that the user has consistently forgone password protection on the account).
Next there is a check to see if this is an account processing directive. A user may create an account or modify it using the functions in the AccountManager or view or delete tables using the TableManager.
After verifying that the request is not one of the non-query requests, the Controller.runQuery method starts setting up the query environment. The first step is determining the output. A StandardDisplay object is created which will handle the communications with the user. A FormatFactory which creates the WriterElements appropriate for the selected format is created and the WriterElement are plugged into the StandardDisplay. WriterElements separate out tasks for writing. Notably there are WriterElements for writing table information and distinct elements (but associated through the FormatFactory) for writing out the data products associated with the element. The global MessageHandler is updated so that messages will be incorporated appropriately into the output, e.g., if the output is FITS, the messages may be rendered as FITS comments.
If the user has requested information about the setup environment, (showArguments or ShowSettings), then these are handled by sending the appropriate messages.
The first step in building the query is determining which tables are used in the query. Tables can be either either local (e.g., already present in the database), or uploaded. Uploaded tables will be added to the database temporarily for the duration of the query and then deleted afterwards. If there is a user account associated with the query, then the uploaded table can be persisted.
The Controller looks for a TableFinder class to use. Normally this will be the StandardTableFinder, but in principle this can be overriden by the user. This is flexibility is built into many of the key operational classes.
The TableFinder looks to the TABLE keyword to find any existing classes that are to be used in the query. It creates a QueryBuilder class that will be used to store information about the query and add Tables to it that represent each of the user selected tables. A MetadataInfo object is created that has all of the required metadata for the system by querying the system metadata table. The metadata for tables not participating in the query is not included in the MetadataInfo object.
Next come the uploaded tables. These are specified using the UPLOAD. Format specific classes (TDat, CSV, VOTable or SourceList) implement the Table interface for the input data. A generic TableCopier class then copies this table into underlying database as a temporary table. Metadata for the uploaded tables is added into the MetadataInfo object.
Once the tables that are to used in the query are found, work begins on setting up all of the SQL clauses in the query using a QueryConstructor. User specified constraints are added to the Where clause. There may be explicit constraints using the CONSTRAINT keyword, or are converted to explicit constraint using the user specified time or positional constraints. Constraints use ParsedExpressions which identify the underlying columns used in each constraint. This allows us to deal with the aliases that tables may have in the query. Generally the first table is aliased to 'a' the second to 'b' and so forth. A TableAliaser manages the mapping between tables and aliases.
Positional constraints involve substantial work. User specified positions may be in a variety of coordinate systems and may involve source names. the SkyView SourceCoordinates class is used to handle this. DistanceCalculator classes use different ways to specify a maximum radius constraint for a query. This is used both for point sources and cross-correlations.
The output fields for the query may be defaulted or may be explicitly specified.
As each of the user inputs is processed the QueryBuilder object is updated appropriately. E.g., position, time, and explicit constraints are added in separate steps. Any requested sort order is saved. If the user has requested a grouping clause that can be added.
After all of the user inputs are processed, the QueryBuilder generates the full SQL for the query. This SQL is first tested using the for validity and to give an estimate of the time the query will take using Postgres' explain function. If there are no problems, then the QueryBuilder returns a Query object which is executed to return a Table.
If the user has requested that the results of a query be saved as a user
table, then two SQL statements are actually executed.
The first statement uses a
select into user.xxx ...
to run the query and save the result as a user table. The second query
is just
select * from user.xxx
and displays the results of the table that has just been created.
When the web client starts up, it first queries the Xamin server to get the current list of available tables. It processes any user arguments and then displays the primary Query pane to the user. Users can select tables and and add constraints to queries. Depending upon what the user has done various capabilities are enabled. Eventually the user submits a query. The users inputs are processed, and a query, in the same format a user may use in the CLI interface, is sent using an XMLHTTPRequest to the Xamin server. The results are processed in the web client and displayed to the user in a new pane.
The following are the major elements of the JavaScript interface:
| Key Servlet Packages and Classes | |
|---|---|
| heasarc.xamin | The package containing the primary servlets for both CLI and web requests |
| QueryServlet | The main class for processing web client query requests. It parses the inputs and returns the results as needed |
| CLIServlet | The main class for processing cli queries |
| TableParams | Get the parameters for a given table (or tables) |
| BatchEmul | Emulate Browse batch queries |
| heasarc.xamin.node | Classes for building nodes in the table selection trees including both HEASARC and Vizier tables |
| heasarc.vo | Support Cone and TAP queries using VO protocols |
| TapAsync | Support TAP Asynchronous queries |
| Vosi* | Support TAP VOSI capabilities |
| ADQLServlet | Synchronous ADQL query |
| heasarc.stilts | Invoke STILTS library for plotting |
| heasarc.sources | Invoke coordinate converter to see if token is source name |
| heasarc.downloads | Support download requests |
| Key JSP files | |
| xamin.jsp | The primary file for the Web interface |
| getDoc.jsp | Get the documentation for a table |
| delete/addUser.jsp | Delete or add a user as requested by the login service |
A single metadata table, metainfo, is used to store all metadata for the Xamin system. When a user account is created and extension metainfo table is created in the schema for the user and the join of this user table -- which contains metadata specific to user tables -- and the system table used. The metainfo table uses a generic structure for representing metadata rather than providing a specific structured tables. This allows for great flexibility in the growth of the metadata at the cost of some complexity in the queries and size of the metadata.
To find information in a table we generally need to know the table, row and column we are interested in. Each row of the metadata table is essentially a table, row, and column identifier followed by the information that belongs there.
There are some additional sources of information that are used in Xamin.
Bibliographic information is stored in a the zzbib table but zzbib is not a special metadata table, it is queryable just as any other table. The only special feature of this table is that it is used in the When clauses in a number of conditional products -- but using the same syntax as other conditional links.
select distinct type from metainfo is the equivalent
of determining what tables are present. Commonly used types include mission, table, product,
linkfield, parameter.
| Type (table) | Relation (column) | Description |
|---|---|---|
|   | ||
| document | The document type stores information about document used somewhere in the system. Currently the only information for a document is its URL, but author, update time, format, ... may be added in the future | |
| document | located | The URL of a document |
| linkfield | A linkfield describes the mapping between the values available in a given table, and the variables used in a product template. | |
| linkfield | field | The variable name used in a product template description |
| linkfield | value | The SQL used to to extract the template from a row of the table |
| mission | The mission type is currently mostly a placeholder, but may include information about instruments, PI's, wavelength coverage and more in the future | |
| mission | end_time | The start of mission science operations |
| mission | start_time | The end of mission science operations |
|   | ||
| parameter | The parameter type gives the characteristics for a particular parameter of a given table. | |
| parameter | description | A short text description of the parameter |
| parameter | format | The format in which the parameter should be displayed |
| parameter | maximum | The maximum value for the parameter in the table |
| parameter | minimum | The minimum value for the parameter in the table |
| parameter | note | A comment on the parameter |
| parameter | order | The order in which the parameter is to be displayed. If positive then the parameter is a default/standard parameter. If negative, then when all parameters are displayed the order should be the absolute value of this parameter, except that values of 0 come last. |
| parameter | otype | The type of the parameter in the Browse system |
| parameter | ucd | The uniform content descriptor for the parameter |
| parameter | unit | The unit[s] of the parameter |
| product | The product type is used to describe a data product which may be an archival product, bibliographic reference, table link or generic external product. | |
|   | ||
| product | description | A short text description of the product |
| product | includes | A link to a child product |
| product | links | User to link the generic products |
| product | match | A template string used to find files or URLs for a product given values in a row |
| product | type | The general type of the product (e.g., FITS, ASCII, HTML, ...) |
| table | This type describes the characteristics of a particular table in the database. | |
| table | author | The author of the table |
| table | bibcode | The bibcode for a paper describing the table |
| table | containedBy | A mission which includes this table (may be repeated) |
| table | contains | A link to a parameter entry |
| table | defaultSearchRadius | The default radius for position queries of this table |
| table | describedBy | A document describing the table |
| table | description | A short text description of the table |
| table | end_time | The column name for the column containing the end time for a table |
| table | includesTypes | Object types described by this table |
| table | j2000coordinates | A pair of columns giving the J2000 coordinates for the row |
| table | lastUpdated | The date when this table was last updated |
| table | priority | A numeric priority associated with the table |
| table | regime | The regime of the table (may be repeated) |
| table | start_time | A column name for the start time associated with each row |
| table | tableType | The kind of table (observation, object, ...) |
| tablelink | A tablelink maps a table to product. It indicates the linkfield mappings that are to be used in finding the products for a given row. | |
| tablelink | linkfield | A linkfield used in this link from a table to a product |
| tablelink | product | The name of the product being linked to |
| tablelink | table | The name of the table being linked from |
| tablelink | when | A conditional expression indicating if this link should be made for the current row |
| user | User entries are only found in the user metainfo tables, not in the primary metainfo table. Exactly one user name should be found in each user metainfo table. | |
| user | The E-mail address of the user. | |
| user | password | An encrypted version of the user password (if supplied). |
| user | status | The status of the account: active/inactive/validating' |
| Name | Table/View | Description |
|---|---|---|
| pos_big_prime | Table | Weekly positions where the position error is > 1 degree. |
| pos_small_prime | Table | Weekly positions where the position error is < 1 degree. |
| pos_big_bck | Table | Incremental positions where the position error is > 1 degree. |
| pos_small_bck | Table | Incremental positions where the position error is < 1 degree. |
| pos_big | View | Current positions where the position error is > 1 degree. |
| pos_small | View | Current positions where the position error is < 1 degree. |
| pos_time_big_prime | Table | Weekly positions/times where the position error is > 1 degree. |
| pos_time_small_prime | Table | Weekly positions/times where the position error is < 1 degree. |
| pos_time_big_bck | Table | Incremental positions/times where the position error is > 1 degree. |
| pos_time_small_bck | Table | Incremental positions/times where the position error is < 1 degree. |
| pos_time_big | View | Current positions/times where the position error is > 1 degree. |
| pos_time_small | View | Current positions/times where the position error is < 1 degree. |
| time_prime | Table | Weekly times |
| time_bck | Table | Incremental times |
| time | View | Current times |
| update_list | Table | Tables that have been updated since weekly cleanup |
So to make positional queries with default search radii efficient we separate the tables with relatively small default search radii and index them in the pos_small master tables. Those with large search radii are placed in pos_big. When we want to find all of the tables we run two queries (as a union). The query of the pos_small tables includes a constraint using the maximum default search radius that can take advantage of the dec index. The query of the pos_big tables does not include this additional constraint, but this table is much smaller since only a very few tables have such large error bars.
select AA from xx_prime
where table_name not in (select table_name from update_list) and BB
union
select AA from xx_bck where BB
By referring to the views, the users of the master tables are isolated
from the update process.
sourceed before the cron jobs are executed.
The scripts are given with respect to xamin/WEB-INF/classes/ops. Logs are generally kept in /www/server/xamin/ops/logs.
| Process | Frequency | Initiation | Script | Description | Log |
|---|---|---|---|---|---|
| Update tables | Every 12 hours | Automatic | ingest/runfullcopy | Update HEASARC tables in the Postgres database. Also update HEASARC and Vizier documentation. | xamin.ingest.DATE |
| Update documentation table | Every 12 hours | Automatic | Included in runfullcopy | Update the table documents table. | See above |
| Update master tables | As needed | Auto | Included in runfullcopy | When tables are updated add position/time information
to master _bck files.
| See above |
| Update bibliography tables | As needed | Automatic | Included in runfullcopy | Add entries to bibliography tables. As far as Xamin is concerned this is just a table update, but the table is maintained by the HEASARC. This happens automatically whenever runfullcopy sees the zzbib.tdat file updated. | See above |
| Clean master tables | Weekly | Automatic | master/buildFullMasters.pl | Put all master table information into _prime tables. | fullMasters.DATE |
| TAP async user tables/jobs cleanup | Daily | Automatic | daily/cleanasync.pl | Clean up all information on jobs in the TAP_ASYNC schema that are past their destruction time. | cleanup.log.DATE |
| User usage review | Weekly | Automatic | ops/daily/sizeReport | Review disk usage of all users. This will also such TAP asynchronous and master table usage. The output of this cron is not logged so it is sent as a mail message to the cron initiator. | Cron user e-mail |
| Run table monitor | Daily | Automatic | daily/monitor.pl | See if the queries that have created user tables would now give different results and update/notify tables/users as requested. | monitor.log.DATE |
| HTTP logs | Daily? | Auto | TBD | Copy TOMCAT http logs -- may want to join with Apache logs | TBD |
| User backup | Daily | Automatic | daily/userbck.pl | Backup Xamin user account information. The HEASARC's TDAT files serve as a backup for the science tables so a backup of those is not required. | userbck.log.DATE |
| Testing | Hourly | Automatic | tests/batchTest.pl | Run tests that show that system is operational. At least one test (comprising multiple queries) is run each hour. Every table is queried at least once each day. | autochecks.log |
| Update tables test | Daily | Automatic | tests/makeTables.pl | Since the tables in the Xamin system can change we need to keep the tables test up to date. This updates tests/tables.tests. | Not logged |
| Statistics | Quarterly | Manual | TBD | Compute statistics including at least:
|
NA |
| Compare Xamin and Browse | Weekly | Automatic | ingest/compareXaminBrowse | Ensure that the tables seen in Xamin and Browse are consistent. | Cron user e-mail |
| Delete tables from Xamin | As needed | Manual | ingest/deleteTable | Delete a table that is no longer used in Xamin. | delete.TABLE |
| Add user schema | As requested by srvlogin | Automatic | xamin/jsp/createUser.jsp | Add a schema for a new user account. This script is invoked by the HEASARC login service and is no externally invokable | Not logged |
| Delete user schema | As requested by srvlogin | Automatic | xamin/jsp/deleteUser.jsp | Deleta a schema for user account. This script is invoked by the HEASARC login service and is not externally invokable. | Not logged |
| Add/update data products | As needed | Manual | products/updateProducts | When new products are defined, the mission products definition file in products/mission/MISSION.xml must be created or updated. This script reads this file and updates the metadata table to define the products. | Not logged |
| Time | Event | |
|---|---|---|
| 00:50 | Test: basic tests | |
| 01:50 | Test: infrastructure tests | |
| 02:30 | Clean asynchronous | |
| 02:50 | Test: basic tests | |
| 03:00 | User backups | |
| 03:10 | Daily monitor | |
| 03:30 Tu | Build full masters | |
| 03:50 | Test: infrastructure tests | |
| 04:50 | Test: basic tests | |
| 05:00 | Table ingest | |
| 05:50 | Test: infrastructure tests | |
| 06:50 | Test: basic tests | |
| 07:50 | Test: infrastructure tests | |
| 08:50 | Test: basic tests | |
| 09:50 | Test: infrastructure tests | |
| 10:50 | Test: basic tests | |
| 11:50 | Test: infrastructure tests | |
| 12:50 | Test: basic tests | |
| 13:50 | Test: infrastructure tests | |
| 14:50 | Test: basic tests | |
| 15:50 | Test: infrastructure tests | |
| 16:50 | Test: basic tests | |
| 17:00 | Table ingest | |
| 17:50 | Test: infrastructure tests | |
| 18:50 | Test: basic tests | |
| 19:50 | Test: infrastructure tests | |
| 20:50 | Test: basic tests | |
| 21:00 | Update tables test | |
| 21:00 Th | Generate user size report | |
| 21:50 | Test: infrastructure tests | |
| 22:00 | Compare Xamin and Browse | |
| 22:10 | Test: table tests | |
| 22:50 | Test: basic tests | |
| 23:50 | Test: infrastructure tests |
This work was authored by Timothy Gerard Endres, time@gjt.org.
This work has been placed into the public domain.
You are free to use this work in any way you wish.