A parameter that is common to two base tables can be cross-correlated to find
all matches within a specified range. A good example of this is using
coordinates to find all matching sources in two databases based on
positional coincidences within a specified radius. The result of such a
cross-correlation in BROWSE is to link records in one database table with one
or more matching records in another. Once such a relation is established, it is
possible to plot, list, or otherwise access parameters from the two different
databases tables.
A related concept is that in which a database table is cross-correlated
against itself using an allowed range for a specified parameter. Such an
auto-correlation will find all matches within the given range and can be
used to ``compress" a database table to show only unique matches. Again,
coordinates provide a good example where all of the unique objects in a
database can be obtained by auto-correlation on position within a particular
radius.
The cross and compress commands are used to make these cross- and
auto-correlations.
The current sample or subsample can be auto-correlated using compress and
specifying a parameter and range to find all matches within the specified range.
Important Note: The compression parameter must be indexed.
If the compression is to be done using coordinates, then it is sufficient to
use only the /radius=min.xx qualifier where min.xx is the cone radius in arc
minutes, or /sradius=sec.xx where sec.xx is in arc seconds. Similarly,
specification of the time compression parameter and range is combined to be
/time=day.xx where day.xx is the range in time specified in days.
For other parameters the parameter and range is specified with the
/param=name and /range=x.x qualifiers. If a parameter contains a character
string, then specifying a range is not meaningful and only exact matches are
taken.
Upon completion of the compression, a new subsample is created containing only
one entry for all of those matched.
The following example compresses all entries in the Einstein IPC data
using a 45 arc second cone radius:
IPC_TOTAL_DEC > comp/srad=45
4956 distinct entries found out of a total of 5958
IPC_TOTAL_DEC 4956>
The /list qualifier gives a line summary output of the matched entries
with a break after each set. This break can be supressed using the /full
qualifier. In the following example, all unique sources within 15 arc seconds
are found and listed for the EXOSAT LE database:
CMA_SOURCES_DEC > comp/srad=15/list
Source no: 1
name seq off expos time count filt ra dec inst
(field name) (#) (min) (sec) (yr.day) rate (1950) (1950)
1 ESO012-G2 1515 2 6674 85.108 6.0E-03 6 0 39 14.4 -79 30 48 L1
2 ESO012-G2 1515 2 5716 85.108 1.6E-02 7 0 39 15.2 -79 30 45 L1
<CR> continues, to exit type any character <CR> or Ctrlz
Statistics can be accumulated on a specified statistics parameter, which can be
different from the compression parameter. This is specified using the
/stats_par=name qualifier. Statistics containing the minimum, maximum, mean,
sigma deviation, and number of occurrences are summarized for each matched
set of entries using the /summary qualifier. In addition, the entry used to
represent this matched set in the subsample can be selected to be the minimum,
maximum, or the closest to the average value of the specified statistics
parameter using /min, /max, or /mean (otherwise the first found record
is used).
Using the same EXOSAT LE example as before, the statistics are summarized and
the subsample is left with the following maximum values:
CMA_SOURCES_DEC > comp/srad=40/stats=count/summary/max
Statistics for parameter COUNT RATE
Name Number Mean Sigma Min Max
1 ESO012-G2 2 1.102E-02 3.559E-03 5.989E-03 1.606E-02
2 ZCHA 1 2.877E-03 2.877E-03 2.877E-03
3 HD32918 2 1.336E-02 2.941E-03 9.200E-03 1.752E-02
4 HD32918 1 8.194E-03 8.194E-03 8.194E-03
5 PKS0637-75 2 4.221E-03 5.315E-04 3.469E-03 4.972E-03
6 E0003.0-74 1 1.768E-03 1.768E-03 1.768E-03
7 SMCX-1 18 0.149 1.664E-02 2.000E-02 0.248
8 E0101.3-73 1 1.719E-02 1.719E-02 1.719E-02
etc.
Relations can be established between the various base tables by using the
cross command. The database against which the cross-correlation is to be
made is specified using the /db=name qualifier.
Important note: The parameter to be cross-correlated must be indexed in BOTH
databases.
If the cross-correlation is to be made using coordinates, it is sufficient
to give only the /radius=min.xx qualifier where min.xx is the cone radius in
arc minutes, or /sradius=sec.xx where sec.xx is in arc seconds. Similarly,
specification of the time parameter and range is combined to be
/time=day.xx where day.xx is the range in time specified in days. Any
other parameter and range is specified using the /param=name
and /range=x.x qualifiers. For a parameter containing a character string, the
range cannot be specifed and only exact matches are taken.
In the following example, the EXOSAT LE database is cross-correlated against
the Einstein IPC using a cone radius of 40 arc mins:
CMA_SOURCES_DEC > cross/db=ipc/srad=40
In database IPC sample TOTAL
3143 out of 5958 entries found
757 have more than one match per entry
using coordinates with a cone radius of 40.00 arc sec
and database LE sample SOURCES where
1498 out of 2418 entries matched
CMA_IPC[dec] 3143>
Further details about the matched entries in each correlation is obtained
with the /list option. This produces a line summary of the
matched entries. For the above example
CMA_SOURCES_DEC > cross/db=ipc/srad=40/list
Match no: 1
Matching coordinates with a cone radius of 40.00 arc sec
for the following entry in database LE sample SOURCES
name seq off expos time count filt ra dec inst
(field name) (#) (min) (sec) (yr.day) rate (1950) (1950)
7 PKS0637-75 1027 1 14340 84.252 5.0E-03 7 6 37 23.7 -75 13 44 L1
The following correlation has been found in database IPC sample TOTAL
Source name ra(1950) dec(1950) count rate error snr har seq # off
1 PKS 0637-75 06 37 26.7 -75 13 38 0.149 5.3E-03 28.1 0.5 8494 0.4
2 PKS 0637-75 06 37 24.1 -75 13 34 0.262 1.6E-02 16.2 0.4 5404 0.2
<CR> continues, to exit type any character <CR> or Ctrlz
To check the probability that the matches found are by chance,
the /shift=x.y qualifier is provided to ``shift" the value of the
parameter value to be matched by x.y. This is particularly useful for
coordinate searches when the number of chance matches can be important. In
this case, to shift the RA by x.y degrees.
When the cross-correlation is completed, BROWSE will leave the two databases
open so that the parameters in correlated records from either database can be
displayed. To alert the user to this new state, the BROWSE prompt will change to
CMA_IPC[dec]. This shows the names of the two open databases CMA_IPC, with
the current database in upper case. Also given is the name of the correlation
parameter in brackets [dec].
Only those entries which matched in each database are left as the subsample.
There are several entries in the IPC database which match each entry in the LE
database, and in the IPC/LE case this happened for 757 entries. Multiple
matches are preserved by increasing the number of ``apparent" entries in the LE
so that each matched IPC record has a corresponding LE record. This will
result in duplicate LE entries in the subsample. Multiple matches can lead to
strange effects. In the above example there are apparently more entries in
the subsample than in the original LE sample. It is usually wise to remove
multiple entries for each source in both databases. A compress can be run
before a cross to remove multiple entries in a database.
Once a correlation is established, all of the BROWSE commands can be used,
including the search commands. The commands will by default operate on the
current database, which can be changed using the cdb command.
The entry number given with the display commands (such as dline or dsam)
will be the same for both databases and can be used to connect listing
obtained for the two databases. In the following example, the IPC and
LE line summaries are displayed for the first six entries:
When a correlation has been established between two base tables, the dcoord
command with the /separation qualifier can be used to list the separation
between the positions given in first databases and those in the correlated
databases.
To display a parameter which is not from the current database, its name must be
prefixed with the name of the database, with a ``dot" between the parameter name
and the database name. For example, to plot the LE against IPC count rates use
CMA_IPC[DEC] 345> ps le.count ipc.count
Similarly, the ASCII tables commands mat and dat can be used to
list parameters from the two databases.
Further cross-correlations can be made up to a maximum of five open databases.
A correlation can be written to a sample so that it can be retrieved at a
later date. This is done using msam name, where name is the name of the
``correlation sample".
A correlation sample can be loaded at any time using the csam (change
sample) command, just like any normal sample. Correlation samples that have
been made are listed using the lsam command.
If only a sample containing the entries from the current database is required,
then the /no_correlation qualifier is used.
In some cases it may be necessary to cross-correlate using two parameters,
such as time and coordinates. As many as three parameters can be specified
simultaneously and BROWSE will find all matches within each specified range.
Next:SQL Interface Up:HEASARC Users Guide Previous:Data ProductsMichael Arida 1998-04-10