next up previous contents
Next: SQL Interface Up: HEASARC Users Guide Previous: Data Products

Subsections


BROWSE Correlations

This chapter describes how to

Overview

A parameter that is common to two base tables can be cross-correlated to find all matches within a specified range. A good example of this is using coordinates to find all matching sources in two databases based on positional coincidences within a specified radius. The result of such a cross-correlation in BROWSE is to link records in one database table with one or more matching records in another. Once such a relation is established, it is possible to plot, list, or otherwise access parameters from the two different databases tables. A related concept is that in which a database table is cross-correlated against itself using an allowed range for a specified parameter. Such an auto-correlation will find all matches within the given range and can be used to ``compress" a database table to show only unique matches. Again, coordinates provide a good example where all of the unique objects in a database can be obtained by auto-correlation on position within a particular radius. The cross and compress commands are used to make these cross- and auto-correlations.

Compression

The current sample or subsample can be auto-correlated using compress and specifying a parameter and range to find all matches within the specified range. Important Note: The compression parameter must be indexed.

Parameter Specification

If the compression is to be done using coordinates, then it is sufficient to use only the /radius=min.xx qualifier where min.xx is the cone radius in arc minutes, or /sradius=sec.xx where sec.xx is in arc seconds. Similarly, specification of the time compression parameter and range is combined to be /time=day.xx where day.xx is the range in time specified in days. For other parameters the parameter and range is specified with the /param=name and /range=x.x qualifiers. If a parameter contains a character string, then specifying a range is not meaningful and only exact matches are taken. Upon completion of the compression, a new subsample is created containing only one entry for all of those matched. The following example compresses all entries in the Einstein IPC data using a 45 arc second cone radius:


IPC_TOTAL_DEC > comp/srad=45
     4956 distinct entries found out of a total of     5958
IPC_TOTAL_DEC 4956>

Listing

The /list qualifier gives a line summary output of the matched entries with a break after each set. This break can be supressed using the /full qualifier. In the following example, all unique sources within 15 arc seconds are found and listed for the EXOSAT LE database:


CMA_SOURCES_DEC > comp/srad=15/list
 
Source no:      1
 
     name       seq   off  expos   time    count filt   ra        dec   inst
 (field name)   (#)  (min) (sec) (yr.day)   rate      (1950)    (1950)
 
 1 ESO012-G2   1515    2    6674  85.108  6.0E-03  6  0 39 14.4 -79 30 48 L1
 2 ESO012-G2   1515    2    5716  85.108  1.6E-02  7  0 39 15.2 -79 30 45 L1
 
 <CR> continues, to exit type any character <CR> or Ctrlz

Statistics

Statistics can be accumulated on a specified statistics parameter, which can be different from the compression parameter. This is specified using the /stats_par=name qualifier. Statistics containing the minimum, maximum, mean, sigma deviation, and number of occurrences are summarized for each matched set of entries using the /summary qualifier. In addition, the entry used to represent this matched set in the subsample can be selected to be the minimum, maximum, or the closest to the average value of the specified statistics parameter using /min, /max, or /mean (otherwise the first found record is used). Using the same EXOSAT LE example as before, the statistics are summarized and the subsample is left with the following maximum values:


CMA_SOURCES_DEC > comp/srad=40/stats=count/summary/max
 
Statistics for parameter COUNT RATE
 
       Name             Number     Mean     Sigma       Min       Max
 
   1 ESO012-G2               2  1.102E-02   3.559E-03  5.989E-03  1.606E-02
   2 ZCHA                    1  2.877E-03              2.877E-03  2.877E-03
   3 HD32918                 2  1.336E-02   2.941E-03  9.200E-03  1.752E-02
   4 HD32918                 1  8.194E-03              8.194E-03  8.194E-03
   5 PKS0637-75              2  4.221E-03   5.315E-04  3.469E-03  4.972E-03
   6 E0003.0-74              1  1.768E-03              1.768E-03  1.768E-03
   7 SMCX-1                 18  0.149       1.664E-02  2.000E-02  0.248
   8 E0101.3-73              1  1.719E-02              1.719E-02  1.719E-02
 
etc.

Cross-Correlations

Relations can be established between the various base tables by using the cross command. The database against which the cross-correlation is to be made is specified using the /db=name qualifier. Important note: The parameter to be cross-correlated must be indexed in BOTH databases.

Establishing

If the cross-correlation is to be made using coordinates, it is sufficient to give only the /radius=min.xx qualifier where min.xx is the cone radius in arc minutes, or /sradius=sec.xx where sec.xx is in arc seconds. Similarly, specification of the time parameter and range is combined to be /time=day.xx where day.xx is the range in time specified in days. Any other parameter and range is specified using the /param=name and /range=x.x qualifiers. For a parameter containing a character string, the range cannot be specifed and only exact matches are taken. In the following example, the EXOSAT LE database is cross-correlated against the Einstein IPC using a cone radius of 40 arc mins:


CMA_SOURCES_DEC > cross/db=ipc/srad=40
 
In database IPC sample TOTAL
 
     3143 out of     5958 entries found
      757 have more than one match per entry
 
using coordinates with a cone radius of  40.00 arc sec
 
and database LE sample SOURCES where
 
     1498 out of     2418 entries matched
 
CMA_IPC[dec] 3143>

Listing

Further details about the matched entries in each correlation is obtained with the /list option. This produces a line summary of the matched entries. For the above example


CMA_SOURCES_DEC > cross/db=ipc/srad=40/list
 
Match no:      1
 
Matching coordinates with a cone radius of  40.00 arc sec
for the following entry in database LE sample SOURCES
 
     name       seq   off  expos   time    count filt   ra        dec   inst
 (field name)   (#)  (min) (sec) (yr.day)   rate      (1950)    (1950)
 
 7 PKS0637-75  1027    1   14340  84.252  5.0E-03  7  6 37 23.7 -75 13 44 L1
 
The following correlation has been found in database IPC sample TOTAL
 
    Source name    ra(1950)   dec(1950)  count rate error  snr  har seq # off
 
 1 PKS 0637-75   06 37 26.7 -75 13 38 0.149     5.3E-03  28.1  0.5 8494  0.4
 2 PKS 0637-75   06 37 24.1 -75 13 34 0.262     1.6E-02  16.2  0.4 5404  0.2
 
 <CR> continues, to exit type any character <CR> or Ctrlz

Shift

To check the probability that the matches found are by chance, the /shift=x.y qualifier is provided to ``shift" the value of the parameter value to be matched by x.y. This is particularly useful for coordinate searches when the number of chance matches can be important. In this case, to shift the RA by x.y degrees.

Using

When the cross-correlation is completed, BROWSE will leave the two databases open so that the parameters in correlated records from either database can be displayed. To alert the user to this new state, the BROWSE prompt will change to CMA_IPC[dec]. This shows the names of the two open databases CMA_IPC, with the current database in upper case. Also given is the name of the correlation parameter in brackets [dec]. Only those entries which matched in each database are left as the subsample. There are several entries in the IPC database which match each entry in the LE database, and in the IPC/LE case this happened for 757 entries. Multiple matches are preserved by increasing the number of ``apparent" entries in the LE so that each matched IPC record has a corresponding LE record. This will result in duplicate LE entries in the subsample. Multiple matches can lead to strange effects. In the above example there are apparently more entries in the subsample than in the original LE sample. It is usually wise to remove multiple entries for each source in both databases. A compress can be run before a cross to remove multiple entries in a database. Once a correlation is established, all of the BROWSE commands can be used, including the search commands. The commands will by default operate on the current database, which can be changed using the cdb command. The entry number given with the display commands (such as dline or dsam) will be the same for both databases and can be used to connect listing obtained for the two databases. In the following example, the IPC and LE line summaries are displayed for the first six entries:


CMA_IPC[dec] 3143> dl 1-6
 
    Source name    ra(1950)   dec(1950)  count rate error  snr  har seq # off
 
  1 PKS 0637-75   06 37 26.7 -75 13 38 0.149     5.3E-03  28.1  0.5 8494  0.4
  2 PKS 0637-75   06 37 24.1 -75 13 34 0.262     1.6E-02  16.2  0.4 5404  0.2
  3 PKS 0637-75   06 37 26.7 -75 13 38 0.149     5.3E-03  28.1  0.5 8494  0.4
  4 PKS 0637-75   06 37 24.1 -75 13 34 0.262     1.6E-02  16.2  0.4 5404  0.2
  5  unknown      00 02 53.6 -74 43 21 8.902E-02 1.3E-02   6.9  0.5  614 24.2
  6  unknown      01 15 43.3 -73 42 31 0.271     3.3E-02   8.2  0.6  623 16.3
CMA_IPC[dec] 3143> cd le
CMA_ipc[dec] 3143> dl 1-10
 
     name       seq   off  expos   time    count filt   ra        dec   inst
 (field name)   (#)  (min) (sec) (yr.day)   rate      (1950)    (1950)
 
>1 PKS0637-75  1027    1   14340  84.252  5.0E-03  7  6 37 23.7 -75 13 44 L1
 2 PKS0637-75  1027    1   14340  84.252  5.0E-03  7  6 37 23.7 -75 13 44 L1
 3 PKS0637-75   857    1   28952  84.169  3.5E-03  7  6 37 22.6 -75 13 40 L1
 4 PKS0637-75   857    1   28952  84.169  3.5E-03  7  6 37 22.6 -75 13 40 L1
 5 E0003.0-74  1141    2   25254  84.295  1.8E-03  7  0  2 51.3 -74 43 12 L1
 6 SMCX-1       914    2    3274  84.198  7.2E-02  7  1 15 46.0 -73 42 23 L1
CMA_ipc[dec] 3143>
When a correlation has been established between two base tables, the dcoord command with the /separation qualifier can be used to list the separation between the positions given in first databases and those in the correlated databases. To display a parameter which is not from the current database, its name must be prefixed with the name of the database, with a ``dot" between the parameter name and the database name. For example, to plot the LE against IPC count rates use


CMA_IPC[DEC] 345> ps le.count ipc.count
Similarly, the ASCII tables commands mat and dat can be used to list parameters from the two databases. Further cross-correlations can be made up to a maximum of five open databases.

Resetting

The correlation can be reset and BROWSE returned to accessing only the original database table by using the rcor (reset correlation) command.

Correlation Samples

A correlation can be written to a sample so that it can be retrieved at a later date. This is done using msam name, where name is the name of the ``correlation sample". A correlation sample can be loaded at any time using the csam (change sample) command, just like any normal sample. Correlation samples that have been made are listed using the lsam command. If only a sample containing the entries from the current database is required, then the /no_correlation qualifier is used.

Multiple Parameters

In some cases it may be necessary to cross-correlate using two parameters, such as time and coordinates. As many as three parameters can be specified simultaneously and BROWSE will find all matches within each specified range.
next up previous contents
Next: SQL Interface Up: HEASARC Users Guide Previous: Data Products
Michael Arida
1998-04-10