OGIP Memo OGIP/95-008
The MATHPHA User's Guide
Ian M George
Version: 1995 Aug 23
Code 668,
NASA/GSFC,
Greenbelt, MD20771
MATHPHA v4.0.0
|
Release | Sections Changed | Brief Notes |
Date | ||
1995 Mar 30 | Original Version | |
1995 Apr 14 | All | Update for MATHPHAv3.5.0 |
1995 Aug 23 | 3, 8 | Update for MATHPHAv4.0.0 |
MATHPHA takes a user-defined input expression representing
mathematical operations to be performed on (one or more) PHA
datasets. The input expression is thus allowed to contain PHA
filenames (and optionally, an extension number), a number of
mathematical operators, integers, reals, and pairs of curved
brackets ("(" & ")"). Assuming the expression is able to be parsed (and
a number of other criteria are satisfied), the operations specified
are performed on the PHA dataset, the errors propagated
(see Section 3),
and a new file containing the resultant PHA dataset is written.
The units in which the algebra is performed is controlled
by a parameter (units; see Section 2),
the default of which is COUNTS
(as opposed to COUNTS PER SECOND). This is true
IRRESPECTIVE of
whether the original PHA data themselves are stored as counts or
counts per second. The exposure time written for the new PHA
dataset is controlled by a user-defined parameter (exposure;
see Section 4). Users also have the option of specifying
the values written to the o/p file of several other mandatory keywords
(see Section 6).
Users should note that MATHPHA is a moderately powerful tool,
and thus open to confusion/abuse. Users are strongly urged to read &
attempt to fully understand the deatils of the task.
In particular, users should understand the implied units for any
integers/reals which form part of the input expression, the details of the
error propagation, and options available to calculate the exposure time
written to the o/p file. Both of these issues are discussed in detail
below. Any remaining confusion/uncertainties/strange results should
be reported to the author.
At the current time, only a modest number of mathematical
operations are possible (as listed
in Section 2.1), though it is anticipated
that they will serve the vast majority of users' needs. These will
be added to as dictated by demand. In addition, due to the relative
crudeness of the expression parser, there are number of rules which
users must conform to when constructing their input expression
(Section 2).
Violation of these rules will result in the task
(hopefully) stopping, (possibly) crashing, or (maybe even)
producing incorrect results.
Only the the OGIP recognised PHA file formats described in
OGIP/92-007
(see also Arnaud etal 1992)
and its appendix provided in OGIP memo
OGIP/92-007a
are supported for both the input and output files.
The input expression is a character string (currently with a
maximum length of 1024 characters) which forms a mathematical
expression consisting of:
Within the constraint of the total number of characters specified
above, any number of spaces can be added to the input expression.
The following mathematical operations are currently supported:
It should be noted that the parser will interpret all characters
besides those listed above and the open/close curved
brackets/parentheses characters (ie ( and ))
as (parts of) input PHA filenames.
There several options open to users regarding the calculation or
propagation of errors, controlled by the parameters errmeth and
properr.
The (hidden) parameter errmeth controls which prescription is used to
calculate errors, should the task need to do so. The value of this keyword
is therefore important if the errors are to be propagated
(ie properr='yes')
and one or more of the i/p files contain the FITS
keyword POISERR = T indicating Poissonian statistics should be used to
calculate the statistical errors, or if errors are not to be propagated
(ie properr='no') and the algebra is performed in counts/sec
(ie units='Rate').
Currently, the following prescriptions are available (where N
is the number of counts observed in a given PHA channel):
The following tables enables direct comparisons to be made between these
approximations:
1 INTRODUCTION
2 RULES GOVERNING THE CONSTRUCTION OF THE INPUT EXPRESSION
Only the öbvious" restrictions are currently placed on
filenames, namely that the path must be specified, and that the
parser must not be able to interpret the path/filename as a
mathematical operation, an integer/real, or some combination
thereof (this currently poses a problem under unix & ultrix -
see below). A specific extension of the input file containing
the PHA dataset to be operated on can be specified by including
the integer extension number in square ([extn#])
brackets
after the filename. If such a specification is not used, then
the entire PHA file will be searched for extensions likely to
contain a PHA dataset. In such cases, if the input file is then
found to consist of more than one PHA dataset, then the task
will warn the user and stop.
The mathematical operators currently allowed are listed
in Section 2.1.
Any character which is not in this list, is not a curved
bracket, and which cannot be interpreted as (part of) an
integer/real, will be interpreted as a filename, or part
thereof.
Numerical constants consist of a combination of ASCII
characters in the range 0 to 9 along with
an optional,
single decimal point (.). For internal purposes, the
parser
interprets all integers as reals, thus the inclusion/exclusion
of the decimal point for whole numbers is of no consequence.
Curved brackets should be used in the standard mathematical
mannor to denote the priority in which various calculations are
performed. For safety, the task will stop if the number of
open-brackets (() does not equal the number
of
close-brackets()).
The only limitations currently in place
are that neither the first nor last entry within a pair of
brackets is a mathematical operator, in which case the task
SHOULD stop.
2.1 Currently Supported Mathematical Operations
3 CALCULATION & PROGATION OF ERRORS
Caution is urged, particularly when using errmeth = 'Gauss', as
unexpected and/or misleading results can be produced.
whereby the errors are calculated using
error = √{N}.
(the default) whereby the algorithm of Gehrels (1986 ApJ, 303, 336)
eqn 7 with S=1 is used
(error = 1.0 + √{N + 0.75}).
The value is statistically that of the (larger) +ve error of a
Poissonian distribution, but within this task this value is
assigned to both the +ve and -ve error on the counts in that channel.
For small N, the errors created using this prescription is
significantly GREATER (and hence more conservative) than
both the true -ve error, and that obtained by simply using
√{N}. Thus, this prescription is recommended unless a
user fully understands the implications of using a different
prescription. Both differences quickly reduce as one moves to
larger N (and the Poissonian distribution becomes more
symmetric/Gaussian).
whereby the algorithm of Gehrels (1986 ApJ, 303, 336) eqn 11,
with S=1 is used
(error = √{N − 0.25}).
The value is statistically that of the (smaller) -ve error of a
Poissonian distribution, but within this task this value is
assigned to both the +ve and -ve error on the counts in that channel.
This error prescription UNDERESTIMATES the true error for small
N.
whereby the mean of the errors given by POISS-1 and
POISS-2 above is used.
N | true 1-sigma | errors calcd using | errors calcd | |||
Poisson errors | Gehrels approx | using √{N} | ||||
0 | +1.84 | -0.00 | +1.87 | -0.00 | +0.00 | -0.00 |
1 | +2.30 | -0.83 | +2.32 | -0.67 | +1.00 | -1.00 |
2 | +2.63 | -1.92 | +2.66 | -1.33 | +1.41 | -1.41 |
3 | +2.92 | -1.63 | +2.94 | -1.66 | +1.73 | -1.73 |
4 | +3.16 | -1.91 | +3.18 | -1.94 | +2.00 | -2.00 |
5 | +3.38 | -2.16 | +3.40 | -2.18 | +2.24 | -2.24 |
10 | +4.27 | -3.11 | +4.28 | -3.12 | +3.16 | -3.16 |
50 | +8.12 | -7.05 | +8.12 | -7.05 | +7.07 | -7.07 |
100 | +11.00 | -9.98 | +11.00 | -9.99 | +10.00 | -10.00 |
The (hidden) parameter properr controls whether the errors are to be
propagated during the algebra or (if properr='no') whether the errors are
simply calculated from the resultant PHA dataset. In the former case,
the errors are propagated in the normal manner
(e3 = √{[e12 + e22]}).
This is an approximation when in the Poissonian regime (ie for low N).
Whilst it
is true that the variances of the Poissonian distribution are combined in
this normal way, confidence limit error bars are not simply related to the
variance like they are for Gaussian statistics.
HOWEVER, these approximations work well for all but the smallest N,
and is certainly superior to either assuming
√{N} errors, or neglecting
errors altogther:
Severely misleading and/or incorrect results are possible if the errors
are not propagated (ie if properr=no). Turning off error propagation
is only reccomended when one is adding non-background-subtracted datasets,
and one fully understands the risks.
The value written as the exposure time in the o/p PHA dataset is
indicated by a user-defined flag (exposure).
This facility
provides added flexibility to users, but with a risk of some
confusion. It should be remembered however, that if desired, the
exposure time given for a PHA dataset can always be reset using the
chkey command within the ftools/heasarc task
GRPPHA.
The following options are currently available:
The area scaling factor is a numerical constant stored in the
OGIP-mandatory FITS keyword AREASCAL in the
SPECTRUM extension of a
PHA file. Details of its use (and misuse) are too varied and
complex to be fully described here, but most often it is used to
renormalize the response matrix associated with that PHA file
within XSPEC.
As such the value of the AREASCAL keyword is often
unity, but (particularly in the case of older response matrices)
can have values of the order of the effective area (in cm2) of
the telescope/instrument.
The value written as the AREASCAL keyword in the o/p PHA dataset is
determined by a hidden parameter, areascal.
The following options are currently available:
There are a number of mandatory keywords required to be present in the
header of extension containing the resultant PHA dataset in the o/p file.
A number of these are crucial for the correct interpretation of the
o/p dataset.
The prime example is the value of the EXPOSURE keyword, and
as described in Section 4, MATHPHA has a
parameter exposure which controls how the value to be written to
the o/p file is determined.
Similarly MATHPHA compares the values of the AREASCAL
keywords found in the i/p files, and attempts to determine the most
appropriate value to be written to the o/p file
(Section 5).
MATHPHA
also checks the
values of the TELESCOP, INSTRUME, DETNAM,
FILTER & CHANTYPE keywords. If there are
discrepancies with any of these keywords, MATHPHA
will warn the user and writes the appropriate null/default value to
the o/p file.
However, there are a number of mandatory keywords associated with
'auxiliary' files, for which MATHPHA is in general unable to
calculate 'appropriate' values (since it largely depends upon what
the user is attempting to do via the input expression).
These keywords are:
However, for convenience MATHPHA does provide users some
opportunity to specify the values to be written to the o/p
file via the hidden parameters
auxfiles, backfile, backscal, corrfile,
corrscal, arfile and rmfile
(see also Section 8).
The internal logic for MATHPHA regarding these parameters is as
follows:
The parser is written in FORTRAN (!), and converts the
user-supplied expression to ÏMG" Reverse Polish (a little-known
dialect thought to be understood by but one person on the planet).
The parser attempts to identify the various elements of the input
expression, and order the mathematical operations and the reading
of PHA datasets appropriately. The parser attempts to work out from
the last-entered (innermost) bracket-pair. The mathematical
expression contained therewithin is evaluated and the answer
stored. The parser then moves to the 2nd-to-last bracket-pair. The
mathematical expression contained therewithin is evaluated
(including the results already calculated, if appropriate), and the
answer again stored. This process thus continues until the full
expression has been evaluated, and this is the result which is
finally written as the output PHA dataset.
Users should keep in mind the following constraints/behaviour:
The task will STOP (hopefully with an appropriate error message)
if:
Users who encounter/suspect problems with the task are urged to
contact the author, but also to experiment with the input
expression (by increasing/moving pairs of brackets, and/or
performing the required calculation with two or more passes through
MATHPHA).
The author, USRA, the OGIP & NASA assume no responsibility for
errors resulting from either bugs, or the misuse of this task.
However, please report any comments/problems or bugs to Ian M George
(george@lheavx.gsfc.nasa.gov).
Unfortunately, the following bugs are known to exist in the
current version of the task:
Section incomplete
Information regarding on-line versions of any of the following references
with an OGIP Memo number (ie documents starting OGIP/.. or
CAL/..) can most easily be found via the WorldWide Web by following
the links from the URL:
Most OGIP Calibration Memos of general community interest will eventually
appear as articles in Legacy, but are also available on request
from The Office of Guest Investigator Programs, Code 668,
NASA/GSFC, Greenbelt, MD 20771, USA.
Arnaud, K.A., George, I.M., Tennant, A.F.,
1992, Legacy, 2, 65.
George, I.M., Arnaud, K.A.
1992.
addendum to
OGIP/92-007)
Gehrels, N., 1986, ApJ, 303, 336
The following useful links are available (in the HTML version of this
document only):
+4.24 -4.24 (using errmeth='POISS-1')
+3.08 -3.08 (using errmeth='POISS-2')
+3.95 -3.95 (using errmeth='POISS-3')
compared to:
+4.27 -3.11 (statistically correct values)
+3.16 -3.16 (propagating 'Gaussian' errors)
4 CALCULATION OF THE EXPOSURE TIME IN THE O/P PHA DATASET
where an exposure time of 1 second is written to the o/p
dataset.
whereby the exposure time read from the specified i/p file
is written to the o/p dataset.
where the entered value (assuming it can be parsed as a
real), is written to the o/p dataset. The value will be assumed
to be in units of seconds. At the current time, numerals of the
form 1E04 or 1E-04 are NOT supported
(and will be
interpreted as i/p filenames).
where the exposure time is calculated from the i/p
expression by sustituting each filename with its exposure time,
and performing the specified calculation.
5 CALCULATION OF THE ÄREA SCALING FACTOR" FOR THE O/P PHA DATASET
where an AREASCAL of 1.0 is written to the o/p dataset.
whereby the AREASCAL read from the specified i/p file is written
to the o/p dataset.
where the entered value (assuming it can be parsed as a real), is
written to the o/p dataset. At the current time, numerals of the
form 1E04 or 1E-04 are NOT supported
(and will be interpreted as i/p filenames).
(the default setting) where MATHPHA
employs the following logic:
In both cases, the user is informed of the value written to the o/p
file.
Essentially, if you dont see warning messages regarding AREASCAL
during execution, don't worry about it. If you do, and the values
on the various i/p files vary dramatically and you dont understand
why, then you should check whether the input expression was really
that which you intended.
this value (actually the straight mean) will be written to the
o/p file.
the user is informed of the various values encounted in the
i/p files, an AREASCAL of unity is written to the o/p file,
and the user invited to determine and change the value of this
keyword outside of MATHPHA.
6 DETERMINATION OF THE AUXILIARY FILES FOR THE O/P PHA DATASET
(see also
OGIP/92-007).
7 WARNINGS ON USAGE
7.1 Common Mistakes/Limitations
expr = directory1/directory2/filename
to divide the PHA file
called 'directory1' by PHA file 'directory2',
then divide the
result by PHA file 'filename'.
This problem may be addressed in a
future release of the task, however can be overcome in the
meantime
by copying the necessary input PHA file(s) to the current
working directory.
8 INPUT PARAMETERS
The i/p expression to be evaluated. This must represent a
series of mathematical operations to be performed on (one or
more) PHA datasets. The expression is allowed to contain PHA
filenames (and optionally, an extension number), the
mathematical operators listed above, integers, reals, and
pairs of curved brackets (( & )).
A number of other
criteria, as specified above MUST also be satisfied.
A flag indicating the physical units in which the algebraic
expression is to be evaluated (and the units in which the o/p
file will be written). The allowed values are 'C'
(ie 'COUNTS' - the default), and 'R' (ie 'RATE'),
implying that
the algebra will be performed in 'COUNTS' and 'COUNTS PER
SECOND' space respectively. The algebra will be performed in
this space irrespective of whether the input files contain
data stored in counts or in counts per second (ie, if
units=C, then i/p PHA histograms stored in counts/s
will be
converted to counts prior to any mathematical operations being
performed). Similarly, this flag gives the implied units of any
numerical constants within the input expression. A third
option value is also allowed, units=F (ie 'FILE')
whereby
the algebra is performed in which ever units most of the files
are stored in ('COUNTS' in the event of a tie). Extreme
caution is urged using this final option.
The name of the o/p file to be written containing the derived
PHA dataset. For safety, the name of the o/p file CANNOT
be
the same as one of the i/p files, and the task will stop if
this is the case. Under unix/ultrix, an o/p filename identical
to a file which already exists on disk is also considered
illegal, and the task will stop. However, the existing file can
be automatically removed, and the new file written if the o/p
filename is preceeded by "!" at the outfil prompt.
A flag indicating how the exposure time in the o/p PHA dataset
is calculated/written. The following options are currently
available: NULL, CALC,
a real number (giving the exposure
in seconds), or an i/p filename (see Section 4).
A flag indicating how the AREASCAL
keyword in the o/p PHA dataset is determined.
The following options are currently available:
'NULL', '%', a real number,
or an i/p filename (see Section 5).
A flag whether the errors are to propagated during the algebra
(the default), or (if properr='no') whether the errors are
simply calculated from the resultant PHA dataset.
The latter can be dangerous - see Section 3
A flag indicating what error presciption is to be used should errors need to
calculated by the task at any stage. The following values are currently allowed:
Caution is urged, particularly when using
errmeth = 'Gauss', as unexpected
and/or misleading results can be produced (see Section 3).
whereby the errors are calculated using √{N}.
(the default) whereby the algorithm of
Gehrels (1986 ApJ, 303, 336) eqn 7 (+ve error) is used.
whereby the algorithm of Gehrels (1986 ApJ, 303, 336) eqn 11
(-ve error) is used.
whereby 0.5 × (POISS-1 + POISS-2) is used
The number of 70 character comment strings (up to a maximum of
4) to be added by the user. ncomments = 0
is allowed for users
not wishing to add their own notes. The default value is 1.
The i/p parameters, including the expression, will be written
as COMMENTS to the o/p file;
these additional comments provide
users the facility to add their own notes. Each string will be
truncated beyond the 70th character.
A flag indicating whether the values of ALL the
auxiliary file keywords
(BACKFILE, BACKSCAL, CORRFILE, CORRSCAL,
RESPFILE & ANCRFILE) written to the o/p
file are to be copied from a user-specified i/p file.
The default value ('NONE') indicates this is not the
case, and will result in these keywords being written with
their null/default values in the o/p file
(see Section 6).
It should be remembered that setting the auxfiles parameter
overrides any values specified via the
backfile, backscal, corrfile,
corrscal, arfile & rmfile parameters
(below) such that any values will be ignored.
A flag indicating how the BACKFILE
keyword written in the o/p PHA dataset is
determined. The following options are currently available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (NONE)
for this keyword will be written in the o/p file
The task will check whether all the i/p PHA files have the same
value of the BACKFILE keyword.
If this is the case, then that value
will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the BACKFILE keyword from the specified
i/p file will be copied to the o/p file.
A flag indicating how the BACKSCAL keyword written in the
o/p PHA dataset is determined.
The following options are currently available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (1.0)
for this keyword will be written in the o/p file
The task will check whether all the i/p PHA files
have the same value of the BACKSCAL keyword.
If this is the case (to within 1%), then
that value will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the BACKSCAL keyword from the specified
i/p file will be copied to the o/p file.
A flag indicating how the CORRFILE keyword written in the
o/p PHA dataset is determined. The following options are currently
available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (NONE) for this keyword will
be written in the o/p file
The task will check whether all the i/p PHA files have the same
value of the CORRFILE keyword. If this is the case,
then that value will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the CORRFILE keyword from the specified
i/p file will be copied to the o/p file.
A flag indicating how the CORRSCAL keyword written in
the o/p PHA dataset is determined. The following options are
currently available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (1.0) for this keyword will be
written in the o/p file
The task will check whether all the i/p PHA files have the same
value of the CORRSCAL keyword. If this is the case
(to within 1%), then that value will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the CORRSCAL keyword from the specified
i/p file will be copied to the o/p file.
A flag indicating how the ANCRFILE keyword written in the
o/p PHA dataset is determined. The following options are currently
available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (NONE) for this keyword will
be written in the o/p file
The task will check whether all the i/p PHA files have the same
value of the ANCRFILE keyword. If this is the case,
then that value will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the ANCRFILE keyword from the specified
i/p file will be copied to the o/p file.
A flag indicating how the RESPFILE keyword written in the
o/p PHA dataset is determined. The following options are
currently available:
In the latter two cases, any problems will be reported by the task, and
the default value written to the o/p file.
The default value (NONE) for this
keyword will be written in the o/p file
The task will check whether all the i/p PHA files have the same
value of the RESPFILE keyword. If this is the case,
then that value will be written to o/p file.
Otherwise, the default value will be written to the o/p file.
The value of the RESPFILE keyword from the specified
i/p file will be copied to the o/p file.
The OGIP FITS format version for PHA datasets in which the o/p
FITS file is to be written. The default is
PHAVERS1='1.1.0'
(also known as known as PHAVERSN = '1992a'),
and is currently
to only format allowed.
A real to be written into the o/p PHA dataset for channels
which suffer a divide-by-zero during the evaluation of the i/p
expression. The default value is -99.
Flag to indicate how chatty the task is at execution. A value
of 9 is the default, with lower/higher values
producing
quieter/verbose output respectively.
Flag specifying whether or not a pre-existing file with the
same name as that requested as the output file from this task
will be overwritten. This parameter is required to be present
by the FTOOLS group and NASA/GSFC and provides the same functionality
as that described above whereby the o/p filename can be preceeded
by "!" at the outfil prompt. It should be noted that
the "!"filename syntax will overwrite any existing
files irrespective of the value of the clobber parameter.
9 KNOWN BUGS
10 WORKED EXAMPLES
REFERENCES
/docs/heasarc/caldb/caldb_docs_index.html
(early, published version of OGIP/92-007)
(OGIP/92-007a
USEFUL LINKS TO OTHER HTML PAGES