So, you've received an XMM-Newton EPIC data set. What are you going
to do with it? After checking what the observation consists of
(see § 3.2), you should note
when the observation was taken. If it is a recent observation, it was likely
processed with the most recent calibrations and SAS, and you can immediately
start to analyze the Pipeline Processed data. However, if it is more
than a year old, it was probably processed with older versions of CCF and SAS
prior to archiving, and the pipeline should be rerun to generate event files
with the latest calibrations.
As noted in
Chapter 4, a variety of analysis packages can be used for the
following steps. However, as the SAS was designed for the basic reduction
and analysis of XMM-Newton data (extraction of spatial, spectral, and
temporal data), it will be used here for demonstration purposes.
SAS will be required at any rate for the production of detector response
files (RMFs and ARFs) and other observatory-specific requirements,
though for the simple case of on-axis point sources the canned
response files provided by the SOC can be used.
NOTE: For PN observations with very bright sources, out-of-time
events can provide a serious contamination of the image. Out-of-time
events occur because the read-out period for the CCDs can be up to
% of the frame time. Since events that occur during the
read-out period can't be distinguished from others events, they are
included in the event files but have invalid locations. For
observations with bright sources, this can cause bright stripes
in the image along the CCD read-out direction. For a more
detailed description of this issue, check:
It is strongly recommended that you keep all reprocessed data in its own
directory! SAS places output files in whichever directory it is in when a task
is called. Throughout this primer, it is assumed that the Pipleline Processed data are
in the PPS directory, the ODF data (with upper case file names, and uncompressed)
are in the directory ODF, the analysis is taking place in the PROC directory, and
the CCF data are in the directory CCF.
If your data are recent, you need only to gunzip the files and prepare the data for processing (see §5. Feel free to skip the section on repipelining (§6.1 and proceed to later discussions. In any case, for simplicity, it is recommended that you change the name of the unzipped event file to something easy to type. For example, an MOS1 event list:
Various analysis procedures are demonstrated using the Lockman Hole SV1 dataset,
ObsID 0123700101, which definitely needs to be repipelined. The following procedures
are applicable to all XMM-Newton datasets, so it is not required that you use this
particular dataset; any observation should be sufficient.
If you simply want to have a quick look at your data, the ESKYIM files contain
EPIC sky images in different energy bands whose ranges are listed in
Table 3.3. While the zipped FITS files may need to be unzipped
before display in ds9 (depending on the version of ds9), they can be
displayed when zipped using fv (fv is FITS file viewer available in the
HEASoft package). In addition, the image of the total band pass for all three EPIC
detectors is also provided in PNG format which can be displayed with a web browser.
Also, the PP source list is provided in both zipped FITS format (readable by
fv) and as an HTML file.
For detailed descriptions of PP data nomenclature, file contents, and
which tasks can be used to view them, see Tables 3.2 and 3.3.
For detailed descriptions of ODF data nomenclature and file contents, see Table 3.1.
We assume that the data was prepared and environment variables were set
according to §5. In the window where SAS was
initialized, in your ``processing directory'' PROC, run emchain
or emproc to produce calibrated photon event files for the MOS
cameras, and epchain or epproc to do the same for the
Note that emproc and epproc will automatically detect
what mode the data were taken in. However, if you are using data that
was not taken in Imaging mode and want to run emchain or
epchain, you will need to set the relevant parameter, as shown
To process the MOS data, type
Similarly, to process the PN data, type
If the dataset has more than one exposure, a specific exposure can be accessed using the exposure parameter, e.g.:
where n is the exposure number. To create an out-of-time event file for your PN data, add the parameter withoutoftime to your epchain invocation:
By default, none of these tasks keep any intermediate files they generate. Emchain and epchain maintain the naming convention described in §3.3.3. Emproc and epproc designate their output event files with ``*ImagingEvts.ds''. In any case, you may want to name the new files something easy to type. For example, to rename one of the new MOS1 event files output from emchain or emproc, respectively, type
To create an image in sky coordinates, type
The output file image.fits can be viewed by using a standard FITS display, such as ds9 (see Figure 6.1) :
The filtering expressions for the MOS and PN are:
The first two expressions will select good events with PATTERN in the 0 to 12 (or 0 to 4) range.
value is similar the GRADE selection for ASCA data, and is related to the
number and pattern of the CCD pixels triggered for a given event.The
PATTERN assignments are: single pixel events: PATTERN == 0,
double pixel events: PATTERN in [1:4], triple and quadruple events:
PATTERN in [5:12].
The second keyword in the expressions, PI, selects the preferred pulse
height of the event; for the MOS, this should be between 200 and 12000 eV.
For the PN, this should be between 200 and 15000 eV. This should clean up the
image significantly with most of the rest of the obvious contamination due to low
pulse height events. Setting the lower PI channel limit somewhat higher (e.g.,
to 300 eV) will eliminate much of the rest.
Finally, the #XMMEA_EM (#XMMEA_EP for the PN) filter provides
a canned screening set of FLAG values for the event. (The FLAG value provides a
bit encoding of various event conditions, e.g., near hot pixels or outside of the
field of view.) Setting FLAG == 0 in the selection expression provides the
most conservative screening criteria and should always be used when serious spectral
analysis is to be done on the PN. It typically is not necessary for the MOS.
It is a good idea to keep the output filtered event files and use them in your
analyses, as opposed to re-filtering the original file with every task. This will
save much time and computer memory. As an example, the Lockman Hole data's original
event file is 48.4 Mb; the fully filtered list (that is, filtered spatially, temporally,
and spectrally) is only 4.0Mb!
To filter the data, type
Sometimes, it is necessary to use filters on time in addition to those
mentioned above. This is because of soft proton background flaring, which
can have count rates of 100 counts/sec or higher across the entire bandpass.
It should be noted that the amount of flaring that needs to be removed depends
in part on the object observed; a faint, extended object will be more affected
than a very bright X-ray source.
To determine if our observation is affected by background flaring, we can examine the light curve:
The output file mos1_ltcrv.fits can be viewed by using fv:
In the pop-up window, the RATE extension will be available in the second row (index 1, as numbering begins with 0). Select ``PLOT'' from this row, and select the column name and axis on which to plot it. The light curve is shown in Fig. 6.2.
Taking a look at the light curve, we can see that there is a very large
flare toward the end of the observation and two much smaller ones in the
middle of the exposure.
There are many ways to filter on time: with an explicit reference to the
TIME parameter in the filtering expression; by making a secondary
Good Time Interval (GTI) file with the task tabgtigen,
which will allow you to filter on TIME or RATE;
or by making a new GTI file with the task gtibuild using
TIME as the filter. All of these will get the job done, so which to
use is a matter of the user's preference. All of these are demonstrated
Filter on RATE with tabgtigen
Examining the light curve shows us that during non-flare times, the count rate is quite low, about 1.3 ct/s, with a small increase at 7.3223e7 seconds to about 6 ct/s. We can use that to generate the GTI file:
We can use evselect to apply it:
where the parameters are as defined in §6.3.
Filter on TIME with tabgtigen
Alternatively, we could have chosen to make a new GTI file by noting the times of the flaring in the light curve and using that as a filtering parameter. The big flare starts around 7.32276e7 s, and the smaller ones are at 7.32119e7 s and 7.32205e7 s. The expression to remove these would be (TIME 73227600)&&!(TIME IN [7.32118e7:7.3212e7])&&!(TIME IN [7.32204e7:7.32206e7]). The syntax &&(TIME 73227600) includes only events with times less than 73227600, and the "!" symbol stands for the logical "not", so use &&!(TIME in [7.32118e7:7.3212e7]) to exclude events in that time interval. Once the new GTI file is made, we apply it with evselect.
where the parameters are as defined above.
Filter on TIME with gtibuild
This method requires a text file as input. In the first two columns, enter the start and end times (in seconds) that you are interested in, and in the third column, indicate with either a + or - sign whether that region should be kept or removed. Each good (or bad) time interval should get its own line, with any optional comments preceeded by a ``#''. In the example case, we would write in our ASCII file (named gti.txt):
and proceed to gtibuild:
And we apply it in the usual manner:
where the parameters are as described in §6.3.
Filter on TIME by Explicit Reference
Finally, we could have chosen to forgo making a secondary GTI file altogether, and simply filtered on TIME with the standard filtering expression (see §6.3). In that case, the full filtering expression would be:
This expression can then be used to filter the original event file, as shown in §6.3, or only the times can be used to filter the file that has already had the standard filters applied:
where the keywords are as described in §6.3.
The edetect_chain task does nearly all the work involved with EPIC source
detection. It can process up to three intruments (both MOS cameras and the PN) with
up to five images in different energy bands simultaneously. All images must have
identical binning and WCS keywords. For this example, we will perform source detection
on MOS1 images in two bands (``soft'' X-rays with energies between 300 and 2000 eV,
and ``hard'' X-rays, with energies between 2000 and 10000 eV) using the filtered event
files produced here.
We will start by generating some files that edetect_chain needs: an
attitude file and images of the sources in the desired energy bands, with the
image binning sizes as needed according to the detector. For the MOS, the we'll
let the binsize be 22.
The example uses the filtered event file produced in §6.5, with the
assumption that it is located in the current directory.
First, make the attitude file by typing
Next, make the soft and hard X-ray images with evselect by typing
We will also make an image with both soft and hard X-rays for display purposes:
where the parameters are
Now we can run edetect_chain:
The energy conversion factors (ECFs) convert the source count rates into
fluxes. The ECFs for each detector and energy band depend on the pattern selection and filter
used during the observation. For more information, please consult the calibration paper
``SSC-LUX-TN-0059'', available at the XMM-Newton Science Operations Center or see Table 8 in the
3XMM Catalogue User Guide.
Those used here are derived from PIMMS using the flux in the 0.1-10.0 keV band, a source
power-law index of 1.9, an absorption of
We can display the results of eboxdetect using the task srcdisplay and produce a region file for the sources.
Figure 6.3 shows the MOS1 event file overlayed with the detected sources.
Throughout the following, please keep in mind that some parameters are instrument-dependent.
The parameter specchannelmax should be set to 11999 for the MOS, or 20479 for the PN.
Also, for the PN, the most stringent filters, (FLAG==0)&&(PATTERN<=4), must be included
in the expression to get a high-quality spectrum.
For the MOS, the standard filters should be appropriate for many cases, though there are
some instances where tightening the selection requirements might be needed. For example,
if obtaining the best-possible spectral resolution is critical to your work, and the
corresponding loss of counts is not important, only the single pixel events should be
selected (PATTERN==0). If your observation is of a bright source, you again might want
to select only the single pixel events to mitigate pile up (see §6.8
and §6.9 for a more detailed discussion).
In any case, you'll need to know spatial information about the area over which you want to extract the spectrum, so display the filtered event file with ds9:
Select the object whose spectrum you wish to extract. This will produce a circle
(extraction region), centered on the object. The circle's radius can be changed by
clicking on it and dragging to the desired size. Adjust the size and position of
the circle until you are satisfied with the extraction region; then, double-click on
the region to bring up a window showing the center coordinates and radius of the
circle. For this example, we will choose the source at (26188.5,22816.5) and
set the extraction radius to 300 (in physical units).
To extract the source spectrum, type
When extracting the background spectrum, follow the same procedures, but change the
extraction area. For example, make an annulus around the source; this can be done
using two circles, each defining the inner and outer edges of the annulus, then
change the filtering expression (and output file name) as necessary.
To extract the background spectrum, type
where the keywords are as described above.
Depending on how bright the source is and what modes the EPIC detectors are in, event pile
up may be a problem. Pile up occurs when a source is so bright that incoming X-rays strike
two neighboring pixels or the same pixel in the CCD more than once in a read-out cycle. In
such cases the energies of the two events are in effect added together to form one event.
If this happens sufficiently often, 1) the spectrum will appear to be harder than it actually
is, and 2) the count rate will be underestimated, since multiple events will be undercounted.
To check whether pile up may be a problem, use the SAS task epatplot. Heavily
piled sources will be immediately obvious, as they will have a ``hole'' in the center,
but pile up is not always so conspicuous. Therefore, we recommend to always check
Note that this procedure requires as input the event files created when the spectrum was
made, not the usual time-filtered event file.
To check for pile up in our Lockman Hole example, type
The output of epatplot is a postscript file, mos1_epat.ps, which may be
viewed with viewers such as gv, containing two graphs describing the distribution
of counts as a function of PI channel; see Figure 8.4.
A few words about interpretting the plots are in order. The top is the distribution of
counts versus PI channel for each pattern class (single, double, triple, quadruple),
and the bottom is the expected pattern distribution (smooth lines) plotted over
the observed distribution (histogram). The lower plot shows the model
distributions for single and double events and the observed distributions. It also
gives the ratio of observed-to-modeled events with 1- uncertainties for single and
double pattern events over a given energy range. (The default is 0.5-2.0 keV; this can be
changed with the pileupnumberenergyrange parameter.) If the data is not piled up,
there will be good agreement between the modeled and observed single and double event
pattern distributions. Also, the observed-to-modeled fractions for both singles and doubles
in the 0.5-2.0 keV range will be unity, within errors. In contrast, if the data is piled up,
there will be clear divergence between the modeled and observed pattern distributions, and the
observed-to-modeled fraction for singles will be less than 1.0, and for doubles, it will
be greater than 1.0.
Finally, when examining the plots, it should noted that the observed-to-modeled fractions
can be inaccurate. Therefore, the agreement between the modeled and observed single and double
event pattern distributions should be the main factor in determining if an observation is
affected by pile up or not.
The source used in our Lockman Hole example is too faint to provide reasonable
statistics for epatplot and is far from being affected by pile up.
For comparison, an example of a bright source (from a different observation)
which is strongly affected by pileup is shown in Figure 6.5.
Note that the observed-to-model fraction for doubles is over 1.0, and there is
severe divergence between the model and the observed pattern distribution.
If you're working with a different (much brighter) dataset that does show signs of
pile up, there are a few ways to deal with it. First, using the region selection
and event file filtering procedures demonstrated in earlier sections, you can excise
the inner-most regions of a source (as they are the most heavily piled up),
re-extract the spectrum, and continue your analysis on the excised event file.
For this procedure, it is recommended that you take an iterative approach: remove
an inner region, extract a spectrum, check with epatplot, and repeat, each time
removing a slightly larger region, until the model and observed distribution functions
agree. If you do this, be aware that removing too small a region with respect to the
instrumental pixel size (1.1'' for the MOS, 4.1'' for the PN) can introduce systematic
inaccuracies when calculating the source flux; these are less than 4%, and decrease
to less than 1% when the excised region is more than 5 times the instrumental pixel
half-size. In any case, be certain that the excised region is larger than the
instrumental pixel size!
You can also use the event file filtering procedures to include only single pixel events
(PATTERN==0), as these events are less sensitive to pile up than other patterns.
Now that we are confident that our spectrum is not piled up, we can continue by
finding the source and background region areas. This is done with the task
backscale, which takes into account any bad pixels or chip gaps, and writes
the result into the BACKSCAL keyword of the spectrum table. Alternatively, we
can skip running backscale, and use a keyword in arfgen below. We
will show both options for the curious.
To find the source and background extraction areas explicitly,
Now that a source spectrum has been extracted, we need to reformat the detector response by making a redistribution matrix file (RMF) and ancillary response file (ARF). To make the RMF:
Now use the RMF, spectrum, and event file to make the ancillary file:
If we had not run backscale, we could set a keyword in arfgen to find the region area:
At this point, the spectrum is ready to be analyzed, so skip ahead to prepare the spectrum for fitting (§13).