So, you've received an XMM-Newton EPIC data set. What are you going
to do with it? After checking what the observation consists of
(see § 3.2), you should note
when the observation was taken. If it is a recent observation, it was likely
processed with the most recent calibrations and SAS, and you can immediately
start to analyze the Pipeline Processed data. However, if it is more
than a year old, it was probably processed with older versions of CCF and SAS
prior to archiving, and the pipeline should be rerun to generate event files
with the latest calibrations.
As noted in
Chapter 4, a variety of analysis packages can be used for the
following steps. However, as the SAS was designed for the basic reduction
and analysis of XMM-Newton data (extraction of spatial, spectral, and
temporal data), it will be used here for demonstration purposes.
SAS will be required at any rate for the production of detector response
files (RMFs and ARFs) and other observatory-specific requirements.
(Although for the simple case of on-axis point sources the canned
response files provided by the SOC can be used.)
NOTE: For PN observations with very bright sources, out-of-time events can provide a serious contamination of the image. Out-of-time events occur because the read-out period for the CCDs can be up to % of the frame time. Since events that occur during the read-out period can't be distinguished from others events, they are included in the event files but have invalid locations. For observations with bright sources, this can cause bright stripes in the image along the CCD read-out direction.
It is strongly recommended that you keep all reprocessed data in its own
directory! SAS places output files in whichever directory it is in when a task
is called. Throughout this primer, it is assumed that the Pipleline Processed data are
in the PPS directory, the ODF data (with upper case file names, and uncompressed)
are in the directory ODF, the analysis is taking place in the PROC directory, and
the CCF data are in the directory CCF.
If your data are recent, you need only to gunzip the files and prepare the data for processing (see §5. Feel free to skip the discussion on repipelining (§7.1) and proceed to later discussions. In any case, for simplicity, it is recommended that you change the name of the unzipped event file to something easy to type. For example, an MOS1 event list:
Various analysis procedures are demonstrated using the Lockman Hole SV1 dataset,
ObsID 0123700101, which definitely needs to be repipelined. The following procedures
are applicable to all XMM-Newton datasets, so it is not required that you use this
particular dataset; any observation should be sufficient.
If you simply want to have a quick look at your data, the ESKYIM files contain
EPIC sky images in different energy bands whose ranges are listed in
Table 3.3. While the zipped FITS files may need to be unzipped
before display in ds9 (depending on the version of ds9), they can be
displayed when zipped using fv (fv is FITS file viewer available in the
HEASoft package). In addition, the image of the total band pass for all three EPIC
detectors is also provided in PNG format which can be displayed with a web browser.
Also, the PP source list is provided in both zipped FITS format (readable by
fv) and as an HTML file.
For detailed descriptions of PP data nomenclature, file contents, and which tasks can be used to view them, see Tables 3.2 and 3.3. For detailed descriptions of ODF data nomenclature and file contents, see Table 3.1.
We assume that the data was prepared and environment variables were set
according to §5, the GUI has been invoked (see
§5.3), and we are in our working directory, ``PROC''.
From the upper window of the GUI, select emproc to process the
MOS data, and epproc to process the PN data.
Double-clicking the task will bring up pop-up windows that will allow you
to change the (many) parameters; however, for most cases, the default
settings are fine, so just click "Run".
By default, these tasks do not keep any intermediate files they generate. Emproc designates output event files with ``*ImagingEvts.ds''. In any case, you may want to name the new files something easy to type. For example, to rename one of the new MOS1 event files output from emproc, type
Remember that tasks place output files in whatever directory you happened to be in when the SAS GUI was called, so either open and close the GUI in the directory where you want the output or move the files to the directory they should be in.
The task xmmselect is used for many procedures in the GUI. Like all
tasks, it can easily be invoked by starting to type the name and pressing enter
when it is highlighted.
When xmmselect is invoked a dialog box will first appear requesting a
file name. You can either use the browser button or just type the file name
in the entry area, ``mos1.fits:EVENTS'' in this case. To use the browser, select
the file folder icon; this will bring up a second window for the file selection.
Choose the desired event file, then the ``EVENTS'' extension in the right-hand column,
and click ``OK''. The directory window will then disappear and you can click ``Run''
on the selection window.
When the file name has been submitted the xmmselect GUI (see
Figure 7.1) will appear, along with a dialog box offering
to display the selection expression. The selection expression will include the
filtering done to this point on the event file, which for the pipeline processing
includes for the most part CCD and GTI selections.
To create an image in sky coordinates by using the xmmselect, call xmmselect and load the event file as in §7.2. Then,
Different binnings and other selections can be invoked by accessing the ``Image'' tab at the top of the GUI. The default settings are reasonable, however, for a basic image. The resultant image is written to the file image.fits, and is automatically displayed using ds9 (see Figure 7.3).
The filtering expressions for the MOS and PN are:
The first two expressions will select good events with PATTERN in the 0 to 12 (or 0 to 4) range.
value is similar the GRADE selection for ASCA data, and is related to the
number and pattern of the CCD pixels triggered for a given event.The
PATTERN assignments are: single pixel events: PATTERN == 0,
double pixel events: PATTERN in [1:4], triple and quadruple events:
PATTERN in [5:12].
The second keyword in the expressions, PI, selects the preferred pulse
height of the event; for the MOS, this should be between 200 and 12000 eV.
For the PN, this should be between 200 and 15000 eV. This should clean up the
image significantly with most of the rest of the obvious contamination due to low
pulse height events. Setting the lower PI channel limit somewhat higher (e.g.,
to 300 eV) will eliminate much of the rest.
Finally, the #XMMEA_EM (#XMMEA_EP for the PN) filter provides
a canned screening set of FLAG values for the event. (The FLAG value provides a
bit encoding of various event conditions, e.g., near hot pixels or outside of the
field of view.) Setting FLAG == 0 in the selection expression provides the
most conservative screening criteria and should always be used when serious spectral
analysis is to be done on the PN. It typically is not necessary for the MOS.
It is a good idea to keep the output filtered event files and use them in your
analyses, as opposed to re-filtering the original file with every task. This will
save much time and computer memory. As an example, the Lockman Hole data's original
event file is 48.4 Mb; the fully filtered list (that is, filtered spatially, temporally,
and spectrally) is only 4.0Mb!
To filter the data using xmmselect,
Sometimes, it is necessary to use filters on time in addition to those mentioned
above. This is because of soft proton background flaring, which can have count
rates of 100 counts/sec or higher across the entire bandpass.
It should be noted that the amount of flaring that needs to be removed depends
in part on the object observed; a faint, extended object will be more affected
than a very bright X-ray source.
To determine if our observation is affected by background flaring, we can make a light curve with xmmselect. Load the event file as shown in § 7.2. Then,
In the fv pop-up window, the RATE extension will be available in the second row (index 1, as numbering begins with 0). Select ``PLOT'' from this row, and select the column name and axis on which to plot it.
Taking a look at the light curve, we can see that there is a very large flare
toward the end of the observation and two much smaller ones in the middle of
There are several ways to filter an event file: on TIME, with an explicit reference to the TIME parameter in the filtering expression or by creating a secondary Good Time Interval (GTI) file, or on RATE, which requires making a new GTI file. New GTI files are easily made with the task tabgtigen or gtibuild. These are discussed in detail below. Any of these methods will produce a cleaned file, so which one to use is a matter of the user's preference.
Filter on RATE With tabgtigen
Examining the light curve shows us that during non-flare times, the count rate is quite low, about 1.3 ct/s, with a small increase at 7.3223e7 seconds to about 6 ct/s. We can use that to generate the GTI file by calling tabgtigen from the SAS GUI and loading the light curve's RATE table as the input table. Then,
The new GTI file can be applied with xmmselect. With the mos1_filt.fits event file loaded,
Filter on TIME With tabgtigen
Alternatively, we could have chosen to make a new GTI file by noting the times of the flaring in the light curve and using that as a filtering parameter. The big flare starts around 7.32276e7 s, and the smaller ones are at 7.32119e7 s and 7.32205e7 s. The expression to remove these would be (TIME 73227600)&&! (TIME IN [7.32118e7:7.3212e7])&&(TIME IN [7.32204e7:7.32206e7]). The syntax &&(TIME 73227600) includes only events with times less than 73227600, and the "!" symbol stands for the logical "not", so use &&!(TIME in [7.32118e7:7.3212e7]) to exclude events in that time interval. To use these filtering parameters, call tabgtigen from the SAS GUI and load the light curve's RATE table as the input table. Then,
The new GTI file can be applied with xmmselect. With the mos1_filt.fits event file loaded,
Filter on TIME With gtibuild
This task requires a text file as input. In the first two columns, enter the start and end times (in seconds) that you are interested in, and in the third column, indicate with either a + or - sign whether that region should be kept or removed. Each good (or bad) time interval should get its own line. In the example case, we would write in our ASCII file (named gti.txt):
and proceed to gtibuild. Invoke the task, then
Filter on TIME by Explicit Reference
Finally, we could have chosen to forgo using tabgtigen or gtibuild altogether, and simply filtered on TIME with the standard filtering expression, seen in §7.4. In that case, the full filtering expression would be:
This expression can then be used to filter the original event file, or only the times can be used to filter the file that has already had the standard filters applied. To do this, load the filtered event file mos_filt.fits in xmmselect by going to "File New Table" at the top of the window. Then,
The edetect_chain task does nearly all the work involved with EPIC source
detection. It can process up to three intruments (both MOS cameras and the PN) with
up to five images in different energy bands simultaneously. All images must have
identical binning and WCS keywords. For this example, we will perform source detection
on MOS1 images in two bands (``soft'' X-rays with energies between 300 and 2000 eV, and
``hard'' X-rays, with energies between 2000 and 10000 eV) using the filtered event file
produced in §7.6.
We will start by generating some files that edetect_chain needs: an attitude file and
images of the sources in the desired energy bands, with the image binning sizes as needed
according to the detector. For the MOS, we'll let the binsize be 22.
First, make the attitude file by calling atthkgen. Then,
Next, make the soft and hard X-ray images. We'll also make an image that includes both bands, for display purposes. Call evselect, then
Follow the same procedure to make the hard X-ray image, changing the output name to
mos1-h.fits and the filtering expression to (FLAG == 0)&&(PI in [2000:10000]).
For our combined band image, we'll set the output name to mos1-all.fits and the
filtering expression to (FLAG == 0)&&(PI in [300:10000]).
Now we can run edetect_chain. Call the task, and then
The energy conversion factors (ECFs) convert the source count rates into
fluxes. The ECFs for each detector and energy band depend on the pattern selection and filter
used during the observation. For more information, please consult the calibration paper
``SSC-LUX-TN-0059'', available at the XMM-Newton Science Operations Center or see Table 8 in the
3XMM Catalogue User Guide.
Those used here are derived from PIMMS using the flux in the 0.1-10.0 keV band, a source
power-law index of 1.9, an absorption of
We can display the results of eboxdetect using the task srcdisplay and produce a region file for the sources. Call srcdisplay, then
Figure 7.5 shows the MOS1 event file overlayed with the detected sources.
Throughout the following, please keep in mind that some parameters are instrument-dependent.
The parameter specchannelmax should be set to 11999 for the MOS, or 20479 for the PN.
Also, for the PN, the most stringent filters, (FLAG==0)&&(PATTERN<=4), must be included
in the expression to get a high-quality spectrum.
For the MOS, the standard filters should be appropriate for many cases, though there are
some instances where tightening the selection requirements might be needed. For example,
if obtaining the best-possible spectral resolution is critical to your work, and the
corresponding loss of counts is not important, only the single pixel events should be
selected (PATTERN==0). If your observation is of a bright source, you again might want
to select only the single pixel events to mitigate pile up (see §7.9
and §7.10 for a more detailed discussion).
To extract the source spectrum, load the filtered file mos1_filt_time.fits into xmmselect if it isn't already loaded. Then,
The background spectrum can be extracted following the same method, setting the region to an annulus around the source: ((X,Y) in CIRCLE(26188.5,22816.5,1500))&&!((X,Y) in CIRCLE(26188.5,22816.5,500)). We will call the filtered event file bkg_filtered.fits and the output spectrum bkg_pi.fits.
Depending on how bright the source is and what modes the EPIC detectors are in, event pile
up may be a problem. Pile up occurs when a source is so bright that incoming X-rays strike
two neighboring pixels or the same pixel in the CCD more than once in a read-out cycle. In
such cases the energies of the two events are in effect added together to form one event.
If this happens sufficiently often, 1) the spectrum will appear to be harder than it actually
is, and 2) the count rate will be underestimated, since multiple events will be undercounted.
To check whether pile up may be a problem, use the SAS task epatplot. Heavily
piled sources will be immediately obvious, as they will have a ``hole'' in the center of
their image, but pile up is not always so conspicuous. Therefore, we recommend to always
check for it.
Note that this procedure requires as input the event file created when the spectrum was
made, not the usual time-filtered event file.
To check for pile up, invoke epatplot. Then,
The output of epatplot is a postscript file, mos1_epat.ps, which may be
viewed with viewers such as gv, containing two graphs describing the distribution of
counts as a function of PI channel, as seen in Figure 7.6.
A few words about interpretting the plots are in order. The top is the distribution of
counts versus PI channel for each pattern class (single, double, triple, quadruple),
and the bottom is the expected pattern distribution (smooth lines) plotted over the
observed distribution (histogram). The lower plot shows the model distributions for
single and double events and the observed distributions. It also gives the ratio of
observed-to-modeled events with 1- uncertainties for single and double pattern
events over a given energy range. (The default is 0.5-2.0 keV; this can be changed with
the pileupnumberenergyrange parameter.) If the data is not piled up, there will be good
agreement between the modeled and observed single and double event pattern distributions.
Also, the observed-to-modeled fractions for both singles and doubles in the specified energy
range will be unity, within errors. In contrast, if the data is piled up, there will be
clear divergence between the modeled and observed pattern distributions, and the
observed-to-modeled fraction for singles will be less than 1.0, and for doubles, it
will be greater than 1.0.
Finally, when examining the plots, it should noted that the observed-to-modeled fractions
can be inaccurate. Therefore, the agreement between the modeled and observed single and
double event pattern distributions should be the main factor in determining if an
observation is affected by pile up or not.
The source used in our Lockman Hole example is too faint to provide reasonable statistics
for epatplot and is far from being affected by pile up. For comparison, an
example of a bright source (from a different observation) which is strongly affected by
pileup is shown in Figure 7.7. Note that the observed-to-model
fraction for doubles is over 1.0, and there is severe divergence between the model and
the observed pattern distribution.
If you are working with a different (much brighter) dataset that does show signs of
pie up, there are a few ways to deal with it. First, using the region selection and
event file filtering procedures demonstrated in earlier sections, you can excise the
inner-most regions of a source (as they are the most heavily piled up), re-extract
the spectrum, and continue your analysis on the excised event file. For this procedure,
it is recommended that you take an iterative approach: remove an inner region, extract
a spectrum, check with epatplot, and repeat, each time removing a slightly
larger region, until the model and observed distribution functions agree. If you do
this, be aware that removing too small a region with respect to the instrumental pixel
size (1.1'' for the MOS, 4.1'' for the PN) can introduce systematic inaccuracies when
calculating the source flux; these are less than 4%, and decrease to less than 1%
when the excised region is more than 5 times the instrumental pixel half-size. In
any case, be certain that the excised region is larger than the instrumental pixel size!
You can also use the event file filtering procedures to include only those events
with PATTERN==0, as these events are less sensitive to pile up than other patterns.
Now that we are confident that our spectrum is not piled up, we can continue by
finding the source and background region areas. This is done with the task backscale,
which takes into account any bad pixels or chip gaps, and writes the result into the
BACKSCAL keyword of the spectrum table. Alternatively, we can skip running backscale,
and use a keyword in arfgen below. We will show both options for the curious.
To find the source extraction area explicitly, call backscale and then
Follow the same steps to find the background spectrum area, changing the input spectrum file to bkg_pi.fits.
The following assumes that an appropriate source spectrum, named mos1_pi.fits,
has been extracted as in §7.8.
To make the RMF,
To make the ARF,
At this point, the spectrum is ready to be analyzed, so skip ahead to prepare the spectrum for fitting §13.