Subsections


12. Introduction to Spectral Fitting

Before we get to fitting the spectrum, it may be useful to have a short introduction on different approaches to take, especially regarding statistics, rebinning, and optimization. Much has been written on this subject; users are referred to these sources and references therein for more in-depth discussions: Nousek & Shue (1989), Humphrey et al. (2009), Kaastra (2017), and Chapter 7 (Statistics) in the Handbook of X-ray Astronomy.

Traditional thinking holds that for data sets of high signal-to-noise and low background, where counting statistics are within the Gaussian regime, an analysis using the default fitting scheme in Xspec and Sherpa ($\chi ^{2}$-minimization) is appropriate. For low count rates ($<$ 5 counts/bin), in the Poisson regime, $\chi ^{2}$-minimization is no longer suitable, as the error per channel can dominate over the count rate. Since channels are weighted by the inverse-square of the errors during $\chi ^{2}$ model fitting, channels with the lowest count rates are given overly-large weights if the data are in the Poisson regime. This leads to spectral continua that are fit incorrectly, with the model lying below the true continuum level.

The traditional way to increase the signal-to-noise of a data set is to rebin or group the channels, since, if channels are grouped in sufficiently large numbers, the combined signal-to-noise of the groups will jump into the Gaussian regime. However, this results in the loss of information - sharp features like an absorption edge or emission line can be washed out. This is a particular problem for high resolution RGS data; even if count rates are large, much of the flux from these sources can be contained within emission lines, rather than the continuum. Consequently, even obtaining correct equivalent widths for such sources is non-trivial. Further, in the Poisson regime, the background spectrum cannot simply be subtracted, as is commonly done in the Gaussian regime, since this could result in negative counts. Therefore, for analysis of high resolution spectra, using the Cash statistic (cstat) on unbinned spectra, or spectra binned only to spectral resolution, has become standard practice; if an assessment of the background is necessary, the background spectrum is fitted simultaneously. Rebinning and using $\chi ^{2}$ should be reserved for fast, preliminary analysis of spectra without sharp features, or for making pretty plots for publication.

For analysis of (mostly smooth) low resolution spectra, such as a spectrum from the PN or MOS instruments, which are typically in the high count regime, spectra are often rebinned and $\chi ^{2}$ statistics are used. However, investigations into the validity of $\chi ^{2}$ statistics in the high count regime have shown that they can still yield biased results. To illustrate this, we have run sets of simulations of an EPIC PN spectrum (a lightly absorbed power law with $\Gamma $=1.46) using a canned response file. In each simulation, the spectrum was either binned to 20 or 30 counts per bin, or not binned at all, and fitted with $\chi ^{2}$ statistics with the Gehrels variance function and cstat. The simulations were run considering different source fluxes, with 10$^{4}$, 5$\times$10$^{4}$, 10$^{5}$, 5$\times$10$^{5}$, and 10$^{6}$ counts per spectrum over the range 0.3-15.0 keV. Each simulation was run 1,000 times. The resulting best-fit values of $\Gamma $ were then compared to the true value. Plots of the results are in Figure 12.1.

Figure 12.1: Comparison of the numbers of best-fit $\Gamma $ values versus the true $\Gamma $. The 1 $\sigma $ values indicate the typical measurement uncertainty for each fit. In each image, the top, middle, and bottom plots were the results of fitting to data that were binned to 30 counts/bin, 20 counts/bin, and unbinned, respectively. In all images, the red line indicates the true $\Gamma $, and the Gehrels $\chi ^{2}$ and cstat fits are blue and green histograms, respectively.

\includegraphics[scale=0.20]{bin30bin20nobin.001.eps} \includegraphics[scale=0.20]{bin30bin20nobin.002.eps} \includegraphics[scale=0.20]{bin30bin20nobin.003.eps} \includegraphics[scale=0.20]{bin30bin20nobin.004.eps} \includegraphics[scale=0.20]{bin30bin20nobin.005.eps}

It can be seen that the Gehrels $\chi ^{2}$ fits tended to find best-fit values of $\Gamma $ that were 2 or 3 $\sigma $ (or more!) higher than the true value, even in spectra with high fluxes. This is particularly true for the fits to spectra that were rebinned to only 20 counts/bin. In contrast, cstat was able to recover $\Gamma $ with much better accuracy in both the high and low count regimes, regardless of binning. The difference in the accuracy of the results obtained with $\chi ^{2}$ and cstat diminished only when the number of counts in the spectrum were very high ($\ge$ 1 million). It is therefore recommended that users examine their data carefully before deciding which statistic and binning, if any, is appropriate for their spectra.

The choice of which optimization method to use is also worth thinking about. The optimization method is what is used to find the minimum or maximum of a function, and it, too, can affect the accuracy of the fit. This is because a fitting function can have one or more local minima, and optimization methods can become stuck in a local minimum rather than finding the global minimum. A commonly used optimization method is Levenberg-Marquardt; this is the default of many software packages. It has the benefit of being very fast, but this comes at the price of being susceptible to getting pulled into local minima, especially if the initial values of the fitting parameters were far from their real values. Levenberg-Marquardt can also fail to converge if the model is complex. Nelder-Mead is more robust, being able to handle complex models better than Levenberg-Marquardt and being less susceptible to getting trapped in local minima, but it takes longer to run. Monte Carlo can also handle complex models well, and it samples the broadest parameter space, so it too will correctly find the global minimum. However, it can take a long time to run, especially if the model is complex. Therefore, users might want to use a combination of these methods to ensure that their fit is robust, especially if they are not sure that their parameters' initial values are reasonable.

If fitting RGS data with XSpec, users should be sure to run v11.1.0 or later. This is because RGS spectrum files have prompted a slight modification to the OGIP standard, since the RGS spatial extraction mask has a spatial-width which is a varying function of wavelength. Thus, it has become necessary to characterize the BACKSCL and AREASCL parameters as vectors (i.e., one number for each wavelength channel), rather than scalar keywords as they are for data from the EPIC cameras and past missions. These quantities map the size of the source extraction region to the size of the background extraction region and are essential for accurate fits. Only Xspec v11.1.0, or later versions, are capable of reading these vectors.

Finally, a caveat of using the Cash statistic in Xspec is that the scheme requires a “total” and “background” spectrum to be loaded into Xspec. This is in order to calculate parameter errors correctly. Consequently, be sure not to use the “net” spectra that were created as part of product packages by SAS v5.2 or earlier.


12.1 Fitting an EPIC Spectrum in Xspec

We will do some simple spectral fitting, using as an example the source and background that was extracted from the Lockman Hole observation (Obs ID 0123700101) in §7 and its RMF and ARF files. All tasks will be called from the command line.

For this example, we are interested in just a quick fit to see what we are dealing with, so we will rebin the data. The ftool procedure grppha provides a good mechanism to do that, though there is a caveat. grppha can group channels using an algorithm which bins up consecutive channels until a count rate threshold is reached. This method conserves the resolution in emission lines above the threshold while improving statistics in the continuum. However, while channel errors are propagated through the binning process correctly, the errors column in the original spectrum product is not strictly accurate. The problem arises because there is no good way to treat the errors within channels containing no counts. To allow statistical fitting, these channels are arbitrarily given an error value of unity, which is subsequently propagated through the binning. Consequently, the errors are overestimated in the resulting spectra.

The following commands not only group the source spectrum for Xspec but also associate the appropriate background and response files for the source. Users who are working in an environment where grppha is not supported should be aware that the SAS task specgroup7.11) can also be used for this purpose.

This is done simply by calling the task and editting parameters as is appropriate:

   > ftools
   > grppha 
      Please enter PHA filename[] mos1_pi.fits      ! input spectrum file name
      Please enter output filename[] mos1_grp30.fits  ! output grouped spectrum
      GRPPHA[] chkey BACKFILE bkg_pi.fits          ! include the background spectrum
      GRPPHA[] chkey RESPFILE mos1_rmf.fits         ! include the RMF
      GRPPHA[] chkey ANCRFILE mos1_arf.fits         ! include the ARF
      GRPPHA[] group min 30                         ! group the data by 30 counts/bin
      GRPPHA[] exit

Next, use Xspec to fit the spectrum, editting the parameters as needed. We'll use a power law as a first try.

      XSPEC> data mos1_grp30.fits    ! input data
      XSPEC> ignore 0.0-0.2,6.6-**   ! set a range appropriate for the data, in keV
      XSPEC> model pow               ! set spectral model to a power law
      1:powerlaw:PhoIndex> 2.0       ! set the power law photon index to 2.0
      2:powerlaw:norm>               ! default model normalization
      XSPEC> renorm                  ! renormalize the model spectrum
      XSPEC> fit                     ! fit the model to the data
      XSPEC> setplot device /xw      ! set the plot device
      XSPEC> setplot energy          ! plot energy along the X axis
      XSPEC> plot ldata ratio        ! plot two panels with the log of the data and
                                     !     the data/model ratio values along the Y axes
      XSPEC> exit                    ! exit Xspec
      Do you really want to exit? (y) y

The spectrum is shown in Figure 12.2.

Figure 12.2: An EPIC MOS spectrum fitted with Xspec.

\includegraphics[scale=0.5]{mos1-lh-source-spec-fit.ps}


12.2 Fitting an RGS Spectrum in Sherpa

We will use for our example the first order spectra from RGS1 and RGS2 of Capella (ObsID 0134720401) that we reprocessed in §10. While they are piled up, they serve our purpose here of simply demonstrating how to fit two RGS spectra and their associated backgrounds simultaneously.

First, lets set the BACKFILE and RESPFILE keywords in the header to the associated files (recall that with spectra from RGS, unlike EPIC, there are no ARFs to consider). SAS and Sherpa do not play well together, so it is best to do this in a new window, or at least in a window where we have not invoked SAS.

 
   > ciao
   > dmhedit P0134720401R1S007SRSPEC1001.FIT filelist=none operation=add key=BACKFILE \ 
        value=P0134720401R1S007BGSPEC1001.FIT datatype=indef
   > dmhedit P0134720401R1S007SRSPEC1001.FIT filelist=none operation=add key=RESPFILE \ 
        value=P0134720401R1S007RSPMAT1001.FIT datatype=indef
   > dmhedit P0134720401R2S008SRSPEC1001.FIT filelist=none operation=add key=BACKFILE \ 
        value=P0134720401R2S008BGSPEC1001.FIT datatype=indef
   > dmhedit P0134720401R2S008SRSPEC1001.FIT filelist=none operation=add key=RESPFILE \ 
        value=P0134720401R2S008RSPMAT1001.FIT datatype=indef


12.3 Fitting a Model

Now we can invoke Sherpa and fit our data. We will use the Monte Carlo optimization method, so it will take several minutes to run. Also note that, prior to determining the uncertainties, we will change the statistic. This is because, by default, Sherpa does not read in the error bars when it is loading data (though this can be changed). Instead, it calculates the error bars for each bin using the statistics that are applied during a session. Cstat, however, does not generate error bars - only flavors of the $\chi ^{2}$ statistic do, so if we want to compare a fit that was found with cstat to the data and have reasonably accurate error bars, we need to switch the statistic to some type of $\chi ^{2}$. Comments below that end with a “\” are continued on the next line.

   > set_method("moncar")            # we'll use the Monte Carlo optimization method 
   > set_stat("cstat")               # use cstat 
   > set_xsabund("wilm")             # set the elemental abundances to the interstellar \ 
                                     #    values of Wilms et al. (2000)
   > set_xsxsect("vern")             # set the atomic cross sections to those of Verner \ 
                                     #    et al. (1996)
   > load_pha(1,"P0134720401R1S007SRSPEC1001.FIT")   # read the RGS1 data
   > load_pha(2,"P0134720401R2S008SRSPEC1001.FIT")   # read the RGS2 data 
   > ignore(":0.6,1.2:")             # ignore data with E < 0.6 keV or E > 1.2 keV
   > plot_data(1)                    # plot the first data set
   > plot_data(2,overplot=True)      # overplot the second data set on the first
   > set_source(1,xsapec.a1)         # set the model for the RGS1 data
   > set_source(2,xsapec.a1)         # ... and do the same for RGS2
   > set_bkg_model(1,xsapec.ba1)     # set the background model for the RGS1 data
   > set_bkg_model(2,xsapec.ba1)     # ... and do the same for RGS2
   > fit()                           # sherpa will automatically fit the spectra \ 
                                     #    simultaneously

The fit results and statistics are

   Datasets              = 1, 2
   Method                = moncar
   Statistic             = cstat
   Initial fit statistic = 22749.2
   Final fit statistic   = 9440.23 at function evaluation 5205
   Data points           = 4128
   Degrees of freedom    = 4124
   Probability [Q-value] = 0
   Reduced statistic     = 2.2891
   Change in statistic   = 13309
      a1.kT          0.629616
      a1.norm        0.0422649
      ba1.kT         0.754037
      ba1.norm       0.00124926

We can easily find the flux and put error bars on our best-fit parameters.

   > calc_energy_flux(0.6,1.2)       # find the source flux between 0.6-1.2 keV in \ 
                                     #    erg/cm^2/s 
   > calc_energy_flux(0.6,1.2,bkg_id=1)  # and now the background flux 
   > set_stat("chi2gehrels")         # lets use Gehrel's chi^2 to find the error bars
   > covar()                         # the models are simple, so "covar" is fine. If we were \
                                     #    using more complex models, "conf" would be a better \
                                     #    choice.
   > set_analysis("wave")            # lets see what this looks like in wavelength space
   > plot_fit(1)
   > plot_fit(2,overplot=True)
   > exit()

We find that between 0.6 and 1.2 keV, the source and background have fluxes of 6.74e-11 erg/cm$^2$/s and 1.96e-12 erg/cm$^2$/s, respectively, and the best fit parameters are:

   Param            Best-Fit  Lower Bound  Upper Bound
   -----            --------  -----------  -----------
   a1.kT            0.629616  -0.00158166   0.00158166
   a1.norm         0.0422649 -0.000142045  0.000142045
   ba1.kT           0.754037    -0.017492     0.017492
   ba1.norm       0.00124926 -4.73433e-05  4.73433e-05

The source data are in Figure 12.3 (top). The fits are overplotted in Figure 12.3 (bottom) and shown in wavelength space.

Figure 12.3: Top: The 1st order RGS1 (blue) and RGS2 (orange) spectra in energy space. Bottom: The data fitted with a collisionally-ionized diffuse plasma model, shown in wavelength space. RGS1 data and fit are blue and orange, respectively, and RGS2 data and fit are green and red, respectively. The gap between 10.6-13.8 Å in RGS1 is due to the absence of CCD7. RGS2 has a gap between 20-24 Å due to the absence of CCD4.

\includegraphics[scale=0.15]{capella_data_energyspace.eps} \includegraphics[scale=0.15]{capella_fit_wavespace.eps}