next up [*] [*]
Next: Practice Up: Fitting with few counts/bin Previous: Fitting with few counts/bin



No background

Cash (ApJ 228, 939) showed that the $\chi^2$minimization criterion is a very bad one if any of the observed data bins had few counts. A better criterion is to use a likelihood function :

\begin{displaymath}C = 2 \sum_{i=1}^N (y(x_i) - y_i \ln y(x_i) + \ln y_i !)\end{displaymath}

where yi are the observed data and y(xi) the values of the function. Minimizing C for some model gives the best-fit parameters. Furthermore, this statistic can be used in the same, familiar way as the $\chi^2$statistic to find confidence intervals. One finds the parameter values that give C = Cmin + N, where N is the same number that gives the required confidence for the number of interesting parameters as for the $\chi^2$case.

Castor (priv. comm.) has pointed out that a better function to use is :

\begin{displaymath}C = 2 \sum_{i=1}^N (y(x_i) - y_i + y_i(\ln y_i - \ln y(x_i))\end{displaymath}

This differs from the first function by a quantity that depends only upon the data. In the limit of a large number of counts this second function does provide a goodness-of-fit criterion similar to that of $\chi^2$and it is now used in XSPEC. It is important to note that the C-statistic assumes that the error on the counts is pure Poisson, and thus it cannot deal with data that already has been background subtracted, or has systematic errors.

With background

Arnaud (2001, ApJ submitted) has extended the method of Cash to include the case when a background spectrum is also in use. Note that this requires the source and background spectra to both be available, it does not work on a background-subtracted spectrum.

Suppose we have an observation which produces Si events in the $i=\{1,,N\}$ spectral bins in an exposure time of ts. This observation includes events from the source of interest along with background events. Further suppose that we perform a background observation which generates Bi events in an exposure time tb. If the model source count rate in bin i is yi then the new fit statistic is

\begin{displaymath}W = 2 \sum \{ t_s y_i + (t_s + t_b) f_i - S_i \log (t_s y_i +...
...- B_i \log (t_b f_i) - S_i (1-\log S_i) - B_i (1-\log B_i) \}


\begin{displaymath}f_i = \frac{S_i + B_i - (t_s+t_b)y_i + d_i}{2(t_s+t_b)}


\begin{displaymath}d_i = \sqrt{[(t_s+t_b)y_i-S_i-B_i]^2 + 4(t_s+t_b)B_i y_i}

In the limit of large numbers of counts/bin a second-order Taylor expansion shows that W tends to

\begin{displaymath}\sum \frac{[S_i - t_s y_i - t_s f_i]^2}{t_s y_i + t_s f_i} +
\frac{[B_i - t_b f_i]^2}{t_b f_i}

which is distributed as $\chi^2$ with (N-M) degrees of freedom, where the model yi has M parameters (including the normalization).

next up [*] [*]
Next: Practice Up: Fitting with few counts/bin Previous: Fitting with few counts/bin
Ben Dorman