| Search in Xamin or Browse... |
XMMSTACKOB - XMM-Newton Serendipitous Source Catalog: 5XMM-DR15 Stacked Observations Data |
HEASARC Archive |
The 5XMM-DR15 catalog contains source detections drawn from a total of 14,616 XMM-Newton EPIC observations made between 2000 January 19 and 2024 October 14; all datasets included were publicly available by 2024 October 31 but not all public observations are included in this catalog. This is due to some observations having very poor signal to noise or processing issues. The net area of the catalog fields taking account of the substantial overlaps between observations is ~1397 deg2.
5XMM-DR15 contains 818,656 unique X-ray sources and 2,578,752 X-ray detections or upper limits above the processing likelihood threshold (column STACK_DET_ML) of 6. Almost half of all sources (411,307) have more than one detection in the catalog (up to 98 repeat observations in the most extreme case).
The catalog distinguishes between extended emission and point-like detections. Parameters of detections of extended sources are only reliable up to the maximum extent measure of 80 arcseconds. There are 42,669 detections of extended emission, only about half of the number in 4XMM-DR14, but twice the number of 'clean' extended sources in 4XMM-DR14. This can be understood by the improved signal to noise in 5XMM-DR15 which is a stacked catalog, which ensures that extended sources are more reliably identified. Indeed more than 60% of the extended sources in 5XMM-DR15 (25,845) are identified as clean (SUM_FLAG < 3).
Due to intrinsic features of the instrumentation as well as some shortcomings of the source detection process, some detections are considered to be spurious or their parameters are considered to be unreliable. It is recommended to use a flag as filters to obtain what can be considered a 'clean' sample. There are 764,140 out of 818,656 sources that are considered to be clean (i.e., SUM_FLAG < 3).
For 408,694 detections, EPIC time series and 408,901 detections, EPIC spectra were automatically extracted during processing, and a chi2-variability test was applied to the time series. This is a significant increase since 4XMM-DR14 as these products are now extracted for detections with 50 EPIC counts, whereas 100 EPIC counts were previously required. 12,330 detections in the catalog are considered variable, within the timespan of the specific observation, at a probability of 10-5 or less based on the null-hypothesis that the source is constant. Of these, 10,907 have a SUM_FLAG < 3. On the long-term, 41,187 sources have a variability of a factor five or greater.
The median flux (in the total photon-energy band 0.2 - 12 keV) of the catalog detections is ~ 1.3 x 10-14 erg/cm2/s; in the soft energy band (0.2 - 2 keV) the median flux is ~ 2.9 x 10-15, and in the hard band (2 - 12 keV) it is ~7.6 x 10-15. The flux values from the three EPIC cameras are, overall, in agreement to ~10% for most energy bands. The median positional accuracy of the catalog point source detections is generally < 1.52 arcseconds (with a standard deviation of 1.41 arcseconds).
To maintain similar structure as prior to the 5XMM release, the HEASARC now provides the sources-only subset of 5XMM-DR15 as XMMSSC, which contains one row per unique source, for the 818,656 sources. This XMMSTACK is the complete version of the catalog and has one row per source, followed by subsequent rows with information stemming from each detection or upper limit and thus has 3,397,248 rows. They both have the same number of columns (421). This table contains key details about each stack used in the construction of the XMMSTACK catalog.
The energy bands used in the 5XMM-DR15 processing were the same as for the 3XMM and 4XMM catalogs.
The following are the basic energy bands:
1 = 0.2 - 0.5 keV 2 = 0.5 - 1.0 keV 3 = 1.0 - 2.0 keV 4 = 2.0 - 4.5 keV 5 = 4.5 - 12.0 keV
while these are the broad energy bands:
6 = 0.2 - 2.0 keV soft band, no images made 7 = 2.0 - 12.0 keV hard band, no images made 8 = 0.2 - 12.0 keV total band
The 5XMM-DR15 catalog is about 18% larger than the 4XMM-DR14 catalog in terms of sources (126,547 more sources) and almost twice the number of sources in the stacked catalog, 4XMM-DR14s, as observations with no overlap were not considered in previous versions of the stacked catalog. In terms of the number of X-ray sources, it is 88% of the eROSITA DR1 catalog that covers half of the sky (Merloni et al. 2024) and more than twice the number of sources and detections that are in the Chandra source catalog version 2.1 (Evans et al. 2010). 5XMM-DR15 complements deeper Chandra and XMM-Newton small area surveys, probing a large sky area at the flux limit where the bulk of the objects that contribute to the X-ray background lie. The 5XMM-DR15 catalog provides a rich resource for generating large, well-defined samples for specific studies, utilizing the fact that X-ray selection is a highly efficient (arguably the most efficient) way of selecting certain types of object, notably active galaxies (AGN), clusters of galaxies, interacting compact binaries and active stellar coronae. The large sky area covered by the serendipitous survey, or equivalently the large size of the catalog, also means that 5XMM-DR15 is a superb resource for exploring the variety of the X-ray source population and identifying rare source types.
The production of the 5XMM-DR15 has been undertaken by the XMM-Newton SSC and XMM2ATHENA consortia in collaboration with the XMM-Newton Science Operations Center in fulfillment of one of its major responsibilities within the XMM-Newton project. The catalog production process has been designed to fully exploit the capabilities of the XMM-Newton EPIC cameras and to ensure the integrity and quality of the resultant catalog through rigorous screening of the data.
5XMM-DR15 is based on the pipeline configurations 21.51. This pipeline version contains many changes with respect to the pipeline used to make the previous major version of the catalog, 4XMM. The main changes to the EPIC processing include an empirical correction the MOS effective area to align with the pn effective area, along with a correction to the pn effective area above 3.0 keV to align EPIC pn to NuSTAR, which has the advantage of carrying out calibration without the mirror module and is therefore more accurate, an update to the CCD layout in the LINCOORD current calibration file (CCF) to align the source positions with the pn camera source positions (see Webb, Traulsen et al. 2026), introducing an evolving Energy Conversion Factor (ECF) with time for the MOS cameras, extracting spectra and lightcurves for each detection when there are more than 50 EPIC counts (previously 100 EPIC counts were required), and new source detection techniques developed for stacked source detection, which involves first fitting the position, extent and common flux and spectral parameters to each detection using the ECF in spectral fitting before maximum likelihood fitting and determining the final source parameters and extracting the variability information through point spread function (PSF) photometry for each detection. More information on these changes can be found in Webb, Traulsen et al., (2026), currently the draft version available.
As in previous versions of the stacked catalog, the SRCID is not propagated from previous versions.
Webb et al. (2020), "The XMM-Newton serendipitous survey. XI. The fifth
XMM-Newton serendipitous source catalogue", in prep (2026).
The following is the preferred citation of the 4XMM version of the catalog:
Webb et al. (2020), "The XMM-Newton serendipitous survey. IX. The fourth
XMM-Newton serendipitous source catalogue", <A&A, 641, 136 (2020)>
=2020A&A...641A.136W
Traulsen et al. (2020), "The XMM-Newton serendipitous survey. X: The
second source catalogue from overlapping XMM-Newton observations and its
long-term variable content", <A&A, 641, A137 (2020)>
=2020A&A...641A.137T
The following is the preferred citation of the 3XMM-DR8 version of the catalog:
Rosen, Webb, Watson et al. (2016), "The XMM-Newton Serendipitous Survey.
VII. The Third XMM-Newton Serendipitous Source Catalogue", A&A, 590, A1.
Should you use this catalog for your research and publish the results, the
authors request that you use the following acknowledgment:
"This research has made use of data obtained from the 5XMM serendipitous source catalog compiled by the XMM-Newton Survey Science Center, the XMM2ATHENA project and in collaboration with the XMM-Newton SOC."
The previous versions of the Serendipitous Source Catalog, 3XMM-DR5, 3XMM-DR6, 3XMM-DR7, 3XMM-DR8, 4XMM-DR9, 4XMM-DR10, 4XMM-DR11, 4XMM-DR12, 4XMM-DR13, and 4XMM-DR14 are also available in the same directory for comparison purposes as the files 3XMM_DR5cat_v1.0.fits.gz, 3XMM_DR6cat_v1.0.fits.gz, 3XMM_DR7cat_v1.0.fits.gz, 3XMM_DR8_cat_v1.0.fits.gz, 4XMM_DR9_cat_v1.0.fits.gz, 4XMM_DR10cat_v1.0.fits.gz, 4XMM_DR11cat_v1.0.fits.gz, 4XMM_DR12cat_v1.0.fits.gz, 4XMM_DR13cat_v1.0.fits.gz, and 4XMM_DR14cat_v1.0.fits.gz, respectively.
(2) For 5XMM-DR15 the identification of warm pixels (pixels that can become hot during a limited period of time) was carried out following the source detection on stacked data (indicated in the 10th detector flag, PN_FLAG, M1_FLAG or M2_FLAG) as T (true) and then propagated to the SUM_FLAG to indicate a possibly spurious detection/source. In versions from DR16, this step will be carried out before the stacked source detection.
Overview: The catalog contains source detections drawn from 14,616 XMM-Newton EPIC observations made between 2000 January 19 and 2024 October 14 and which were publicly available by 2024 November 30. Net exposure times in these observations range from < 1000 up to 2.7 million seconds. Figure 5.1 of the User Guide shows the distribution of fields on the sky.
The sky area of the catalog observations corrected for field overlaps is ~1,397 deg2.
The catalog contains 2,578,752 X-ray detections or upper limits with total-band (0.2 - 12 keV) likelihood values >= 6. These are detections of 818,656 unique X-ray sources, that is, 411,307 X-ray sources have multiple detections in separate observations (up to 98 observations). Of the 818,656 X-ray sources, 42,699 are classified as extended with 25,845 of these being in regions considered to be 'clean' (SUM_FLAG < 1).
Data Quality: As part of extensive quality evaluation for the catalog, each field has been visually screened. Regions where there were obvious deficiencies with the automatic source detection and parametrization process were identified and all detections within those regions were flagged (cf. 2XMM UG, Sec. 3.2.6 at http://xmmssc.irap.omp.eu/2XMM/UserGuide_xmmcat.html#CatVisScreen but importantly, note Section 3.11 at http://xmmssc.irap.omp.eu/3XMM-DR4/UserGuide_xmmcat.html#VisScreen). Such flagged detections include clearly spurious detections (many of which are classified as extended) as well as detections where the source parameters may be unreliable. For most uses of the catalog it is recommended to use SUM_FLAG as a filter to obtain what can be considered a 'clean' sample.
Note that no attempt is made to flag spurious detections arising from statistical fluctuations in the background.
Sensitivity and Photometry: Figure 5.2 presents, for each of the three cameras, the distributions of flux for energy bands 1 to 5 and also for the combined (EPIC) data. These give an indication of the limiting flux available in the catalogs for each of the bands.
Astrometry: Considerable improvements have been made to improve the astrometry for 5XMM and these are detailed in Webb, Traulsen et al., (2026), draft version available currently.
Products: Spectral Energy Distributions, Auxiliary stack data and source images are also available. More information about these data products is available from the User Guide.
It should be pointed out that the SAS used for the bulk reprocessing (for 5XMM) was from manifest pipeline version 21.51, which is based on SAS 21. A description of the column and possible cross-references follow below.
The following table gives an overview of the statistics of this catalog in comparison with 4XMM-DR14:
5XMM-DR15 4XMM-DR14 Increment
Number of observations 14,616 13,864 752
Observing interval 19-Jan-00 03-Feb-00 11 months
- 14-Oct-24 - 31-Nov-23
Sky coverage, taking overlaps 1,397 sq.deg 1,383 sq.deg 14 sq. deg
into account (>= 1ksec exposure)
Number of unique sources 818,656 692,109 126,547
Number of detections/upper limits 2,578,752 1,035,832 1,542,920
Number of 'clean' sources 764,140 585,899 179,241
(i.e., summary flag < 3)
Number of 'clean' (summary 25,845 22,147 3,698
flag < 3)
Number of detections with spectra 408,901 372,603 36,298
Number of detections with timeseries 408,694 372,313 36,381
Number of detections where the 12,330 8,380 3,950
probability of timeseries being
constant is < 1.0E-05
Institut de Recherche en Astrophysique et Planetologie, Toulouse, France Leibniz-Institut für Astrophysik, Potsdam (AIP), Germany Observatoire Astronomique de Strasbourg, France Département d'Astrophysique, CEA/DRF/IRFU, Saclay, France Instituto de Física de Cantabria, Santander, Spain University of Leicester, UK Mullard Space Science Laboratory, University College London, UK Max-Planck-Institut für extraterrestrische Physik, Germany IAASARS, National Observatory of Athens, I. Metaxa & V. Pavlou, 15236, Greece
The SSC team are grateful to the XMM-Newton SOC for their support in the catalog production activities.
This project has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement number 101004168, the XMM2ATHENA project.
The SSC acknowledges the use of the TOPCAT and STILTS software packages (written by Mark Taylor, University of Bristol) in the production and testing of the 5XMM-DR15 catalog.
Data Processing: Data processing for the 5XMM-DR15 catalog was based on the SAS version 21 and carried out with the pipeline version 21.51 and the latest set of current calibration files at the time of processing (November and December 2024). This new version includes a number of improvements compared to previous versions. Improvements to the EPIC (and RGS) effective areas were made using an empirical correction from MOS to pn (and RGS to pn), as well as a further correction above 3 keV to align the pn to the NUSTAR spectral fits. A correction was also included to update the MOS CCD positions to improve the astrometry. The main data processing steps used to produce the 5XMM data products were similar to those outlined in (Webb et al. 2020, Rosen et al. 2016, Watson et al. 2009) and described on the SOC web pages. For all the 5XMM data, the observation data files were processed to produce calibrated event lists. The optimized background time intervals were identified and using them, the filtered exposures (taking into account exposure time, instrument mode, etc.), multi-energy-band X-ray images, and exposure maps were generated. The initial detections were made on single observations, using simultaneously all images and bands, one to five, from the three cameras when available, see Table 1. The probability, and corresponding likelihood, were computed from the null hypothesis that the measured counts in the search box result from a Poissonian fluctuation in the estimated background level. A detection mask was made for each camera that defines the area of the detector which is suitable for source detection. An initial source list was made using a 'box detection' algorithm. This slides a search box (20" x 20") across the image defined by the detection mask. Sources were cut-out using a radius that was dependent on source brightness in each band, and these areas of the image where sources had been detected were blanked out. The source-excised images, normalized by the exposure maps, and the corresponding masks are convolved with a Gaussian kernel to create the background map (Traulsen et al. 2019). A second box-source-detection pass was then carried out, creating a new source list, this time using the background maps ('map mode') which increased the source detection sensitivity compared to the first pass. The box size was again set to 20" x 20". A maximum likelihood fitting procedure was then applied to the sources to calculate source parameters in each input image, by fitting a model to the distribution of counts over a circular area of radius 60" (Watson et al. 2009). 1.19 million detections were made before the stacking procedure. These detections were then used to produce detection level spectra and lightcurves, if more than 50 EPIC counts per detection were detected, where previously 100 EPIC counts were required. This resulted in almost 409,000 spectra and lightcurves, an increase of 10% with respect to 4XMM-DR14. For the catalog of sources (5XMM-DR15), the exposures were stacked and source detection was carried out by first fitting the position, extent and common flux and spectral parameters to each detection using the ECF in spectral fitting before maximum likelihood fitting and determining the final source parameters and extracting the variability information through point spread function (PSF) photometry for each detection. Automatic and visual screening procedures were carried out to check for any problems in the data products.
Stacking and source detection: Source detection on XMM-Newton EPIC observations uses maximum-likelihood fits under Cash statistics as described for example by (Watson et al. 2009, Traulsen et al. 2019). With 5XMM, the authors introduced a revised approach to stacked source detection in order to handle all 5XMM data from single observations to 99 directly overlapping observations. During the source-detection step, the authors assumed that the flux of each source remains constant over all exposures and that its spectrum in the five standard energy bands can be described by a simple model. The authors chose an absorbed power law as the spectral model which is a reasonable approximation to most XMM-Newton sources (Watson et al. 2009). Under these assumptions, the equations of the maximum-likelihood detection take the same form and the same degrees of freedom irrespective of the number of exposures in which a source is fitted. The degrees of freedom are the source coordinates, the mean source flux, and the spectral parameters -- column density and power-law index -- if the source is fitted as point-like, and additionally the radius of the extent model, if the source is fitted as extended. The results of the five-band spectral fit in the detection step are given in the catalog in the columns with prefix "STACK_".
The photon flux is related to the measured count rates in each input image to source detection by energy conversion factors (ECFs). In the new XMM-Newton source detection, the ECFs for each fitted pair of spectral parameters, for each fitted detector position, and for each instrumental setup (EPIC/pn, MOS1, MOS2 with their respective filters) are extrapolated on the fly over a grid of pre-compiled values. They cover column densities between 1019-23 cm-2 and power-law indices between 0 and 5. Time-dependence of the EPIC instrumental cross-calibration is taken into account over six different epochs.
Once a source is reliably detected with a log-likelihood STACK_DET_ML >= 6, the assumptions of constant flux and power-law spectrum are dropped, and image-level count rates and related parameters are determined by forced PSF photometry at the detected source position and extent radius. During PSF photometry, the count rate in each contributing image is treated as a free fit parameter: the method used in source detection in the previous Serendipitous Source Catalogs from EPIC data. The photometry results are given in the catalog in the RATES, FLUX, DET_ML and related columns without the prefix STACK.
For each fit parameter, the lower and upper confidence limit are calculated, searching for the parameter values for which the minimum Cash statistics value plus one is reached. For an efficient and robust search, the source-detection task emldetect employs the so called false-position method, which is a numerical bracketing approach. If the calculation of an error component does not converge, this component is now set to undefined in all cases. Previously, a count-rate dependent fall-back value was used for coordinates, extent, and count rates. The total 1-sigma error on a parameter is the arithmetic mean of the lower and the upper error if both are defined. If one component does not converge, the other component is taken as the total error. In addition to the total errors, 5XMM also includes the asymmetric upper and lower errors on the image coordinates, the extent radius, and the spectral fit parameters STACK_FLUX, STACK_NH, and STACK_GAMMA.
Systematic position error: The systematic uncertainty of the 5XMM-DR15 astrometry is estimated using a statistical approach based on the cross-matching of the X-ray sources with an external catalog with accurate positions. The adopted methodology is similar to that described in Section 6.2 of Merloni et al. (2024). It is assumed that the X-ray positional errors are symmetric in the direction of the right ascension and declination and are described by the normal distribution. Under these assumptions the probability of a radial offset, r, of an X-ray source from its true position is given by the Rayleigh distribution with parameter sigma that represents the astrometric standard deviation in the right ascension or declination direction. The authors assumed that sigma has a statistical (sigmastat) and a systematic (sigmasys) component:
sigma = (sigmastat2 + sigmasys2)0.5
with sigmastat = RADEC_ERR/sqrt(2), i.e. the statistical error is
approximated by the RADEC_ERR parameter estimated by the detection chain.
The systematic uncertainty is inferred at the population level by modeling
the angular separation distribution between the positions of X-ray sources
and an external catalog with vanishing astrometric errors. The linear part
at large angular distances represents chance alignments and a pronounced peak
at small separations corresponds to true associations. Modeling the observed
number of pairs at a given angular separation can constrain the fraction of
X-ray sources with true associations in the external catalog, the sky
density of the external catalog, X-ray source positional uncertainty
and hence sigmasys2.
The total number of X-ray verses external catalog pairs at a given angular separation bin theta is a Poisson variate with expectation value Lambda(theta) = Nrand(theta) + N_assoctheta. Therefore the likelihood of the model can then be expressed as the product of the Poisson probabilities at each angular separation bin, see Webb, Traulsen et al. (2026)
The modeling assumes that the positional uncertainty of an X-ray source is given by Equation 1 and that the systematic uncertainty is the same for all sources. Although sigmasys depends on the number of X-ray photon counts of a source, this dependence is weak and therefore assuming a single catalog-wide value for this parameter is an acceptable approximation, see Webb, Traulsen et al. (2026). The external astrometric catalog used was quasars from Gaia and unWISE Data (Shu et al. 2019). The authors only considered Gaia/unWISE sources with probability of being a quasar >0.8 and g-band magnitude <20.5 mag. The latter criterion is adopted to minimize variations in the sky density of quasar candidates because of the variable depth of the Gaia survey as a result of the scanning law of the mission. The authors limited the 5XMM catalog to sources with emldetect detection likelihood EP_DET_ML>15 (to increase the purity of the sample), that are not spatially extended (parameter EXTENT=0), are not close to CCD gaps or the edges of the field of field of view (PN_MASKFRAC>0.9 or M1_MASKFRAC>0.9 or M2_MASKFRAC>0.9), have quality flags that do not indicate issues during the detection (SUM_FLAG=0) and lie outside the Galactic plane (Galactic latitude >30 degrees). For this sample, the authors inferred sigmasys = 0.88{pm}0.01 arcsec.
This sigmasys is larger than the one derived for 4XMM-DR10s by Traulsen et al. (2020), because the 5XMM data were not rectified astrometrically when producing DR15. The astrometric correction will be included in DR16 to further improve the source positions.
Long term variability: 5XMM-DR15 is a stacked catalog, containing all of the sources detected following the stacking of overlapping observations, but also includes the individual detections in each of the contributing observations and non-detections when no detection was made. This provides the user with long-term XMM-Newton variability over the 25 years of data used to construct the catalog. These data can be visualized in lightcurves produced for each source.
However, to increase the timeframe over which variability can be examined and to increase the number of data points for each source, the STONKS algorithm was implemented (Quintin et al. 2024). This algorithm uses a master catalog constructed from data from a variety of different observatories. To create the 5XMM-DR15 catalog, this master catalog was generated in February 2026 using the most recent versions of the XMM-Newton catalog (4XMM-DR14, Webb et al. 2020), Chandra Source Catalog version 2.1 (Evans et al. 2010), the Living Swift/XRT point-source catalog (Evans et al. 2023), the eROSITA eRASS1 catalog (Merloni et al. 2024), the XMM-Newton Slew survey catalog version 3, XMMSL3 (Saxton et a.l. 2008), the two ROSAT catalogs, 2RXS (Boller et al. 2016) and WGACAT (White et al. 1994). The authors also generated upper limits for the XMM-Newton non-detections using the RapidXMM (Ruiz et al. 2022) version of HILIGT (Saxton et al. 2022). Matching was done on a two by two basis using an algorithm based on (Budavary & Szalay 2008) and implemented in NWAY (Salvato et al. 2018), see Quintin et al. (2024) for the details of the algorithm. The authors ensured that the fluxes estimated were comparable by converting each flux detection to a single, common energy band. The common band the authors chose was the 0.1-12 keV band, as it contains the energy bands of every one of the missions the authors used and then assumed an absorbed power-law spectra, with parameters Gamma = 1.7 and NH = 3 x 1020 cm-2.
The pessimistic variability ratio was calculated, taking the ratio of the highest flux point minus the 1-sigma error and the lowest flux point plus the 1-sigma error. Alternatively, in the case of an upper limit, the difference is calculated using the 3-sigma upper limit and the highest flux point minus the 1-sigma error. The authors provided significant long-term variability for sources that have a ratio of five or greater. The variability is calculated for the detections made with the standard pipeline before stacking. This is provided in the column 'APPROX_SOURCE_VAR'. There are 41,187 detections with a variability ratio of five or greater, with the highest reaching a ratio of 78,000. The mean variability is a factor 80.
Spectral fitting: The procedure to select, merge and analyze the 5XMM spectra is similar to Viitanen et al 2025, with some changes in the procedure used to merge the spectra, and in the output quality flags. Standard PPS processing of individual observations includes spectral extraction for detections with more than 50 EPIC counts, with a corresponding background spectrum. For each of the stacked sources, the associated individual detections are checked for extracted spectra. Each spectrum is checked for a strictly positive number of total counts (in the extracted detection spectrum, including source and background), background counts (in the extracted background spectrum), and net counts (calculated by subtracting the background counts from the total counts, after scaling by the relative extraction areas), in the 0.2-12~keV band. If any of these conditions are not fulfilled, the spectrum is discarded from further processing.
The selected spectra are then separated by instrument (pn or MOS). Spectra from each instrument are merged. The procedure to decide which spectra to merge for each source has been simplified with respect to that in Viitanen et al 2025. The spectra are sorted in decreasing signal-to-noise ratio (defined as the net counts divided by twice the total counts minus the re-scaled background counts) and the cumulative signal-to-noise ratio of the spectra with higher or equal signal-to-noise ratio than the one under consideration is calculated. Spectra are only merged down to the point in which the maximum cumulative signal-to-noise ratio is reached. The merging is done using the SAS task epicspeccombine. The merged total and background spectra for each instrument are then re-binned to have one or more counts per bin. Spectral fitting and modeling were done with XSPEC (Arnaud 1996) through the Python interface together with the Bayesian X-ray Analysis (BXA) tool (Buchner et al 2014), which connects XSPEC to the nested-sampling package UltraNest (Buchner et al 2021). The spectral models were implemented in XSPEC and explored with BXA. To speed up the Bayesian fitting, the authors first performed a quick Levenberg-Marquardt fit to obtain approximate best-fit values, and then reduced the prior volume explored by UltraNest by centering the prior bounds on those preliminary estimates. For the power-law model used here, the prior range for NH was set to the preliminary best-fit value, clipped to the interval [0.001,10.0] (in units of 1022 cm-2), the prior range for Gamma was set to the preliminary best-fit value {pm}1.0, clipped to [1.0,3.0], and the prior range for log10(Flux) was set to the preliminary best-fit value {pm}2, clipped to [-15,-9]. This substantially reduced the prior volume explored by UltraNest and accelerated convergence. The posterior probability distributions were then obtained from the BXA/UltraNest sampling and stored as chains for further summary and export.
The spectra were fitted with an absorbed power law (in XSPEC notation cflux * phabs * zpowerlw) to the merged spectra. This model has three free spectral parameters: the flux (FLUX in the catalog, observed flux not corrected for absorption), the column density (NH, not constrained by the column density of our Galaxy in the direction of the source) and the spectral slope (GAMMA). In addition, when pn and MOS spectra are fitted jointly, the authors included an inter-instrument normalization parameter (IIN), implemented as a multiplicative constant, with the pn normalization fixed to unity and the MOS normalization left free. Future versions of the catalog will include spectral fitting with other models as well. All spectral fitting was performed using the Cash statistic (Cash 1979), which is appropriate for Poisson-distributed data and particularly effective in the low-count regime. Unfortunately, the Cash statistic does not provide a goodness-of-fit (GoF) indicator. This was estimated by fitting the merged background spectra with an empirical, camera-specific background model. The GoF was determined by re-binning the background spectra to have at least 20 counts per bin, and then the chi2 of the best fit Cash model was compared to the expected value for an equivalent number of degrees of freedom (see Viitanen et al 2025). Fits with probabilities p<0.01 were discarded and excluded from further analysis. The second step, for the merged spectra whose corresponding background spectra passed the previous filter, was fitting the pn and MOS spectra using a combined source+background model, in which all background-shape parameters were fixed to the best-fit values obtained in the initial background-only fit, leaving only the background normalization free to vary. This approach ensures consistency and mitigates overfitting. The method of Buchner et al 2014 was used to estimate the GoF of the source+background fits. The p-values were estimated using a permutation test. For each source, the authors generated 1,000 resampled datasets by randomly redistributing the combined data+model counts into two equal-size subsamples, allowing each energy-bin count to originate from either the observed or modeled spectrum. For each resampling, the authors computed the corresponding Kolmogorov-Smirnov (KS) statistic. The p-value was then defined as the fraction of permutations yielding a KS statistic larger than that of the original data-model comparison. Models with KS p-values >=0.01 were considered acceptable fits.
Catalog values provided include the median and the 5 and 95% percentiles, degrees of freedom and the KS GoF p-value (PVALUE) of the source+background fit. The INFO parameter links to the list of spectra included in the merged spectrum (designated by their observation identification OBS_ID and source number (SRCNUM). A flag (FLAG) is provided, with the following possible values:
0 : no issues detected 1 : zero or negative source counts 2 : zero or negative source counts (also implies <=0 net counts) 3 : could not create merged spectrum 4 : source+background or background fit failed 5 : poor goodness-of-fit, with KS p-value <0.01 (or pre-fit p-value <0.01 if KS is unavailable) 6 : photon index pegged, with PhoIndex median within 0.05 of the hard prior limits <= 1.05 or >= 2.95, for priors in the range [1.0,3.0]) 7 : poor goodness-of-fit and photon index pegged 8 : NH pegged, with median NH >= 9.5 (near the upper cap of 10.0, in units of 1022 cm-2) 9 : poor goodness-of-fit and NH pegged 10 : photon index pegged and NH pegged 11 : poor goodness-of-fit, photon index pegged, and NH pegged
Optical monitor products: The XMM-OM observes the sky simultaneously with the X-ray instruments onboard XMM-Newton. For 5XMM, XMM-OM counterparts to X-ray sources are drawn from version 6.2 of the XMM-Newton Serendipitous Ultraviolet Source Survey (XMM-SUSS) catalog (Page et al. 2012). The XMM-SUSS is compiled from images obtained through the six primary photometric filters of XMM-OM, which have effective wavelengths from 2120 Angstrom (UVW2) to 5430 Angstrom (V). XMM-SUSS 6.2 includes sources detected in any of the six optical and ultraviolet photometric filters.
For XMM-OM counterparts, 5XMM contains the corresponding source ID in XMM-SUSS 6.2, the match-probability to the X-ray source (using an NWAY-like algorithm, Salvato et al. 2018) and the following information for each and every XMM-OM passband in which the counterpart is detected: AB magnitude and magnitude uncertainty, a quality flag, an extended flag, a chi2 value and the degrees of freedom for which it is calculated. The AB magnitude and uncertainty provided for each band is a weighted mean of the measurements over all XMM-Newton observations in which the source is detected, and the corresponding magnitude uncertainty. The quality flag is an integer equivalent to a binary number in which each bit corresponds to a different quality issue; a bit is set to 1 when a data quality concern is identified or otherwise set to 0. Sources with the highest quality in the corresponding photometric band will thus have a value of 0. These flags are :
The meaning of the quality flags (columns OM_filter_QUALITY_FLAG) in 5XMM are as follows:
bit 0 (value 1) source on a bad pixel bit 1 (value 2) source on a readout streak bit 2 (value 4) source on a smoke-ring bit 3 (value 8) source on a diffraction spike bit 4 (value 16) source affected by Mod-8 pattern bit 5 (value 32) source within the central enhancement bit 6 (value 64) source near a bright source bit 7 (value 128) source near the edge bit 8 (value 256) point source within an extended source bit 9 (value 512) weird source (bright pixel) bit 10 (value 1,024) multiple exposure values within photometry aperture bit 11 (value 2,048) the source is affected by the reduced sensitivity patch [**] bit 12 (value 4,096) the source is too bright (rate > 0.97 c/frame)
The extended flag is set to 0 if the counterpart is consistent with the shape of a point source in the corresponding band or 1 if the source appears extended. Note that it is possible for a source to appear point-like in one passband but have measurable extent in another. Where a source has been detected in multiple XMM-Newton observations in the same XMM-OM passband a chi2 value is computed for the sequence of magnitude measurements compared to a single, constant magnitude. The corresponding degrees of freedom is one less than the number of measurements in that band. The chi2 divided by the degrees of freedom can be used as an indicator as to whether there is evidence for variability between observations in that XMM-OM passband. Where variability is suggested by the chi2, the individual measurements can be consulted in XMM-SUSS 6.2. Caution is advised in inferring variability from chi2 when the corresponding quality flag is other than 0, or for sources which appear point-like in some XMM-Newton observations and extended in others, because the photometry is measured differently for extended and point-like sources (see Page et al. 2012). XMM-OM counterparts have been classified probabilistically into Galactic and extragalactic source types and this classification information is included in 5XMM; see Section Classification for more details.
Redshifts: Photometric redshifts (photo-z) were calculated by selecting all 5XMM sources classified as AGN (see Section Classification) and outside the galactic plane (|b| > 20 degrees). The authors used the optical and NIR-MIR photometry provided in their SEDs (Section Products) and calculated photometric redshifts for these sources. The authors used two different algorithms for estimating redshifts: TPZ, a machine learning algorithm (Carrasco Kind & Brunner, 2013) using the training sample described below MLZ-TPZ is a machine learning algorithm based on a supervised technique with prediction trees and random forest. The photometric redshifts and the corresponding cross-validation of the results was done through the photo-z pipeline developed for 5XMM, which includes a k-fold cross-validation method for evaluating the accuracy and reliability of our method, the selection of the optimal feature set for photo-z calculations using a Recursive Elimination Feature algorithm, and the quality evaluation of the individual photo-z, by using the shape of the redshift probability distribution given by TPZ. The authors also used the LePhare algorithm, a template fitting algorithm (Arnouts et al. 1999, Ilbert et al. 2006), an SED template fitting code, well adapted to find high-redshift sources that would be otherwise missed by TPZ, since the results of machine learning methods are limited to the redshift range of the corresponding training sample (redshifts below 3.5 in our case). Moreover, LePhare allows us to estimate redshifts for sources with only partial photometry in the optical or infrared bands. The authors used two different sets of templates for LePhare, depending on the optical morphological classification of the sources. For extended objects the authors used the templates proposed by Salvato et al. 2009, 2011 for the COSMOS survey. For point-like objects the authors used the eFEDS templates (Salvato et al. 2022).
The authors compiled a large spectroscopic sample of X-ray selected extragalactic sources that can be used for the training and cross-validation of the machine learning and template fitting algorithms the authors used for calculating photometric redshifts. The training sample was selected from the second version of the Millions of Optical-Radio/X-ray Associations (MORX) Catalog (Flesch 2024). The authors selected sources with secure spectroscopic redshifts, with an X-ray counterpart and classified as AGN or galaxies. The authors defined four different subsamples based on the photometry available in these large area optical surveys: SDSS sample (~55,000 sources), PanSTARRS sample (~47,000 sources), SkyMapper sample (~6,000 sources), and DES sample (~14,000 sources). Ancillary photometry in the near- (from the 2MASS, UKIDSS and VHS surveys) and mid-infrared (AllWISE catalog) was included if available.
This training sample is an order of magnitude larger than those previously used in similar efforts for estimating photometric redshifts for X-ray sources using machine learning techniques (e.g. Mountrichas et al. 2017, Ruiz et al. 2018). A variety of validation techniques for the photometric redshifts were carried out (see Webb, Traulsen et al. 2026 for more details). The spectral redshifts are provided in the 5XMM catalog under the column REDSHIFT_ZSP. 31,831 spectral redshifts are provided in 5XMM-DR15 and 154,734 photometric redshifts (REDSHIFT_TPZ_Z_BEST and REDSHIFT_LPH_Z_BEST, along with the confidence limits on these redshifts and a link to further information on distance determination per source).
Classification: Both the X-ray sources and the optical/ultraviolet sources in XMM-SUSS 6.2 have undergone a classification using an adapted version of the Naive Bayes classifier CLAXBOI (Tranin et al. 2022). For the X-ray sources this algorithm uses the XMM-Newton X-ray properties of each source such as the hardness ratios, spectral fits (with a power law, but also an APEC model), along with the X-ray to r-band flux ratio, the X-ray to W1 infra-red band ratio when these complementary data are available, the maximum X-ray variability, the X-ray luminosity when the distance is known from Gaia or the Glade+ catalog (Dálya et al. 2022) and the distance to the center of the galaxy in case of extra-galactic sources associated with a galaxy. For the X-ray sources, the most-likely classifications are given in the column CLASSX_CLASS. These are AGN, star, Galactic X-ray binary, cataclysmic variable, background AGN, extra-galactic X-ray binary and extended sources. The authors also provided an outlier classification, for the case when none of the above source-types matches the source. Seven further columns provide the probability attributed to each classification. This allows the user to make an informed decision about the reliability of the classification. For the X-ray sources, there are 556,337 AGN, 119,661 stars, 26,100 Galactic X-ray binaries, 1,276 cataclysmic variables, 49,969 background AGN, 22,732 extra-galactic X-ray binary and 42,581 extended sources. The higher the outlier value (maximum 10), the more likely the source does not fit any of the designated categories, but see also Tranin et al. (2022). 49,404 sources have an outlier value greater than five, 3,943 have the best classification as AGN, 17,834 have the best classification as a star, 15,554 as a Galactic X-ray binary, 3,657 as extra-galactic X-ray binaries and 2,394 as extended sources. This could imply that these are extreme types of each classification, or indeed, different objects.
For the XMM-OM sources only three source classes were retained, quasar (QSO), galaxies and stars. The most probable classification is given in the 'CLASSOPT_CLASS' column. Again the probability attributed to the three classes for each source is provided in the three subsequent columns. A total of 201,536 sources have an optical classification. There are 66,306 QSO, 29,971 galaxies and 105,259 stars. Of the 20,564 sources with both an X-ray classification of star and an XMM-OM classification, all of the sources are classified as stars, implying that the classification is reliable. The other classifications are more complicated to compare as an AGN does not necessarily have the same definition as a quasar, however, of the X-ray sources classified as an AGN and with an XMM-OM classification, two-thirds are classified as a quasar.
Hot areas in the detector plane: Warm pixels on a CCD (at a few counts per exposure) are too faint to be detected as such by the automatic processing, but can either push faint sources above detection level, or create spurious sources when combined with statistical fluctuations. This is an intrinsically random process, not visible over a short period of time, but which creates hot areas when projecting all sources detected over 24 years onto the detector plane.
The authors addressed this by projecting all sources onto CCD coordinates PN/M1/M2_RAWX/Y, keeping only sources above the detection threshold with the current instrument alone. In that way, the authors could distinguish hot areas coming from different instruments. The authors proceeded to detect hot pixels or columns in each CCD, using a similar method to the SAS task embadpixfind. For more information see Webb et al. (2020). Many of the warm pixels were not present at the beginning of the mission, and some appear for a short amount of time. So the authors tested each hot area for variability using revolution number, and the same Kolmogorov-Smirnov-based algorithm used to detect segments of bright columns, compared to the reference established over all sources on all CCDs and all instruments. This resulted in a revolution interval for each hot area.
Sources on a hot area for a particular instrument and within the corresponding revolution interval are flagged with flag 10 (PN_FLAG, M1_FLAG or M2_FLAG) as T (true) and then propagated to the SUM_FLAG to indicate a possibly spurious detection/source). In version 5XMM-DR15 this flagging was done after source detection, but from versions DR16, this step will be carried out before the stacked source detection. Below, the authors provided the quality flags in 5XMM.
Table 3.1: Meaning of the characters in the quality flags PN_FLAG, M1_FLAG, M2_FLAG, EP_FLAG and their distribution in 5XMM-DR15.
Flag Description EP PN M1 M2 all 818,656 100% 753,847 100% 692,614 100% 777,900 100% 0 No warning issued 537,679 66% 550,693 73% 594,018 86% 679,272 87% 1 PSF coverage < 50% 160,445 20% 97,178 13% 42,025 6% 41,749 5% 2 Near a bright point-like source 4,090 0% 3,875 1% 3,588 1% 3,876 0% 3 Near a bright extended source 60,955 7% 56,180 7% 54,894 8% 58,519 8% 4 Extended near a bright point source 728 0% 683 0% 629 0% 675 0% 5 Extended near a bright extended source 12,201 1% 8,203 1% 9,917 1% 11,088 1% 6 Extended, significant in one band 6,146 1% 5,467 1% 5,644 1% 5,862 1% 7 Extended, flag 4, 5, or 6 14,605 2% 11,699 2% 12,654 2% 13,694 2% 8 On a bad pixel or CCD area 24,433 3% 24,342 3% 170 0% 34 0% 9 Near a bad CCD area 65,913 8% 65,711 9% 476 0% 268 0% 10 On a warm CCD pixel 13,662 2% 11 Flagged during visual screening 54,516 7%
Table 3.2: Meaning of the SUM_FLAG values:
Value Description 0 Good 1 if the warning flags EP_FLAG 1, 2, 3 or 9 set to true but not 7, 8, 10 or 11 2 if the possibly-spurious or warm pixel flags EP_FLAG 7, 8 or 10 set to true but not the manual flag 11 3 if the manual flag EP_FLAG 11 is set to true but not the spurious or warm pixel flags 7, 8 or 10 4 if the manual flag 11 as well as one of the spurious or warm pixel flags 7, 8 or 10 are set to trueThe default value of every flag is F for False. When a flag was set it means it has been changed to T for True.
The task dpssflag sets all flags except the camera-specific flags (i.e., flags 2,3,4,5,6,7) on the summary row (EPIC band 8) which are then propagated backwards to the individual cameras and bands.
Stack_ID
A unique number assigned to the observation stack.
N_Observations
The total number of observations that are contained within the stack.
N_Contrib_Max
The maximum number of directly overlapping observations in the stack
N_Exposures
The number of exposures in the stack, i.e. the number of active instruments, covering the stacked region.
N_Sources_Stack
The number of sources detected within the stack.
Stack_RA
The Right Ascension reference coordinate for the stack, given in J2000.0 decimal degrees.
Stack_Dec
The Declination reference coordinate for the stack, given in J2000.0 decimal degrees.
Area_Est
The approximate sky area, in square degrees, covered by the stack.
ObsID
The unique identifier of the XMM-Newton observation.
Name
The name of the main target of the observation.
RA
The mean pointing Right Ascension of the telescope's optical axis, given in J2000.0 decimal degrees.
Dec
The mean pointing Declination of the telescope's optical axis, given in J2000.0 decimal degrees.
LII
The mean pointing Galactic Longitude of the telescope's optical axis in degrees.
BII
The mean pointing Galactic Latitude of the telescope's optical axis in degrees.
XMM_Revolution
The XMM-Newton revolution number of the observation.
Time
The date/time of the start of the contributing observation, given as the Modified Julian Date JD-2400000.5.
End_Time
The date/time of the end of the contributing observation, given as the Modified Julian Date JD-2400000.5.
Astcorr_Flag
A flag which indicates that the observation was astrometrically corrected.
CC_Pos_Offset
The catcorr total position shift of the field, in arcseconds, based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Pos_Offset_Error
The 1-sigma uncertainty in the catcorr total position shift of the field, in
arcseconds. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_RA_Offset
The catcorr shift of the Right Ascension in the field, in arcseconds. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_RA_Offset_Error
The 1-sigma uncertainty in the catcorr shift of the Right Ascension in the
field, in arcseconds. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Dec_Offset
The catcorr shift of the Declination in the field, in arcseconds. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Dec_Offset_Error
The 1-sigma uncertainty in the catcorr shift of the Declination in the field,
in arcseconds. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Rot_Corr
The catcorr shift of the position angle in the field, in degrees. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Rot_Corr_Error
The 1-sigma uncertainty in the catcorr shift of the position angle in the
field, in degrees. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Refcat
The catcorr reference catalog. This is based on the XMM SAS catcorr tool which uses the named catalog to provide positional information to correct the astrometric positions (see documentation here).
CC_Nmatches
The catcorr number of usable matches with the reference catalog. This is based on the XMM SAS catcorr tool which uses the CC_REFCAT catalog to provide positional information to correct the astrometric positions using sources within the observation to match to this number of sources matched to the reference catalog (see documentation here).
M2_Usage_Flag
This flag indicates whether EPIC/MOS2 data were used (if not null) in source detection ('S') or photometry('P').
M1_Usage_Flag
This flag indicates whether EPIC/MOS1 data were used (if not null) in source detection ('S') or photometry('P').
PN_Usage_Flag
This flag indicates whether EPIC/pn data were used (if not null) in source detection ('S') or photometry('P').
PN_Submode
The EPIC/pn submode used for the observation.
M1_Submode
The EPIC/MOS1 submode used for the observation.
M2_Submode
The EPIC/MOS2 submode used for the observation.
PN_Filter
The EPIC/pn filter used for the observation.
M1_Filter
The EPIC/MOS1 filter used for the observation.
M2_Filter
The EPIC/MOS2 filter used for the observation.
PN_Ontime
The total EPIC/pn good exposure time, in seconds, at the central position of the
source. The times are calculated by the SAS task evselect and are not
vignetting corrected. This is read from the header of the individual
observation file and set to zero if the center of the source is location on a
bad chip area.
M1_Ontime
The total EPIC/MOS1 good exposure time, in seconds, at the central position of the
source. The times are calculated by the SAS task evselect and are not
vignetting corrected. This is read from the header of the individual
observation file and set to zero if the center of the source is location on a
bad chip area.
M2_Ontime
The total EPIC/MOS2 good exposure time, in seconds, at the central position of the
source. The times are calculated by the SAS task evselect and are not
vignetting corrected. This is read from the header of the individual
observation file and set to zero if the center of the source is location on a
bad chip area.
PN_Bkg_Cprob
The EPIC/pn Cauchy probability derived from PN_BKG_CRAREA.
M1_Bkg_Cprob
The EPIC/MOS1 Cauchy probability derived from M1_BKG_CRAREA.
M2_Bkg_Cprob
The EPIC/MOS2 Cauchy probability derived from M2_BKG_CRAREA.
PN_Bkg_Crarea
The EPIC/pn background rate per area, in ct/s/arcsec2.
M1_Bkg_Crarea
The EPIC/MOS1 background rate per area, in ct/s/arcsec2.
M2_Bkg_Crarea
The EPIC/MOS2 background rate per area, in ct/s/arcsec2.