Due to the lapse in federal government funding, NASA is not updating this website. We sincerely regret this inconvenience.
Accessing HEASARC and LAMBDA data in the Cloud
Introduction
Beginning in 2023, the Year of Open Science, as part of NASA's Open Science Initiative, and in collaboration with the Amazon Web Services (AWS) Open Data project, HEASARC data are now available in the cloud. This effort is motivated by the need to increase the accessibility of this data in the broader community and to enable the kind of science that requires the significant resources of cloud computing.
HEASARC data are now on AWS and registered in their Open Data Registry in two buckets called "nasa-heasarc" and "nasa-lambda". Below we show in a tutorial notebook how to do this several ways:
- in Python, using tools like Astropy,
- using our new lightweight Hark search and download tool,
- direct access using HTTPS or the AWS command line interface (CLI).
NASA Astrophysics including HEASARC are building cloud analysis capabilities with the Fornax Initiative. See details ....
Pythonic Data Access Tutorial
We have updated our Astroquery module in a number of ways including providing cloud access as described on it's documentation page. In addition, we have a quick tutorial on accessing HEASARC or LAMBDA data in the cloud using more direct tools such as PyVO and boto3. You can download the Python notebook or view it rendered as HTML.
Some software, such as Astropy's FITS IO routines can read data directly from the S3 bucket, including with options to read only a subset of a FITS file. Tools like HEASoft based on cfitsio can also read any file out of a URL. See below.
Note that some HEASoft tools that rely on knowing the directory structure of an input dataset might require you to copy the data out of the S3 object store and into a file system it can access.
Tools
The HEASARC now provides a standalone tool, hark, that can be used to search for data and download it directly from AWS. We recommand using this tool for fast access to data in the cloud, especially if you are download large amounts of data and/or as part of parallel pipelines, because it does not depend on the speed of the HEASARC servers.
See the hark for details
Direct Bucket Access
These data can currently be accessed by using the HEASARC or LAMBDA web tools to browse the archive and retrieve a list of observations or files to download, or by doing the same with one of our APIs. (See our archive pages for the HEASARC options or the LAMBDA data portal.) If the given tool does not return cloud URIs, they can be inferred from the on premises URL. Simply replace the beginning of the traditional access URL with the AWS S3 bucket address. For example, a Chandra image located at
https://heasarc.gsfc.nasa.gov/FTP/chandra/data/byobsid/5/4475/primary/acisf04475N004_full_img2.fits.gzcan also be found in the "nasa-heasarc" bucket at
s3://nasa-heasarc/chandra/data/byobsid/5/4475/primary/acisf04475N004_full_img2.fits.gzor
https://nasa-heasarc.s3.amazonaws.com/chandra/data/byobsid/5/4475/primary/acisf04475N004_full_img2.fits.gz
For LAMBDA data, similar URLs can be turned into URIs using the bucket name "nasa-lambda". Note that for WMAP, there is one small change to the path from "map" to "wmap" to clarify that it's the mission name. I.e.,
https://lambda.gsfc.nasa.gov/data/map/dr5/skymaps/9yr/smoothed/wmap_band_smth_iqumap_r9_9yr_K_v5.fits
can also be found at
s3://nasa-lambda/wmap/dr5/skymaps/9yr/smoothed/wmap_band_smth_iqumap_r9_9yr_K_v5.fits
or
https://nasa-lambda.s3.amazonaws.com/wmap/dr5/skymaps/9yr/smoothed/wmap_band_smth_iqumap_r9_9yr_K_v5.fits
For bulk data access, e.g. to download a directory and its contents, you will need to use the AWS CLI. For example, to list the contents of a directory on AWS:
aws s3 ls s3://nasa-heasarc/swift/data/obs/ --no-sign-request
aws s3 cp --recursive --no-sign-request s3://nasa-heasarc/swift/data/obs/2025_03/00014197056/ my_local_directory/00014197056
Thanks to Amazon's Open Data project, these data are free to access from anywhere, not subject to cloud data egress costs. As described on HEASARC's data policy web page, these data are available freely for your use.
Datasets
Data are synchronized on a weekly basis. Please let us know if you would benefit from a higher cadence of a particular dataset. Currently, we also trigger a sync for a fermi/data/gbm/triggers/ directory as soon as the data come in for close to real-time access. The datasets currently available include:
- High-energy astrophysics datasets
- Ariel5
- ASCA
- BBXRT
- BeppoSAX
- Caldb
- Chandra
- Compton
- Copernicus
- COS-B
- DXS
- EXOSAT
- Fermi (lat/weekly/{photon,spacecraft,1s_spacecraft,extended,diffuse} and gbm/{triggers,bursts}/)
- Ginga
- HaloSat
- HEAO-1
- Hitomi
- IXPE
- Nicer
- NuSTAR
- OSO-8
- ROSAT
- SAS-2
- SRT-eRosita
- Suzaku
- Swift
- VELA 5B
- WASS
- XQC
- Rossi XTE
- XMM-Newton
- CMB datasets
- WMAP
- COBE
Please also see the HEASARC and LAMBDA entries in the AWS Open Data Registry.
Caveats
Some selection of datasets has been made to avoid putting into the cloud data that we don't believe will
be useful to access this way, such as older mission data in non-standard file formats. We will also keep
the nasa-heasarc bucket in sync with the on-prem archive on a best efforts basis for the
ongoing missions. Therefore the most recent data products may only be available from the HEASARC on-prem
archive for a few days until the next sync.