HEASARC@SciServer User Guide
Please note that HEASARC@SciServer is in beta. If you encounter any issues running code or accessing the data, please contact the HEASARC help desk. If your issue is not related to the HEASARC setup specifically, then you can email the SciServer helpdesk. Click on your username at the top right of your dashboard and select Help from the dropdown menu.
I. Getting started
Create your own account on https://www.sciserver.org/.
Then go to "SciServer -> Compute" (note: not "Compute Jobs", which is the batch). It will list your containers, and if this is the first time you've looked, there won't be any. Here's a screen shot:
Note the grid symbol on the top menu bar that allows you to get around. The Compute is the second down. The first is the Home dashboard, where you'll find your files and groups, which we'll get to later.
Click "Create container". You'll be presented with some
options. First, enter a name for your container. (This can be done
repeatedly, so don't worry too much.) Then ignore the Domain. Third is
the Compute Image your container will be based on. Select
Then further down, your User Volumes are shown (more later). Below that are the Data Volumes where you should select the HEASARC data volume. This will make our data and software areas accessible from within your container. (The "HEASARC software" volume is deprecated and will go away, as will the comput image named simply "Heasarc". These are remnants from early testing.)
When you've created it, you'll see its status in the list.
In this screen shot, it isn't already running, so there's a green arrow to start it. The red x lets you delete it. If the green arrow is instead a red square, that means it's already running, and you can stop it if you wish. Note that if you close your browser without stopping the container, it'll still be running in the same state in which you left it.
If you click on the name of the container, a new tab will open with a Jupyter interface. By default, this is the simple notebook interface, but there's an option toward the right to switch to the Jupyter Lab:
II. File systems
There are a couple of things to know about the file systems available to you within the container. By default, you'll have areas called "temporary" and "persistent", which are what they sound. The persistent area is going to have the same contents if you stop the container and go back to it later and will be visible from other containers you define. It has a limit of 10GB total.
These will be under your HOME directory:
If you choose to define your own user volume (see below) and mount it to this container, it will also appear here, as will any that are shared with you by other users under the corresponding username:
So under Storage is where you should put your work if you want it to survive outside the containers. If you want to share stuff, you should create a user volume rather than stash it in persistent, which nobody else can see. You should also back it up yourself, see Miscellaneous below.
If you chose to mount the HEASARC data volume (when creating the container), it will be found at the same level as your Storage and Temporary areas, eg., you'll see
The FTP area will contain all of the HEASARC data holdings exactly as they are organized on our own FTP site. Our compute image also puts a link to this area at /FTP (though this link will be broken if you forget to mount the volume).
The default software installation is not on that volume but built within the compute container under
The image is non-trivial to update but will be kept up to date with major software releases. The data volume, however, also has a software area that is ours to update as needed with development software builds and extra stuff as required.
Please note that the word 'volume' is used in two different contexts: your "user volume" that you create and share yourself and the system's volumes such as the HEASARC's data volume, which you can also choose to mount. One may prefer to name the "user volumes" with the word "folder" or something to avoid confusion, because they are mounted in different places within the container.
There's also a shared user volume (see below) where we've put some example Jupyter notebooks. We hope to expand this into a large set of executable analysis threads written by us, our instrument teams, and our users who wish to share them.
III. Files, groups and sharing
In the Home dashboard is
Under Files, you can create your own "User volumes" and see what "Data volumes" are accessible (you cannot create your own). You can also browse files through this interface.
Under Groups, you can create groups. This is an example, showing the HEASARC software user group as well as others. For each group, it shows you the "Shared Files" (e.g., "user volumes"), the "Shared Data Volumes" (data volumes), and "Shared Compute Images". As you can see, the HEASARC software user group has access to several different things. The Image itself is the instance of Linux plus all the required libraries and software builds already ready to go. The headata volume is where the HEASARCs data holdings will be. (The volume called "heasoft" is deprecated and will go away; any additional software builds will be put on the same volume as the data.)
You can create your own groups, add your own shared files/volumes, and invite your collaborators to join your group. This way, you and your group can share data that are not available to anybody else.
IV. Moving files in and out
In the Jupyter Lab interface, the file navigator on the left side lets you download (right click) and upload (button on top) files.
You can also from within your Jupyter Lab shell scp from external sites that are publicly visible.
Thirdly, outside of JupyterLab, the SciServer web site has a Files tab where you can manage the contents of your user volumes. (Note that on occasion, within the container's JupyterLab session, I've found that I got a permission denied trying to remove a file that is listed as owned and writable by the "idies" user I'm running under in the container. In this case, I found I could delete it outside the container in this SciServer->Files tab. This may be because the user volume is mounted with a different set of permissions not visible within the container.)
The image currently has HEASoft 6.28:
This is the HEASoft installation you get by default (see your .bashrc). This image requires SciServer admins to update. But you'll also see that you have access to
which we can write to ourselves. If you need it, I can install a development version of HEASoft. (But it takes some time.) Then you'd change your HEADAS to point to that version of HEASoft. Or we can add other software you might need if you cannot install it yourself.
CALDB is currently set to the archive's calibration area, which will be kept up to date:
The software environment is a work in progress. You may need additional libraries. E.g., for additional machine-learning packages, you can:
Alternatively, you can upload your own software and install it. If it uses the distutils, you can run:
and it will be installed in your environment. Note that such installs will persist in this container if you stop
and restart it, but they will not be there if you create a new container. (Closing your
browser and/or logging out of SciServer does not stop your container.
When you log back in, your Jupyter Lab session will be exactly as you
left it. The container stops when you use your dashboard to stop it
as above.) Feel free to request
additions to the standard environment that may be generally useful.
If the code doesn't have an install script, then you have to do a bit more. On your desktop, you could add the code location to your PYTHONPATH, but on SciServer, you cannot change the environment that the notebook server uses. Instead you have to add your code directory to the path inside each notebook using, e.g.,
in every notebook that depends on that code. Note that if you upload
the code into your persistent storage area, it will be available there for
all new or restarted containers.
VI. Test an example notebook
We have several example notebooks in the HEASARC data volume, so if you mount this to your container, then you will find it at:
Make yourself a workspace in your persistent area:
Get a copy of one of our notebooks to try out:
(You can also do this with the navigation sidebar, where you can right click and copy and paste files.)
Then in the sidebar on the left, navigate to that directory and double-click on the name of the notebook to open it.
To be fixed: for now, click on "Python 3" at the upper right and switch it to "Python 3.8 (heasarc)", unless that Kernel is already set.
And off you go.
VII. Data discovery with HEASARC tools
The usual ways of discovering HEASARC data with our Browse and Xamin tools have not yet been integrated seamlessly into SciServer. For the moment, if you use these tools to generate a download script, you can then edit that script into a simple file list where paths starting /FTP (just like at the HEASARC) will work on SciServer within your container when you use the HEASARC image and mount the HEASARC data volume.
Alternatively, we invite you to explore the Python possibilities. There's a data access notebook in the cookbooks directory with a few examples, one of which is using PyVO. This is a powerful new way to explore not just HEASARC data but data from any VO-compliant archive. See NAVO's collection of notebook tutorials for generic use cases. In the RXTE notebook mentioned above is an example of how to get a list of observations and construct a file list with knowledge of the RXTE archive structure. We hope to provide some Python wrappers for making this even more straightforward.
You can submit a notebook that you have tested interactively to the batch for longer processing. The batch service is called Compute Jobs (while Compute is interactive) under the main menus. There are a couple of things to be aware of, though. Firstly, we are using the non-standard Python 3.8 kernel available. Set up your Jupyter notebook to use the correct "Python 3.8 (heasarc)" kernel before submitting it to the batch.
Submitting to the batch starts with a process a bit like setting up a new container. You have to select the compute image and the volumes that you want to mount. If it starts successfully, it will look something like this:
HelpIf you encounter any issues running code or accessing the data, please contact the HEASARC help desk. If your issue is not related to the HEASARC setup specifically, then you can email the SciServer helpdesk. Click on your username at the top right of your dashboard and select Help from the dropdown menu. If you could use some additional software that you cannot install yourself, you can also ask at the HEASARC help desk if we can install it for you.
User ContributionsOne of the benefits of SciServer is how easy it makes to share data, code, results, etc. among collaborators. But you can also contribute them to the community of HEASARC@SciServer users. If you have things that you think would be generally useful, place it in a user volume that you can then share with us (e.g., user tjaffe), and we'll take a look at whether it would be appropriate to include it on the HEASARC volume for all to use.
The Ciao software is also built on the HEASARC image and located in /opt/ciao-4.12/ (currently). Note that running Ciao currently requires changing Python versions. To switch to Ciao analysis in your bash shell:
Note also that one cannot use some Ciao tools with the archive itself as the input path, since it will expect to be able to write to the data directories. I.e.,
will result in an error about the read-only file system. You will have to copy the input data directory to your own workspace (temporary or persistent as appropriate; see above).
The XMM software is available in the default environment under /opt/xmmsas/. Like Chandra analysis, you must copy the input data from the main archive into your own workspace, as it will expect to write to those directories. This is under development, so please send us any issues.
HEASARC Home | Observatories | Archive | Calibration | Software | Tools | Students/Teachers/Public