TESS Intro

An introduction into the tools and tutorials available for the analysis of TESS data

Welcome everyone to our TESS Lightkurve tutorial!


Rebekah Hounsell - Support scientist for TESS in the NASA GSFC GI Office.

Download the notebook

If you would like to download a copy of this notebook you can do so by clicking the link below


Learning Goals

In this tutorial, we will teach the user how to access, analyze, and manipulate data from the NASA Exoplanet mission TESS (this can also be applied to Kepler & K2). All tools presented will teach the user how to work with time series data for the purpose of scientific research.

The tutorial assumes a basic knowledge of python and astronomy, and will walk the user through several of the concepts outlined below:

  1. How to obtain TESS data products from the MAST archive
  2. How to use Lightkurve to access the various data products and create time series
  3. How to analyze and assess various data anomalies and how you might visualize them


This tutorial requires the use of specific packages: - Lightkurve to work with TESS data (v2.0.1) - Matplotlib for plotting. - Numpy for manipulating the data.

import lightkurve as lk
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

First time users

If you are not that experienced with Python, or cannot download Lightkurve, you can run this notebook as a Colab notebook. Colaboratory allows users to write and execute Python in your browser with zero configuration required.

All you need is a Google account and to copy and paste in the following command at the top of your colab notebook:

!pip install lightkurve

This downloads the Lightkurve package.

Introduction to TESS:

The Transiting Exoplanet Survey Satellite (TESS) is a NASA-sponsored Astrophysics Explorer-class mission that is performing a near all-sky survey to search for planets transiting nearby stars. TESS completed its primary mission in July of 2020, and has now entered its extended mission. The current extended mission will last until September 2022, and will continue to scan the sky for exoplanets and transient events. The TESS mission is now more community focused with a larger guest investigator (GI) program.

Over the last three years TESS has observed both the northern and southern hemispheres, with each hemisphere being split into ~13 sectors. Each sector is observed for ~27 days by TESS’s four cameras.

The main data products collected by the TESS mission are described below.

To learn more about the TESS mission and its data products, please visit the TESS GI pages.

1. How to obtain TESS data products from the MAST archive

You can access TESS, Kepler, and K2 data via the Mikulksi Archive for Space Telescopes (MAST) archive or via the Lightkurve package (see Section 2).

Here, we are focusing on obtaining your data via the MAST portal. Using the portal, you can enter the name of your object, its TIC number, or position (i.e., R.A and Dec). If listed in the archive, the table containing each observation will be returned.

2. How to use Lightkurve to access the various data products and create a time series

Lightkurve offers a user-friendly way to analyze time series data obtained by telescopes, in particular NASA’s Kepler and TESS exoplanet missions. You can search for the various data products for TESS on MAST using the following Lightkurve functions:

For the purpose of this tutorial, we will be examining L 98-59, a bright M dwarf star at a distance of 10.6 pc. This star is host to three terrestrial-sized planets and is also known in the TESS system as TIC 307210830.

2.1 Accessing the data products

Let’s go through each one of the above functions and see what data is available.

search_ffi = lk.search_tesscut('L 98-59')
search_tpf = lk.search_targetpixelfile('L 98-59')
search_lcf = lk.search_lightcurve('L 98-59')
SearchResult containing 15 data products.
0TESS Sector 012018TESScut1426L 98-590.0
1TESS Sector 022018TESScut1426L 98-590.0
2TESS Sector 052018TESScut1426L 98-590.0
3TESS Sector 082019TESScut1426L 98-590.0
4TESS Sector 092019TESScut1426L 98-590.0
5TESS Sector 102019TESScut1426L 98-590.0
6TESS Sector 112019TESScut1426L 98-590.0
7TESS Sector 122019TESScut1426L 98-590.0
8TESS Sector 282020TESScut475L 98-590.0
9TESS Sector 292020TESScut475L 98-590.0
10TESS Sector 322020TESScut475L 98-590.0
11TESS Sector 352021TESScut475L 98-590.0
12TESS Sector 362021TESScut475L 98-590.0
13TESS Sector 372021TESScut475L 98-590.0
14TESS Sector 382021TESScut475L 98-590.0

The above table provides several important pieces of information: - The sector in which the object was observed. - The year in which the object was observed. - The author of the data. This has multiple options and each is a hyperlink that when clicked will provide you with more information. - The cadence of the observation. - The name of the target. - The distance of the observation from your target of interest. This is useful if you conduct a cone search around your objects co-ordinates.

The table above indicates that our object was observed in multiple sectors. Note that in years 1 and 2 (2018 & 2019) that the cadence of the FFI data was 30-min, but in year 3 (2020/2021) it is 10-min.

Let’s see if any other data exists - i.e., was it observed as a target of interest and does it have a Target Pixel File.

SearchResult containing 28 data products.
0TESS Sector 022018SPOC1203072108300.0
1TESS Sector 022018TESS-SPOC18003072108300.0
2TESS Sector 052018SPOC1203072108300.0
3TESS Sector 052018TESS-SPOC18003072108300.0
4TESS Sector 082019SPOC1203072108300.0
5TESS Sector 092019SPOC1203072108300.0
6TESS Sector 102019SPOC1203072108300.0
7TESS Sector 112019SPOC1203072108300.0
8TESS Sector 122019SPOC1203072108300.0
9TESS Sector 282020SPOC203072108300.0
18TESS Sector 352021SPOC203072108300.0
19TESS Sector 352021SPOC1203072108300.0
20TESS Sector 362021SPOC203072108300.0
21TESS Sector 362021SPOC1203072108300.0
22TESS Sector 372021SPOC203072108300.0
23TESS Sector 372021SPOC1203072108300.0
24TESS Sector 382021SPOC203072108300.0
25TESS Sector 382021SPOC1203072108300.0
26TESS Sector 392021SPOC203072108300.0
27TESS Sector 392021SPOC1203072108300.0
Length = 28 rows

Great! Our object was observed as a target of interest and has 2-min and 20-sec cadenced data. This means that there should be light curve files already on the archive. Let’s check those out.

SearchResult containing 38 data products.
0TESS Sector2018DIAMANTE18003072108300.0
1TESS Sector 022018SPOC1203072108300.0
2TESS Sector 022018TESS-SPOC18003072108300.0
3TESS Sector 022018QLP18003072108300.0
4TESS Sector 022018TASOC1203072108300.0
5TESS Sector 022018TASOC18003072108300.0
6TESS Sector 052018SPOC1203072108300.0
7TESS Sector 052018TESS-SPOC18003072108300.0
8TESS Sector 052018QLP18003072108300.0
9TESS Sector 082019SPOC1203072108300.0
28TESS Sector 352021SPOC203072108300.0
29TESS Sector 352021SPOC1203072108300.0
30TESS Sector 362021SPOC203072108300.0
31TESS Sector 362021SPOC1203072108300.0
32TESS Sector 372021SPOC203072108300.0
33TESS Sector 372021SPOC1203072108300.0
34TESS Sector 382021SPOC203072108300.0
35TESS Sector 382021SPOC1203072108300.0
36TESS Sector 392021SPOC203072108300.0
37TESS Sector 392021SPOC1203072108300.0
Length = 38 rows

Wonderful! Light curves for our object of interest have already been created.

2.2 Creating a light curve using a Light Curve File:

Now on to getting the light curve for our object of interest. From the above table, it looks like there are multiple authors for our target. For the purpose of this tutorial, let’s stick to “SPOC” data products which have a 2-min cadence. We can return only these results using the following commands.

search_lcf_refined = lk.search_lightcurve('L 98-59', author="SPOC", exptime=120)
SearchResult containing 15 data products.
0TESS Sector 022018SPOC1203072108300.0
1TESS Sector 052018SPOC1203072108300.0
2TESS Sector 082019SPOC1203072108300.0
3TESS Sector 092019SPOC1203072108300.0
4TESS Sector 102019SPOC1203072108300.0
5TESS Sector 112019SPOC1203072108300.0
6TESS Sector 122019SPOC1203072108300.0
7TESS Sector 282020SPOC1203072108300.0
8TESS Sector 292020SPOC1203072108300.0
9TESS Sector 322020SPOC1203072108300.0
10TESS Sector 352021SPOC1203072108300.0
11TESS Sector 362021SPOC1203072108300.0
12TESS Sector 372021SPOC1203072108300.0
13TESS Sector 382021SPOC1203072108300.0
14TESS Sector 392021SPOC1203072108300.0

We now see five search results. Let’s download these and see what the light curve looks like.

lcf = search_lcf_refined.download_all()
LightCurveCollection of 15 objects:
    0: <TessLightCurve LABEL="TIC 307210830" SECTOR=2 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    1: <TessLightCurve LABEL="TIC 307210830" SECTOR=5 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    2: <TessLightCurve LABEL="TIC 307210830" SECTOR=8 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    3: <TessLightCurve LABEL="TIC 307210830" SECTOR=9 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    4: <TessLightCurve LABEL="TIC 307210830" SECTOR=10 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    5: <TessLightCurve LABEL="TIC 307210830" SECTOR=11 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    6: <TessLightCurve LABEL="TIC 307210830" SECTOR=12 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    7: <TessLightCurve LABEL="TIC 307210830" SECTOR=28 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    8: <TessLightCurve LABEL="TIC 307210830" SECTOR=29 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    9: <TessLightCurve LABEL="TIC 307210830" SECTOR=32 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    10: <TessLightCurve LABEL="TIC 307210830" SECTOR=35 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    11: <TessLightCurve LABEL="TIC 307210830" SECTOR=36 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    12: <TessLightCurve LABEL="TIC 307210830" SECTOR=37 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    13: <TessLightCurve LABEL="TIC 307210830" SECTOR=38 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    14: <TessLightCurve LABEL="TIC 307210830" SECTOR=39 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>

This has downloaded the light curve for each sector, and stored the data in arrays. You can look at the data for a specific sector by specifying an array number as indicated below. This displays the data for sector 2 as a table.

TessLightCurve length=18300 LABEL="TIC 307210830" SECTOR=2 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux
electron / selectron / sdpixpixelectron / selectron / selectron / selectron / selectron / selectron / spixpixpixpixpixpixpixpixpixpix

In this table, you are given the time and the flux for your object of interest. There does however seem to be three entries for flux: flux, sap_flux, and pdcsap_flux. By default the flux = pdcsap_flux, but what do these entries mean?

  • Simple Aperture Photometry (SAP): The SAP light curve is calculated by summing together the brightness of pixels that fall within an aperture set by the TESS mission. This is often referred to as the optimal aperture, but in spite of its name, it can sometimes be improved upon! Because the SAP light curve is a sum of the brightness in chosen pixels, it is still subject to systematic artifacts of the mission.
  • Pre-search Data Conditioning SAP flux (PDCSAP) flux: SAP flux from which long term trends have been removed using so-called Co-trending Basis Vectors (CBVs). PDCSAP flux is usually cleaner data than the SAP flux and will have fewer systematic trends.

You can switch between fluxes using the following commands,

pdcsap = lcf[0].pdcsap_flux

sapflux = lcf[0].sap_flux

Let’s now plot both the pdcsap and sap light curves and see what they look like.

ax = lcf[0].plot(column='sap_flux', normalize=True, label="SAP");
lcf[0].plot(ax=ax, column='pdcsap_flux', normalize=True, label="PDCSAP");

There are some big differences between these two light curves, specifically the dips in the SAP light curve and its overall gradent. These differences are caused by scattered light and other noise issues. For more information refer to these tutorials. For now, let’s think about how we can manipulate the light curves.

2.2.1 Manipulating a light curve:

There are a set of useful functions in Lightkurve which you can use to work with the data. These include:

  • flatten(): Remove long term trends using a Savitzky–Golay filter
  • remove_outliers(): Remove outliers using simple sigma clipping
  • remove_nans(): Remove infinite or NaN values (these can occur during thruster firings)
  • fold(): Fold the data at a particular period
  • bin(): Reduce the time resolution of the array, taking the average value in each bin.

We can use these simply on a light curve object. For this tutorial lets stick with the PDCSAP flux.

ax = lcf[0].plot()
ax.set_title("PDCSAP light curve of  L 98-59")
Text(0.5, 1.0, 'PDCSAP light curve of  L 98-59')
flat_lc = lcf[0].flatten(window_length=401)
Folding the light curve

From the L 98-59 System paper, we know that planet c has a period of 3.690621 days. We can use the fold() function to find the transit in our data as shown below.

folded_lc = flat_lc.fold(period=3.690621)
Binning the light curve

Often, to see a trend, it can be beneficial to bin the data, this can be achieved via the bin() function.

binned_lc = folded_lc.bin(time_bin_size=0.01)

Great, we can now see our transit very clearly! Note that we can achieve the same plot from our data using one line of code instead of several, see below.


Interact with your light curve

There is also an interactive tool for light curves called .interact_bls. Box Least Squares (BLS), is a method for identifying transit signals in a light curve.

The .interact_bls method allows you to identify periodic transit signals in light curves by manually selecting the period and duration of the signal.


The light curve in the top right panel is phase-folded with the highest power period. When you zoom in on a region of period space in the BLS periodogram, it will automatically update the phase plot with the new period-at-max-power. Changing the duration using the slider in the bottom left will also update the BLS periodogram and phase-folded light curve. Finally, the parameters of the BLS model can be found in the bottom right panel.

What if your object is not a target of interest but simply observed within the full framed images? You can still extract the data and create a 30-min or 10-min cadenced light curve.

2.3 Creating a light curve using FFI data:

In our previous FFI search, we found that L 98-59 was observed in Sector 2 with a 30-min cadence. This data is stored as the 2nd argument of the search_ffi array.

To create the light curve from the FFI data, we must first download the relevant images. Note that we do not want the entirety of the Sector 2 FFI, only a small region surrounding our object of interest. We can specify the size of the region we want to cut out using the commands below; in this case we want a 10x10 pixel region.

ffi_data = search_ffi[1].download(cutout_size=10)

Let’s now see what this cut out looks like and also check that our object is at the center of it.

<matplotlib.axes._subplots.AxesSubplot at 0x7f84b36fd0d0>

The above figure indicates the pixels on the CCD camera, with which L 98-59 was observed. The color indicates the amount of flux in each pixel, in electrons per second. The y-axis shows the pixel row, and the x-axis shows the pixel column. The title tells us the TESS Input Catalogue (TIC) identification number of the target, and the observing cadence of this image. By default, plot() shows the first observation cadence in the Sector.

It looks like our star is isolated, so we can extract a light-curve by simply summing up all the pixel values in each image. To do this, we need to first define an aperture mask.

Many decisions go into the choice of aperture mask, including the significant blending of the large TESS pixels. In this tutorial, we are going to define an aperture by defining a median flux value and only selecting pixels at a certain sigma above that threshold.

In most situations, a threshold mask will be the best choice for custom aperture photometry, as it doesn’t involve trial and error beyond finding the best sigma value. You can define a threshold mask using the following code:

target_mask = ffi_data.create_threshold_mask(threshold=15, reference_pixel='center')
n_target_pixels = target_mask.sum()

This indicates that there are 9 pixels which are above our threshold and in our mask. We can now check to make sure that our target is covered by this mask using plot.

ffi_data.plot(aperture_mask=target_mask, mask_color='r');

Nice! We see our target mask centered on the 9 brightest pixels in the center of the image. Let’s see what the light curve looks like. Note that this light curve will be uncorrected for any anomalies or noise, and that the flux is therefore based upon “Simple Aperture Photometry” (SAP).

To create our light curve we will pass our aperture_mask to the `to_lightcurve <https://docs.lightkurve.org/reference/api/lightkurve.KeplerTargetPixelFile.to_lightcurve.html?highlight=to_lightcurve>`__ function.

ffi_lc = ffi_data.to_lightcurve(aperture_mask=target_mask)

Once again, we can examine the light curve data as a table, but note this time that there is only one flux value and that as default this is the SAP flux.

TessLightCurve length=1196 LABEL="" SECTOR=2
electron / selectron / spixpix

Let’s now plot this.

ffi_lc.plot(label="SAP FFI")
<matplotlib.axes._subplots.AxesSubplot at 0x7f84b08b7550>

Looking at the above light curve, we can see two dominant peaks and observe that the flux in the aperture is dominated by what is known as scattered light. We can tell this because TESS orbits Earth twice in each sector, thus patterns which appear twice within a sector are typically related to TESS’ orbit (such as the scattered light effect).

We will discuss this issue in more detail below.

3. How to analyze and assess various data anomalies and how you might visualize them

Lets take a look at the SAP light curves derived from our FFI data and the PDCSAP light curve derived from our Light Curve File.

ax = lcf[0].plot(column='pdcsap_flux', normalize=True, label="PDCSAP");
ffi_lc.plot(ax=ax, normalize=True, label="SAP FFI")
<matplotlib.axes._subplots.AxesSubplot at 0x7f849105ccd0>

Looking at the figure above, you can see that the SAP light curve has a long-term change in brightness that has been removed in the PDCSAP light curve, while keeping the transits at the same depth. For most inspections, a PDCSAP light curve is what you want to use, but when looking at astronomical phenomena that aren’t planets (e.g. long-term variability), the SAP flux may be preferred.

The primary source of noise removed from the SAP light curve is that of scattered light. Each of TESS’s cameras has a lens hood to reduce the scattered light from the Earth and the Moon. Due to TESS’s wide field of view and the physical restrictions of the Sun shade, the lens hood is not 100% efficient. The effect of the scattered light on the CCD’s can be seen in this video.

Interactive inspection:

By interactively inspecting the area around your object of interest, you can see when scattered light comes into play, and also how it effects the light curve. To do this, we use the interact() function.


You can move the large bottom left slider to change the location of the vertical red bar, which indicates which cadence is being shown in the TPF postage stamp image. The slider beneath the TPF postage stamp image controls the screen stretch, which defaults to logarithmic scaling initialized to 1% and 95% lower and upper limits respectively.

You can move your cursor over individual data points to show hover-over tooltips indicating additional information about that datum. Currently, the tooltips list the cadence, time, flux, and quality flags. The tools on the right hand side of the plots enable zooming and pixel selection.

Interaction modes:

  • Clicking on a single pixel shows the time series light curve of that pixel alone.
  • Shift-clicking on multiple pixels shows the light curve using that pixel mask.
  • Shift-clicking on an already selected pixel will deselect that pixel.
  • Clicking and dragging a box will make a rectangular aperture mask — individual pixels can be deselected from this mask by shift-clicking (box deselecting does not work).
  • The screen stretch high and low limits can be changed independently by clicking and dragging each end, or simultaneously by clicking and dragging in the middle.
  • The cadence slider updates the postage stamp image at the position of the vertical red bar in the light curve.
  • Clicking on a position in the light curve automatically seeks to that cadence number.
  • The left and right arrows can be clicked to increment the cadence number by one.
  • The interact() tool works for TESS data and Kepler/K2.

This tool can also be used to see how crowded the field of your sources is and if anything else unusual happened during observation.

Interact Sky:

Lightkurve has an additional tool to interactively inspect target pixel files — .interact_sky. This method brings up a single frame of the target pixel file with targets identified by Gaia marked by red circles. The size of the circle scales with the magnitude of the target, where brighter sources are larger and fainter sources are smaller. Using your cursor, you can hover over the red circles to display useful information from Gaia, including its Gaia ID, G band magnitude, and coordinates.

/Users/rhounsel/opt/anaconda3/envs/astroconda/lib/python3.7/site-packages/lightkurve/interact.py:517: LightkurveWarning: Proper motion correction cannot be applied to the target, as none is available. Thus the target (the cross) might be noticeably away from its actual position, if it has large proper motion.

This tool is useful for crowded sources.

Cadence Quality Flags:

The TESS pipeline populates a series of quality flags to indicate when a cadence may have been taken during an anomalous event. These flags are available in the Light Curve Files, the Target Pixel Files, and a subset are available for the FFIs.

Aperture Mask Image Flags:

The Light Curve Files and Target Pixel Files contain an image in the APERTURE FITS extension that describes how each pixel was used in the processing.

Tables of these flags can be found here, where a description of each flag is provided.

Additional Resources

In this tutorial, we have covered the basics of how to obtain, reduce and analyze TESS data using Lightkurve. We have, however, only skimmed the surface of what Lightkurve can do and how to investigate the data. For more detailed tutorials as well as other useful tools, please visit the following pages.

  • Lightkurve Tutorials page: A set of 21 tutorials dealing with Kepler/K2 and TESS data
  • TESS GI data products page: A set of 7 TESS specific tutorials.
  • STScI Kepler K3 notebooks: A set of notebooks produced by a collaboration between NumFocus, MAST, Lightkurve, and TESS GI office. They make use of python astronomical data packages to demonstrate how to analyze time series data from these NASA missions. New tools are presented here and techniques for the advanced user.