How to find an Exoplanet with TESS data

When a planet passes in front of its host star, from a certain view point, it causes the light of that star to dim. This is known as a transit.

Many space missions have been specifically designed to detect planets using the transit method. One such mission is the Transiting Exoplanet Survey Satellite (TESS).

TESS is a NASA-sponsored Astrophysics Explorer-class mission that is performing a near all-sky survey to search for planets transiting nearby stars. The mission observes from a unique elliptical high Earth orbit (HEO) that provides an unobstructed view of its field to obtain continuous light curves and a more stable platform for precise photometry than a low Earth orbit.

TESS is equipped with four CCD cameras that have adjacent field-of-views to produce a 4 x 1 array, or ‘observing Sector’, yielding a combined field-of-view of 96 x 24 degrees, as illustrated below.

Each hemisphere is split into these observing Sectors, and each Sector is observed for ~27 days. Since 2018, TESS has observed approximately 80% of the sky, mapping both the northern and southern hemispheres, and detecting thousands of planet candidates.

Data from the TESS mission are publicly available from the Mikulski Archive for Space Telescopes (MAST). The main data products collected by the TESS mission are described below:

To learn more about the TESS mission and its data products, please visit the TESS GI pages.

Download the notebook

If you would like to download a copy of this notebook you can do so by clicking the link below


Learning Goals

In this tutorial, we will teach the user how to access, analyze, and manipulate data from the TESS mission (this can also be applied to Kepler & K2). We will be utilizing a Python package called Lightkurve which offers a user-friendly way to analyze time series data on the brightness of planets, stars, and galaxies. The package is focused on supporting science with NASA’s Kepler and TESS space telescopes but can equally be used to analyze light curves obtained by your backyard telescope.

This tutorial assumes a basic knowledge of python and astronomy, and will walk the user through several of the concepts outlined below,

  • How to use Lightkurve to access the various data products and create time series.
  • How to account for instrumental and noise effects within your data.
  • How to recover a planet transit from your data.


This tutorial requires the use of specific packages: - Lightkurve to work with TESS data (v2.0.1) - Matplotlib for plotting. - Numpy for manipulating the data.

import lightkurve as lk
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

First time users

If you are not experienced with Python, or cannot download Lightkurve, you can run this notebook as a Colab notebook. Colaboratory allows users to write and execute Python in your browser with zero configuration required.

All you need is a Google account and to copy and paste in the following command at the top of your colab notebook:

!pip install lightkurve --quiet

This downloads the Lightkurve package.

1. How to use Lightkurve to access the various data products and create a time series

You can search for the various data products for TESS using the following Lightkurve functions.

In this tutorial, we will be examining a nearby, bright target Pi Mensae (TIC ID 261136679), around which TESS scientists discovered a short period planet candidate on a 6.27 day orbit. See the ApJ paper by Huang et al (2018) for more details.

1.1 Accessing the data products

Let’s go through each one of the above functions and see what data is available.

search_ffi = lk.search_tesscut('Pi Mensae')
search_tpf = lk.search_targetpixelfile('Pi Mensae')
search_lcf = lk.search_lightcurve('Pi Mensae')
SearchResult containing 13 data products.
0TESS Sector 012018TESScut1426Pi Mensae0.0
1TESS Sector 042018TESScut1426Pi Mensae0.0
2TESS Sector 082019TESScut1426Pi Mensae0.0
3TESS Sector 112019TESScut1426Pi Mensae0.0
4TESS Sector 122019TESScut1426Pi Mensae0.0
5TESS Sector 132019TESScut1426Pi Mensae0.0
6TESS Sector 272020TESScut475Pi Mensae0.0
7TESS Sector 282020TESScut475Pi Mensae0.0
8TESS Sector 312020TESScut475Pi Mensae0.0
9TESS Sector 342021TESScut475Pi Mensae0.0
10TESS Sector 352021TESScut475Pi Mensae0.0
11TESS Sector 382021TESScut475Pi Mensae0.0
12TESS Sector 392021TESScut475Pi Mensae0.0

The above table provides several important pieces of information. - The sector in which the object was observed. - The year in which the object was observed. - The author of the data. This has multiple options, and each is a hyperlink that when clicked will provide you with more information. - The cadence of the observation. - The name of the target. - The distance of the observation from your target of interest. This is useful if you conduct a cone search around your objects co-ordinates.

The table above indicates that our object was observed in multiple Sectors. Note that in Sectors 1 - 13 (2018 & 2019) that the cadence of the FFI data was 30-min, but in Sectors 27 and above (2020 & 2021) it is 10-min.

Let’s see if any other data exists - i.e., was it observed as a target of interest and does it have a Target Pixel File.

SearchResult containing 30 data products.
0TESS Sector 012018SPOC1202611366790.0
1TESS Sector 012018TESS-SPOC18002611366790.0
2TESS Sector 042018SPOC1202611366790.0
3TESS Sector 042018TESS-SPOC18002611366790.0
4TESS Sector 082019SPOC1202611366790.0
5TESS Sector 082019TESS-SPOC18002611366790.0
6TESS Sector 112019SPOC1202611366790.0
7TESS Sector 112019TESS-SPOC18002611366790.0
8TESS Sector 122019SPOC1202611366790.0
9TESS Sector 122019TESS-SPOC18002611366790.0
20TESS Sector 312020TESS-SPOC6002611366790.0
21TESS Sector 342021SPOC202611366790.0
22TESS Sector 342021SPOC1202611366790.0
23TESS Sector 342021TESS-SPOC6002611366790.0
24TESS Sector 382021SPOC202611366790.0
25TESS Sector 382021SPOC1202611366790.0
26TESS Sector 382021TESS-SPOC6002611366790.0
27TESS Sector 392021SPOC202611366790.0
28TESS Sector 392021SPOC1202611366790.0
29TESS Sector 392021TESS-SPOC6002611366790.0
Length = 30 rows

Great! Our object was observed as a target of interest and has 2-min and 20-sec cadenced data. This means that there should be light curve files already on the archive. Let’s check those out.

SearchResult containing 41 data products.
0TESS Sector 012018SPOC1202611366790.0
1TESS Sector 012018TESS-SPOC18002611366790.0
2TESS Sector 012018QLP18002611366790.0
3TESS Sector 012018TASOC1202611366790.0
4TESS Sector 012018TASOC18002611366790.0
5TESS Sector 042018SPOC1202611366790.0
6TESS Sector 042018TESS-SPOC18002611366790.0
7TESS Sector 042018QLP18002611366790.0
8TESS Sector 082019SPOC1202611366790.0
9TESS Sector 082019TESS-SPOC18002611366790.0
31TESS Sector 312020QLP6002611366790.0
32TESS Sector 342021SPOC202611366790.0
33TESS Sector 342021SPOC1202611366790.0
34TESS Sector 342021TESS-SPOC6002611366790.0
35TESS Sector 382021SPOC202611366790.0
36TESS Sector 382021SPOC1202611366790.0
37TESS Sector 382021TESS-SPOC6002611366790.0
38TESS Sector 392021SPOC202611366790.0
39TESS Sector 392021SPOC1202611366790.0
40TESS Sector 392021TESS-SPOC6002611366790.0
Length = 41 rows

Wonderful! Light curves for our object of interest have already been created.

1.2 Creating a light curve using a Light Curve File

Now on to getting the light curve for our object of interest. From the above table, it looks like there are multiple authors for our target. For this tutorial, let’s stick to “SPOC” data products which have a 2-min cadence. We can return only these results using the following commands.

search_lcf_refined = lk.search_lightcurve('Pi Mensae', author="SPOC", exptime=120)
SearchResult containing 12 data products.
0TESS Sector 012018SPOC1202611366790.0
1TESS Sector 042018SPOC1202611366790.0
2TESS Sector 082019SPOC1202611366790.0
3TESS Sector 112019SPOC1202611366790.0
4TESS Sector 122019SPOC1202611366790.0
5TESS Sector 132019SPOC1202611366790.0
6TESS Sector 272020SPOC1202611366790.0
7TESS Sector 282020SPOC1202611366790.0
8TESS Sector 312020SPOC1202611366790.0
9TESS Sector 342021SPOC1202611366790.0
10TESS Sector 382021SPOC1202611366790.0
11TESS Sector 392021SPOC1202611366790.0

We now see 11 search results. Let’s download these and see what the light curve looks like.

lcf = search_lcf_refined.download_all()
LightCurveCollection of 12 objects:
    0: <TessLightCurve LABEL="TIC 261136679" SECTOR=1 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    1: <TessLightCurve LABEL="TIC 261136679" SECTOR=4 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    2: <TessLightCurve LABEL="TIC 261136679" SECTOR=8 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    3: <TessLightCurve LABEL="TIC 261136679" SECTOR=11 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    4: <TessLightCurve LABEL="TIC 261136679" SECTOR=12 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    5: <TessLightCurve LABEL="TIC 261136679" SECTOR=13 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    6: <TessLightCurve LABEL="TIC 261136679" SECTOR=27 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    7: <TessLightCurve LABEL="TIC 261136679" SECTOR=28 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    8: <TessLightCurve LABEL="TIC 261136679" SECTOR=31 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    9: <TessLightCurve LABEL="TIC 261136679" SECTOR=34 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    10: <TessLightCurve LABEL="TIC 261136679" SECTOR=38 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>
    11: <TessLightCurve LABEL="TIC 261136679" SECTOR=39 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux>

The above indicates that we have downloaded the light curves for each Sector and stored the data in arrays. You can look at the data for a specific Sector by specifying an array number as indicated below. This displays the data for Sector 1 as a table.

TessLightCurve length=18279 LABEL="TIC 261136679" SECTOR=1 AUTHOR=SPOC FLUX_ORIGIN=pdcsap_flux
electron / selectron / sdpixpixelectron / selectron / selectron / selectron / selectron / selectron / spixpixpixpixpixpixpixpixpixpix

In this table, you are given the time and the flux for your object of interest. There does however seem to be three entries for flux: flux, sap_flux, and pdcsap_flux. By default, the flux = pdcsap_flux, but what do these entries mean?

  • Simple Aperture Photometry (SAP): The SAP light curve is calculated by summing together the brightness of pixels that fall within an aperture set by the TESS mission. This is often referred to as the optimal aperture, but despite its name, it can sometimes be improved upon! Because the SAP light curve is a sum of the brightness in chosen pixels, it is still subject to systematic artifacts of the mission.
  • Pre-search Data Conditioning SAP flux (PDCSAP) flux: SAP flux from which long-term trends have been removed using so-called Co-trending Basis Vectors (CBVs). PDCSAP flux is usually cleaner data than the SAP flux and will have fewer systematic trends.

You can switch between fluxes using the following commands,

pdcsap = lcf[0].pdcsap_flux

sapflux = lcf[0].sap_flux

Let’s now plot both the PDCSAP and SAP light curves and see what they look like.

ax = lcf[0].plot(column='sap_flux', normalize=True, label="SAP");
lcf[0].plot(ax=ax, column='pdcsap_flux', normalize=True, label="PDCSAP");

There are some big differences between these two light curves, specifically the dips in the SAP light curve and its overall gradent. These differences will be discussed later in the tutorial. For now, let’s think about how we can manipulate the light curves.

1.2.1 Manipulating a light curve

There are a set of useful functions in Lightkurve which you can use to work with the data. These include:

  • flatten(): Remove long term trends using a Savitzky–Golay filter
  • remove_outliers(): Remove outliers using simple sigma clipping
  • remove_nans(): Remove infinite or NaN values (these can occur during thruster firings)
  • fold(): Fold the data at a particular period
  • bin(): Reduce the time resolution of the array, taking the average value in each bin.

We can use these simply on a light curve object. For this tutorial, let’s stick with the PDCSAP flux.

ax = lcf[0].plot()
ax.set_title("PDCSAP light curve of  Pi Mensae")
Text(0.5, 1.0, 'PDCSAP light curve of  Pi Mensae')

We can kind of make out a possible transit but let us manipulate the light curve some more to see if we can pull it out.


flat_lc = lcf[0].flatten(window_length=1001)

The light curve looks much flatter. Unfortunately, there is a portion of the light curve that is very noisy, due to a jitter in the TESS spacecraft. We can remove this simply by masking the light curve. First, we’ll select the times that had the jitter.

# Flag the times that are good quality
mask = (flat_lc.time.value < 1346) | (flat_lc.time.value > 1350)
masked_lc = flat_lc[mask]
<matplotlib.axes._subplots.AxesSubplot at 0x7fb070571ed0>

We can use Lightkurve to plot these two light curves over each other to see the difference.

# First define the `matplotlib.pyplot.axes`
ax = flat_lc.plot()

# Pass that axis to the next plot
masked_lc.plot(ax=ax, label='masked');

This looks much better. Now we might want to clip out some outliers from the light curve. We can do that with a simple Lightkurve function remove_outliers().

Remove outliers

clipped_lc = masked_lc.remove_outliers(sigma=6)

Finally, let’s use Lightkurve to fold the data at the exoplanet orbital period and see if we can detect the transit.

Folding the light curve and finding the transit

From the Pi Mensae paper, we know that planet c has a period of 6.27 days. We can use the fold() function to find the transit in our data as shown below.

folded_lc = clipped_lc.fold(period=6.27, epoch_time=1325.504)

It looks like there’s something there, but it’s hard to see. Let’s bin the light curve to reduce the number of points, but also reduce the uncertainty of those points.

Binning the light curve

import astropy.units as u
binned_lc = folded_lc.bin(time_bin_size=5*u.minute)

And now we can see the transit of Pi Mensae c!

2. Creating a light curve using FFI data

In our previous FFI search, we found that Pi Men was observed in Sector 1 with a 30-min cadence. This data is stored as the 1st argument of the search_ffi array.

To create the light curve from the FFI data, we must first download the relevant images. Note that we do not want the entirety of the Sector 1 FFI, only a small region surrounding our object of interest. We can specify the size of the region we want to cut out using the commands below, in this case we want a 10x10 pixel region.

ffi_data = search_ffi[0].download(cutout_size=10)

Let’s now see what this cut out looks like and also check that our object is at the center of it.

<matplotlib.axes._subplots.AxesSubplot at 0x7fb0945b2190>

The above figure indicates the pixels on the CCD camera, with which Pi Men was observed. The color indicates the amount of flux in each pixel, in electrons per second. The y-axis shows the pixel row, and the x-axis shows the pixel column. The title tells us the TESS Input Catalogue (TIC) identification number of the target, and the observing cadence of this image. By default, plot() shows the first observation cadence in the Sector.

It looks like our star is isolated, so we can extract a light-curve by simply summing up all the pixel values in each image. To do this, we need to first define an aperture mask.

Many decisions go into the choice of aperture mask, including the significant blending of the large TESS pixels. In this tutorial, we are going to define an aperture by defining a median flux value and only selecting pixels at a certain sigma above that threshold.

In most situations, a threshold mask will be the best choice for custom aperture photometry, as it doesn’t involve trial and error beyond finding the best sigma value. You can define a threshold mask using the following code:

target_mask = ffi_data.create_threshold_mask(threshold=10, reference_pixel='center')
n_target_pixels = target_mask.sum()

This indicates that there are 18 pixels which are above our threshold and so in our mask. We can now check to make sure that our target is covered by this mask using plot.

ffi_data.plot(aperture_mask=target_mask, mask_color='r')
<matplotlib.axes._subplots.AxesSubplot at 0x7fb0d19c3e90>

Nice! We see our target mask centered on the 18 brightest pixels in the center of the image. Let’s see what the light curve looks like. Note that this light curve will be uncorrected for any anomalies or noise, and that the flux is therefore based upon “Simple Aperture Photometry” (SAP).

To create our light curve, we will pass our aperture_mask to the `to_lightcurve <>`__ function.

ffi_lc = ffi_data.to_lightcurve(aperture_mask=target_mask)

Once again, we can examine the light curve data as a table, but note this time that there is only one flux value and that as default, this is the SAP flux.

TessLightCurve length=1267 LABEL="" SECTOR=1
electron / selectron / spixpix

Let’s now plot this,

ffi_lc.scatter(label="SAP FFI")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb0d1babc50>

We can see that there are problematic data points in this light curve which are probably due to jitter. Once again, we can remove these data points via creating and applying a mask.

mask_ffi = (ffi_lc.time.value < 1346) | (ffi_lc.time.value > 1350)
masked_lc_ffi = ffi_lc[mask_ffi]
<matplotlib.axes._subplots.AxesSubplot at 0x7fb0d1bab050>

OK, this looks a bit better but we should also clip the data again.

clipped_ffi = masked_lc_ffi.remove_outliers(sigma=6)

Looking at the above light curve, we can see that there are still a few odd trends that need to be addressed, but there is also strong evidence for the previously observed transit! We can try to clean up our data a little using Lightkurve’s built in corrector class functions. These functions are very useful for removing scattered light and other effects. You can learn more about them here.

In this example, we are going to use the Pixel Level Decorrelation (PLD) Corrector (PLDCorrect). The PLD method has primarily been used to remove systematic trends introduced by small spacecraft motions during observations and has been shown to be successful at improving the precision of data taken by the Spitzer space telescope. PLD works by identifying a set of trends in the pixels surrounding the target star and performing linear regression to create a combination of these trends that effectively models the systematic noise introduced by spacecraft motion. This noise model is then subtracted from the uncorrected light curve. We can apply it to our data using the code shown below.

from lightkurve.correctors import PLDCorrector
pld = PLDCorrector(ffi_data[mask_ffi], aperture_mask=target_mask)
pltAxis = pld.diagnose()

corrected_ffi = pld.correct(pca_components=3)

The above plots indicate the corrections applied to our light curve. It removed a background and applied a spline; outliers are also presented. Let’s now plot up our corrected light curve and compare to the corrected flux to the non-corrected flux.

ax = ffi_lc.plot(normalize=True, label="SAP FFI");
corrected_ffi.remove_outliers().plot(ax=ax,normalize=True,label="SAP FFI corrected")
(0.9975, 1.0025)

We can see that the corrector removed a lot of the trends that we were seeing. Let’s now proceed as we did before and compare the results. First we need to flatten().

ffi_flat_lc = corrected_ffi.flatten(window_length=1001)
<matplotlib.axes._subplots.AxesSubplot at 0x7fb09473f4d0>

Now we need to fold().

folded_ffi = ffi_flat_lc.fold(period=6.27, epoch_time=1325.504)
(0.999, 1.001)

It is a little noiser than before and a bit more difficult to see due to the longer cadence (30-min), but we can clearly make out the transit again. Let’s compare this to our earlier light curve.

ax = folded_lc.plot(label="LightCurve Object")
folded_ffi.plot(ax=ax, label="FFI")
<matplotlib.axes._subplots.AxesSubplot at 0x7fb0a1d59890>

Great! The transit is shown in both cases. It is clear more work needs to be done on the FFI to remove noise and instrumental trends from the data, but this is a good start!

Additional Resources

In this tutorial, we have covered the basics of how to obtain, reduce and analyze TESS data using Lightkurve. We have, however, only skimmed the surface of what Lightkurve can do and how to investigate the data. For more detailed tutorials as well as other useful tools, please visit the following pages.

  • Lightkurve Tutorials page: A set of 21 tutorials dealing with Kepler/K2 and TESS data
  • TESS GI data products page: A set of 7 TESS specific tutorials.
  • STScI Kepler K3 notebooks: A set of notebooks produced by a collaboration between NumFocus, MAST, Lightkurve, and TESS GI office. They make use of python astronomical data packages to demonstrate how to analyze time series data from these NASA missions. New tools are presented here and also techniques for the advanced user.


Rebekah Hounsell (with help from the Lightkurve Collaboration, 2018) - Support scientist for TESS in the NASA GSFC GI Office. For more help with TESS data, please contact us at