lightkurve.LightCurve.remove_outliers#

LightCurve.remove_outliers(sigma=5.0, sigma_lower=None, sigma_upper=None, return_mask=False, **kwargs)[source]#

Removes outlier data points using sigma-clipping.

This method returns a new LightCurve object from which data points are removed if their flux values are greater or smaller than the median flux by at least sigma times the standard deviation.

Sigma-clipping works by iterating over data points, each time rejecting values that are discrepant by more than a specified number of standard deviations from a center value. If the data contains invalid values (NaNs or infs), they are automatically masked before performing the sigma clipping.

Note

This function is a convenience wrapper around astropy.stats.sigma_clip() and provides the same functionality. Any extra arguments passed to this method will be passed on to sigma_clip.

Parameters
sigmafloat

The number of standard deviations to use for both the lower and upper clipping limit. These limits are overridden by sigma_lower and sigma_upper, if input. Defaults to 5.

sigma_lowerfloat or None

The number of standard deviations to use as the lower bound for the clipping limit. Can be set to float(‘inf’) in order to avoid clipping outliers below the median at all. If None then the value of sigma is used. Defaults to None.

sigma_upperfloat or None

The number of standard deviations to use as the upper bound for the clipping limit. Can be set to float(‘inf’) in order to avoid clipping outliers above the median at all. If None then the value of sigma is used. Defaults to None.

return_maskbool

Whether or not to return a mask (i.e. a boolean array) indicating which data points were removed. Entries marked as True in the mask are considered outliers. This mask is not returned by default.

**kwargsdict

Dictionary of arguments to be passed to astropy.stats.sigma_clip.

Returns
clean_lcLightCurve

A new light curve object from which outlier data points have been removed.

outlier_maskNumPy array, optional

Boolean array flagging which cadences were removed. Only returned if return_mask=True.

Examples

This example generates a new light curve in which all points that are more than 1 standard deviation from the median are removed:

>>> lc = LightCurve(time=[1, 2, 3, 4, 5], flux=[1, 1000, 1, -1000, 1])
>>> lc_clean = lc.remove_outliers(sigma=1)
>>> lc_clean.time
<Time object: scale='tdb' format='jd' value=[1. 3. 5.]>
>>> lc_clean.flux
<Quantity [1., 1., 1.]>

Instead of specifying sigma, you may specify separate sigma_lower and sigma_upper parameters to remove only outliers above or below the median. For example:

>>> lc = LightCurve(time=[1, 2, 3, 4, 5], flux=[1, 1000, 1, -1000, 1])
>>> lc_clean = lc.remove_outliers(sigma_lower=float('inf'), sigma_upper=1)
>>> lc_clean.time
<Time object: scale='tdb' format='jd' value=[1. 3. 4. 5.]>
>>> lc_clean.flux
<Quantity [    1.,     1., -1000.,     1.]>

Optionally, you may use the return_mask parameter to return a boolean array which flags the outliers identified by the method. For example:

>>> lc_clean, mask = lc.remove_outliers(sigma=1, return_mask=True)
>>> mask
array([False,  True, False,  True, False])