DesignMatrix¶

class lightkurve.correctors.DesignMatrix(df, columns=None, name='unnamed_matrix', prior_mu=None, prior_sigma=None)

Bases: object

A matrix of column vectors for use in linear regression.

The purpose of this class is to provide a convenient method to interact with a set of one or more regressors which are known to correlate with trends or systematic noise signals which we want to remove from a light curve. Specifically, this class is designed to provide the design matrix for use by Lightkurve’s RegressionCorrector class.

Parameters
dfdict, array, or pandas.DataFrame object

Columns to include in the design matrix. If this object is not a DataFrame then it will be passed to the DataFrame constructor.

columnsiterable of str (optional)

Column names, if not already provided via df.

namestr

Name of the matrix.

prior_muarray

Prior means of the coefficients associated with each column in a linear regression problem.

prior_sigmaarray

Prior standard deviations of the coefficients associated with each column in a linear regression problem.

Attributes Summary

 columns List of column names. rank Matrix rank computed using numpy.linalg.matrix_rank. shape Tuple specifying the shape of the matrix as (n_rows, n_columns). values 2D numpy array containing the matrix values.

Methods Summary

 append_constant(self[, prior_mu, prior_sigma]) Returns a new DesignMatrix with a column of ones appended. pca(self[, nterms]) Returns a new DesignMatrix with a smaller number of regressors. plot(self[, ax]) Visualize the design matrix values as an image. plot_priors(self[, ax]) Visualize the coefficient priors. split(self, row_indices) Returns a new DesignMatrix with regressors split into multiple columns. standardize(self) Returns a new DesignMatrix in which the columns have been median-subtracted and sigma-divided.

Attributes Documentation

columns

List of column names.

rank

Matrix rank computed using numpy.linalg.matrix_rank.

shape

Tuple specifying the shape of the matrix as (n_rows, n_columns).

values

2D numpy array containing the matrix values.

Methods Documentation

append_constant(self, prior_mu=0, prior_sigma=inf)

Returns a new DesignMatrix with a column of ones appended.

Returns
DesignMatrix

New design matrix with a column of ones appended. This column is named “offset”.

pca(self, nterms=6)

Returns a new DesignMatrix with a smaller number of regressors.

This method will use Principal Components Analysis (PCA) to reduce the number of columns in the matrix.

Parameters
ntermsint

Number of columns in the new matrix.

Returns
DesignMatrix

A new design matrix with PCA applied.

plot(self, ax=None, **kwargs)

Visualize the design matrix values as an image.

Uses Matplotlib’s plot_image to visualize the matrix values.

Parameters
axAxes

A matplotlib axes object to plot into. If no axes is provided, a new one will be created.

**kwargsdict

Extra parameters to be passed to plot_image.

Returns
Axes

The matplotlib axes object.

plot_priors(self, ax=None)

Visualize the coefficient priors.

Parameters
axAxes

A matplotlib axes object to plot into. If no axes is provided, a new one will be created.

Returns
Axes

The matplotlib axes object.

split(self, row_indices)

Returns a new DesignMatrix with regressors split into multiple columns.

This method will return a new design matrix containing n_columns * len(row_indices) regressors. This is useful in situations where the linear regression can be improved by fitting separate coefficients for different contiguous parts of the regressors.

Parameters
row_indicesiterable of integers

Every regressor (i.e. column) in the design matrix will be split up over multiple columns separated at the indices provided.

Returns
DesignMatrix

A new design matrix with shape (n_rows, len(row_indices)*n_columns).

standardize(self)

Returns a new DesignMatrix in which the columns have been median-subtracted and sigma-divided.

For each column in the matrix, this method will subtract the median of the column and divide by the column’s standard deviation, i.e. it will compute the column’s so-called “standard scores” or “z-values”.

This operation is useful because it will make the matrix easier to visualize and makes fitted coefficients easier to interpret.

Notes: * Standardizing a spline design matrix will break the splines. * Columns with constant values (i.e. zero standard deviation) will be left unchanged.

Returns
DesignMatrix

A new design matrix with median-subtracted & sigma-divided columns.