Basis.constrain()

Basis.constrain()#

Basis.constrain(constraint)[source]#

Apply a linear constraint to the basis and corresponding penalty.

When a constraint is applied, the type of constraint is saved to Basis.constraint, and the reparamterization matrix is saved to Basis.reparam_matrix.

Parameters:

constraint (Array | ndarray | bool | number | bool | int | float | complex | Literal['sumzero_term', 'sumzero_coef', 'constant_and_linear']) – Type of constraint or custom linear constraint matrix to apply. If an array is supplied, the constraint will be A @ coef == 0, where A is the supplied array (the constraint matrix).

Return type:

Self

Returns:

The modified basis instance (self).

Notes

This method implements the procedure detailed by Kneib et al. (2019). For the following exposition, which is quoted almost verbatim from Kneib et al. (2019), assume that this basis is used to evaluate a function

\[s(\mathbf{x}_i) = \sum_{j=1}^J B_j(\mathbf{x}_i) \beta_j = \mathbf{b}(\mathbf{x}_i)^\top \boldsymbol{\beta},\]

where

  • \(i=1, \dots, N\) is the observation index,

  • \(\mathbf{x}_i^\top = [x_{i,1}, \dots, x_{i,M}]\) are covariate observations, where \(M\) denotes the number of covariates,

  • \(\mathbf{b}^\top = [B_1(\mathbf{x}), \dots, B_J(\mathbf{x})]\) are a set of basis function evaluations, and

  • \(\boldsymbol{\beta}^\top = [\beta_1, \dots, \beta_J]\) are the corresponding coefficients.

The basis matrix for such a term is

\[\begin{split}\mathbf{B} = \begin{bmatrix} \mathbf{b}(\mathbf{x}_1)^\top \\ \vdots \\ \mathbf{b}(\mathbf{x}_N)^\top \end{bmatrix},\end{split}\]

and the term can be written in matrix form as

\[\mathbf{s} = \mathbf{B} \boldsymbol{\beta},\]

where \(\mathbf{B}\) is the basis matrix of dimension \(N \times J\). We consider \(\boldsymbol{\beta} \in \mathbb{R}^J\) to be subject to linear constraints of the form

\[\mathbf{A} \boldsymbol{\beta} = \mathbf{0}.\]

\(\mathbf{A}\) is an \(A \times J\) constraint matrix. To explicitly remove the constrained component, we construct a complementary matrix \(\bar{\mathbf{A}} \in \mathbb{R}^{(J-A) \times J}\) such that

\[\bar{\mathbf{A}} \mathbf{A}^\top = \mathbf{0},\]

and the stacked matrix \([\mathbf{A}^\top, \bar{\mathbf{A}}^\top]^\top\) is of full rank. One possible construction of \(\bar{\mathbf{A}}\) is based on the eigenvalue decomposition of \(\mathbf{A}^\top \mathbf{A}\), using the eigenvectors corresponding to zero eigenvalues. This is the construction of \(\bar{\mathbf{A}}\) used in this method. Under the full-rank assumption, the inverse of the composed matrix exists and can be written as

\[\begin{split}\begin{bmatrix} \mathbf{A} \\ \bar{\mathbf{A}} \end{bmatrix}^{-1} = \begin{bmatrix} \mathbf{C}, \bar{\mathbf{C}} \end{bmatrix},\end{split}\]

where \(\mathbf{C} \in \mathbb{R}^{J \times A}\) and \(\bar{\mathbf{C}} \in \mathbb{R}^{J \times (J-A)}\). This yields the reparameterisation

\[\boldsymbol{\beta} = \mathbf{C} \boldsymbol{\alpha} + \bar{\mathbf{C}} \boldsymbol{\gamma},\]

where \(\boldsymbol{\alpha} = \mathbf{A} \boldsymbol{\beta} = \mathbf{0}\) vanishes due to the constraint and \(\boldsymbol{\gamma} = \bar{\mathbf{A}} \boldsymbol{\beta}\) represents the remaining unconstrained coefficients. Applying this reparameterisation to the functional effect gives \(\bar{\mathbf{s}} = \bar{\mathbf{B}} \boldsymbol{\alpha}\), where the basis matrix is reparameterized as

\[\bar{\mathbf{B}} = \mathbf{B} \bar{\mathbf{C}}.\]

Accordingly, the original penalty matrix \(\mathbf{K}\) is reparamterized as

\[\bar{\mathbf{K}} = \bar{\mathbf{C}}^\top \mathbf{K} \bar{\mathbf{C}}.\]

Default constraint options

The default options correspond to the following constraint matrices:

  • "sumzero_term": \(\mathbf{A} = \mathbf{1}^\top \mathbf{B}\), where \(\mathbf{B}\) is the basis matrix. This is the preferred option for a sum to zero constraint, because it centers the evaluated term.

  • "sumzero_coef": \(\mathbf{A} = \mathbf{1}^\top\). This is an alternative sum to zero constraint, focusing only on ensuring that the coefficients sum to zero.

  • "constant_and_linear": \(\mathbf{A}=(\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top \mathbf{B}\), where \(\mathbf{X} = [\mathbf{1}, \mathbf{x}]\) is a design matrix built with the covariate observations \(\mathbf{x}\) used in this basis. This constraint removes both a constant (like "sumzero_term") and a linear trend from the term modeled with this basis.

References

Kneib, T., Klein, N., Lang, S., & Umlauf, N. (2019). Modular regression—A Lego system for building structured additive distributional regression models with tensor product interactions. TEST, 28(1), 1–39. https://doi.org/10.1007/s11749-019-00631-z