TermBuilder.mrf()

TermBuilder.mrf()#

TermBuilder.mrf(x, k=-1, scale='default', inference='default', polys=None, nb=None, penalty=None, absorb_cons=True, diagonal_penalty=True, scale_penalty=True, factor_scale=False, prefix='', name=None)[source]#

Gaussian Markov random field.

The preferred way to initialize these is by supplying polys, because this enables plotting via plot_regions().

Parameters:
  • x (str) – Name of the region variable.

  • k (int, default: -1) – If -1, this is a “full-rank” (up to identifiability constraint) Markov random field. If k is an integer smaller than the number of unique regions, a low-rank field will be returned, see Wood (2017), Sections 5.8.1 and 5.4.2.

  • scale (ScaleIG | Var | float | VarIGPrior | Literal['default'], default: 'default') –

    Scale parameter passed to the coefficient prior, StrctTerm.scale.

    • If "default", the scale will be initialized according to the default scale function defined for this TermBuilder instance. Please refer to the TermBuilder documentation for more information.

    • If you pass a float, this will be taken as the constant value of the scale, and the scale will not be estimated as part of the model without further action.

    • If you pass a liesel.model.Var, this will be used as the scale. Make sure to define the inference attribute of your custom scale variable (or a latent, transformed version of it).

    • If you pass a VarIGPrior, a scale variable will be set up for you using ScaleIG. This means, the scale will be \(\tau\), with an iverse Gamma prior on its square, i.e. \(\tau^2 \sim \operatorname{InverseGamma}(a, b)\), where a and b are taken from the VarIGPrior object. A fitting Gibbs kernel will be set up automatically to sample \(\tau^2\) in this case, see ScaleIG for details.

  • inference (Any | None | Literal['default'], default: 'default') – Inference specification for this term’s coefficient. Note that this inference is only used for the coefficient variables of the terms created by this builder (StrctTerm.coef), not for the scale variables (StrctTerm.scale). The default ("default") uses the TermBuilder’s default inference specification defined during initialization. Please refer to the TermBuilder documentation for more information.

  • polys (dict[str, Array | ndarray | bool | number | bool | int | float | complex] | None, default: None) – Dictionary of arrays. The keys of the dict are the region labels. The corresponding values define the region by defining polygons. The neighborhood structure can be inferred from this polygon information.

  • nb (Mapping[str, Array | ndarray | bool | number | bool | int | float | complex | list[str] | list[int]] | None, default: None) – Dictionary of array. The keys of the dict are the region labels. The corresponding values indicate the neighbors of the region. If the values are lists or arrays of strings, the values are the labels of the neighbors. If they are lists or arrays of integers, the values are the indices of the neighbors. Indices correspond to regions based on an alphabetical ordering of regions.

  • penalty (Array | ndarray | bool | number | bool | int | float | complex | None, default: None) – If a penalty is supplied explicitly, it takes precedence over a potential penalty derived from both nb and polys.

  • penalty_labels – If a penalty is supplied explicitly, labels must also be specified. The labels create the association between penalty columns and region labels. The values of this sequence should be the string labels of unique regions in x.

  • absorb_cons (bool, default: True) – Whether the default identification constraint should be applied by reparameterization and absorbing the reparameterization matrix into the basis and penalty matrices for computational efficiency. If False, the basis is unconstrained, if True it receives a sum to zero constrained. Also see Basis.constrain().

  • diagonal_penalty (bool, default: True) – Whether the penalty matrix associated with this term should be reparameterized into a diagonal matrix. In this case, the basis matrix is reparameterized accordingly. This can be beneficial for posterior geometry, which is why it is the default. Also see Basis.diagonalize_penalty().

  • scale_penalty (bool, default: True) – Whether the penalty matrix should be scaled such that its infinity norm is one. This can improve numerical stability, which is why it is done by default. Also see Basis.scale_penalty().

  • factor_scale (bool, default: False) – Whether to factor out the scale in the prior for this term, turning it into a partially (or fully) standardized form. See StrctTerm.factor_scale() for details.

  • prefix (str, default: '') – A string prefix to be added to the returned term’s name.

  • name (str | None, default: None) – Manually defined name of the term. If a prefix is specified, the prefix will be added to this name.

See also

plot_regions

Plots MCMC results on a map of the regions.

plot_polys

Plots a map based on polygons.

plot_forest

Plots regions with uncertainty in a forest plot.

Notes

This method internally calls the R package mgcv to set up the basis and penalty. The mgcv documentation provides further details.

Return type:

MRFTerm

Returns:

Comments on the additional attributes available on the returned MRFTerm variable:

  • If either polys or nb are supplied, the returned term will contain information in MRFTerm.neighbors.

  • If only a penalty matrix is supplied, the returned MRFSpec will not contain information in MRFTerm.neighbors.

  • MRFTerm.mapping contains the map of region labels to integer codes.

  • MRFTerm.labels contains the region labels.

  • Returning the label order only makes sense if the basis is not reparameterized, because only then we have a clear correspondence of parameters to labels. If the basis is reparameterized, with absorb_cons=True or of low rank with k -1, there is no such correspondence in a clear way, so the label order in MRFTerm.ordered_labels is None.

Examples

>>> import liesel_gam as gam
>>> df = gam.demo_data(n=100)
>>> nb = {"a": ["b", "c"], "b": ["a"], "c": ["a"]}
>>> print(df.x_cat.unique().tolist())
['a', 'b', 'c']
>>> tb = gam.TermBuilder.from_df(df)
>>> tb.mrf("x_cat", nb=nb)
MRFTerm(name="mrf(x_cat)")

References