BasisBuilder.lin()

BasisBuilder.lin()#

BasisBuilder.lin(formula, xname='', basis_name='X', include_intercept=False, context=None)[source]#

Linear design matrix without penalty.

Parameters:
  • formula (str) – Right-hand side of a model formula, as understood by formulaic. Most of formulaic’s grammar is supported. See notes for details.

  • xname (str, default: '') – If provided, the design matrix will be named {basis_name}({xname}), for example B(x), is basis_name="B" and xname="x".

  • basis_name (str, default: 'X') – Name of the basis variable.

  • include_intercept (bool, default: False) – Whether to include an intercept column in the basis.

  • context (dict[str, Any] | None, default: None) – Dictionary of additional Python objects that should be made available to formulaic when constructing the design matrix. Gets passed to formulaic.ModelSpec.get_model_matrix().

Return type:

LinBasis

Notes

The following formulaic syntax is supported:

  • + for adding a term

  • a:b for simple interactions

  • a*b for expanding to a + b + a:b

  • (a + b)**n for n-th order interactions

  • a / b for nesting

  • C(a, ...) for categorical effects

  • b %in% a for inverted nesting

  • {a+1} for quoted Python code to be executed

  • `weird name` backtick-strings for weird names

  • Other transformations like center(a), scale(a), or lag(a), see grammar.

  • Python functions

Not supported:

  • String literals

  • Numeric literals

  • Wildcard "."

  • \| for splitting a formula

  • "~" in formula, since this method supports only the right-hand side of a Wilkinson formula.

  • 1 +, 0 +, or -1 in formula, since intercept addition is handled via the argument include_intercept.

References

Examples

Simple example:

>>> import liesel_gam as gam
>>> df = gam.demo_data(n=100)
>>> registry = gam.PandasRegistry(df)
>>> bb = gam.BasisBuilder(registry)
>>> bb.lin("x_lin + x_nonlin + x_cat")
LinBasis(name="X")

Customized categorical encoding:

>>> import liesel_gam as gam
>>> df = gam.demo_data(n=100)
>>> registry = gam.PandasRegistry(df)
>>> bb = gam.BasisBuilder(registry)
>>> bb.lin("x_lin + x_nonlin + C(x_cat, contr.sum)")
LinBasis(name="X")

Interaction:

>>> import liesel_gam as gam
>>> df = gam.demo_data(n=100)
>>> registry = gam.PandasRegistry(df)
>>> bb = gam.BasisBuilder(registry)
>>> bb.lin("x_lin * x_cat")
LinBasis(name="X")