API Reference#

This reference provides detailed documentation for user functions in the current release of kulprit.

kulprit#

Kulprit.

Kullback-Leibler projections for Bayesian model selection.

class kulprit.ProjectionPredictive(model: Model, idata: Optional[InferenceData] = None)[source]#

Projection Predictive class from which we perform the model selection procedure.

plot_compare(plot: Optional[bool] = False, legend: Optional[bool] = True, title: Optional[bool] = True, figsize: Optional[tuple] = None, plot_kwargs: Optional[dict] = None) Tuple[DataFrame, Axes][source]#

Compare the ELPD of the projected models along the search path.

Parameters:#

plotbool

Plot the results of the comparison. Defaults to False

legendbool

Add legend to figure. Defaults to True.

titlebool

Show a tittle with a description of how to interpret the plot. Defaults to True.

figsizetuple

If None, size is (10, num of submodels) inches

plot_kwargsdict

Optional arguments for plot elements. Currently accepts ‘color_elpd’, ‘marker_elpd’,

‘marker_fc_elpd’, ‘color_dse’, ‘marker_dse’, ‘ls_reference’, ‘color_ls_reference’.

Returns:#

cmpDataFrame

ordered from largest to smaller model. The columns are:

  • rank: The rank-order of the models. 0 is the best.

  • elpd: ELPD estimated either using (PSIS-LOO-CV). Higher ELPD indicates higher

    out-of-sample predictive fit (“better” model).

  • pIC: Estimated effective number of parameters.

  • elpd_diff: The difference in ELPD between two models.

    The difference is computed relative to the reference model

  • weight: Relative weight for each model. This can be loosely interpreted as the probability

    of each model (among the compared model) given the data.

  • SE: Standard error of the ELPD estimate.

  • dSE: Standard error of the difference in ELPD between each model and the top-ranked model.

    It’s always 0 for the reference model.

  • warning: A value of 1 indicates that the computation of the ELPD may not be reliable.

    This could be indication of PSIS-LOO-CV starting to fail see http://arxiv.org/abs/1507.04544 for details.

  • scale: Scale used for the ELPD. This is always the log scale

axes : matplotlib_axes or bokeh_figure

plot_densities(var_names: Optional[List[str]] = None, submodels: Optional[List[int]] = None, include_reference: bool = True, labels: Literal['formula', 'size'] = 'formula', kind: Literal['density', 'forest'] = 'density', figsize: Optional[Tuple[int, int]] = None, plot_kwargs: Optional[dict] = None) Axes[source]#

Compare the projected posterior densities of the submodels

Parameters:#

var_nameslist of str, optional

List of variables to plot.

submodelslist of int, optional

List of submodels to plot, 0 is intercept-only model and the largest valid integer is the total number of variables in reference model. If None, all submodels are plotted.

include_referencebool

Whether to include the reference model in the plot. Defaults to True.

labelsstr

If “formula”, the labels are the formulas of the submodels. If “size”, the number of covariates in the submodels.

figsizetuple

Figure size. If None it will be defined automatically.

plot_kwargsdict

Dictionary passed to ArviZ’s plot_density function (if kind density) or to plot_forest (if kind forest).

Returns:#

axes : matplotlib_axes or bokeh_figure

project(terms: Union[List[str], Tuple[str], int]) SubModel[source]#

Projection the reference model onto a variable subset.

Parameters:#

terms : Union[List[str], Tuple[str], int] Collection of strings containing the names of the parameters to include the submodel, or the number of parameters to include in the submodel, not including the intercept term

Returns:#

kulprit.data.SubModel: Projected submodel object

search(max_terms: Optional[int] = None, method: Literal['forward', 'l1'] = 'forward') Optional[dict][source]#

Model search method through parameter space.

If max_terms is not provided, then the search path runs from the intercept-only model up to but not including the full model.

Parameters:#

max_termsint

The number of parameters of the largest submodel in the search path, not including the intercept term.

methodstr

The search method to employ, either “forward” to employ a forward search heuristic through the space, or “l1” to use the L1-regularized search path.

Returns:#

dict: The model selection procedure search path, containing the submodels along the

search path, keyed by their model size.

kulprit.plots#

Top-level plotting module.

kulprit.plots.plot_compare(cmp_df, legend=True, title=True, figsize=None, plot_kwargs=None)[source]#

Plot model comparison.

Parameters:#

cmp_dfpd.DataFrame

Dataframe containing the comparison data. Should have columns elpd_loo and elpd_diff containing the ELPD values and the differences to the reference model.

legendbool

Flag for plotting the legend, default True.

titlebool

Flag for plotting the title, default True.

figsizetuple

Figure size. If None it will be defined automatically.

plot_kwargs : dict