API Reference#
This reference provides detailed documentation for user functions in the current release of kulprit.
kulprit
#
Kulprit.
Kullback-Leibler projections for Bayesian model selection.
- class kulprit.ProjectionPredictive(model: Model, idata: Optional[InferenceData] = None)[source]#
Projection Predictive class from which we perform the model selection procedure.
- plot_compare(plot: Optional[bool] = False, legend: Optional[bool] = True, title: Optional[bool] = True, figsize: Optional[tuple] = None, plot_kwargs: Optional[dict] = None) Tuple[DataFrame, Axes] [source]#
Compare the ELPD of the projected models along the search path.
Parameters:#
- plotbool
Plot the results of the comparison. Defaults to False
- legendbool
Add legend to figure. Defaults to True.
- titlebool
Show a tittle with a description of how to interpret the plot. Defaults to True.
- figsizetuple
If None, size is (10, num of submodels) inches
- plot_kwargsdict
Optional arguments for plot elements. Currently accepts ‘color_elpd’, ‘marker_elpd’,
‘marker_fc_elpd’, ‘color_dse’, ‘marker_dse’, ‘ls_reference’, ‘color_ls_reference’.
Returns:#
- cmpDataFrame
ordered from largest to smaller model. The columns are:
rank: The rank-order of the models. 0 is the best.
- elpd: ELPD estimated either using (PSIS-LOO-CV). Higher ELPD indicates higher
out-of-sample predictive fit (“better” model).
pIC: Estimated effective number of parameters.
- elpd_diff: The difference in ELPD between two models.
The difference is computed relative to the reference model
- weight: Relative weight for each model. This can be loosely interpreted as the probability
of each model (among the compared model) given the data.
SE: Standard error of the ELPD estimate.
- dSE: Standard error of the difference in ELPD between each model and the top-ranked model.
It’s always 0 for the reference model.
- warning: A value of 1 indicates that the computation of the ELPD may not be reliable.
This could be indication of PSIS-LOO-CV starting to fail see http://arxiv.org/abs/1507.04544 for details.
scale: Scale used for the ELPD. This is always the log scale
axes : matplotlib_axes or bokeh_figure
- plot_densities(var_names: Optional[List[str]] = None, submodels: Optional[List[int]] = None, include_reference: bool = True, labels: Literal['formula', 'size'] = 'formula', kind: Literal['density', 'forest'] = 'density', figsize: Optional[Tuple[int, int]] = None, plot_kwargs: Optional[dict] = None) Axes [source]#
Compare the projected posterior densities of the submodels
Parameters:#
- var_nameslist of str, optional
List of variables to plot.
- submodelslist of int, optional
List of submodels to plot, 0 is intercept-only model and the largest valid integer is the total number of variables in reference model. If None, all submodels are plotted.
- include_referencebool
Whether to include the reference model in the plot. Defaults to True.
- labelsstr
If “formula”, the labels are the formulas of the submodels. If “size”, the number of covariates in the submodels.
- figsizetuple
Figure size. If None it will be defined automatically.
- plot_kwargsdict
Dictionary passed to ArviZ’s
plot_density
function (if kind density) or toplot_forest
(if kind forest).
Returns:#
axes : matplotlib_axes or bokeh_figure
- project(terms: Union[List[str], Tuple[str], int]) SubModel [source]#
Projection the reference model onto a variable subset.
Parameters:#
terms : Union[List[str], Tuple[str], int] Collection of strings containing the names of the parameters to include the submodel, or the number of parameters to include in the submodel, not including the intercept term
Returns:#
kulprit.data.SubModel: Projected submodel object
- search(max_terms: Optional[int] = None, method: Literal['forward', 'l1'] = 'forward') Optional[dict] [source]#
Model search method through parameter space.
If
max_terms
is not provided, then the search path runs from the intercept-only model up to but not including the full model.Parameters:#
- max_termsint
The number of parameters of the largest submodel in the search path, not including the intercept term.
- methodstr
The search method to employ, either “forward” to employ a forward search heuristic through the space, or “l1” to use the L1-regularized search path.
Returns:#
- dict: The model selection procedure search path, containing the submodels along the
search path, keyed by their model size.
kulprit.plots
#
Top-level plotting module.
- kulprit.plots.plot_compare(cmp_df, legend=True, title=True, figsize=None, plot_kwargs=None)[source]#
Plot model comparison.
Parameters:#
- cmp_dfpd.DataFrame
Dataframe containing the comparison data. Should have columns elpd_loo and elpd_diff containing the ELPD values and the differences to the reference model.
- legendbool
Flag for plotting the legend, default True.
- titlebool
Flag for plotting the title, default True.
- figsizetuple
Figure size. If None it will be defined automatically.
plot_kwargs : dict