Regression quickstart¶

Run interval regression and probabilistic regression from one calibrated explainer.

Prerequisites¶

pip install calibrated-explanations scikit-learn

Interval regression semantics note¶

Calibration prerequisites: fit on x_proper, y_proper and calibrate on held-out x_cal, y_cal.
Mode-specific guarantees: percentile intervals use CPS for requested percentile bounds.
Assumptions: calibration and deployment data are exchangeable or distribution-matched.
Explicit non-guarantees: no guarantee under drift or fixed interval width across subpopulations.
Explanation-envelope limits: feature-level intervals summarize model behavior under perturbation.
Formal semantics: Calibrated interval semantics.

Probabilistic or thresholded regression semantics note¶

Calibration prerequisites: fit on x_proper, y_proper and calibrate on held-out x_cal, y_cal.
Mode-specific guarantees: threshold queries use CPS with Venn-Abers for calibrated event probabilities.
Assumptions: calibration and deployment data are exchangeable or distribution-matched.
Explicit non-guarantees: no guarantee under drift, and no causal guarantee from threshold probabilities.
Explanation-envelope limits: feature-level intervals summarize model behavior under perturbation.
Formal semantics: Calibrated interval semantics.

1. Prepare the dataset¶

from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

dataset = load_diabetes()
x = dataset.data
y = dataset.target

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.2, random_state=0
)
x_proper, x_cal, y_proper, y_cal = train_test_split(
    x_train, y_train, test_size=0.25, random_state=0
)

2. Fit and calibrate the explainer¶

from sklearn.ensemble import RandomForestRegressor
from calibrated_explanations import WrapCalibratedExplainer

explainer = WrapCalibratedExplainer(RandomForestRegressor(random_state=0))
explainer.fit(x_proper, y_proper)
explainer.calibrate(x_cal, y_cal, feature_names=dataset.feature_names)

3. Interval and threshold predictions¶

pred, (low, high) = explainer.predict(x_test[:3], uq_interval=True)
probabilities, probability_interval = explainer.predict_proba(
    x_test[:1], threshold=150, uq_interval=True
)

4. Explore alternatives¶

alternatives = explainer.explore_alternatives(x_test[:2], threshold=150)

Next steps:

Entry-point tier: Tier 2.