Regression quickstart¶
Run interval regression and probabilistic regression from one calibrated explainer.
Prerequisites¶
pip install calibrated-explanations scikit-learn
Interval regression semantics note¶
Calibration prerequisites: fit on
x_proper, y_properand calibrate on held-outx_cal, y_cal.Mode-specific guarantees: percentile intervals use CPS for requested percentile bounds.
Assumptions: calibration and deployment data are exchangeable or distribution-matched.
Explicit non-guarantees: no guarantee under drift or fixed interval width across subpopulations.
Explanation-envelope limits: feature-level intervals summarize model behavior under perturbation.
Formal semantics: Calibrated interval semantics.
Probabilistic or thresholded regression semantics note¶
Calibration prerequisites: fit on
x_proper, y_properand calibrate on held-outx_cal, y_cal.Mode-specific guarantees: threshold queries use CPS with Venn-Abers for calibrated event probabilities.
Assumptions: calibration and deployment data are exchangeable or distribution-matched.
Explicit non-guarantees: no guarantee under drift, and no causal guarantee from threshold probabilities.
Explanation-envelope limits: feature-level intervals summarize model behavior under perturbation.
Formal semantics: Calibrated interval semantics.
1. Prepare the dataset¶
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
dataset = load_diabetes()
x = dataset.data
y = dataset.target
x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.2, random_state=0
)
x_proper, x_cal, y_proper, y_cal = train_test_split(
x_train, y_train, test_size=0.25, random_state=0
)
2. Fit and calibrate the explainer¶
from sklearn.ensemble import RandomForestRegressor
from calibrated_explanations import WrapCalibratedExplainer
explainer = WrapCalibratedExplainer(RandomForestRegressor(random_state=0))
explainer.fit(x_proper, y_proper)
explainer.calibrate(x_cal, y_cal, feature_names=dataset.feature_names)
3. Interval and threshold predictions¶
pred, (low, high) = explainer.predict(x_test[:3], uq_interval=True)
probabilities, probability_interval = explainer.predict_proba(
x_test[:1], threshold=150, uq_interval=True
)
4. Explore alternatives¶
alternatives = explainer.explore_alternatives(x_test[:2], threshold=150)
Next steps:
Entry-point tier: Tier 2.