Probabilistic (Thresholded) Regression

Probabilistic regression allows you to ask probability questions about a real-valued target (e.g., “What is the probability the house price <= 5 million?”).

This is technically implemented as Thresholded Regression: you supply a threshold, and CE returns the calibrated probability that the outcome satisfies that threshold condition.

Probabilistic or thresholded regression semantics note

  • Calibration prerequisites: fit on x_proper, y_proper and calibrate on held-out x_cal, y_cal.

  • Mode-specific guarantees: threshold queries use CPS with Venn-Abers calibrated event probabilities.

  • Assumptions: calibration and deployment data are exchangeable or distribution-matched.

  • Explicit non-guarantees: no guarantee under drift or causal actionability from threshold probabilities.

  • Explanation-envelope limits: feature-level probability shifts summarize model behavior under perturbation.

  • Formal semantics: Calibrated interval semantics.

Supported signatures

Method

Description

predict_proba(x, threshold=t)

Threshold probability P(y <= t)

predict_proba(x, threshold=(low, high))

Interval event probability P(low < y <= high)

predict_proba(x, uq_interval=True, ...)

Probability + uncertainty interval

explain_factual(x, threshold=...)

Explains why the specific threshold condition is met

Note: Returns P(y <= t) for scalar thresholds and P(low < y <= high) for interval thresholds.

Examples

1. Threshold probability (Scalar threshold)

# Probability that y is at or below 150
probs, (low_p, high_p) = explainer.predict_proba(
    x_test,
    threshold=150,
    uq_interval=True
)
print(f"P(y <= 150): {probs[0]}  Confidence: [{low_p[0]}, {high_p[0]}]")

2. Interval event probability (Range threshold)

Calculate the probability that the true value lies inside a specific user-defined range.

# Probability that y is between 100 and 200
probs, (low_p, high_p) = explainer.predict_proba(
    x_test,
    threshold=(100, 200),
    uq_interval=True
)
print(f"P(100 < y <= 200): {probs[0]}")

3. Explaining the probability

You can generate feature rules explaining exactly why the probability is high or low for your chosen threshold.

# Why is P(y <= 150) so high (or low)?
explanation = explainer.explain_factual(
    x_test,
    threshold=150,
)

Key parameters

  • threshold:

    • Scalar t: treated as a binary classification boundary.

    • Tuple (low, high): treated as an interval containment query.

  • uq_interval: Returns the uncertainty bound on the probability estimate itself (aleatoric + epistemic uncertainty on the score).

Entry-point tier: Tier 2.