calibrated_explanations.utils.helper¶

Helper utilities for filesystem, typing, and data transformations.

Centralizes small routines for safe imports, conversions, and metric calculations shared across calibrated explanations.

calibrated_explanations.utils.helper.make_directory(path: str, save_ext=None, add_plots_folder=True) → None[source]¶

Create directory if it does not exist.

Parameters:

path (str) – The path to the directory to create
save_ext (str or list, optional) – The extension of the file to save, by default None
add_plots_folder (bool, optional) – Whether to add a ‘plots’ folder to the path, by default True

calibrated_explanations.utils.helper.safe_isinstance(obj, class_path_str)[source]¶

Acts as a safe version of isinstance without having to explicitly import packages which may not exist in the users environment.

Checks if obj is an instance of type specified by class_path_str.

Parameters:

obj (Any) – Some object you want to test against
class_path_str (str or list) – A string or list of strings specifying full class paths Example: sklearn.ensemble.RandomForestRegressor

Returns:

bool

Return type:

True if isinstance is true and the package exists, False otherwise

calibrated_explanations.utils.helper.safe_import(module_name, class_name=None)[source]¶: Safely import a module, if it is not installed, print a message and return None.

calibrated_explanations.utils.helper.check_is_fitted(estimator, attributes=None, *, msg=None, all_or_any=<built-in function all>)[source]¶

Perform is_fitted validation for estimator.

Checks if the estimator is fitted by verifying the presence of fitted attributes (ending with a trailing underscore) and otherwise raises a NotFittedError with the given message.

If an estimator does not set any attributes with a trailing underscore, it can define a __sklearn_is_fitted__ method returning a boolean to specify if the estimator is fitted or not.

Parameters:

estimator (estimator instance) – estimator instance for which the check is performed.
attributes (str, list or tuple of str, default=None) –
Attribute name(s) given as string or a list/tuple of strings Eg.: ["coef_", "estimator_", ...], "coef_"

If None, estimator is considered fitted if there exist an attribute that ends with a underscore and does not start with double underscore.
msg (str, default=None) –
The default error message is, “This %(name)s instance is not fitted yet. Call ‘fit’ with appropriate arguments before using this estimator.”

For custom messages if “%(name)s” is present in the message string, it is substituted for the estimator name.

Eg. : “Estimator, %(name)s, must be fitted before sparsifying”.
all_or_any (callable, {all, any}, default=all) – Specify whether all or any of the given attributes must exist.

Return type:

None

Raises:

NotFittedError – If the attributes are not found.

calibrated_explanations.utils.helper.is_notebook()[source]¶: Check if the code is running in a Jupyter notebook.

calibrated_explanations.utils.helper.transform_to_numeric(df, target, mappings=None)[source]¶

Transform the categorical features to numeric.

Parameters:

df (pd.DataFrame) – The dataframe to transform
target (str) – The target column name
categorical_features (list, optional) – The list of categorical features, by default None
mappings (dict, optional) – The mapping created by previous calls to this function, by default None

Returns:

pd.DataFrame – The transformed dataframe
Categorical features – A list of the indexes to categorical features
Categorical labels – A dictionary with a list of categorical labels (value) for each categorical feature (key)
Target labels – A dictionary with target label-index pairs
Mappings – A dictionary with the mapping of each categorical feature and the target

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({'target': ['a','b']})
>>> transform_to_numeric(df,'target')
(   target
0       0
1       1, None, None, {0: 'a', 1: 'b'}, {'target': {'a': 0, 'b': 1}})

>>> df = pd.DataFrame({'numerical': [2,3], 'nominal': ['c','d'], 'target': ['a','b']})
>>> ndf, categorical_features, categorical_labels, target_labels, mappings = transform_to_numeric(df,'target')
>>> ndf
   numerical  nominal  target
0          2        0       0
1          3        1       1
>>> categorical_features
[1]
>>> categorical_labels
{1: {0: 'c', 1: 'd'}}
>>> target_labels
{0: 'a', 1: 'b'}
>>> mappings
{'nominal': {'c': 0, 'd': 1}, 'target': {'a': 0, 'b': 1}}

>>> ddf = pd.DataFrame({'numerical': [2,3], 'nominal': ['d','c'], 'target': ['b','a']})
>>> nddf, _, _, _, _ = transform_to_numeric(ddf,'target', mappings)
>>> nddf
   numerical  nominal  target
0          2        1       1
1          3        0       0

calibrated_explanations.utils.helper.assert_threshold(threshold, x)[source]¶

Test if the thresholds are valid.

Parameters:

threshold (int, float, tuple, list, or np.ndarray) – The threshold(s) to be validated. It can be a scalar (int or float), a tuple with two values, or a list/np.ndarray of scalars or tuples.
x (list or np.ndarray) – The data against which the thresholds are validated. Used to check the length of list/np.ndarray thresholds.

Returns:

The validated threshold(s).

Return type:

int, float, tuple, or list

Raises:

AssertionError – If the length of the list/np.ndarray threshold is not equal to the number of samples. if the tuple threshold does not have two values.
ValueError – If the threshold is not a scalar, binary tuple, or list of scalars or binary tuples.

Examples

>>> assert_threshold(0.5, [1, 2, 3])
0.5
>>> assert_threshold((0.2, 0.8), [1, 2, 3])
(0.2, 0.8)
>>> assert_threshold([0.1, 0.2, 0.3], [1, 2, 3])
[0.1, 0.2, 0.3]
>>> assert_threshold([(0.1, 0.9), (0.2, 0.8)], [1, 2])
[(0.1, 0.9), (0.2, 0.8)]
>>> assert_threshold(None, [1, 2, 3])
>>> assert_threshold([0.1, 0.2], [1])
Traceback (most recent call last):
    ...
AssertionError: list thresholds must have the same length as the number of samples

calibrated_explanations.utils.helper.calculate_metrics(uncertainty=None, prediction=None, w=0.5, metric=None, normalize=False)[source]¶

Calculate different metrics based on the uncertainty and probability values.

The function calculate_metrics calculates different metrics based on the uncertainty and probability values.

Parameters:

uncertainty (float) – The uncertainty parameter is a float value that represents the uncertainty of the explanation. Uncertainty is a measure of the confidence of the explanation. For classification, this is a value between 0 and 1, where 0 means the explanation is certain and 1 means the explanation is uncertain. For regression, this is the width of the uncertainty interval determined by the user defined percentiles.
prediction (float) – The prediction parameter is a float value that represents the prediction of the explanation. For classification, this is the probability of the predicted class. For regression, this is the predicted value.
w (float, default=0.5) – The w parameter is a float value that represents the weight of the uncertainty in the metric calculation. The weight must be between -1 and 1. The default value is 0.5.
metric (str, list of str, or None, default=None) – The metric parameter is a string that represents the metric to calculate. If metric is set to None, the function will calculate all available metrics. If metric is set to a list of metrics, the function will calculate only those metrics. The available metrics are: - ‘ensured’ : Weighted Sum Method
normalize (bool, default=False) – The normalize parameter is a boolean value that represents whether to normalize the uncertainty and prediction values. The default value is False.

Notes

If the method is called with no arguments, it will return the list of available metrics.

calibrated_explanations.utils.helper.convert_targets_to_numeric(y)[source]¶

Convert string/categorical targets to numeric values while preserving labels.

Parameters:

(array-like) (y)

Returns:

array-like: Numeric version of the target values
dict or None: Mapping of original labels to numeric values if conversion was needed

Return type:

tuple

calibrated_explanations.utils.helper.concatenate_thresholds(perturbed_threshold, threshold, indices)[source]¶

Concatenates the given threshold values to the perturbed_threshold based on the provided indices.

Parameters:

perturbed_threshold (np.ndarray) – The existing perturbed thresholds.
threshold (list or np.ndarray) – The original thresholds.
indices (np.ndarray) – The indices to select from the threshold.

Returns:

The concatenated thresholds.

Return type:

np.ndarray

calibrated_explanations.utils.helper.immutable_array(array)[source]¶

Convert a numpy array to an immutable array.

Parameters:: array (list or np.ndarray) – The numpy array to convert.
Returns:: The immutable numpy array.
Return type:: np.ndarray

Examples

>>> arr = immutable_array([1, 2, 3])
>>> arr.flags.writeable
False
>>> int(arr[0])
1
>>> arr[0] = 10
Traceback (most recent call last):
    ...
ValueError: assignment destination is read-only

calibrated_explanations.utils.helper.prepare_for_saving(filename)[source]¶

Prepare the file path, name, title, and extension for saving a file.

Parameters:

filename (str) – The full path to the file to save.

Returns:

str: The path to the file.
str: The filename.
str: The title of the file.
str: The extension of the file.

Return type:

tuple

calibrated_explanations.utils.helper.safe_mean(values, default=0.0)[source]¶

Return the mean of values, but return default if values is empty.

This prevents numpy from emitting a “Mean of empty slice” RuntimeWarning and gives callers a deterministic fallback for empty inputs.

calibrated_explanations.utils.helper.safe_first_element(values, default=0.0, col=None)[source]¶

Return a sensible first element from values.

If values is scalar, return it as float.
If values is empty (size == 0), return default.
If col is None, return the first flattened element.
If col is given and values is 1D, return values[col] when available.
If col is given and values is 2D, return values[0, col] when available.

This protects callers that index [0] (or [0, 1]) on prediction outputs when fallback/edge cases may produce empty arrays.

calibrated_explanations.utils.helper.assign_threshold(threshold: Any) → Any[source]¶

Normalize regression threshold for prediction tasks.

Returns empty containers for list/array inputs to prevent threshold broadcast errors. For scalar thresholds, returns the value unchanged. Used in probabilistic regression to validate and prepare thresholds before making predictions.

Parameters:: threshold (scalar, list, array-like, or None) – Optional threshold value for regression explanations.
Returns:: For None: returns None. For scalar: returns the scalar unchanged. For list/array: returns empty array (no threshold broadcast).
Return type:: None, scalar, or empty array

Examples

Scalar threshold (valid for single prediction):

>>> assign_threshold(5.0)
5.0