etiq package

Subpackages

Submodules

etiq.biasparams module

class etiq.biasparams.BiasParams(protected: str, privileged: Any, unprivileged: Any, positive_outcome_label: Any, negative_outcome_label: Any)

Bases: object

negative_outcome_label: Any
positive_outcome_label: Any
privileged: Any
protected: str
unprivileged: Any

etiq.charting module

etiq.charting.create_data_profile_records(adata_profile: DataProfile) List[DataFeatureProfile]
etiq.charting.generate_data_profile(dataset: AbstractDataset, group_by: str | None = None)
etiq.charting.generate_summary_histograms(dataset: AbstractDataset, bins=15) dict

Generates summary histograms of a dataframe

etiq.charting.str_index(index, use_int_bins: bool) str

Convert index value to a string

If an index is a Pandas Interval, use the left hand side for the boundary.

etiq.config module

Argument configuration

class etiq.config.DataclassEnhancedJSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)

Bases: JSONEncoder

default(o)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class etiq.config.PipelineDetails(name: str, run_id: str)

Bases: object

name: str
run_id: str
etiq.config.clear_config()
etiq.config.etiq_config(src: str | Path)
etiq.config.etiq_pipeline_details(pipeline_name: str, run_id: str | None = None) PipelineDetails
etiq.config.get_config() Dict

Load a configuration

etiq.config.get_function_config() Dict | None
etiq.config.get_pipeline_details() PipelineDetails | None
etiq.config.load_config(src: str | Path) Dict

Load a configuration

etiq.config.use_config_for_missing_args(fn: Callable)

Decorate a function so that if it is called with missing parameters, we can retrieve them from the currently loaded configuration

etiq.config.use_config_section_for_missing_args(section_name: str)

Decorate a function so that if it is called with missing parameters, we can retrieve them from the specified section in the currently loaded configuration

etiq.custom_metrics module

Decorators to help define custom metrics to be used in model prediction accuracy and bias scoring.

etiq.custom_metrics.actual_values(akeyword: str) Callable

A decorator that maps a specified keyword argument to the ground truth parameter used in metric calculations.

This decorator designates which argument in your function represents the actual values, i.e., ground truth labels.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1], actual=[1, 1, 0],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the actual values.

Returns:

A decorator that maps the specified keyword to the actual values.

Return type:

Callable

etiq.custom_metrics.custom_metric(custom_metric_callable: Callable) Callable

A decorator that allows an arbitrary function to be used as a custom metric with the etiq library.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg
Parameters:

custom_metric_callable – The function to be decorated.

Returns:

A function that returns a dictionary with the function name as the key and the function return value as the value.

etiq.custom_metrics.data_values(akeyword: str) Callable

A decorator that maps a specified keyword argument to the data values parameter used in metric calculations.

Example usage:

import numpy as np
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import StandardScaler
import etiq

@etiq.metrics.bias_metric
@etiq.custom_metric
@etiq.prediction_values("predictions")
@etiq.data_values("data")
def individual_fairness_metric(data, predictions, n_neighbours: int = 10,
                               closest_neighbours: int = 5, **kwargs) -> float:
    data = np.asarray(data)
    pred = np.asarray(predictions)
    if data.shape[0] < n_neighbours:
        return np.nan
    scaler = StandardScaler().fit(data)
    data_scaled = scaler.transform(data)
    nbrs = NearestNeighbors(n_neighbors=n_neighbours, algorithm='ball_tree', n_jobs=-1).fit(data_scaled)
    indices = nbrs.kneighbors(data_scaled, return_distance=False)
    return 1.0 - np.mean(np.abs(pred - np.mean(pred[indices[:, 0:closest_neighbours]], axis=1)))
Parameters:

akeyword – The keyword argument to be mapped to the data values.

Returns:

A decorator that maps the specified keyword to the data values.

etiq.custom_metrics.negative_outcome(akeyword: str) Callable

A decorator that maps a specified keyword argument to the negative outcome parameter used in metric calculations.

This decorator allows you to specify which argument in your function represents the negative outcome in the context of fairness or bias metrics.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1, 0], actual=[1, 1, 0, 1],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the negative outcome.

Returns:

A decorator that maps the specified keyword to ‘negative_outcome_label’.

Return type:

Callable

etiq.custom_metrics.positive_outcome(akeyword: str) Callable

A decorator that maps a specified keyword argument to the positive outcomes parameter used in metric calculations.

This decorator allows you to specify which argument in your function represents the positive outcome in the context of fairness or bias metrics.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1], actual=[1, 1, 0],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the positive outcome.

Returns:

A decorator that maps the specified keyword to ‘positive_outcome_label’.

Return type:

Callable

etiq.custom_metrics.prediction_values(target_keyword: str) Callable

A decorator that maps a specified keyword argument to the predictions parameter used in metric calculations.

This decorator designates which argument in your function represents the predictions made by a model.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg
Parameters:

target_keyword – The target keyword to map to.

Returns:

A decorator that maps the ‘pred’ keyword to the target keyword.

etiq.drift_measures module

class etiq.drift_measures.DriftMeasure(name, value)

Bases: tuple

name

Alias for field number 0

value

Alias for field number 1

etiq.drift_measures.concept_drift_measure(f: Callable) Callable

A decorator that allows an arbritray function to be decorated to return a DriftMeasure. In addition the decorated function will then be registered to be used as a concept drift measure when using Etiq. This is used to specify user defined concept drift measures to be used with Etiq.

This can be used as follows:

import etiq

@etiq.drit_measures.concept_drift_measure
def total_variational_distance(expected_dist, new_dist):
    return sum(0.5 * abs(x-y) for (x,y) in zip(expected_dist, new_dist))
Parameters:

f – The function to be wrapped

Returns:

A user-defined concept drift measure.

etiq.drift_measures.drift_measure(custom_measure_callable: Callable | None = None, *, autobin: bool = False) Callable

A decorator that allows an arbritray function to be decorated to return a DriftMeasure. This is used to specify user defined drift measures.

This can be used as follows:

import etiq

@etiq.drit_measures.drift_measure
def total_variational_distance(expected_dist, new_dist):
    return sum(0.5 * abs(x-y) for (x,y) in zip(expected_dist, new_dist))
Parameters:
  • custom_measure_callable (Callable) – The function to be wrapped

  • autobin – This is set to True to automatically bin the distributions being compared. Defaults to False.

Returns:

A user-defined drift measure.

etiq.drift_measures.earth_mover_distance(expected_dist: List[float], new_dist: List[float]) float

Calculates the Earth-Mover distance between two categorical distributions

Parameters:
  • expected_dist (List[float]) – The expected categorical distribution

  • new_dist (List[float]) – The comparison categorical distribution

Returns:

The earth mover distance between the two categorical distributions

Return type:

float

etiq.drift_measures.jensen_shannon(expected_dist: List[float], new_dist: List[float]) float | None

Calculates the jenson-shannon value for a single feature given its expected and newly sampled probability distributions.

\[\text{JS(P,Q)} = \sum_{x} P(x) \ln\left(\frac{P(x)}{Q(x)}\right) + \sum_{x} Q(x) \ln\left(\frac{Q(x)}{P(x)}\right)\]
Parameters:
  • expected_dist (List[float]) – Expected probability distribution

  • new_dist (List[float]) – New probability distribution

Returns:

Calculated jenson-shannon value.

Return type:

float

etiq.drift_measures.jensen_shannon_distance(expected_dist: List[float], new_dist: List[float]) float

Calculates the Jensen-Shannon distance between two categorical distributions.

\[\text{JS(P,Q)} = \sum_{x} P(x) \ln\left(\frac{P(x)}{Q(x)}\right) + \sum_{x} Q(x) \ln\left(\frac{Q(x)}{P(x)}\right)\]
Parameters:
  • expected_dist (List[float]) – The expected categorical distribution

  • new_dist (List[float]) – The comparison categorical distribution

Returns:

The Jensen-Shannon distance between the two categorical distributions

Return type:

float

etiq.drift_measures.kl_divergence(expected_dist: List[float], new_dist: List[float]) float

Calculates the Kullback-Leibler divergence between two categorical distributions.

\[\text{KL(P||Q)} = \sum_{x} P(x) \ln\left(\frac{P(x)}{Q(x)}\right)\]
Parameters:
  • expected_dist (List[float]) – The expected categorical distribution

  • new_dist (List[float]) – The comparison categorical distribution

Returns:

The Killback-Leiber between the two categorical distributions

Return type:

float

etiq.drift_measures.kolmogorov_smirnov(expected_dist: List[float], new_dist: List[float]) float

Calculates the p-value of the Kolmogorov-Smirnov distribution for two population samples.

Parameters:
  • expected_dist (List[float]) – List of original population values

  • new_dist (List[float]) – List of new population values

Returns:

The p-value of a two tailed KS

Return type:

float

etiq.drift_measures.psi(expected_dist: List[float], new_dist: List[float]) float

Calculate the population stability index (PSI) for a single variable given an expected and new empirical distributions.

\[\text{PSI} = \sum_{i=1}^{n} (p_{1i} - p_{0i}) \times \ln\left(\frac{p_{1i}}{p_{0i}}\right)\]
Parameters:
  • expected_dist (List[float]) – Expected probability distribution

  • new_dist (List[float]) – New probability distribution

Returns:

Calculated PSI value

Return type:

float

etiq.expectations module

ETIQ AI integration with Great Expectations

class etiq.expectations.ExpectationResult(success: bool, name: str, column: str | None, count: int | None, failed: int | None, threshold: Tuple[float, float])

Bases: NamedTuple

Normalised expectation result

column: str | None

Alias for field number 2

count: int | None

Alias for field number 3

failed: int | None

Alias for field number 4

name: str

Alias for field number 1

success: bool

Alias for field number 0

threshold: Tuple[float, float]

Alias for field number 5

etiq.expectations.add_json_suite_to_validator(validator: Validator, json_suite: List[Dict[str, Dict[str, Any] | Any]]) None

Apply a JSON representation of an expecation suite to a validator. We expect the suite to be specified as a list of dictionary entries:

{<expectation_name:str>: <value:any>}  # Shorthand for single argument expectations.

or:

{<expectation_name:str>: {<kwarg:str>: <value:any>}}

Example format:

[
    {"expect_column_values_to_not_be_null": "age"},
    {"expect_column_values_to_be_between": {"column": "age", "min_value": 0, "max_value": 120}},
]
etiq.expectations.expectations_from_data_profile(record: FeatureDataProfile, margin: float = 0.1) List[Dict[str, Any]]

Create one or more expectations from a data feature profile

Parameters:
  • record – DataFeatureProfile - Record to generate expectations from.

  • margin – float - (0.0 - 1.0) Percentage allowable margin of error (0 = no error allowed, 1 = complete error!)

etiq.expectations.get_context(dataset: BasePandasDatasetMixin)

Return the associated context for an expectation validator

etiq.expectations.get_expectation_results(run_results: Any) List[ExpectationResult]

Get results from the nested result object as a more basic list of objects

etiq.expectations.get_min_max_threshold(kwargs: Dict[str, Any]) Tuple[float, float]

Get min/max values for threshold of expectation if any

etiq.expectations.get_validator(dataset: BasePandasDatasetMixin, context: AbstractDataContext | None = None) Validator

Get the Great Expectations validator associated with the given dataset

Parameters:
  • dataset – Etiq dataset

  • context – Optional context argument

etiq.measures module

class etiq.measures.Measure(name, value)

Bases: tuple

name

Alias for field number 0

value

Alias for field number 1

etiq.measures.correlation_measure(custom_measure_callable: Callable[[List[Any], List[Any]], float]) Callable[[List[Any], List[Any]], Measure]

A decorator used to create and register a custom correlation measure

An example of the use of this decorator is as follows:

import scipy.stats as stats
import etiq
@etiq.measures.correlation_measure
def kendalls(x, y) -> float:
    return stats.kendalltau(x,y).statistic
Parameters:

custom_measure_callable – The correlation function

Returns:

A function which returns a Measure.

Return type:

Callable

etiq.measures.cramersv(x: List[Any], y: List[Any]) float

Calculates Cramer’s V between two (categorical) variables.

Parameters:
  • x – List like values representing the first variable.

  • y – List like values representing the second variable.

Returns:

Cramer’s V association coefficient for the two variables.

Return type:

float

etiq.measures.epsilonSquare(x: List[Any], y: List[float]) float
Calculates the epsilon squared effect size of the Kruskal-Wallis test

between two variables where the first variable is categorical and the second continuous.

Parameters:
  • x – List like values representing the categorical variable

  • y – List like values representing the continuous variable

Returns:

The largest rank-biserial correlation value for all pairs of categories.

Return type:

float

etiq.measures.pearsons(x: List[float], y: List[float]) float

Calculates Pearson’s product-moment correlation coefficient between two variables.

Parameters:
  • x – List like values representing the first variable.

  • y – List like values representing the second variable.

Returns:

Pearson’s correlation coefficient of the two variables.

Return type:

float

etiq.measures.pointbiserial(x: List[float], y: List[float]) float
Calculates point biserial coefficient between two variables. This should be used to calculate

correlation between a binary variable and a continuous one.

Parameters:
  • x – List like values representing the first variable.

  • y – List like values representing the second variable.

Returns:

Point biserial correlation coefficient of the two variables.

Return type:

float

etiq.measures.rankbiserial(x: List[Any], y: List[float]) float
Calculates the rank-biserial correlation between two variables where the first variable is

categorical and the second continuous. If the categorical variable has more than two categories than the rank-biserial correlation for all pairs of categories is calculated and the rank-biserial correlation with the largest absolute value is returned.

Parameters:
  • x – List like values representing the categorical variable.

  • y – List like values representing the continuous variable.

Returns:

The largest rank-biserial correlation value for all pairs of categories.

Return type:

float

etiq.measures.reset_correlation_measures() None

This removes all correlation measures from the list of correlation measures.

etiq.measures.spearmans(x: List[float], y: List[float]) float

Calculates Spearman’s rank order correlation coefficient between two variables

Parameters:
  • x – List like values representing the first variable

  • y – List like values representing the second variable

Returns:

Spearman’s rank order correlation coefficient of the two variables.

Return type:

float

etiq.metrics module

etiq.metrics.accuracy(pred: FeatureOrNameType, label: FeatureOrNameType, **kwargs) float

Returns accuracy given predictions and the corresponding ground truth labels.

Accuracy = nr of correct predictions/ nr of predictions. See accuracy score.

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

Returns:

Accuracy score for the predictions.

etiq.metrics.accuracy_metric(f: Callable) Callable

Decorate a function so that is added to the list of accuracy metrics.

An example of the use of this decorator is as follows:

@etiq.metrics.accuracy_metric
@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome_label")
@etiq.negative_outcome("negative_outcome_label")
def f_score(predictions, actual, positive_outcome_label, negative_outcome_label):
    true_pos = sum((predictions == actual) & (actual == positive_outcome_label))
    false_pos = sum((predictions != actual) & (actual == positive_outcome_label))
    false_neg = sum((predictions != actual) & (actual == negative_outcome_label))
    return true_pos / (true_pos + 0.5*(false_pos + false_neg))
Parameters:

f – The function to be decorated.

Returns:

The unchanged function after it has been added to the list of accuracy metrics.

etiq.metrics.bias_metric(f: Callable) Callable

Decorate a function so that is added to the list of bias metrics.

An example of the use of this decorator is as follows:

@etiq.metrics.bias_metric
@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome_label")
@etiq.negative_outcome("negative_outcome_label")
def treatment_equality(predictions, actual, positive_outcome_label, negative_outcome_label):
    false_neg = sum((predictions != actual) & (predictions == negative_outcome_label))
    false_pos = sum((predictions != actual) & (predictions == positive_outcome_label))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    try:
        val = false_pos/false_neg
    except ZeroDivisionError:
        return np.inf
    return val
Parameters:

f – The function to be decorated.

Returns:

The unchanged function after it has been added to the list of bias metrics.

etiq.metrics.demographic_parity(pred: Any, positive_outcome_label: Any, **kwargs) float

Calculates demographic parity using the positive prediction rate (note this is NOT the true positive rate).

Demographic parity = nr of predicted positive labels/nr of predictions

Parameters:
  • pred – Predictions.

  • positive_outcome_label – The positive outcome label.

Returns:

The positive prediction rate.

etiq.metrics.equal_odds_tnr(pred: FeatureOrNameType, label: FeatureOrNameType, negative_outcome_label: Any, **kwargs) float

Calculates equal odds true negative rate differences between two demographic populations e.g. male and female. Returns a true negative rate given predictions, the corresponding ground truth labels and the negative outcome label.

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

  • negative_outcome_label – The negative outcome label.

Returns:

The true negative rate.

etiq.metrics.equal_odds_tpr(pred: FeatureOrNameType, label: FeatureOrNameType, positive_outcome_label: Any, **kwargs) float

Calculates equal odds true positive rate differences between two demographic populations e.g. male and female. Returns a true positive rate given predictions, the corresponding ground truth labels and the positive outcome label. This corresponds to the recall score in sklearn.

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

  • positive_outcome_label – The positive outcome label.

Returns:

The true positive rate.

etiq.metrics.equal_opportunity(pred: FeatureOrNameType, label: FeatureOrNameType, positive_outcome_label: Any, **kwargs) float

This is used to calculate equal opportunity differences between two demographic groups e.g. male and female. Returns a true positive rate given predictions, the corresponding ground truth labels and the positive outcome label. This corresponds to the recall score in sklearn. Note that this is identical to equal_odds_tpr().

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

  • positive_outcome_label – The positive outcome label.

Returns:

The true positive rate.

etiq.metrics.individual_fairness(data: Any, pred: Any, n_neighbours: int = 10, closest_neighbours: int = 5, *args, **kwargs) float

Calculate an individual fairness metric by calculating the mean consistency of the predictions. Consistency for an observation is defined as the difference in the observation for a prediction and the mean prediction on the observation’s k-nearest neighbours.

Parameters:
  • data – The data.

  • pred – The predictions made on the data.

Returns:

An individual fairness metric based on consistency of prediction.

etiq.metrics.individual_fairness_cf(pred: FeatureOrNameType, protected: FeatureOrNameType, privileged: ProtectedType, unprivileged: ProtectedType, positive_outcome_label: OutcomeType, negative_outcome_label: OutcomeType, data: Any, n_neighbours: int = 10, *args, **kwargs) dict

This calculates the proportion of the unprivileged group with a negative outcome prediction which have at least one “counterfactual” amongst their nearest neighbours. In this instance a counterfactual is defined as a member of the privileged group with a positive outcome prediction.

Parameters:
  • pred – The predictions made on the data. Defaults to None.

  • data – The data. Defaults to None.

  • protected – Protected (sensitive) labels. Defaults to None.

  • privileged – Privileged label. Defaults to None.

  • unprivileged – Unprivileged label. Defaults to None.

  • positive_outcome_label – The positive outcome label. Defaults to None.

  • negative_outcome_label – The negative outcome label. Defaults to None.

Returns:

The proportion of the unprivileged group with a negative outcome prediction which have at least one “counterfactual” amongst their nearest neighbours.

etiq.metrics.reset_metric_lists() None

This removes all custom metrics from the list of accuracy and bias metrics.

etiq.metrics.true_neg_rate(pred: FeatureOrNameType, label: FeatureOrNameType, negative_outcome_label: Any, **kwargs) float

Returns a true negative rate given predictions and the corresponding ground truth labels.

True Negative Rate = all correctly predicted Negative outcome labels /

all Negative outcome labels

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

  • negative_outcome_label – The negative outcome label.

Returns:

True negative rate for the predictions.

etiq.metrics.true_pos_rate(pred: FeatureOrNameType, label: FeatureOrNameType, positive_outcome_label: Any, **kwargs) float

Returns a true positive rate given predictions and the corresponding ground truth labels. This corresponds to the recall score in sklearn.

True Positive Rate = all correctly predicted Positive outcome labels /

all Positive outcome labels

Parameters:
  • pred – Predictions.

  • label – Ground truth labels.

  • positive_outcome_label – The positive outcome label.

Returns:

True positive rate for the predictions.

etiq.model module

ETIQ Core data model representations

class etiq.model.BaseModel(*, model_fitted=None, model_architecture=None, model_label=None)

Bases: object

NAME = 'BaseModel'
fit(x, y)

Run the model fit

property has_model_fit_run
property isfitted
predict(x)

Predict

score(x, y)
class etiq.model.DefaultLogisticRegression(model_label=None, **kwargs)

Bases: BaseModel

Wrapper for sklearn Logistic Regression model

NAME = 'LogisticRegression'
class etiq.model.DefaultRandomForestClassifier(model_label=None, **kwargs)

Bases: BaseModel

Wrapper for sklearn Random Forest Classifier

NAME = 'RandomForest'
class etiq.model.DefaultXGBoostClassifier(model_label=None, **kwargs)

Bases: BaseModel

Wrapper for XGBoost Classifier

NAME = 'XGBoost'
class etiq.model.Model(model_architecture=None, model_fitted=None, model_label=None, **kwargs)

Bases: BaseModel

User supplied dataset model

NAME = 'UserModel'
class etiq.model.PrecalculatedModel(features: DataFrame, prediction_label: str, model_label: str | None = None)

Bases: BaseModel

NAME = 'Precalculated'
fit(x, y)

Run the model fit

predict(x)

Predict

etiq.projects module

Define user facing project API

etiq.projects.get_all_projects() List[Project]

Return a list of all projects

etiq.projects.open(name: str, create_if_missing: bool = True) Project

Open a project, creating it if desired

Parameters:
  • name – Project name

  • create_if_missing – Create this project if it can’t be found?

Returns:

A new or existing project with the given name

Return type:

Project | None

etiq.scans module

Objects to represent scan results to the end user

class etiq.scans.Aggregate(issue_name: str, feature_count: float, segment_count: float, issues_tested_count: float, issues_found_count: float, metric: str, measure: str, lower_threshold: float, upper_threshold: float)

Bases: object

feature_count: float
issue_name: str
issues_found_count: float
issues_tested_count: float
lower_threshold: float
measure: str
metric: str
segment_count: float
upper_threshold: float
class etiq.scans.Issue(issue: str, feature: str, metric: str, metric_value: float, measure: str, measure_value: float, lower_threshold: float, upper_threshold: float, value: str, record: str)

Bases: object

feature: str
issue: str
lower_threshold: float
measure: str
measure_value: float
metric: str
metric_value: float
record: str
upper_threshold: float
value: str
class etiq.scans.Scan(id: str, name: str, created: datetime.datetime, modified: datetime.datetime, start_time: datetime.datetime, end_time: datetime.datetime, status: datetime.datetime, log: str, meta: Dict[Any, Any], type: str, segments: List[etiq.scans.Segment], issues: List[etiq.scans.Issue], aggregates: List[etiq.scans.Aggregate])

Bases: object

aggregates: List[Aggregate]
created: datetime
end_time: datetime
classmethod from_record(scan_record: Scan) Scan
id: str
issues: List[Issue]
log: str
meta: Dict[Any, Any]
modified: datetime
name: str
segments: List[Segment]
start_time: datetime
status: datetime
type: str
class etiq.scans.Segment(name: str, business_rule: str, volume: float, volume_percent_total: float, metric_tag: str)

Bases: object

business_rule: str
metric_tag: str
name: str
volume: float
volume_percent_total: float
etiq.scans.populate_record(record_type: Any, record: Any, **extra_properties) Any

Populate dataclass records from a sqlalchemy record

Parameters:
  • type – Dataclass instance of below

  • record – SQLAlchemy record instance to get values from

  • **extra_properties – Any extra properties to add to the dataclass

etiq.snapshot_stage module

class etiq.snapshot_stage.SnapshotStage(value)

Bases: Enum

An enumeration.

PRE_PRODUCTION = 0
PRODUCTION = 1
class etiq.snapshot_stage.SnapshotStatus(value)

Bases: Enum

An enumeration.

FINAL = 1
INITIAL = 0

etiq.snapshots module

Define the Etiq Snapshot class

class etiq.snapshots.Snapshot(id: Optional[str], name: str, dataset: etiq.datasets.abstract_dataset.AbstractDataset, model: etiq.model.BaseModel, project: 'etiq.projects.Project', comparison_dataset: Optional[etiq.datasets.abstract_dataset.AbstractDataset], bias_params: Optional[etiq.biasparams.BiasParams] = None, stage: etiq.snapshot_stage.SnapshotStage = <SnapshotStage.PRE_PRODUCTION: 0>, status: etiq.snapshot_stage.SnapshotStatus = <SnapshotStatus.INITIAL: 0>, significant_features: list = <factory>, generate_data_profiles: bool = False, enable_significance_calculation: bool = False, pipeline_id: Optional[str] = None, run_id: Optional[str] = None)

Bases: object

bias_params: BiasParams | None = None

Bias parameters (demographic parameters).

calculate_significant_features()

Calculates the order of significance of the top 20 (at most significant) most significant features of the snapshot’s dataset (using the validation samples of the dataset) given the snapshot’s model.

Note that these are stored in the significant_features of the snapshot and not returned as output.

comparison_dataset: AbstractDataset | None

Comparison dataset.

dataset: AbstractDataset

Base dataset.

enable_significance_calculation: bool = False

Flag set/unset to enable/disable calculation for feature significances.

generate_data_profiles: bool = False

Flag set/unset to enable/disable generation of data profiles for the dataset features.

get_all_scans() List[Scan]

Gets a list of all scans that have been performed on the snapshot.

Returns:

A list of the scans that have been performed on the snapshot.

get_validator()

Get Great Expectations validator. Requires GE Python module to be installed

id: str | None

Snapshot’s identifier.

model: BaseModel

Model. Sklearn compatible model i.e. one which conforms to the fit/predict semantics.

name: str

Snapshot name.

pipeline_id: str | None = None

(Optional) identifier to identify the pipeline to which the snapshot belongs.

project: etiq.projects.Project

Project to which the snapshot belongs.

run_id: str | None = None

(Optional) identifier to identify the run within a pipeline to which the snapshot belongs.

scan_accuracy_metrics(thresholds: Dict[str, Tuple[float, float]], ignore_lower_threshold: bool = False, ignore_upper_threshold: bool = True, metric_filter: List[str] | None = None, positive_outcome_label: int = 1, negative_outcome_label: int = 0, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Scan for accuracy issues. This scans the base dataset of the snapshot (more specifically the validation samples of the dataset) for issues where the value of the specified accuracy metrics are below and/or above the specified thresholds.

Use this scan like this:

thresholds = {"accuracy": [0.85, 1.0]}
segments, issues, issue_summary = (
    snapshot.scan_accuracy_metrics(
        thresholds = thresholds, metric_filter = ["accuracy"]
    )
)
Parameters:
  • thresholds – A dictionary of the accuracy thresholds (both upper and lower) e.g. {“precision” : [0.8, 1.0]} where “precision” is the metric with 0.8 and 1.0 the lower and upper thresholds respectively, defaults to None (in which case the lower and upper thresholds default to 0.5 and 1.0).

  • ignore_lower_threshold – This is set to True if we want to ignore the lower threshold. This is useful for metrics where a higher value means a poorer accuracy like RMSE. Defaults to False.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold. This is useful for metrics where a higher value means better accuracy. Defaults to True.

  • metric_filter – This restricts the scan to only use the metrics whose names are specified in this list. If this is not specified it defaults to None, in which case all known accuracy metrics, including any user defined ones, are included in the scan. Defaults to None.

  • positive_outcome_label – The label of the “positive” outcome where this is required by an accuracy metric e.g. for the calculation of a precision metric. This defaults to 1.

  • negative_outcome_label – The label of the “negative” outcome where this is required by an accuracy metric e.g. for the calculation of a recall metric. This defaults to 0.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either has no suitable model or because the dataset does not contain predictions.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_accuracy_metrics_rca(thresholds: Dict[str, Tuple[float, float]] | None = None, ignore_lower_threshold: bool = False, ignore_upper_threshold: bool = True, metric_filter: List[str] | None = None, minimum_segment_size: int | None = None, positive_outcome_class: int = 1, negative_outcome_class: int = 0, encoders: Dict[str, LabelEncoder] | None = None, find_most_granular: bool = False, rca_search_features: List[str] | None = None, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Scan for the root cause of accuracy issues.

This scans the base dataset of the snapshot (more specifically the validation samples of the dataset) for segments where the value of the specified accuracy metrics are below and/or above the specified thresholds.

Use this scan like this:

thresholds = {"accuracy": [0.85, 1.0]}
segments, issues, issue_summary = (
    snapshot.scan_accuracy_metrics_rca(
        thresholds=thresholds,
        metric_filter=["accuracy"],
        find_most_granular=False,
        rca_search_features=["gender", "age", "relationship"]
    )
)
Parameters:
  • thresholds – A dictionary of the accuracy thresholds (both upper and lower) e.g. {“precision” : [0.8, 1.0]} where “precision” is the metric with 0.8 and 1.0 the lower and upper thresholds respectively, defaults to None (in which case the lower and upper thresholds default to 0.5 and 1.0).

  • ignore_lower_threshold

    This is set to True if we want to ignore the lower threshold. This is useful for metrics where a higher value means a poorer accuracy like RMSE. Defaults to False.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold. This is useful for metrics where a higher value means better accuracy. Defaults to True.

  • metric_filter – This restricts the scan to only use the metrics whose names are specified in this list. If this is not specified it defaults to None, in which case all known accuracy metrics, including any user defined ones, are included in the scan. Defaults to None.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. If this is not set it defaults to None in which case 2% of the total number of validation samples is used as the minimum segment size by the RCA scan.

  • positive_outcome_label

    The label of the “positive” outcome where this is required by an accuracy metric e.g. for the calculation of a precision metric. This defaults to 1.

  • negative_outcome_label

    The label of the “negative” outcome where this is required by an accuracy metric e.g. for the calculation of a recall metric. This defaults to 0.

  • encoders – Dictionary of the encoders (if any) used for the features. This enables the segment logic to be translated back to the pre-encoded version. This defaults to None.

  • find_most_granular – This flag determines the RCA mode. If this set to False the RCA will stop scanning a segment if the segment has an accuracy below the specified threshold, however if this is set to True the RCA will keep scanning sub-segments of segments where the metric is below/above specified thresholds. This defaults to False.

  • rca_search_features – Set of features to use in the RCA segment scan. This defaults to None in which case datasets features are used when scanning for segments.

  • issue_storage_limit (int, optional) – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned for the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either has no suitable model or because the dataset does not contain predictions.

Returns:

A scan result consisting of the segments found during the scan, the individual issues found and a summary of the issues.

scan_bias_metrics(thresholds: Dict[str, Tuple[float, float]], ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, metric_filter: List[str] | None = None, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Scan for bias issues.

This scans the base dataset of the snapshot (more specifically the validation samples of the dataset) for issues where the difference specified bias metrics are below and/or above the specified thresholds.

Note that this will only run if the snapshot contains a base dataset that contains bias information i.e. a BiasDataset.

Use this scan like this:

thresholds = {"precision": [0.0, 0.25]}
segments, issues, issue_summary = (
    snapshot.scan_bias_metrics(
        thresholds=thresholds, metric_filter=["precision"]
    )
)
Parameters:
  • thresholds – A dictionary of the bias thresholds (both upper and lower) e.g. {“precision” : [0.0, 0.2]} where “precision” is the metric with 0.0 and 0.2 as the lower and upper thresholds respectively, defaults to None (in which case the lower and upper thresholds default to 0.0 and 0.2). Note that these are thresholds on the difference of the statistic between a favoured and not favoured group e.g. male and female.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower difference threshold. Defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper difference threshold. Defaults to False.

  • metric_filter – This restricts the scan to only use the metrics specified in this list. If this is not specified it defaults to None, in which case all known bias metrics, including any user defined ones, are included in the scan.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned for the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either has no suitable model or because the dataset does not contain predictions and/or bias information.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_bias_metrics_rca(thresholds: Dict[str, Tuple[float, float]] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, metric_filter: List[str] | None = None, minimum_segment_size: int | None = None, encoders: Dict[str, LabelEncoder] | None = None, find_most_granular: bool = False, rca_search_features: List[str] | None = None, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Scan for bias issues root cause.

This scans the base dataset of the snapshot (more specifically the validation samples of the dataset) for segments where the difference specified bias metrics are below and/or above the specified thresholds.

Note that this will only run if the snapshot contains a base dataset that contains bias information i.e. a BiasDataset.

Use this scan like this:

thresholds = {"precision": [0.0, 0.25]}
segments, issues, issue_summary = (
    snapshot.scan_bias_metrics_rca(
        thresholds=thresholds, metric_filter=["precision"]
    )
)
Parameters:
  • thresholds – A dictionary of the bias thresholds (both upper and lower) e.g. {“precision” : [0.0, 0.2]} where “precision” is the metric with 0.0 and 0.2 as the lower and upper thresholds respectively, defaults to None (in which case the lower and upper thresholds default to 0.0 and 0.2). Note that these are thresholds on the difference of the statistic between a favoured and not favoured group e.g. male and female.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower difference threshold. Defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper difference threshold. Defaults to False.

  • metric_filter – This restricts the scan to only use the metrics specified in this list. If this is not specified it defaults to None, in which case all known bias metrics, including any user defined ones, are included in the scan.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. If this is not set it defaults to None in which case 2% of the total number of validation samples is used as the minimum segment size by the RCA scan.

  • encoders – Dictionary of the encoders (if any) used for the features. This enables the segment logic to be translated back to the pre-encoded version. This defaults to None.

  • find_most_granular – This flag determines the RCA mode. If this set to False the RCA will stop scanning a segment if the segment has a bias difference below/above the specified thresholds, however if this is set to True the RCA will keep scanning sub-segments of segments where the metric difference is below/above specified thresholds. This defaults to False.

  • rca_search_features – Set of features to use in the RCA segment scan. This defaults to None in which case datasets features are used when scanning for segments.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned for the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either has no suitable model or because the dataset does not contain predictions and/or bias information.

Returns:

A scan result consisting of the segments, the individual issues found and a summary of the issues.

scan_bias_sources(auto: bool = False, nr_groups: int = 20, minimum_segment_size: int | None = None, continuous_continuous_measure: str = 'pearsons', categorical_categorical_measure: str = 'cramersv', categorical_continuous_measure: str = 'rankbiserial', binary_continuous_measure='pointbiserial', issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify potential sources of bias in the snapshot’s base dataset.

segments, issues, issue_summary = (
    snapshot.scan_bias_sources(auto=True)
)
Parameters:
  • auto – Set to True to use automated tree based segmentation to identify potentially biased segments. If set to False K-Means clustering is used to identify segments to examine for potential data bias. Defaults to False.

  • nr_groups – The number of groups to use if K-Means clustering is used to create data segments to examine for potential bias. Defaults to 20.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant when using the automated tree based segmentation method. Defaults to None in which case 2% of the total number of validation samples is used.

  • continuous_continuous_measure – The correlation measure to use when checking correlations between two continuous features. Defaults to ‘pearsons’.

  • categorical_categorical_measure – The correlation measure to use when checking for correlations between two categorical features. Defaults to ‘cramersv’.

  • categorical_continuous_measure – The correlation measure to use when checking for correlations between a categorical and a continuous feature. Defaults to ‘rankbiserial’.

  • binary_continuous_measure – The correlation measure to use when checking for correlations between a categorical (and binary) and a continuous feature. Defaults to ‘rankbiserial’. Defaults to ‘pointbiserial’.

  • issue_storage_limit – _description_. Defaults to DEFAULT_ISSUE_LIMIT.

Raises:
  • NotImplementedError – Raised if this scan is run for non-pandas datasets.

  • ValueError – Raised if the snapshot’s base dataset contains no demographic information.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_concept_drift_metrics(thresholds, concept_drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, number_of_bins: int = 10, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify concept drift issues between a snapshot’s base and comparison datasets.

A concept drift issue occurs where a concept drift measure between the base and comparison datasets is below and/or above the specified thresholds for the concept drift measure.

Use this scan like this:

thresholds = {"jensen_shannon_distance": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_concept_drift_metrics(
        thresholds = thresholds,
        drift_measures = ["jensen_shannon_distance"]
    )
)
Parameters:
  • thresholds

    A user-defined dictionary of the drift measure thresholds (both upper and lower) e.g. {“jensen_shannon_distance” : [0.0, 0.15]} where “jensen_shannon_distance” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower thresholds indicates a minimum level of acceptable drift while the upper threshold indicates a maximum level of acceptable drift. This defaults to None.

    Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to 0.05 and 1.0 respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None in which case all known drift measures, including any user defined ones, are included in the scan.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower threshold otherwise the lower threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a high measure value indicates a possible drift issue. This defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold otherwise the upper threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a low measure value indicates a possible drift issues. This defaults to True.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:
  • NotImplementedError – Raised if this scan is run for non-pandas datasets.

  • ValueError – This is raised if the snapshot cannot run the scan because it either does not contain suitable datasets or has had unknown drift measures specified.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_concept_drift_metrics_rca(thresholds: Dict[str, Tuple[float, float]] | None = None, concept_drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, features: List[str] | None = None, minimum_segment_size: int | None = None, find_most_granular: bool = False, encoders: Dict[str, LabelEncoder] | None = None, issue_storage_limit: int = 10000, **kwargs) Tuple[DataFrame, DataFrame, DataFrame]

Run a root cause analysis (RCA) for concept drift issues between a snapshot’s base and comparison datasets.

A concept drift issue occurs where a concept drift measure between the base and comparison datasets is below and/or above the specified thresholds for the concept drift measure.

Use this scan like this:

thresholds = {"jensen_shannon_distance": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_concept_drift_metrics_rca(
        thresholds = thresholds,
        drift_measures = ["jensen_shannon_distance"]
    )
)
Parameters:
  • thresholds

    A user-defined dictionary of the concept drift measure thresholds (both upper and lower) e.g. {“jensen_shannon_distance” : [0.0, 0.15]} where “jensen_shannon_distance” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower thresholds indicates a minimum level of acceptable drift while the upper threshold indicates a maximum level of acceptable drift. This defaults to None.

    Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to 0.05 and 1.0 respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None in which case all known drift measures, including any user defined ones, are included in the scan.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower threshold otherwise the lower threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a high measure value indicates a possible drift issue. This defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold otherwise the upper threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a low measure value indicates a possible drift issues. This defaults to True.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. Defaults to None, in which case 2% of the total number of validation samples is used as the minimum segment size by the RCA scan.

  • encoders – Dictionary of the encoders (if any) used for the features. This enables the segment logic to be translated back to the pre-encoded version. This defaults to None.

  • find_most_granular – Determines the RCA (Root Cause Analysis) mode. If set to False, the RCA will stop scanning a segment if the segment has concept drift below/above the specified thresholds. If set to True, the RCA will keep scanning sub-segments of segments where the metric difference is below/above specified thresholds. Defaults to False.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:
  • NotImplementedError – Raised if this scan is run for non-pandas datasets.

  • ValueError – This is raised if the snapshot cannot run the scan because it either does not contain suitable datasets or has had unknown drift measures specified.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_data_changes(snapshot: Snapshot, group_by: str | None = None, margin: float = 0.1, margin_per_field: Dict[str, float] | None = None) Tuple[DataFrame, DataFrame, DataFrame]

Create expectations based off a previous snapshot and compare to this snapshot.

We’ll create a data profile of the dataset associated with the snapshot and compare it against this one using Great Expectations. Great Expectations package great-expectations must be installed for this to work.

Parameters:
  • snapshot – Snapshot - The other snapshot to scan against.

  • group_by – Str - Optional field to group the data by.

  • margin – Float - (0.0 - 1.0 )Optional error margin to apply to the tests. (where 0.0 = no deviation allowed)

  • margin_per_field – Dict[str, float] - Optional dictionary of fieldname/margin pairs to specify margin differently for specific fields.

Returns:

ScanResult - Segments, Issues, AggregateIssue dataframes.

scan_data_issues(orderings: List[Tuple[str, str]] | None = None, filter_ids: Sequence[str] | None = None, search_for_missing_features: bool = True, search_for_unknown_features: bool = True, identical_feature_filter: Sequence[str] | None = None, missing_category_feature_filter: Sequence[str] | None = None, unknown_category_feature_filter: Sequence[str] | None = None, range_feature_filter: Sequence[str] | None = None, duplicate_features_subset: Sequence[str] | None = None, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify data issues.

import pandas as pd
import etiq
from etiq import SimpleDatasetBuilder

df = pd.DataFrame(
    [
        ["ID1", "2022-10-10", "2022-10-17", 2, 3, 4, 5, 6, 1],
        ["ID2", "2022-10-11", "2022-10-18", 8, 9, 10, 11, 12, 0],
        ["ID3", "2022-10-12", "2022-10-19", 2, 3, 4, 5, 6, 1],
        ["ID4", "2022-10-13", "2022-10-20", 8, 9, 10, 11, 12, 0],
        ["ID5", "2022-10-14", "", 2, 3, 4, 5, 6, 1],
        ["ID5", "2022-10-15", "2022-10-22", 8, 9, 10, 11, 12, 0],
        ["ID6", "2022-10-16", "2022-10-23", 2, 3, 4, 5, 6, 1],
        ["ID7", "2022-10-17", "2022-10-24", 8, 9, 10, 11, 12, 0],
        ["ID8", "2022-10-18", "2022-10-25", 14, 15, 16, 17, 18, 1],
        ["ID9", "2022-10-19", "2022-09-26", 15, 16, 17, 18, 19, 1],
        ["", "2022-10-19", "2023-09-26", 15, 16, 17, 18, 19, 1],
    ],
    columns=[
        "key",
        "start",
        "end",
        "F1",
        "F2",
        "F3",
        "F4",
        "F5",
        "T",
    ]
)
base_dataset = SimpleDatasetBuilder.datasets(
    validation_features=df,
    label="T",
    id_col=["key"],
    cat_col=["F1", "F2", "T"],
    date_col=["start", "end"],
    convert_date_cols=True,
    name="test_dataset",
)
project = etiq.projects.open(name="Data Issues")
# Creating a snapshot
snapshot = project.snapshots.create(
    name="Test Data Issues Snapshot", dataset=base_dataset, model=None
)
(segments, issues, issue_summary) = snapshot.scan_data_issues(
    orderings=[("start", "end")],
    duplicate_features_subset=["key"]
)

This will either scan the base dataset for data issues or, if there is a comparison dataset present in the snapshot, scan the comparison dataset for data issues using the base dataset as an exemplar dataset.

Parameters:
  • orderings – List of tuples with defined feature orderings. This defines pairs of features which have the defined order e.g. a tuple (‘start_date’, ‘end_date’) indicates that for all samples in the dataset the ‘start_date’ must be less or equal to the ‘end_date’ feature. This defaults to None in which case no order violations are checked.

  • filter_ids – List of ID features to scan for missing IDs. This defaults to None in which case we do not search for missing IDs.

  • search_for_missing_features – Set to True to have the method to scan for missing features in the comparison dataset i.e. features that appear in the base dataset but not in the comparison dataset. Otherwise no check for missing features is performed. Defaults to True.

  • search_for_unknown_features – Set to True to have the method to scan for unknown features in the comparison dataset i.e. features that appear in the comparison dataset but not in the base dataset. Otherwise no check for unknown features is performed. Defaults to True.

  • identical_feature_filter – List of features to scan for issues where the feature is identical between the base and comparison dataset. Defaults to None in which case no features are checked to see if they are identical.

  • missing_category_feature_filter – List of features to scan for data issues where there are missing categories i.e. a category is present for the feature in the base dataset but not present in the comparison dataset. Defaults to None in which case no features are scanned for missing categories.

  • unknown_category_feature_filter – List of features to scan for data issues where there are unknown categories i.e. a category is present for a dataset feature in the comparison dataset but not present in the base dataset. Defaults to None in which case no features are scanned for unknown categories.

  • range_feature_filter – List of features to scan for data issues where a feature has values in the comparison dataset which are outside of the observed range for this feature in the base dataset. Defaults to None in which case all continuous features are scanned for out of range issues.

  • duplicate_features_subset – The list of features which should when taken in combination be unique i.e. they form a key for the dataset. These are then used to scan for duplicates in the base dataset. Defaults to None in which case no duplicate issue scan is performed.

  • issue_storage_limit (int, optional) – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – Raised if there is no base dataset in the snapshot.

Returns:

A scan result consisting of the segments (always just the global

segment), the individual issues found and a summary of the issues.

Return type:

ScanResult

scan_demographic_leakage(leakage_threshold: float = 0.9, minimum_segment_size: int | None = None, continuous_continuous_measure: str = 'pearsons', categorical_categorical_measure: str = 'cramersv', categorical_continuous_measure: str = 'rankbiserial', binary_continuous_measure='pointbiserial', issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify demographic leakage in the snapshots base dataset.

Demographic leakage occurs when one or more of a dataset’s non-target features reveals (or has a high chance of revealing) a demographic feature which we do not want to train a model on. For example we may want to exclude gender from a model but have a feature such as a military spouse flag which leaks the gender demographics.

Parameters:
  • leakage_threshold – The threshold of the correlation metric over which we identify a leakage issue as occuring. Defaults to 0.9.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. Defaults to None, in which case 2% of the total number of validation samples is used as the minimum segment size.

  • continuous_continuous_measure – The correlation measure to use where the feature being checked for demographic leakage is continuous as is the demographic feature. Defaults to ‘pearsons’.

  • categorical_categorical_measure – The correlation measure to use where the feature being checked for demographic leakage is categorical and the demographic feature is also categorical. Defaults to ‘cramersv’.

  • categorical_continuous_measure – The correlation measure to use where the feature being checked for demographic leakage is binary and the demographic feature is also continuous. Defaults to ‘rankbiserial’.

  • binary_continuous_measure – The correlation measure to use where the feature being checked for demographic leakage is categorical and the demographic feature is also continuous. Defaults to ‘pointbiserial’.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:
  • NotImplementedError – Raised if this scan is run where the snapshot’s base dataset is a non-pandas dataset.

  • ValueError – Raised if the snapshot’s base dataset contains no demographic information.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_drift_metrics(thresholds: Dict[str, Tuple[float, float]] | None = None, drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, features: List[str] | None = None, issue_storage_limit: int = 10000, **kwargs) Tuple[DataFrame, DataFrame, DataFrame]

Scan for feature drift issues.

This scans the snapshot for feature drift issues between the snapshot’s base and comparison datasets. A drift issues is an instance where a drift measure on a feature is below and/or above the specified thresholds for the drift measure.

Use this scan like this:

thresholds = {"psi": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_drift_metrics(
        thresholds = thresholds, drift_measures = ["psi"]
    )
)
Parameters:
  • thresholds – A user-defined dictionary of the drift measure thresholds (both upper and lower) e.g. {“psi” : [0.0, 0.15]} where “psi” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower thresholds indicates a minimum level of acceptable drift (not widely applicable) while the upper threshold indicates a maximum level of acceptable drift. This defaults to None. Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None in which case all known drift measures, including any user defined ones, are included in the scan.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower threshold otherwise the lower threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a high measure value indicates feature drift. This defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold otherwise the upper threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a low measure value indicates feature drift. This defaults to True.

  • features – The list of features to use when scanning for feature drift issues. This defaults to None if not provided in which case the most significant features are scanned for drift provided these are available otherwise all features are scan for drift.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either does not contain suitable datasets or invalid drift measures.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_drift_metrics_rca(thresholds: Dict[str, Tuple[float, float]] | None = None, drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, features: List[str] | None = None, minimum_segment_size: int | None = None, find_most_granular: bool = False, encoders: Dict[str, LabelEncoder] | None = None, issue_storage_limit: int = 10000, **kwargs) Tuple[DataFrame, DataFrame, DataFrame]

Identify the root cause of feature drift issues between a snapshot’s base and comparison datasets.

This method works by searching for sub-segments of the data (e.g., age > 21) in which there are feature drift issues between the snapshot’s base and comparison datasets. A feature drift issue occurs when a drift measure on a feature is below and/or above the specified thresholds for the drift measure.

Use this scan like this:

thresholds = {"psi": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_drift_metrics_rca(
        thresholds=thresholds,
        drift_measures=["psi"],
        minimum_segment_size=500
    )
)
Parameters:
  • thresholds – A user-defined dictionary of the drift measure thresholds (both upper and lower) e.g. {“psi” : [0.0, 0.15]} where “psi” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower threshold indicates a minimum level of acceptable drift (not widely applicable) while the upper threshold indicates a maximum level of acceptable drift. This defaults to None. Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to 0.5 and 1.0 respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None. In this case all known drift measures including any user defined ones, are included in the scan.

  • ignore_lower_threshold – Set to True to ignore the lower threshold; otherwise, the lower threshold is taken into account when scanning for drift issues. Defaults to True.

  • ignore_upper_threshold – Set to True to ignore the upper threshold; otherwise, the upper threshold is taken into account when scanning for drift issues. Defaults to True.

  • features – The list of features to use when scanning for feature drift issues. Defaults to None, where the most significant features are scanned for drift, provided they are available; otherwise, all features are scanned for drift.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. Defaults to None, in which case 2% of the total number of validation samples is used as the minimum segment size by the RCA scan.

  • encoders – Dictionary of the encoders (if any) used for the features. This enables the segment logic to be translated back to the pre-encoded version. This defaults to None.

  • find_most_granular – Determines the RCA (Root Cause Analysis) mode. If set to False, the RCA will stop scanning a segment if the segment has feature drift below/above the specified thresholds. If set to True, the RCA will keep scanning sub-segments of segments where the metric difference is below/above specified thresholds. Defaults to False.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None, where a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – Raised if the snapshot cannot run the scan because it either does not contain suitable datasets or invalid drift measures.

Returns:

A scan result consisting of the segments, the individual issues found, and a summary of the issues.

scan_expectations(*, validator=None, context=None, suite_name: str | None = None, results=None, json_suite: list | None = None) Tuple[DataFrame, DataFrame, DataFrame]

Run each of the given expectations against the current data source

Parameters:
  • validator – GE validator object having an expectation suite. Defaults to None.

  • context – An existing GE context. Defaults to None.

  • suite_name – Name of the suite to use from the existing context. Defaults to None.

  • results – An existing GE result object to create scan results from. Defaults to None.

  • json_suite – Takes a JSON defined suite as defined in an Etiq config file. Defaults to None.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_fingerprints(other: Snapshot, margin=0.99, margin_per_field=None, grouping=None, metrics: Iterable[str] | None = None) Tuple[DataFrame, DataFrame, DataFrame]

Given another snapshot to scan, we generate metrics about the associated Dataset on this and the other snapshot. Metrics from both datasets are compared to produce a list of possible issues and the possible relationship between the two datasets.

An aggregate issue record is created for each test we apply - one for each of the below (or selected) metrics. An issue record is added if the metrics do not match within the target margin.

Note

Fingerprint scanning is only presently supported on Pandas based data sources.

Metrics we test:

Metric

Description

Applies to

“count”

Number of rows contained in the table.

Whole Table

“min”

Minimum value

Continuous features

“max”

Maximum value

Continuous features

“mean”

Mean value

Continuous features

“median”

Median value

Continuous features

“missing”

Number of rows missing values

Both continuous and categorical

“sum”

Sum of all values

Continuous features

“unique”

Count of distinct values

Both continuous and categorical

“std”

Standard Deviation

Continuous features

Parameters:
  • other (Snapshot) – Snapshot to compare against this

  • margin (float) – Default margin for tests. What is the similarity we need between two metrics for the test comparison to pass? Where 1.0 means metrics must match exactly and 0.0 means they do not match at all.

  • margin_per_field (dict[str, float]) – Dictionary of {feature: margin} to apply specific margins on a per-feature basis. Useful to tweak the margin when we know that some fields differ more or there are fields we really want to match up in metric properties.

  • grouping (str|List[str]|Callable) – How do we group the data? DataFrame.groupby style grouping argument. One of: * Single feature name * List of feature names * Pandas grouping function See the Pandas groupby documentation on the by parameter for full usage details.

  • metrics (List[str]) – We may only be interested in specific metrics so you can specify which metrics are run. The names are as per the above table.

Returns:

Segments, Issues and issue aggregate summary

dataframes.

Return type:

Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]

Examples

The most basic usage is passing in another snapshot to get comparison results:

segments, issues, issue_summary = snapshot.scan_fingerprints(other_snapshot)

Applying global margin setting of 0.8 for when we know that the datasets differ:

segments, issues, issue_summary = snapshot.scan_fingerprints(other_snapshot, margin=0.80)

Applying margins on a per-field basis:

# We expect more variation on these fields:
margins = {
    "Order Count": 0.8,
    "Age": 0.9
}

segments, issues, issue_summary = snapshot.scan_fingerprints(
    other_snapshot,
    margin_per_field=margins,
)

Only running the “sum” and “mean” metrics over the data:

segments, issues, issue_summary = snapshot.scan_fingerprints(
    other_snapshot,
    metrics=["sum", "mean"],
)
scan_leakage(leakage_threshold: float = 0.9, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify target and demographic leakage in the snapshots base dataset.

Target leakage (also known as data leakage), occurs when one or more of a dataset’s non-target features has information about the target feature that would not be available at the time of prediction. For example a monthly income feature would “leak” information about overall yearly income if it were included in a dataset.

Demographic leakage occurs when one or more of a dataset’s non-target features reveals (or has a high chance of revealing) a demographic feature which we do not want to train a model on. For example we may want to exclude gender from a model but have a feature such as a military spouse flag which leaks the gender demographics.

Parameters:
  • leakage_threshold – The threshold of the correlation metric over which we identify a leakage issue as occuring. Defaults to 0.9.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:
  • NotImplementedError – Raised if this scan is run where the snapshot’s base dataset is a non-pandas dataset.

  • ValueError – Raised if the snapshot’s base dataset contains no demographic information.

Returns:

_description_

Return type:

ScanResult

Deprecated since version 1.4: Use etiq.snapshot.scan_demographic_leakage() or etiq.snapshot.scan_target_leakage()

scan_target_drift_metrics(thresholds: Dict[str, Tuple[float, float]], drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, issue_storage_limit: int = 10000, **kwargs) Tuple[DataFrame, DataFrame, DataFrame]

Identify drift issues between a snapshot’s base and comparison datasets for the target feature.

A target drift issue occurs where a drift measure on the target feature is below and/or above the specified thresholds for the drift measure.

Use this scan like this:

thresholds = {"psi": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_target_drift_metrics(
        thresholds = thresholds,
        drift_measures = ["psi"]
    )
)
Parameters:
  • thresholds

    A user-defined dictionary of the drift measure thresholds (both upper and lower) e.g. {“psi” : [0.0, 0.15]} where “psi” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower thresholds indicates a minimum level of acceptable drift while the upper threshold indicates a maximum level of acceptable drift. This defaults to None.

    Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to 0.5 and 1.0 respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None in which case all known drift measures, including any user defined ones, are included in the scan.

  • ignore_lower_threshold – This is set to True if we want to ignore the lower threshold otherwise the lower threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a high measure value indicates a possible drift issue. This defaults to True.

  • ignore_upper_threshold – This is set to True if we want to ignore the upper threshold otherwise the upper threshold is taken into account when scanning for drift issues. This makes sense in the context of drift measures where a low measure value indicates a possible drift issues. This defaults to True.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – This is raised if the snapshot cannot run the scan because it either does not contain suitable datasets or has had unknown drift measures specified.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_target_drift_metrics_rca(thresholds: Dict[str, Tuple[float, float]] | None = None, drift_measures: List[str] | None = None, ignore_lower_threshold: bool = True, ignore_upper_threshold: bool = False, minimum_segment_size: int | None = None, find_most_granular: bool = False, encoders: Dict[str, LabelEncoder] | None = None, issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify the root cause of drift issues between a snapshot’s base and comparison datasets for the target feature.

This method works by searching for sub-segments of the data (e.g., age > 21) for which there are drift issues between the snapshot’s base and comparison datasets. A target drift issue occurs when a drift measure on a target feature is below and/or above the specified thresholds for the drift measure.

Use this scan like this:

thresholds = {"psi": [0.0, 0.15]}
segments, issues, issue_summary = (
    snapshot.scan_target_drift_metrics_rca(
        thresholds=thresholds,
        drift_measures=["psi"],
        minimum_segment_size=500
    )
)
Parameters:
  • thresholds – A user-defined dictionary of the drift measure thresholds (both upper and lower) e.g. {“psi” : [0.0, 0.15]} where “psi” is the measure with 0.0 and 0.15 the lower and upper thresholds respectively. The lower threshold indicates a minimum level of acceptable drift (not widely applicable) while the upper threshold indicates a maximum level of acceptable drift. This defaults to None. Note that if no user defined thresholds are provided or a drift measure is not present in the thresholds dictionary then upper and lower thresholds default to 0.5 and 1.0 respectively.

  • drift_measures – A list of drift measures to use in the scan. If this is not specified it defaults to None. In this case all known drift measures including any user defined ones, are included in the scan.

  • ignore_lower_threshold – This is set to True to ignore the lower threshold when scanning for drift issues, otherwise the lower threshold is taken into account when scanning for drift issues. It makes sense to set this to True in the context of drift measures where a high measure value indicates a possible drift issue. This defaults to True.

  • ignore_upper_threshold – This is set to True to ignore the upper threshold when scanning for drift issues, otherwise the upper threshold is taken into account when scanning for drift issues. It makes sense to set this to True in the context of drift measures where a low measure value indicates a possible drift issue. This defaults to True.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. Defaults to None, in which case 2% of the total number of validation samples is used as the minimum segment size by the RCA scan.

  • encoders – Dictionary of the encoders (if any) used for the features. This enables the segment logic to be translated back to the pre-encoded version. This defaults to None.

  • find_most_granular – Determines the RCA (Root Cause Analysis) mode. If set to False, the RCA will stop scanning a segment if the segment has target drift below/above the specified thresholds. If set to True, the RCA will keep scanning sub-segments of segments where the metric difference is below/above specified thresholds. Defaults to False.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None, where a maximum of 10000 issues will be returned by the scan.

Raises:

ValueError – Raised if the snapshot cannot run the scan because it either does not contain suitable datasets or invalid drift measures.

Returns:

A scan result consisting of the segments, the individual issues found, and a summary of the issues.

scan_target_leakage(leakage_threshold: float = 0.9, minimum_segment_size: int | None = None, continuous_continuous_measure: str = 'pearsons', categorical_categorical_measure: str = 'cramersv', categorical_continuous_measure: str = 'rankbiserial', binary_continuous_measure: str = 'pointbiserial', issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]

Identify target leakage in the snapshots base dataset.

Target leakage (also known as data leakage), occurs when one or more of a dataset’s non-target features has information about the target feature that would not be available at the time of prediction. For example a monthly income feature would “leak” information about overall yearly income if it were included in a dataset.

Parameters:
  • leakage_threshold – The threshold of the correlation metric over which we identify a leakage issue as occuring. Defaults to 0.9.

  • minimum_segment_size – The minimum number of samples in a segment for it to be considered significant. Defaults to None, in which case 2% of the total number of validation samples is used as the minimum segment size.

  • continuous_continuous_measure – The correlation measure to use where the feature being checked for target leakage is continuous as is the target feature. Defaults to ‘pearsons’.

  • categorical_categorical_measure – The correlation measure to use where the feature being checked for target leakage is categorical and the target feature is also categorical. Defaults to ‘cramersv’.

  • categorical_continuous_measure – The correlation measure to use where the feature being checked for target leakage is binary and the target feature is also continuous. Defaults to ‘rankbiserial’.

  • binary_continuous_measure – The correlation measure to use where the feature being checked for target leakage is categorical and the target feature is also continuous. Defaults to ‘pointbiserial’.

  • issue_storage_limit – The total number of individual issues to return for the scan. Defaults to None in which case a maximum of 10000 issues will be returned by the scan.

Raises:
  • NotImplementedError – Raised if this scan is run where the snapshot’s base dataset is a non-pandas dataset.

  • ValueError – Raised if the snapshot’s base dataset contains no demographic information.

Returns:

A scan result consisting of the segments (always just the global segment), the individual issues found and a summary of the issues.

scan_target_leakage_comparison(leakage_threshold: float = 0.9, minimum_segment_size: int | None = None, continuous_continuous_measure: str = 'pearsons', categorical_categorical_measure: str = 'cramersv', categorical_continuous_measure: str = 'rankbiserial', binary_continuous_measure: str = 'pointbiserial', issue_storage_limit: int = 10000) Tuple[DataFrame, DataFrame, DataFrame]
significant_features: list

The dataset features in significance order wrt to the model.

stage: SnapshotStage = 0

Snapshot’s process stage.

status: SnapshotStatus = 0

Snapshot’s status

class etiq.snapshots.SnapshotManager(_project: etiq.projects.Project)

Bases: object

create(*, name: str, dataset: AbstractDataset, model: BaseModel, comparison_dataset: AbstractDataset | None = None, generate_data_profiles: bool = False, enable_significance_calculation: bool = False, pipeline_id: str | None = None, pipeline_run_id: str | None = None, **params) Snapshot

Create snapshot instance

Parameters:
  • name – The name of the snapshot.

  • dataset – The snapshot’s base dataset.

  • model – The model to add to the snapshot.

  • comparison_dataset – The snapshot’s comparison dataset. Defaults to None.

  • generate_data_profiles – Set this to True to enable the calculation of the snapshot’s data profile. Defaults to False.

  • enable_significance_calculation – Set this to True to enable the most significant features to be calculated. Defaults to False.

  • pipeline_id – The snapshots pipeline identifier. Defaults to None in which case this is automatically determined.

  • pipeline_run_id – The snapshot’s pipeline identifier. Defaults to None in which case this is automatically determined.

Returns:

A snapshot with the specified settings.

get_all() List[Snapshot]

Return list of all snapshots.

Returns:

List of snapshots.

get_by_id(snapshot_id: int) Snapshot | None

Retrieve the snapshot with the corresponding ID.

Parameters:

snapshot_id – Snapshot id

Returns:

Snapshot with the corresponding ID or None if no such snapshot exists.

get_by_name(name: str) Snapshot | None

Retrieve the snapshot with the corresponding name.

Parameters:

name – Snapshot name

Returns:

Snapshot with the corresponding name or None if no such snapshot exists.

etiq.transforms module

class etiq.transforms.ConvertUnknownToNaN

Bases: object

static encode(data=None, unknown_symbol='?', **kwargs)
class etiq.transforms.Dropna

Bases: object

static encode(data=None, **kwargs)
class etiq.transforms.EncodeLabels

Bases: object

static decode(encoder=None, data=None, cont_col=None, cat_col=None, **kwargs)
static encode(data=None, cont_col=None, cat_col=None, **kwargs)

etiq.utils module

etiq.utils.abscorrcoef(x, y)
etiq.utils.ensure_mutual_exclusion(new_list: Sequence[str], other_lists: Sequence[Sequence[str]])
etiq.utils.flag_biased_groups(metric_group, cutoff)

metric_group: dictionary, with the key is the segment label or group identifier while value is: 100 * (result_privileged_group - result_unprivileged_group)

etiq.utils.guess_cont_cat_col(data=None, names_col=None)

Function attempts to guess which columns have categorical data and which ones have continuous data

Parameters:
  • data (numpy array or pandas Dataframe) – input data

  • names_col (numpy array of strings) – a list of all the column names in data

Returns:

A tuple of Numpy arrays. The first array has column names that we assume are categorical features. The second array has column names that we assume are continuous features.

test to implemnet: there is no column name that is not categorized as continuous or categorical

etiq.utils.load_obj(name, path)
etiq.utils.load_sample(filename: str) Any

Load example data from the sample data folder

If it’s a csv or pkl file, we will process it first and return the original object or pandas dataframe if csv.

Parameters:

filename – Name of the file to import - only needs the file name, not the full path.

Returns:

The item loaded from disk

etiq.utils.safecorrcoef(x, y)
etiq.utils.save_obj(obj, name, path)
etiq.utils.split_dataset(df, splits, random_seed)

Split a dataset into train/test pairs

Parameters:
  • df (list) – data to split

  • splits – size of training, size of validation, size of testing data

Returns:

if the size of the validation datset is 0, it will return the last row of the dataframe as a Series for the validation dataset

Module contents

ETIQ CORE

  1. etiq.ai 2023

class etiq.BiasDatasetBuilder

Bases: SimpleDatasetBuilder

classmethod bias_params(bias_params: BiasParams | None = None, protected: str | None = None, privileged: Any | None = None, unprivileged: Any | None = None, positive_outcome_label: Any | None = None, negative_outcome_label: Any | None = None)

Returns a BiasParams object

Parameters:
  • bias_params – A bias params object to clone. This defaults to None if not provided.

  • protected – Protected feature name for example ‘gender’. Defaults to None.

  • privileged – Privileged label within the protected feature for example ‘male’. This defaults to None if not provided.

  • unprivileged – Privileged label within the protected feature for example ‘female’. This defaults to None if not provided.

  • positive_outcome_label – The label of a “positive” outcome within a target feature. Defaults to None.

  • negative_outcome_label – The label of a “negative” outcome within a target feature. Defaults to None.

Returns:

A BiasParams object

classmethod dataset(features: DataFrame, target: DataFrame | None = None, label: str | None = None, prediction: str | None = None, cat_col: List[str] | None = None, cont_col: List[str] | None = None, train_valid_test_splits: Tuple[float, float, float] = (0.8, 0.2, 0.0), id_col: List[str] | None = None, date_col: List[str] | None = None, bias_params: BiasParams | None = None, convert_date_cols: bool = False, datetime_format: str = '', remove_protected_from_features: bool = True, random_seed: int = 2, name: str = None, register_creation: bool = True) BiasDataset

Creates a BiasDataset object given pandas dataframe(s).

Use this the dataset builder like:

from etiq import BiasDatasetBuilder
from etiq.biasparams import BiasParams
import pandas as pd
a = [
        ["2022-10-10", 'M', 2, 3, 4, 5, 6, 1],
        ["2022-10-11", 'F', 8, 9, 10, 11, 12, 0],
        ["", 'F', 2, 3, 4, 5, 6, 1],
        ["2022-10-13", 'M', 8, 9, 10, 11, 12, 0],
        ["2022-10-14", 'F', 2, 3, 4, 5, 6, 1],
        ["2022-10-15", 'F', 8, 9, 10, 11, 12, 0],
        ["2022-10-16", 'M', 2, 3, 4, 5, 6, 1],
        ["2022-10-17", 'F', 8, 9, 10, 11, 12, 0],
        ["2022-10-18", 'M', 14, 15, 16, 17, 18, 1],
        ["2022-10-19", 'M', 15, 16, 17, 18, 19, 1],
    ]
df = pd.DataFrame(a,
                columns=["start_date", "gender", "age2", "age3", "age4",
                        "age5", "age6", "income"])
adataset = BiasDatasetBuilder.dataset(
                features=df,
                label="income",
                cat_col=["age2", "age3", "income"],
                cont_col=["age4", "age5", "age6"],
                date_col=["start_date"],
                bias_params = BiasParams(protected='gender',
                                         privileged='M',
                                         unprivileged='F',
                                         positive_outcome_label= 1,
                                         negative_outcome_label= 0),
                remove_protected_from_features = True,
                convert_date_cols=True,
                name="test_dataset")
Parameters:
  • features – Pandas dataframe containing the dataset features as columns.

  • target – Pandas dataframe containing the target feature as a column. This defaults to None in which case the target feature is assumed to be either the last column in features dataset or the column name specified in the label argument.

  • label – The name of the column containing the target. This defaults to None in which case the target is assumed to either be the last column of the features dataframe or the first column of the target dataframe if this is not None.

  • prediction – The name of the column containing the prediction data. This defaults to None in which case the assumption is that the dataset contains no prediction data.

  • cat_col – List of categorical features. This defaults to None in which case categorical features are determined automatically.

  • cont_col – List of continuous features. This defaults to None in which case continuous features are determined automatically.

  • id_col – List of id features. This defaults to None in which case it is assumed the dataset contains no id features.

  • date_col – List of datetime features. This defaults to None in which case it is assumed the dataset contains no datetime features.

  • bias_params – This contains demographic data (the protected feature) needed to create the bias dataset. This defaults to None in which case a fake random protected feature is created.

  • train_valid_test_splits – This parameter specifies the proportions to use when splitting the data into training, validation and test subsets. This defaults to (0.8, 0.2, 0.0).

  • random_seed – Random number seed (for reproducibility) used when splitting the data into random training, validation and test subsets. This defaults to 2.

  • remove_protected_from_features – This is set to True in order to remove the protected feature from the normal features i.e. the protected feature is then not considered a feature used by the model. Otherwise the protected feature is treated as a normal feature.

  • convert_date_cols – This is set to True in order to convert an date features into datetime objects. This defaults to False.

  • datetime_format – The specific datetime format (assumes a common datetime is used for all datetime features). This defaults to an empty string in which case the datetime format is guessed.

  • name – The name to use for the dataset. This defaults to None in which case a random name is assigned.

  • register_creation – This is set to True to enable the dataset to be registered to the database (note that only a hash and/or fingerprint of the data is stored). This Defaults to True.

Returns:

A BiasDataset object.

classmethod datasets(training_features: DataFrame | None = None, training_target: DataFrame | None = None, validation_features: DataFrame | None = None, validation_target: DataFrame | None = None, testing_features: DataFrame | None = None, testing_target: DataFrame | None = None, label: str | None = None, prediction: str | None = None, cat_col: List[str] | None = None, cont_col: List[str] | None = None, bias_params: BiasParams | None = None, remove_protected_from_features: bool = True, id_col: List[str] | None = None, date_col: List[str] | None = None, convert_date_cols: bool = False, datetime_format: str = '', name: str | None = None, register_creation: bool = True) BiasDataset

Creates a SimpleDataset object given pandas dataframe(s).

Use this builder like:

from etiq import BiasDatasetBuilder
from etiq.biasparams import BiasParams
import pandas as pd
training = [
        ["2022-10-10", 'M', 2, 3, 4, 5, 6, 1],
        ["2022-10-11", 'F', 8, 9, 10, 11, 12, 0],
        ["", 'F', 2, 3, 4, 5, 6, 1],
        ["2022-10-13", 'M', 8, 9, 10, 11, 12, 0],
        ["2022-10-14", 'F', 2, 3, 4, 5, 6, 1]
        ]
validation = [
        ["2022-10-15", 'F', 8, 9, 10, 11, 12, 0],
        ["2022-10-16", 'M', 2, 3, 4, 5, 6, 1],
        ["2022-10-17", 'F', 8, 9, 10, 11, 12, 0],
        ["2022-10-18", 'M', 14, 15, 16, 17, 18, 1],
        ["2022-10-19", 'M', 15, 16, 17, 18, 19, 1]
        ]
df1 = pd.DataFrame(training,
                columns=["start_date", "gender", "age2", "age3", "age4",
                        "age5", "age6", "income"])
df2 = pd.DataFrame(validation,
                columns=["start_date", "gender", "age2", "age3", "age4",
                        "age5", "age6", "income"])
adataset = BiasDatasetBuilder.datasets(
                training_features=df1,
                validation_features=df2,
                label="income",
                cat_col=["age2", "age3", "income"],
                cont_col=["age4", "age5", "age6"],
                date_col=["start_date"],
                bias_params = BiasParams(protected='gender',
                                         privileged='M',
                                         unprivileged='F',
                                         positive_outcome_label= 1,
                                         negative_outcome_label= 0),
                remove_protected_from_features = True,
                convert_date_cols=True,
                name="test_dataset")
Parameters:
  • training_features – Pandas dataframe containing the training dataset features. This defaults to None in which case we assume there is no training data.

  • training_target – Pandas dataframe containing the target training data as a column. This defaults to None in which case the target feature is assumed to be either the last column in features dataset or the column name specified in the label argument.

  • validation_features – Pandas dataframe containing the validation dataset features. This defaults to None in which case we assume there is no validation data.

  • validation_target – Pandas dataframe containing the target validation data as a column. This defaults to None in which case the target feature is assumed to be either the last column in validation features dataset or the column name specified in the label argument.

  • testing_features – Pandas dataframe containing the testing dataset features. This defaults to None in which case we assume there is no testing data.

  • testing_target – Pandas dataframe containing the target testing data as a column. This defaults to None in which case the target feature is assumed to be either the last column in testing features dataset or the column name specified in the label argument.

  • label – The name of the column containing the target. This defaults to None in which case the target is assumed to either be the last column of the features dataframe or the first column of the target dataframe if this is not None.

  • prediction – The name of the column containing the prediction data. This defaults to None in which case the assumption is that the dataset contains no prediction data.

  • cat_col – List of categorical features. This defaults to None in which case categorical features are determined automatically.

  • cont_col – List of continuous features. This defaults to None in which case continuous features are determined automatically.

  • id_col – List of id features. This defaults to None in which case it is assumed the dataset contains no id features.

  • date_col – List of datetime features. This defaults to None in which case it is assumed the dataset contains no datetime features.

  • bias_params – This contains demographic data (the protected feature) needed to create the bias dataset. This defaults to None in which case a fake random protected feature is created.

  • train_valid_test_splits – This parameter specifies the proportions to use when splitting the data into training, validation and test subsets. This defaults to (0.8, 0.2, 0.0).

  • random_seed – Random number seed used when splitting the data into random training, validation and test subsets. This defaults to 2.

  • remove_protected_from_features – This is set to True in order to remove the protected feature from the normal features i.e. the protected feature is then not considered a feature used by the model. Otherwise the protected feature is treated as a normal feature.

  • convert_date_cols – This is set to True in order to convert an date features into datetime objects. This defaults to False.

  • datetime_format – The specific datetime format (assumes a common datetime is used for all datetime features). This defaults to an empty string in which case the datetime format is guessed.

  • name – The name to use for the dataset. This defaults to None in which case a random name is assigned.

  • register_creation – This is set to True to enable the dataset to be registered to the database (note that only a hash and/or fingerprint of the data is stored). This Defaults to True.

Returns:

A BiasDataset object.

class etiq.DataIssue(name: str, feature: str, value: str | float, segment: str)

Bases: object

A simple data class to represent a data issue e.g. “unknown_category” found by a pipeline the data

feature: str
name: str
segment: str
value: str | float
class etiq.Model(model_architecture=None, model_fitted=None, model_label=None, **kwargs)

Bases: BaseModel

User supplied dataset model

NAME = 'UserModel'
class etiq.SimpleDatasetBuilder

Bases: object

A builder for the SimpleDataset class

classmethod dataset(features: DataFrame, target: DataFrame | None = None, label: str | None = None, prediction: str | None = None, cat_col: List[str] | None = None, cont_col: List[str] | None = None, id_col: List[str] | None = None, date_col: List[str] | None = None, train_valid_test_splits: Tuple[float, float, float] = (0.8, 0.2, 0.0), random_seed: int = 2, convert_date_cols: bool = False, datetime_format: str = '', name: str | None = None, register_creation: bool = True) SimpleDataset

Creates a SimpleDataset object given pandas dataframe(s).

Use this builder like:

from etiq import SimpleDatasetBuilder
import pandas as pd
a = [
        ["2022-10-10", 2, 3, 4, 5, 6, 1],
        ["2022-10-11", 8, 9, 10, 11, 12, 0],
        ["", 2, 3, 4, 5, 6, 1],
        ["2022-10-13", 8, 9, 10, 11, 12, 0],
        ["2022-10-14", 2, 3, 4, 5, 6, 1],
        ["2022-10-15", 8, 9, 10, 11, 12, 0],
        ["2022-10-16", 2, 3, 4, 5, 6, 1],
        ["2022-10-17", 8, 9, 10, 11, 12, 0],
        ["2022-10-18", 14, 15, 16, 17, 18, 1],
        ["2022-10-19", 15, 16, 17, 18, 19, 1],
    ]
df = pd.DataFrame(a,
                columns=["start_date", "age2", "age3", "age4",
                        "age5", "age6", "income"])
adataset = SimpleDatasetBuilder.dataset(
                features=df,
                label="income",
                cat_col=["age2", "age3", "income"],
                cont_col=["age4", "age5", "age6"],
                date_col=["start_date"],
                convert_date_cols=True,
                name="test_dataset")
Parameters:
  • features – Pandas dataframe containing the dataset features as columns.

  • target – Pandas dataframe containing the target feature as a column. This defaults to None in which case the target feature is assumed to be either the last column in features dataset or the column name specified in the label argument.

  • label – The name of the column containing the target. This defaults to None in which case the target is assumed to either be the last column of the features dataframe or the first column of the target dataframe if this is not None.

  • prediction – The name of the column containing the prediction data. This defaults to None in which case the assumption is that the dataset contains no prediction data.

  • cat_col – List of categorical features. This defaults to None in which case categorical features are determined automatically.

  • cont_col – List of continuous features. This defaults to None in which case continuous features are determined automatically.

  • id_col – List of id features. This defaults to None in which case it is assumed the dataset contains no id features.

  • date_col – List of datetime features. This defaults to None in which case it is assumed the dataset contains no datetime features.

  • train_valid_test_splits – This parameter specifies the proportions to use when splitting the data into training, validation and test subsets. This defaults to (0.8, 0.2, 0.0).

  • random_seed – Random number seed used when splitting the data into random training, validation and test subsets. This defaults to 2.

  • convert_date_cols – This is set to True in order to convert an date features into datetime objects. This defaults to False.

  • datetime_format – The specific datetime format (assumes a common datetime is used for all datetime features). This defaults to an empty string in which case the datetime format is guessed.

  • name – The name to use for the dataset. This defaults to None in which case a random name is assigned.

  • register_creation – This is set to True to enable the dataset to be registered to the database (note that only a hash and/or fingerprint of the data is stored). This Defaults to True.

Returns:

A SimpleDataset object.

classmethod datasets(training_features: DataFrame | None = None, training_target: DataFrame | None = None, validation_features: DataFrame | None = None, validation_target: DataFrame | None = None, testing_features: DataFrame | None = None, testing_target: DataFrame | None = None, label: str | None = None, prediction: str | None = None, cat_col: List[str] | None = None, cont_col: List[str] | None = None, id_col: List[str] | None = None, date_col: List[str] | None = None, convert_date_cols=False, datetime_format='', name: str | None = None, register_creation: bool = True) SimpleDataset

Creates a SimpleDataset object given pandas dataframe(s).

Use this builder like:

from etiq import SimpleDatasetBuilder
import pandas as pd
training = [
                ["2022-10-10", 2, 3, 4, 5, 6, 1],
                ["2022-10-11", 8, 9, 10, 11, 12, 0],
                ["", 2, 3, 4, 5, 6, 1],
                ["2022-10-13", 8, 9, 10, 11, 12, 0]
                ["2022-10-14", 2, 3, 4, 5, 6, 1]
            ]
validation = [
                ["2022-10-15", 8, 9, 10, 11, 12, 0],
                ["2022-10-16", 2, 3, 4, 5, 6, 1],
                ["2022-10-17", 8, 9, 10, 11, 12, 0],
                ["2022-10-18", 14, 15, 16, 17, 18, 1],
                ["2022-10-19", 15, 16, 17, 18, 19, 1]
            ]
df1 = pd.DataFrame(training,
                columns=["start_date", "age2", "age3", "age4",
                        "age5", "age6", "income"])
df2 = pd.DataFrame(validation,
                columns=["start_date", "age2", "age3", "age4",
                        "age5", "age6", "income"])
adataset = SimpleDatasetBuilder.datasets(
                training_features=df1,
                validation_features=df2,
                label="income",
                cat_col=["age2", "age3", "income"],
                cont_col=["age4", "age5", "age6"],
                date_col=["start_date"],
                convert_date_cols=True,
                name="test_dataset")
Parameters:
  • training_features – Pandas dataframe containing the training dataset features. This defaults to None in which case we assume there is no training data.

  • training_target – Pandas dataframe containing the target training data as a column. This defaults to None in which case the target feature is assumed to be either the last column in features dataset or the column name specified in the label argument.

  • validation_features – Pandas dataframe containing the validation dataset features. This defaults to None in which case we assume there is no validation data.

  • validation_target – Pandas dataframe containing the target validation data as a column. This defaults to None in which case the target feature is assumed to be either the last column in validation features dataset or the column name specified in the label argument.

  • testing_features – Pandas dataframe containing the testing dataset features. This defaults to None in which case we assume there is no testing data.

  • testing_target – Pandas dataframe containing the target testing data as a column. This defaults to None in which case the target feature is assumed to be either the last column in testing features dataset or the column name specified in the label argument.

  • label – The name of the column containing the target. This defaults to None in which case the target is assumed to either be the last column of the features dataframe or the first column of the target dataframe if this is not None.

  • prediction – The name of the column containing the prediction data. This defaults to None in which case the assumption is that the dataset contains no prediction data.

  • cat_col – List of categorical features. This defaults to None in which case categorical features are determined automatically.

  • cont_col – List of continuous features. This defaults to None in which case continuous features are determined automatically.

  • id_col – List of id features. This defaults to None in which case it is assumed the dataset contains no id features.

  • date_col – List of datetime features. This defaults to None in which case it is assumed the dataset contains no datetime features.

  • convert_date_cols – This is set to True in order to convert an date features into datetime objects. This defaults to False.

  • datetime_format – The specific datetime format (assumes a common datetime is used for all datetime features). This defaults to an empty string in which case the datetime format is guessed.

  • name – The name to use for the dataset. This defaults to None in which case a random name is assigned.

  • register_creation – This is set to True to enable the dataset to be registered to the database (note that only a hash and/or fingerprint of the data is stored). This Defaults to True.

Returns:

A SimpleDataset object.

class etiq.SnapshotStage(value)

Bases: Enum

An enumeration.

PRE_PRODUCTION = 0
PRODUCTION = 1
class etiq.SnapshotStatus(value)

Bases: Enum

An enumeration.

FINAL = 1
INITIAL = 0
etiq.actual_values(akeyword: str) Callable

A decorator that maps a specified keyword argument to the ground truth parameter used in metric calculations.

This decorator designates which argument in your function represents the actual values, i.e., ground truth labels.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1], actual=[1, 1, 0],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the actual values.

Returns:

A decorator that maps the specified keyword to the actual values.

Return type:

Callable

etiq.concept_drift_measure(f: Callable) Callable

A decorator that allows an arbritray function to be decorated to return a DriftMeasure. In addition the decorated function will then be registered to be used as a concept drift measure when using Etiq. This is used to specify user defined concept drift measures to be used with Etiq.

This can be used as follows:

import etiq

@etiq.drit_measures.concept_drift_measure
def total_variational_distance(expected_dist, new_dist):
    return sum(0.5 * abs(x-y) for (x,y) in zip(expected_dist, new_dist))
Parameters:

f – The function to be wrapped

Returns:

A user-defined concept drift measure.

etiq.correlation_measure(custom_measure_callable: Callable[[List[Any], List[Any]], float]) Callable[[List[Any], List[Any]], Measure]

A decorator used to create and register a custom correlation measure

An example of the use of this decorator is as follows:

import scipy.stats as stats
import etiq
@etiq.measures.correlation_measure
def kendalls(x, y) -> float:
    return stats.kendalltau(x,y).statistic
Parameters:

custom_measure_callable – The correlation function

Returns:

A function which returns a Measure.

Return type:

Callable

etiq.custom_metric(custom_metric_callable: Callable) Callable

A decorator that allows an arbitrary function to be used as a custom metric with the etiq library.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg
Parameters:

custom_metric_callable – The function to be decorated.

Returns:

A function that returns a dictionary with the function name as the key and the function return value as the value.

etiq.data_values(akeyword: str) Callable

A decorator that maps a specified keyword argument to the data values parameter used in metric calculations.

Example usage:

import numpy as np
from sklearn.neighbors import NearestNeighbors
from sklearn.preprocessing import StandardScaler
import etiq

@etiq.metrics.bias_metric
@etiq.custom_metric
@etiq.prediction_values("predictions")
@etiq.data_values("data")
def individual_fairness_metric(data, predictions, n_neighbours: int = 10,
                               closest_neighbours: int = 5, **kwargs) -> float:
    data = np.asarray(data)
    pred = np.asarray(predictions)
    if data.shape[0] < n_neighbours:
        return np.nan
    scaler = StandardScaler().fit(data)
    data_scaled = scaler.transform(data)
    nbrs = NearestNeighbors(n_neighbors=n_neighbours, algorithm='ball_tree', n_jobs=-1).fit(data_scaled)
    indices = nbrs.kneighbors(data_scaled, return_distance=False)
    return 1.0 - np.mean(np.abs(pred - np.mean(pred[indices[:, 0:closest_neighbours]], axis=1)))
Parameters:

akeyword – The keyword argument to be mapped to the data values.

Returns:

A decorator that maps the specified keyword to the data values.

etiq.disable_telemetry()

Disable usage reporting of the library

etiq.drift_measure(custom_measure_callable: Callable | None = None, *, autobin: bool = False) Callable

A decorator that allows an arbritray function to be decorated to return a DriftMeasure. This is used to specify user defined drift measures.

This can be used as follows:

import etiq

@etiq.drit_measures.drift_measure
def total_variational_distance(expected_dist, new_dist):
    return sum(0.5 * abs(x-y) for (x,y) in zip(expected_dist, new_dist))
Parameters:
  • custom_measure_callable (Callable) – The function to be wrapped

  • autobin – This is set to True to automatically bin the distributions being compared. Defaults to False.

Returns:

A user-defined drift measure.

etiq.enable_telemetry()

Enable usage reporting of the library

etiq.etiq_config(src: str | Path)
etiq.etiq_pipeline_details(pipeline_name: str, run_id: str | None = None) PipelineDetails
etiq.get_config() Dict

Load a configuration

etiq.get_data_issues(base_dataset: DataFrame, comparison_dataset: DataFrame, categorical_features: List[str] | None = None, continuous_features: List[str] | None = None, search_for_missing_features: bool = True, search_for_unknown_features: bool = True, identical_feature_filter: List[str] | None = None, range_feature_filter: List[str] | None = None, missing_category_feature_filter: List[str] | None = None, unknown_category_feature_filter: List[str] | None = None) List[DataIssue]
etiq.get_histogram_calculator(atype: Any) Callable
etiq.get_pipeline_details() PipelineDetails | None
etiq.is_telemetry_enabled() bool

Is telemetry enabled? Check the DB when we first load

etiq.load_config(src: str | Path) Dict

Load a configuration

etiq.load_sample(filename: str) Any

Load example data from the sample data folder

If it’s a csv or pkl file, we will process it first and return the original object or pandas dataframe if csv.

Parameters:

filename – Name of the file to import - only needs the file name, not the full path.

Returns:

The item loaded from disk

etiq.login(server: str, token: str) str

Set login details so we start submitting pipeline details to the given server

Parameters:
  • server – str, dashboard server to connect to

  • token – str, Authentication token to use.

etiq.login_using_token(host: str, token: str)
etiq.logout()

Remove authentication config

etiq.negative_outcome(akeyword: str) Callable

A decorator that maps a specified keyword argument to the negative outcome parameter used in metric calculations.

This decorator allows you to specify which argument in your function represents the negative outcome in the context of fairness or bias metrics.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1, 0], actual=[1, 1, 0, 1],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the negative outcome.

Returns:

A decorator that maps the specified keyword to ‘negative_outcome_label’.

Return type:

Callable

etiq.positive_outcome(akeyword: str) Callable

A decorator that maps a specified keyword argument to the positive outcomes parameter used in metric calculations.

This decorator allows you to specify which argument in your function represents the positive outcome in the context of fairness or bias metrics.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg

# Example call
treatment_equality(predictions=[1, 0, 1], actual=[1, 1, 0],
                   positive_outcome=1, negative_outcome=0)
Parameters:

akeyword – The keyword argument representing the positive outcome.

Returns:

A decorator that maps the specified keyword to ‘positive_outcome_label’.

Return type:

Callable

etiq.prediction_values(target_keyword: str) Callable

A decorator that maps a specified keyword argument to the predictions parameter used in metric calculations.

This decorator designates which argument in your function represents the predictions made by a model.

Example usage:

import numpy as np
import etiq

@etiq.custom_metric
@etiq.actual_values("actual")
@etiq.prediction_values("predictions")
@etiq.positive_outcome("positive_outcome")
@etiq.negative_outcome("negative_outcome")
def treatment_equality(predictions, actual, positive_outcome, negative_outcome):
    false_neg = np.sum((predictions != actual) &
                       (predictions == negative_outcome))
    false_pos = np.sum((predictions != actual) &
                       (predictions == positive_outcome))
    if false_pos == 0:
        return 0.0
    elif false_neg == 0:
        return np.inf
    return false_pos / false_neg
Parameters:

target_keyword – The target keyword to map to.

Returns:

A decorator that maps the ‘pred’ keyword to the target keyword.

etiq.proxy_calculation_for_type(calculation_name: str, atype: Type[Any])
etiq.register_histogram_handler(atype: Any, handler: Callable)