Visualizers

Modules under Visualizers are mainly responsible for all interactive graphing works. It takes in direct inputs or post-processed inputs from interpreters and generate various plots using plotly frameworks. The types of graph generated depend on the feature component which is linked to specific task.

Viz - General Metrics

rarity.visualizers.general_metrics.plot_classification_report(yTrue: pandas.core.series.Series, yPred: pandas.core.series.Series, model_names: List)[source]

Create classification report in table form

Parameters
  • yTrue (pd.Series) – true labels, output from int_general_metrics

  • yPred (pd.Series) – predicted labels, output from int_general_metrics

  • model_names (List[str]) – model names, output from interpreter int_general_metrics

Returns

list of tables displaying classification report details

Return type

List[~plotly.graph_objects.Figure]

rarity.visualizers.general_metrics.plot_confusion_matrix(yTrue: pandas.core.series.Series, yPred: pandas.core.series.Series, model_names: List)[source]

Create confusion matrix

Parameters
  • yTrue (pd.Series) – true labels, output from int_general_metrics

  • yPred (pd.Series) – predicted labels, output from int_general_metrics

  • model_names (List[str]) – model names, output from interpreter int_general_metrics

Returns

figure displaying confusion matrix details

Return type

Figure

rarity.visualizers.general_metrics.plot_precisionRecall_curve(yTrue: pandas.core.series.Series, yPred: pandas.core.series.Series, model_names: List)[source]

Display precision-recall curve for comparison on various models

Parameters
  • yTrue (pd.Series) – true labels, output from int_general_metrics

  • yPred (pd.Series) – predicted labels, output from int_general_metrics

  • model_names (List[str]) – model names, output from interpreter int_general_metrics

Returns

figure displaying line curves comparing precision-recall for various models

Return type

Figure

rarity.visualizers.general_metrics.plot_prediction_offset_overview(df: pandas.core.frame.DataFrame)[source]

Display scatter plot for overview on prediction offset values

Parameters

df (DataFrame) – dataframe containing yTrue and yPred values, output from int_general_metrics

Returns

figure displaying scatter plot outlining overview on prediction offset values

Return type

Figure

rarity.visualizers.general_metrics.plot_prediction_vs_actual(df: pandas.core.frame.DataFrame)[source]

Display scatter plot for comparison on actual values vs prediction values

Parameters

df (pd.DataFrame) – dataframe containing yTrue and yPred values, output from int_general_metrics

Returns

figure displaying scatter plot comparing actual values vs prediction values

Return type

Figure

rarity.visualizers.general_metrics.plot_roc_curve(yTrue: pandas.core.series.Series, yPred: pandas.core.series.Series, model_names: List)[source]

Display roc curve for comparison on various models

Parameters
  • yTrue (pd.Series) – true labels, output from int_general_metrics

  • yPred (pd.Series) – predicted labels, output from int_general_metrics

  • model_names (List[str]) – model names, output from interpreter int_general_metrics

Returns

figure displaying line curves comparing roc-auc score for various models

Return type

Figure

rarity.visualizers.general_metrics.plot_std_error_metrics(df: pandas.core.frame.DataFrame)[source]

Display table comparing various standard metrics for regression task

Parameters

df (DataFrame) – dataframe containing info on error metrics, output from int_general_metrics

Returns

table object comparing various standard metrics for regression task

Return type

DataTable

Viz - Miss Predictions

rarity.visualizers.miss_predictions.plot_prediction_offset_overview(df: pandas.core.frame.DataFrame)[source]

Display scatter plot for overview on prediction offset values

Parameters

df (DataFrame) – dataframe containing calculated offset values, output from int_miss_predictions

Returns

figure displaying scatter plot outlining overview on prediction offset values by index

Return type

Figure

rarity.visualizers.miss_predictions.plot_probabilities_spread_pattern(df_specific_label: pandas.core.frame.DataFrame)[source]

Display scatter plot for probabilities comparison on correct data point vs miss-predicted data point for each class label

Parameters

df_specific_label (DataFrame) – dataframe of 1 specific label of 1 model type, output from int_miss_predictions

Returns

figure displaying scatter plot outlining probabilities comparison on correct data point vs miss-predicted data point for each class label

Return type

Figure

rarity.visualizers.miss_predictions.plot_simple_probs_spread_overview(df_label_state: pandas.core.frame.DataFrame)[source]

Display data table listing simple stats on ss, %correct, % wrong, accuracy for each label class

Parameters

df_label_state (DataFrame) – dataframe containing info on simple stats, output from int_miss_predictions

Returns

table object outlining simple stats on ss, %correct, % wrong, accuracy for each label class

Return type

DataTable

Viz - Loss Clusters

rarity.visualizers.loss_clusters.plot_logloss_clusters(dfs: List[pandas.core.frame.DataFrame], analysis_type: str)[source]

For use in classification task only. Function to plot figure displaying cluster groups by log-loss values

Parameters
  • dfs (List[~pd.DataFrame]) – list of dataframes containing cluster info, output from int_loss_clusters

  • analysis_type (str) – info to indicate if analysis is regression or classification, info inherited from data_loader

Returns

figure displaying violin plot outlining cluster groups by log-loss values

Return type

Figure

rarity.visualizers.loss_clusters.plot_offset_clusters(df: pandas.core.frame.DataFrame, analysis_type: str)[source]

For use in regression task only. Function to plot figure displaying cluster groups by prediction offset values

Parameters
  • df (DataFrame) – dataframe containing cluster info, output from int_loss_clusters

  • analysis_type (str) – info to indicate if analysis is regression or classification, info inherited from data_loader

Returns

figure displaying violin plot outlining cluster groups by offset values

Return type

Figure

rarity.visualizers.loss_clusters.plot_optimum_cluster_via_elbow_method(cluster_range: List[int], sum_squared_distance: List[float], models: List[str])[source]

Figure to guide decision on the number of clusters that is reasonable to form with KMean method

Parameters
  • cluster_range (List[int]) – list of integers indicating the number of clusters

  • sum_squared_distance (List[float]) – list of sum of squared distance generated via kmean_inertia

  • models (List[str]) – list of models used to generate yPred

Returns

figure displaying line plot outlining the change in sum of squared distances along the cluster range

Return type

Figure

Viz - xFeature Distribution

rarity.visualizers.xfeature_distribution.plot_distribution_by_kl_div_ranking(kl_div_dict_sorted: Dict, display_option: str, display_value: int, comparison_base: str, model_name: str)[source]

Create distribution plot by kl-divergence score ranking in descending order

Parameters
  • kl_div_dict_sorted (Dict) – dictionary storing kl-divergence score by feature in decending order

  • display_option (str) –

    • info to indicate if to display distribution plot by top-N / bottom-N or both top-N + bottom-N

    • Available options: top, bottom or both

  • display_value (int) –

    • number indicates the limit of graph to be displayed, max at 10

    • if dataset consists of < 10 features, the limit == no. of features the dataset has

  • comparison_base (str) – info to indicate the baseline for distribution comparison. dataset_type for regression and pred_state for classification task

  • model_name (str) – model used to generate yPred

Returns

Dictionary storing distribution figures by display_option

Return type

Dict[str, ~plotly.graph_objects.Figure]

rarity.visualizers.xfeature_distribution.plot_distribution_by_specific_feature(ls_specific_feature: List[str], kl_div_dict_sorted: Dict, comparison_base: str, model_name: str)[source]

Create distribution plot for a specific feature

Parameters
  • ls_specific_feature (List[str]) – list of feature to have its distribution graph plotted

  • kl_div_dict_sorted (Dict) – dictionary storing kl-divergence score by feature in decending order

  • comparison_base (str) – info to indicate the baseline for distribution comparison. dataset_type for regression and pred_state for classification task

  • model_name (str) – model used to generate yPred

Returns

List of figures displaying distribution plot of specific feature

Return type

List[~plotly.graph_objects.Figure]

Viz - Shared Viz Component

rarity.visualizers.shared_viz_component.reponsive_table_to_filtered_datapoints(data: Dict, customized_cols: List[str], header: Dict, exp_format: str)[source]

Create table outlining dataframe content

Parameters
  • data (DataTable) – dictionary like format storing dataframe info under ‘record’ key

  • customized_cols (List[str]) – list of customized column names

  • header (Dict) – dictionary format storing the style info for table header

  • exp_format (str) – text info indicating the export format

Returns

table object outlining the dataframe content with specific styles

Return type

DataTable

rarity.visualizers.shared_viz_component.reponsive_table_to_filtered_datapoints_similaritiesCF(df, customized_cols, feature_cols, header, exp_format)[source]

Create table outlining dataframe content specific to Counter-Factuals component

Parameters
  • df (DataFrame) – dataframe containing calculated distance info

  • customized_cols (List[str]) – list of customized column names

  • feature_cols (List[str]) – list of feature column names

  • header (Dict) – dictionary format storing the style info for table header

  • exp_format (str) – text info indicating the export format

Returns

table object outlining the dataframe content with dynamic-conditional styles

Return type

DataTable