Evaluation#

APIs for post-run analysis and comparison utilities.

API path: apem.unit_based_model.evaluation

Public entrypoints for post-processing and comparing saved US-model results.

compare_price_algorithms(df, *, align_on=None, baseline=None)[source]#

Compare algorithm price series on aligned observations.

Parameters:
  • df (DataFrame) – input price table with one algorithm and price value per aligned observation

  • align_on (Sequence[str] | None) – explicit alignment key columns; when omitted, all non-required columns are used, or an inferred row index if none are available

  • baseline (str | None) – optional algorithm name; if provided, only pairs including this baseline are returned

Returns:

pairwise comparison table with mean levels, mean differences, absolute differences, and correlation per algorithm pair

Raises:

ValueError – if duplicates exist for the same alignment key and algorithm, if fewer than two algorithms are present, or if baseline is not present

Return type:

DataFrame

create_timestamped_output_dir(evaluation_root, *name_parts, max_dir_name_length=64)[source]#

Create a timestamped output directory with bounded folder-name length.

On Windows, long absolute paths can fail around the default 260-character limit. This helper keeps the timestamped folder segment compact and appends a stable hash when truncation is required.

Parameters:
  • evaluation_root (Path)

  • name_parts (str)

  • max_dir_name_length (int)

Return type:

Path

ensure_lost_opp_cost_run_for_configuration(
results_root,
repo_root,
dataset,
pricing_algorithm,
power_flow_model,
power_flow_model_name,
)[source]#

Reuse or compute a run that includes lost-opportunity-cost analysis outputs.

Parameters:
  • results_root (Path) – root directory containing run folders

  • repo_root (Path) – repository root used to normalize computed paths

  • dataset (UnitBased_Datasets) – dataset enum to solve

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum to solve

  • power_flow_model – instantiated power-flow model object

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

tuple (run_dir, status) where status is "reused" or "computed"

Return type:

tuple[Path, str]

ensure_redispatch_run_for_configuration(
results_root,
repo_root,
dataset,
power_flow_model,
power_flow_model_name,
redispatch_algorithm,
redispatch_constraint_units=False,
redispatch_threshold=0,
)[source]#

Reuse or compute a run that includes redispatch metrics.

Parameters:
  • results_root (Path) – root directory containing run folders

  • repo_root (Path) – repository root used to normalize computed paths

  • dataset (UnitBased_Datasets) – dataset enum to solve

  • power_flow_model – instantiated power-flow model object

  • power_flow_model_name (str) – model name used in run folder structure

  • redispatch_algorithm (RedispatchAlgorithms) – redispatch algorithm enum

  • redispatch_constraint_units (bool) – redispatch option forwarded to solver

  • redispatch_threshold (float) – threshold option forwarded to solver

Returns:

tuple (run_dir, status) where status is "reused" or "computed"

Return type:

tuple[Path, str]

ensure_run_for_configuration(
results_root,
repo_root,
dataset,
pricing_algorithm,
power_flow_model,
power_flow_model_name,
)[source]#

Reuse or compute a run for one dataset/pricing/model configuration.

Parameters:
  • results_root (Path) – root directory containing run folders

  • repo_root (Path) – repository root used to normalize computed paths

  • dataset (UnitBased_Datasets) – dataset enum to solve

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum to solve

  • power_flow_model – instantiated power-flow model object

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

tuple (run_dir, status) where status is "reused" or "computed"

Return type:

tuple[Path, str]

ensure_welfare_run_for_configuration(
results_root,
repo_root,
dataset,
power_flow_model,
power_flow_model_name,
)[source]#

Reuse or compute a run that includes allocation welfare stats.

Parameters:
  • results_root (Path) – root directory containing run folders

  • repo_root (Path) – repository root used to normalize computed paths

  • dataset (UnitBased_Datasets) – dataset enum to solve

  • power_flow_model – instantiated power-flow model object

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

tuple (run_dir, status) where status is "reused" or "computed"

Return type:

tuple[Path, str]

find_latest_matching_lost_opp_cost_run(
results_root,
dataset,
pricing_algorithm,
power_flow_model_name,
zonal_path='',
)[source]#

Return the newest matching run folder with lost-opportunity-cost stats.

Parameters:
  • results_root (Path) – root directory containing run folders

  • dataset (UnitBased_Datasets) – dataset enum expected in run metadata

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum expected in run metadata

  • power_flow_model_name (str) – selected power-flow model name

  • zonal_path (str) – expected zonal path metadata value

Returns:

latest matching run directory with stats file, or None

Return type:

Path | None

find_latest_matching_redispatch_run(
results_root,
dataset,
power_flow_model_name,
redispatch_algorithm,
redispatch_constraint_units=False,
redispatch_threshold=0,
zonal_path='',
)[source]#

Return the newest matching run folder with redispatch metric outputs.

Parameters:
  • results_root (Path) – root directory containing run folders

  • dataset (UnitBased_Datasets) – dataset enum expected in run metadata

  • power_flow_model_name (str) – selected power-flow model name

  • redispatch_algorithm (RedispatchAlgorithms) – redispatch algorithm enum

  • redispatch_constraint_units (bool) – redispatch option expected in run output

  • redispatch_threshold (float) – threshold option expected in run output

  • zonal_path (str) – expected zonal path metadata value

Returns:

latest matching run directory with redispatch files, or None

Return type:

Path | None

find_latest_matching_run(
results_root,
dataset,
pricing_algorithm,
power_flow_model_name,
zonal_path='',
)[source]#

Return the newest run folder for a dataset, pricing algorithm, and model.

Parameters:
  • results_root (Path) – root directory containing run folders

  • dataset (UnitBased_Datasets) – dataset enum expected in run metadata

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum expected in run metadata

  • power_flow_model_name (str) – selected power-flow model name

  • zonal_path (str) – expected zonal path metadata value (empty for non-zonal)

Returns:

latest matching run directory, or None if no match is found

Return type:

Path | None

find_latest_matching_welfare_run(
results_root,
dataset,
power_flow_model_name,
zonal_path='',
)[source]#

Return the newest matching run folder with allocation welfare stats.

Parameters:
  • results_root (Path) – root directory containing run folders

  • dataset (UnitBased_Datasets) – dataset enum expected in run metadata

  • power_flow_model_name (str) – selected power-flow model name

  • zonal_path (str) – expected zonal path metadata value

Returns:

latest matching run directory with allocation stats, or None

Return type:

Path | None

load_lost_opp_cost_table(
path,
*,
algorithm_column='algorithm',
lost_opp_cost_column='lost_opp_cost',
component_column='component',
value_column='value',
sheet_name='Sheet1',
)[source]#

Load a lost-opportunity-cost table from disk and normalize core columns.

Supported file types are .csv, .parquet, .txt, .xlsx, and .xls.

Parameters:
  • path (str | Path) – file path to load

  • algorithm_column (str) – source column name mapped to algorithm

  • lost_opp_cost_column (str) – source column name mapped to lost_opp_cost

  • component_column (str) – source column name mapped to component

  • value_column (str) – source column name mapped to value

  • sheet_name (str) – Excel sheet name when loading .xlsx/.xls

Returns:

validated normalized table with columns algorithm, lost_opp_cost, component, value

Raises:

ValueError – if the file type is unsupported or parsed data fails validation

Return type:

DataFrame

load_lost_opp_costs_from_run(
run_dir,
scenario_name,
pricing_algorithm,
power_flow_model_name,
)[source]#

Load one pricing algorithm’s lost-opportunity-cost components from a run.

Parameters:
  • run_dir (Path) – run directory containing run_config.txt

  • scenario_name (str) – dataset/scenario label added to the output table

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

normalized table with dataset, algorithm, lost_opp_cost, component, value

Return type:

DataFrame

load_prices_from_run(run_dir, scenario_name, pricing_algorithm, power_flow_model_name)[source]#

Load one pricing algorithm’s node-period prices from a selected run folder.

Parameters:
  • run_dir (Path) – run directory containing run_config.txt

  • scenario_name (str) – dataset/scenario label added to the output table

  • pricing_algorithm (PricingAlgorithms) – pricing algorithm enum

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

normalized price table with dataset, algorithm, node, period, and price

Return type:

DataFrame

load_redispatch_metric_file(path, *, redispatch_algorithm=None, metric=None)[source]#

Load one redispatch metric file and normalize it to tabular format.

The file is expected to contain one <label>: <value> line.

Parameters:
  • path (str | Path) – redispatch metric file path

  • redispatch_algorithm (str | None) – algorithm label override; inferred from the file name when omitted

  • metric (str | None) – metric label override (for example costs or volumes); inferred from the file name when omitted

Returns:

validated one-row table with redispatch_algorithm, metric, value

Raises:

ValueError – if parsing fails or inferred values are invalid

Return type:

DataFrame

load_redispatch_metrics_from_run(
run_dir,
scenario_name,
power_flow_model_name,
redispatch_algorithm,
redispatch_constraint_units=False,
redispatch_threshold=0,
)[source]#

Load redispatch costs/volumes from a selected run folder.

Parameters:
  • run_dir (Path) – run directory containing run_config.txt

  • scenario_name (str) – dataset/scenario label added to the output table

  • power_flow_model_name (str) – model name used in run folder structure

  • redispatch_algorithm (RedispatchAlgorithms) – redispatch algorithm enum

  • redispatch_constraint_units (bool) – redispatch option used to build file names and output metadata

  • redispatch_threshold (float) – threshold used to build file names and output metadata

Returns:

normalized table with dataset, power_flow_model, redispatch_algorithm, redispatch options, metric, and value

Return type:

DataFrame

load_welfare_from_run(run_dir, scenario_name, power_flow_model_name)[source]#

Load welfare values from a selected run folder.

Parameters:
  • run_dir (Path) – run directory containing run_config.txt

  • scenario_name (str) – dataset/scenario label added to the output table

  • power_flow_model_name (str) – model name used in run folder structure

Returns:

normalized welfare table with dataset, power_flow_model, welfare_scope, period, and welfare

Return type:

DataFrame

load_welfare_table(
path,
*,
power_flow_model_name=None,
welfare_scope_column='welfare_scope',
period_column='period',
welfare_column='welfare',
sheet_name='Sheet1',
)[source]#

Load a welfare table from disk and normalize core columns.

Supported file types are .txt, .csv, .parquet, .xlsx, and .xls.

Parameters:
  • path (str | Path) – file path to load

  • power_flow_model_name (str | None) – model name override used when the loaded file does not include power_flow_model

  • welfare_scope_column (str) – source column name mapped to welfare_scope

  • period_column (str) – source column name mapped to period

  • welfare_column (str) – source column name mapped to welfare

  • sheet_name (str) – Excel sheet name when loading .xlsx/.xls

Returns:

validated normalized welfare table

Raises:

ValueError – if file type is unsupported or parsed data fails validation

Return type:

DataFrame

normalize_run_dir(path, repo_root)[source]#

Resolve a run directory path, using repo_root for relative paths.

Parameters:
  • path (Path | str) – absolute or relative run-directory path

  • repo_root (Path) – repository root used to resolve relative paths

Returns:

normalized absolute-like path rooted at repo_root when needed

Return type:

Path

parse_run_config(run_config_path)[source]#

Parse key-value run metadata stored in run_config.txt.

Parameters:

run_config_path (Path) – path to a run configuration file

Returns:

dictionary with parsed metadata entries

Return type:

dict[str, str]

plot_average_prices_by_node(
prices,
output_file,
algorithm_order=None,
statistic_fn=<function mean>,
)[source]#

Plot one aggregated price statistic by node for each algorithm.

Parameters:
  • prices (DataFrame) – price table containing at least node, algorithm, and price

  • output_file (str | Path) – output image path

  • algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

  • statistic_fn (Callable[[ndarray], float]) – aggregation function applied to price values

Returns:

None

Return type:

None

plot_average_prices_by_period(
prices,
output_file,
algorithm_order=None,
statistic_fn=<function mean>,
)[source]#

Plot one aggregated price statistic by period for each algorithm.

Parameters:
  • prices (DataFrame) – price table containing at least period, algorithm, and price

  • output_file (str | Path) – output image path

  • algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

  • statistic_fn (Callable[[ndarray], float]) – aggregation function applied to price values (for example np.mean or np.median)

Returns:

None

Return type:

None

plot_lost_opp_cost_by_component(
lost_opp_costs,
output_file,
*,
lost_opp_cost_type,
algorithm_order=None,
)[source]#

Plot one lost-opportunity-cost type across components for each algorithm.

Parameters:
  • lost_opp_costs (DataFrame) – table containing at least lost_opp_cost, component, algorithm, value

  • output_file (str | Path) – output image path

  • lost_opp_cost_type (str) – selected type to plot (for example glocs)

  • algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

Returns:

None

Raises:

ValueError – if filtered data is empty or duplicates exist per component and algorithm

Return type:

None

plot_price_boxplot_by_node(
prices,
output_file,
algorithm_order=None,
statistic_fn=<function mean>,
)[source]#

Create a boxplot across algorithms using one aggregated value per node.

Parameters:
  • prices (DataFrame) – price table containing node, algorithm, price

  • output_file (str | Path) – output image path

  • algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

  • statistic_fn (Callable[[ndarray], float]) – aggregation function applied within each node

Returns:

None

Return type:

None

plot_price_boxplot_by_period(
prices,
output_file,
algorithm_order=None,
statistic_fn=<function mean>,
)[source]#

Create a boxplot across algorithms using one aggregated value per period.

Parameters:
  • prices (DataFrame) – price table containing period, algorithm, price

  • output_file (str | Path) – output image path

  • algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

  • statistic_fn (Callable[[ndarray], float]) – aggregation function applied within each period

Returns:

None

Return type:

None

plot_redispatch_metric_by_algorithm(
redispatch_table,
output_file,
*,
metric,
redispatch_algorithm_order=None,
)[source]#

Plot one redispatch metric as a bar chart across redispatch algorithms.

Parameters:
  • redispatch_table (DataFrame) – redispatch table containing metric, redispatch_algorithm, and value

  • output_file (str | Path) – output image path

  • metric (str) – selected metric label (for example costs)

  • redispatch_algorithm_order (Sequence[str] | None) – optional plotting order for algorithms

Returns:

None

Raises:

ValueError – if filtered data is empty or duplicates exist per redispatch algorithm

Return type:

None

plot_redispatch_metric_by_power_flow_model(
redispatch_table,
output_file,
*,
metric,
power_flow_model_order=None,
)[source]#

Plot one redispatch metric as a bar chart across power-flow models.

Parameters:
  • redispatch_table (DataFrame) – redispatch table containing metric, power_flow_model, and value

  • output_file (str | Path) – output image path

  • metric (str) – selected metric label (for example costs)

  • power_flow_model_order (Sequence[str] | None) – optional plotting order for models

Returns:

None

Raises:

ValueError – if filtered data is empty or duplicates exist per power-flow model

Return type:

None

plot_total_welfare_by_power_flow_model(
welfare_table,
output_file,
power_flow_model_order=None,
)[source]#

Plot total welfare for each power-flow model as a bar chart.

Parameters:
  • welfare_table (DataFrame) – welfare table containing welfare_scope, power_flow_model, and welfare

  • output_file (str | Path) – output image path

  • power_flow_model_order (Sequence[str] | None) – optional plotting order for models

Returns:

None

Raises:

ValueError – if no total rows exist or duplicates exist per model

Return type:

None

plot_value_by_period_and_power_flow_model(
table,
output_file,
*,
period_column,
model_column,
value_column,
ylabel,
title,
power_flow_model_order=None,
)[source]#

Plot a generic value by period for each power-flow model.

Parameters:
  • table (DataFrame) – input table

  • output_file (str | Path) – output image path

  • period_column (str) – column name used as x-axis period

  • model_column (str) – column name used for model grouping

  • value_column (str) – numeric value column to plot

  • ylabel (str) – y-axis label

  • title (str) – chart title

  • power_flow_model_order (Sequence[str] | None) – optional plotting order for models

Returns:

None

Raises:

ValueError – if required columns are missing

Return type:

None

plot_value_by_power_flow_model(
table,
output_file,
*,
value_column,
ylabel,
title,
power_flow_model_order=None,
)[source]#

Plot one value column as a bar chart across power-flow models.

Parameters:
  • table (DataFrame) – input table containing power_flow_model and one value column

  • output_file (str | Path) – output image path

  • value_column (str) – column name to plot on the y-axis

  • ylabel (str) – y-axis label

  • title (str) – chart title

  • power_flow_model_order (Sequence[str] | None) – optional plotting order for models

Returns:

None

Raises:

ValueError – if required columns are missing or duplicates exist per power-flow model

Return type:

None

plot_welfare_by_period(welfare_table, output_file, power_flow_model_order=None)[source]#

Plot period welfare trajectories for each power-flow model.

Parameters:
  • welfare_table (DataFrame) – welfare table containing welfare_scope, period, power_flow_model, and welfare

  • output_file (str | Path) – output image path

  • power_flow_model_order (Sequence[str] | None) – optional plotting order for models

Returns:

None

Raises:

ValueError – if no period-level welfare rows are found

Return type:

None

round_numeric_columns(df, digits=2)[source]#

Round all numeric columns to a fixed number of decimal places.

Parameters:
  • df (DataFrame) – input table

  • digits (int) – number of decimals used for rounding numeric columns

Returns:

copy of df with rounded numeric columns

Return type:

DataFrame

statistic_name(statistic_fn)[source]#

Return a lowercase filename-safe name for the selected statistic function.

Parameters:

statistic_fn (Callable[[ndarray], float]) – aggregation callable (for example np.mean)

Returns:

lowercase snake-case statistic label

Return type:

str

summarize_prices(df, *, group_by=('algorithm',))[source]#

Compute descriptive statistics for prices grouped by one or more columns.

Parameters:
  • df (DataFrame) – input price table

  • group_by (Sequence[str]) – grouping columns to summarize by; defaults to ("algorithm",)

Returns:

one row per group with counts, central moments, quantiles, spread, and additional quality metrics

Return type:

DataFrame

validate_lost_opp_cost_table(df)[source]#

Validate and normalize a generic lost-opportunity-cost input table.

Parameters:

df (DataFrame) – input table expected to contain algorithm, lost_opp_cost, component, and value

Returns:

normalized copy with lowercase categorical values and numeric value

Raises:

ValueError – if required columns are missing, unsupported categories are present, labels are empty, or no numeric values are available

Return type:

DataFrame

validate_price_table(df)[source]#

Validate and normalize a generic price-analysis input table.

Parameters:

df (DataFrame) – input table containing at least algorithm and price columns; additional columns are preserved

Returns:

normalized copy with trimmed column names, normalized algorithm labels, and numeric price values

Raises:

ValueError – if required columns are missing, algorithm labels are empty, or no numeric prices are available

Return type:

DataFrame

validate_redispatch_table(df)[source]#

Validate and normalize a generic redispatch-analysis input table.

Parameters:

df (DataFrame) – input table expected to contain redispatch_algorithm, metric, and value

Returns:

normalized copy with lowercase metric labels and numeric value

Raises:

ValueError – if required columns are missing, metric values are unsupported, algorithm labels are empty, or no numeric values are available

Return type:

DataFrame

validate_welfare_table(df)[source]#

Validate and normalize a generic welfare-analysis input table.

Parameters:

df (DataFrame) – input table expected to contain power_flow_model, welfare_scope, period, and welfare

Returns:

normalized copy with lowercase scope labels, integer-like periods, and numeric welfare values

Raises:

ValueError – if required columns are missing, scope values are unsupported, model labels are empty, or period/scope combinations are inconsistent

Return type:

DataFrame