Evaluation walkthrough¶
Once a model is trained, the right question isn't just "what's the MSE?" but "do the predictions satisfy power-flow physics?" This notebook:
- Loads a saved checkpoint
- Runs inference on a held-out batch
- Computes voltage, generation, and power-balance violations with
ACOPFConstraintEvaluator - Compares network-level violation statistics to the solver targets
import torch
from lumina.dataset.opf.opf_dataset import OPFDataset
from lumina.dataset.opf.transforms import to_float32
from lumina.loader.opf.opf_loader import DataLoader
from lumina.evaluator.opf.evaluator import ACOPFConstraintEvaluator
from lumina.evaluator.opf.utils import Modeler
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
DATA_ROOT = '/path/to/datasets'
CKPT_PATH = 'case14_quickstart.pt' # produced by 01_quickstart_case14.ipynb
1. Load data and checkpoint¶
Prerequisite: run 01_quickstart_case14.ipynb first — its last cell saves case14_quickstart.pt in the working directory, which Modeler.load_model_from_training_checkpoint expects to consume here.
ds = OPFDataset(
root=DATA_ROOT,
case_name='pglib_opf_case14_ieee',
group_id=0,
transform=to_float32, # cast features to float32 to match model weights
)
loader = DataLoader(ds[-128:], batch_size=32) # last 128 as held-out
# Reconstruct the trained model from NB01's checkpoint via Modeler.
modeler = Modeler(device=torch.device(DEVICE))
model = modeler.load_model_from_training_checkpoint(CKPT_PATH)
2. Run inference¶
preds, gts, batches = [], [], []
with torch.no_grad():
for batch in loader:
batch = batch.to(DEVICE)
out = model(batch.x_dict, batch.edge_index_dict, batch.edge_attr_dict)
preds.append({k: v.detach().cpu() for k, v in out.items()})
gts.append({k: v.detach().cpu() for k, v in batch.y_dict.items()})
batches.append(batch.cpu())
print(f'evaluated {len(loader.dataset)} samples in {len(batches)} batches')
3. Constraint evaluation¶
ACOPFConstraintEvaluator checks bound, balance, and thermal-limit constraints. It returns per-sample violations as well as RMS-aggregated metrics.
evaluator = ACOPFConstraintEvaluator()
metrics = []
for batch, pred in zip(batches, preds):
m = evaluator.evaluate(batch, pred)
metrics.append(m)
import numpy as np
agg = {k: float(np.mean([m[k] for m in metrics])) for k in metrics[0]}
for k, v in agg.items():
print(f'{k:30s} {v:.4e}')
4. Compare against solver targets¶
The OPFData targets came from a numerical solver, so they should satisfy the constraints (up to solver tolerance). Running the evaluator on the targets gives the lower bound on what's achievable.
metrics_solver = []
for batch, gt in zip(batches, gts):
metrics_solver.append(evaluator.evaluate(batch, gt))
agg_solver = {k: float(np.mean([m[k] for m in metrics_solver])) for k in metrics_solver[0]}
print(f'{"metric":30s} {"model":>12s} {"solver":>12s} ratio')
for k in agg:
s = agg_solver[k]
ratio = (agg[k] / s) if s > 0 else float('inf')
print(f'{k:30s} {agg[k]:12.4e} {s:12.4e} {ratio:6.1f}x')
Ratios near 1 mean the model is hitting solver-quality feasibility. Large ratios for p_balance_rms or q_balance_rms are a signal that the model is leaving real- or reactive-power imbalance on the table — usually fixable with a physics-informed loss term (see notebook 4).