VIOVIO Survey BenchmarkGitHub Pages reproducible benchmark

Lesson 06 / Evaluate

Read benchmark results

Interpret a dashboard full of TBD cells without treating missing values as scores.

Learning outcomes

  • Read TBD, N/A, Partial, and Unresolved correctly
  • Compare only validated numeric rows
  • Check the artifact paths behind a metric row

TBD is a public state

TBD means the benchmark has defined the case but has not validated the numeric output yet.

A table with TBD cells is useful because it shows the benchmark matrix, expected artifacts, and missing-data policy.

Do not average missing cases

Only validated numeric rows can enter an average.

Rows marked TBD, N/A, Partial, or Unresolved stay visible but are not silently counted.

Checklist

  • System is in the final public list
  • Dataset has a defined truth path
  • Pose and timing artifacts exist
  • EPA summary exists before a numeric value is shown

Advanced notes

A table can have both a run status and metric status. The public dashboard should make that distinction visible.

Practice task

On the Results page, pick one row and identify the pose, timing, EPA, and resource artifacts.