Progress summary

This page shows performance metrics for recent evals.
EvalDatasetProviderPromptPass Rate %Pass CountFail CountRaw score