Eval framework with playground, scoring, and dataset management for LLM apps — without Braintrust's $249/seat/month enterprise pricing. Pay per eval run, multi-model judge included, 99% cheaper than the platform.
This module is your starting point. Describe what you want to layer on top — an interface, extra fields, a workflow, a whole app. Watch it build in real time. ⌘/Ctrl + Enter to run.
Your module's ready — tell us what you need
Use it, host it, give it a home, or keep building. You pick.