Together AI Dedicated endpoints replacement. Reserved-capacity Llama and Mixtral serving without the $2+ per-GPU-hour burn on idle H100 pods — metered strictly per token generated, zero charges when your endpoint is silent. Built on MeterCall.
This module is your starting point. Describe what you want to layer on top — an interface, extra fields, a workflow, a whole app. Watch it build in real time. ⌘/Ctrl + Enter to run.
Your module's ready — tell us what you need
Use it, host it, give it a home, or keep building. You pick.