AutoMix Selection
AutoMix optimizes the trade-off between response quality and cost using a cascading approach with self-verification. It starts with cheaper models and escalates to more expensive ones only when confidence is low.
This approach can achieve >50% cost reduction while maintaining comparable performance (AutoMix, Madaan et al., NeurIPS 2024).