Run Benchmark
Test LLM vision models on hot dog classification and compare accuracy
Free vision models from OpenRouter — pick up to 4
1
2
3
4
Optional — leave blank to use the server default from .env. Get a key at openrouter.ai/keys
Images per category. Each model gets N hot dog + N not hot dog images.