LLM Model Ranking Real-time Usage-based Model Rankings and Statistics
Compare and analyze actual usage and performance of LLM models from various perspectives
Weekly By Model Model Efficiency Ranking (excluding free models) TOP 10
Efficiency Ranking Indicator Guide
Efficiency rankings are calculated based on the output tokens / input tokens ratio. The lower this ratio, the more efficiently the model operates.
This metric holds particular significance in tasks such as document editing, code refactoring, and data analysis. Highly efficient models tend to accurately extract only the necessary parts from user-provided information and respond concisely, reducing unnecessary token consumption and enabling cost-effective AI utilization. However, a low efficiency ratio does not necessarily indicate better performance. Some complex tasks may require more output tokens, and when detailed explanations or extensive information provision is needed, a higher efficiency ratio might actually be preferable. Therefore, this metric should be interpreted according to the nature and purpose of the task.
Rank | Model Name | Input Tokens | Output Tokens | Efficiency Ratio |
---|---|---|---|---|
1 | meta-llama/llama-guard-4-12b | 646.82M | 838.34K | 0.0013 |
2 | perplexity/sonar-deep-research | 8.00M | 6.85M | 0.0152 |
3 | openai/codex-mini | 151.46M | 2.47M | 0.0158 |
4 | anthropic/claude-4-opus-20250522 | 16.8B | 327.79M | 0.0195 |
5 | neversleep/llama-3.1-lumimaid-8b | 1.1B | 22.05M | 0.0205 |
6 | anthropic/claude-4-sonnet-20250522 | 201.7B | 4.2B | 0.021 |
7 | qwen/qwen-plus-2025-01-25 | 195.42M | 4.11M | 0.021 |
8 | openai/o4-mini-high-2025-04-16 | 1.4B | 36.17M | 0.0224 |
9 | arcee-ai/spotlight | 12.24M | 287.41K | 0.0235 |
10 | meta-llama/llama-3.2-1b-instruct | 5.4B | 130.27M | 0.0242 |