Référence du Modèle Supérieur: NormScore - LiveBench
Rang | Nom du Modèle | NormScore - LiveBench | Codage Agentique | Programmation | Analyse de Données | SI | Langue | Mathématiques | Raisonnement |
---|---|---|---|---|---|---|---|---|---|
1 | o3 High | 74.606 | 74.949 | 75.962 | 69.490 | 76.516 | 73.063 | 74.015 | 76.356 |
2 | Gemini 2.5 Pro Preview | 72.032 | 64.243 | 72.894 | 74.085 | 74.120 | 70.713 | 77.248 | 71.197 |
3 | o3 Pro High | 71.355 | 48.182 | 76.056 | 71.753 | 76.250 | 77.183 | 73.792 | 76.339 |
4 | Claude 4 Opus Thinking | 70.959 | 57.104 | 72.595 | 74.724 | 71.696 | 72.186 | 76.972 | 72.991 |
5 | o3 Medium | 70.108 | 55.320 | 77.142 | 70.590 | 74.900 | 70.085 | 70.246 | 73.384 |
6 | Claude 4 Sonnet Thinking | 69.424 | 49.966 | 72.894 | 73.268 | 71.434 | 69.050 | 74.289 | 76.836 |
7 | o4-Mini High | 68.944 | 51.751 | 79.218 | 71.618 | 75.421 | 61.905 | 73.802 | 71.050 |
8 | Grok 4 | 68.111 | 35.690 | 70.630 | 72.766 | 69.365 | 73.992 | 77.438 | 78.871 |
9 | Gemini 2.5 Pro (Max Thinking) | 68.100 | 39.259 | 73.193 | 75.977 | 68.702 | 74.277 | 73.347 | 76.063 |
10 | DeepSeek R1 | 66.758 | 42.828 | 70.723 | 75.230 | 70.991 | 62.535 | 74.242 | 73.490 |
11 | Qwen 3 235B A22B Thinking 2507 | 66.338 | 24.983 | 66.570 | 80.303 | 79.824 | 68.257 | 70.803 | 73.861 |
12 | Claude 3.7 Sonnet Thinking | 65.837 | 48.182 | 72.501 | 72.887 | 72.161 | 67.337 | 68.852 | 61.399 |
13 | Gemini 2.5 Pro | 65.035 | 21.414 | 70.032 | 76.133 | 69.721 | 73.411 | 72.719 | 75.619 |
14 | Claude 4 Opus | 64.195 | 51.751 | 72.894 | 68.658 | 69.624 | 74.555 | 68.961 | 45.501 |
15 | o4-Mini Medium | 62.843 | 33.906 | 73.493 | 71.043 | 72.655 | 58.169 | 70.387 | 63.260 |
16 | Gemini 2.5 Flash | 61.392 | 30.337 | 62.922 | 73.283 | 70.648 | 57.294 | 73.286 | 63.362 |
17 | GLM 4.5 | 60.921 | 35.690 | 59.760 | 69.110 | 72.426 | 60.456 | 71.370 | 56.143 |
18 | DeepSeek R1 | 60.782 | 28.553 | 75.364 | 72.113 | 71.488 | 52.983 | 67.783 | 62.257 |
19 | Qwen 3 235B A22B Thinking | 60.487 | 17.845 | 65.784 | 71.587 | 77.895 | 57.466 | 69.657 | 62.839 |
20 | Qwen 3 235B A22B Instruct 2507 | 60.169 | 21.414 | 65.784 | 66.592 | 67.232 | 62.866 | 68.853 | 70.099 |
21 | Grok 3 Mini Beta (High) | 59.881 | 30.337 | 53.941 | 66.322 | 69.878 | 58.058 | 66.778 | 70.692 |
22 | Gemini 2.5 Flash Preview | 59.482 | 28.553 | 59.760 | 67.062 | 70.155 | 59.460 | 71.178 | 59.267 |
23 | Claude 4 Sonnet | 59.292 | 30.337 | 77.534 | 66.494 | 68.558 | 66.481 | 66.812 | 44.223 |
24 | Qwen 3 32B | 59.216 | 14.276 | 63.615 | 70.746 | 75.619 | 53.499 | 69.577 | 67.023 |
25 | Kimi K2 Instruct | 58.610 | 24.983 | 71.117 | 64.411 | 73.251 | 62.187 | 64.764 | 50.781 |
26 | Qwen 3 Coder 480B A35B Instruct | 57.812 | 41.044 | 72.501 | 65.683 | 65.841 | 61.874 | 58.574 | 44.013 |
27 | Claude 3.7 Sonnet | 56.644 | 37.475 | 73.587 | 62.046 | 67.911 | 62.992 | 56.595 | 39.594 |
28 | Qwen 3 30B A3B | 55.341 | 16.061 | 47.018 | 68.800 | 73.891 | 53.267 | 66.416 | 57.462 |
29 | GLM 4.5 Air | 55.195 | 16.061 | 57.196 | 67.766 | 69.944 | 43.315 | 68.884 | 63.138 |
30 | GPT-4.5 Preview | 54.953 | 23.199 | 75.364 | 61.407 | 64.219 | 63.265 | 59.226 | 43.858 |
31 | Gemini 2.5 Flash Lite Preview (Thinking) | 53.159 | 7.138 | 58.674 | 67.015 | 74.725 | 51.017 | 61.691 | 51.121 |
32 | Grok 3 Beta | 52.917 | 17.845 | 72.894 | 58.222 | 75.234 | 53.312 | 54.803 | 39.097 |
33 | DeepSeek V3.1 | 52.242 | 19.629 | 68.254 | 62.983 | 72.342 | 46.848 | 62.172 | 35.684 |
34 | GPT-4.1 | 51.923 | 17.845 | 72.501 | 67.676 | 68.421 | 53.717 | 54.107 | 35.807 |
35 | ChatGPT-4o | 50.466 | 17.845 | 76.748 | 67.865 | 63.885 | 49.388 | 48.278 | 39.343 |
36 | Claude 3.5 Sonnet | 49.049 | 23.199 | 73.193 | 58.321 | 61.513 | 54.939 | 44.353 | 34.809 |
37 | Qwen2.5 Max | 48.500 | 7.138 | 66.177 | 66.631 | 66.859 | 57.701 | 49.767 | 31.045 |
38 | Mistral Medium 3 | 47.116 | 19.629 | 60.938 | 58.386 | 63.403 | 44.578 | 51.975 | 33.837 |
39 | GPT-4.1 Mini | 47.108 | 10.707 | 71.416 | 61.045 | 62.427 | 37.392 | 51.032 | 43.378 |
40 | Llama 4 Maverick 17B 128E Instruct | 45.446 | 7.138 | 53.641 | 50.828 | 67.270 | 48.300 | 52.852 | 35.344 |
41 | Phi-4 Reasoning Plus | 44.732 | 5.354 | 59.966 | 54.352 | 64.955 | 29.470 | 53.944 | 46.626 |
42 | DeepSeek R1 Distill Llama 70B | 44.566 | 7.138 | 46.139 | 61.803 | 62.102 | 36.210 | 50.716 | 48.282 |
43 | GPT-4o | 43.517 | 12.492 | 68.646 | 63.805 | 57.647 | 44.349 | 36.100 | 32.007 |
44 | Gemini 2.0 Flash Lite | 43.079 | 5.354 | 58.768 | 66.827 | 68.037 | 33.537 | 47.874 | 25.985 |
45 | Command A | 42.209 | 8.922 | 53.735 | 48.160 | 73.620 | 37.328 | 39.749 | 29.299 |
46 | Hunyuan Turbos | 41.092 | 3.569 | 49.880 | 48.203 | 67.605 | 33.920 | 50.073 | 30.855 |
47 | Gemma 3 27B | 40.195 | 7.138 | 48.496 | 38.902 | 66.474 | 40.789 | 45.365 | 27.775 |
48 | Mistral Large | 39.777 | 1.784 | 62.323 | 54.269 | 60.323 | 40.829 | 37.114 | 27.296 |
49 | Qwen2.5 72B Instruct Turbo | 39.206 | 3.569 | 56.785 | 51.742 | 57.155 | 36.503 | 45.256 | 27.485 |
50 | Mistral Small | 38.119 | 12.492 | 49.189 | 53.349 | 56.529 | 34.461 | 33.536 | 29.906 |
51 | DeepSeek R1 Distill Qwen 32B | 38.043 | 5.354 | 46.532 | 50.546 | 49.431 | 29.806 | 52.060 | 35.777 |
52 | GPT-4.1 Nano | 36.338 | 7.138 | 63.314 | 44.714 | 51.123 | 29.355 | 36.876 | 28.697 |
53 | Claude 3.5 Haiku | 36.283 | 7.138 | 52.650 | 54.144 | 54.963 | 38.917 | 30.254 | 21.094 |
54 | Gemma 3 12B | 35.066 | 1.784 | 41.779 | 31.679 | 65.508 | 30.915 | 41.636 | 23.118 |
55 | Command R Plus | 28.077 | 1.784 | 26.868 | 46.999 | 51.158 | 30.622 | 19.872 | 17.444 |
56 | Gemma 3n E4B IT | 27.139 | 1.784 | 31.208 | 16.253 | 57.497 | 25.122 | 27.990 | 17.728 |
57 | Command R | 25.600 | 1.784 | 25.876 | 38.054 | 49.437 | 27.650 | 16.115 | 16.579 |
58 | Gemma 3 4B | 23.385 | 0.000 | 15.511 | 17.818 | 56.508 | 14.927 | 27.432 | 15.977 |
59 | Gemma 3n E2B IT | 21.502 | 1.784 | 16.296 | 13.375 | 51.005 | 15.083 | 22.793 | 15.898 |