AI Intelligence · Models

AI Model Directory

Compare 9+ AI models with benchmarks, pricing, and capabilities.Select up to 3 to compare side-by-side.

9 models·6 new this month·3 open source
#1

GPT-5

NEW HOT

OpenAI · Released 2025-05-20

Closed API

OpenAI's most capable model with expert-level reasoning across all domains.

TextCodeVisionAudioReasoningAgents
92.3
MMLU
97.1
HumanEval
91.5
MATH
88
GPQA
1M tokens context~1.5T (est.)$2.50/1M in
#2

Claude 4 Opus

NEW HOT

Anthropic · Released 2025-05-15

Closed API

Anthropic's flagship model, ideal for complex analysis and agentic tasks.

TextCodeVisionReasoningAgents
90.1
MMLU
95.2
HumanEval
89.3
MATH
86.5
GPQA
500K tokens context~200B (est.)$15.00/1M in
#3

Gemini Ultra 2

NEW HOT

Google DeepMind · Released 2025-05-10

Closed API

Google's most capable multimodal model with 2M token context and native video.

TextCodeVisionVideoAudioReasoning
91.8
MMLU
94.7
HumanEval
90.2
MATH
87.1
GPQA
2M tokens context~1T (est.)$3.50/1M in
#4

Llama 4 Scout

NEW HOT

Meta AI · Released 2025-05-05

Open Source API

Meta's largest open-weight model, fully free for commercial use.

TextCodeVisionReasoning
85.4
MMLU
88.9
HumanEval
82.1
MATH
74.3
GPQA
128K tokens context109BFree in
#5

GPT-4o

OpenAI · Released 2024-05-13

Closed API

OpenAI's flagship multimodal model, the most widely used AI in production.

TextCodeVisionAudio
88.7
MMLU
90.2
HumanEval
76.6
MATH
53.6
GPQA
128K tokens context~200B (est.)$2.50/1M in
#6

Claude 3.5 Sonnet

Anthropic · Released 2024-10-22

Closed API

Anthropic's best cost-performance model, excelling at coding and reasoning.

TextCodeVisionAgents
88.3
MMLU
93.7
HumanEval
78.3
MATH
65
GPQA
200K tokens context~70B (est.)$3.00/1M in
#7

DeepSeek R2

NEW HOT

DeepSeek · Released 2025-04-28

Open Source API

DeepSeek's reasoning powerhouse — GPT-4 quality at a fraction of the cost.

TextCodeMathReasoning
87.9
MMLU
92.1
HumanEval
91
MATH
71.5
GPQA
128K tokens context671B MoE$0.14/1M in
#8

Mistral Large 3

NEW

Mistral AI · Released 2025-04-15

Closed API

Europe's frontier model, built for enterprise with strong multilingual support.

TextCodeVisionFunction Calling
84
MMLU
89.2
HumanEval
80.1
MATH
59.3
GPQA
128K tokens context123B$2.00/1M in
#9

Qwen 2.5 72B

Alibaba Cloud · Released 2024-09-19

Open Source API

Alibaba's open-weight model, top among open-source models on multiple benchmarks.

TextCodeMathReasoning
86.1
MMLU
86.7
HumanEval
85.4
MATH
56.4
GPQA
128K tokens context72BFree in

Want detailed model comparisons?

Visit our Benchmark Center for side-by-side comparisons across all key metrics.

View Benchmark Center