Mercury Check detailed information and pricing for AI models
Context Length 128,000 tokens, inception from provided
128,000
Context Tokens
$0.25
Prompt Price
$1.00
Output Price
6/16
Feature Support
Model Overview
Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the blog post here.
Basic Information
Developer
inception
Model Series
Other
Release Date
2025-06-26
Context Length
128,000 tokens
Max Completion Tokens
16,384 tokens
Variant
standard
Pricing Information
Prompt Tokens
$0.25 / 1M tokens
Completion Tokens
$1.00 / 1M tokens
Data Policy
Supported Features
Supported (6)
Top K
Frequency Penalty
Presence Penalty
Response Format
Tool Usage
Structured Outputs
Unsupported (10)
Image Input
Seed
Repetition Penalty
Min P
Logit Bias
Logprobs
Top Logprobs
Reasoning
Web Search Options
Top A
Actual Usage Statistics
#161
Out of 353 total models
1.3B
Total Tokens Last 30 Days
43.89M
Daily Average Usage
41%
Weekly Usage Change