DeepSeek V4 Series
DeepSeek V4 is a high-performance large language model family designed for strong reasoning,
long-context understanding, and extremely cost-efficient AI usage.
It is widely recognized for delivering frontier-level capability at a very low price.
The series includes two models: DeepSeek V4 Flash for fast and low-cost workloads,
and DeepSeek V4 Pro for maximum performance. Both models support thinking and non-thinking modes,
tool calling, JSON output, and long-context processing up to 1M tokens.
- 1M token context length
- Supports thinking & non-thinking modes
- Tool calls and JSON structured output
- Chat prefix completion (beta)
- FIM completion (non-thinking mode only)
- Optimized for coding, reasoning, and agent workflows
Pricing
| Category | DeepSeek V4 Flash | DeepSeek V4 Pro |
|---|---|---|
| 1M Input Tokens (Cache Hit) | $0.0028 | $0.003625 / $0.0145 |
| 1M Input Tokens (Cache Miss) | $0.14 | $0.435 / $1.74 |
| 1M Output Tokens | $0.28 | $0.87 / $3.48 |
Overall, DeepSeek V4 stands out for its extremely low cost and strong performance balance,
making it one of the most cost-effective frontier AI model families available.

Ollama local LLM hosting platform