DeepSeek's fast, economical model. Handles both chat and reasoning (thinking) modes at very low cost.
Performance
Time to first token
—ms
—vs prior 24h
Total response time
—ms
—vs prior 24h
Throughput
—tok/s
—vs prior 24h
Inter-token latency
—ms
—vs prior 24h