DeepSeek has launched its V4 API models, V4-Pro and V4-Flash, introducing significant price reductions and an eightfold increase in context length. The V4-Flash model replaces the previous V3.2 version without a price hike, offering cached input at 0.2 RMB per million tokens and reducing uncached input costs from 2 RMB to 1 RMB, while output costs drop from 3 RMB to 2 RMB. The context length has expanded from 128K to 1M tokens. The V4-Pro model, a new premium tier, is priced at 1 RMB for cached input, 12 RMB for uncached input, and 24 RMB for output per million tokens, reflecting a higher cost due to limited high-end compute capacity. However, prices are expected to decrease following the release of Ascend 950 super nodes later this year. Both models support non-reasoning and reasoning modes, with the latter offering high and max intensity levels. The legacy models, deepseek-chat and deepseek-reasoner, will be discontinued by July 24, 2026.