NVIDIA has revealed why Together Compute opted for the Blackwell architecture to power its DeepSeek-V4 model. According to NVIDIA, Blackwell is specifically optimized to address critical bottlenecks in long-context inference, such as KV-cache pressure during the decoding phase and MoE weight bandwidth during the prefill phase. While the announcement highlighted the capabilities of a single NVIDIA HGX B200 system, it did not include specific performance metrics or comparative data.