Alibaba's Qwen team has launched automatic implicit caching for its Qwen3.7-Max model on Alibaba Cloud's Bailian platform, significantly reducing input costs by up to 80%. This new feature allows developers to benefit from cost savings without altering code or adding parameters. The system identifies repeated context prefixes in requests, charging only 20% of the standard rate for matched input tokens. The implicit caching is particularly beneficial for scenarios involving long texts and Agent tasks, where Qwen3.7-Max frequently processes large codebases or documents. This move comes amid competitive pricing pressures, notably from DeepSeek V4-Pro, which recently slashed its cache-hit billing to $0.003625 per million tokens. In response, Qwen3.7-Max also offers an explicit caching mode, providing even lower costs but requiring manual configuration.