Alibaba's Qwen team has introduced FlashQLA, a high-performance linear attention kernel designed to enhance AI processing on personal devices. Released on April 29, FlashQLA is built on TileLang and reportedly offers 2–3 times faster forward pass and twice as fast backward pass. The kernel incorporates gate-driven intra-card computation and hardware-friendly algebraic optimizations, although specific technical details and limitations remain undisclosed.