Sakana AI, in collaboration with NVIDIA, has launched TwELL, an open-source sparse data format and acceleration kernels that enhance GPU efficiency by skipping ineffective computations. This innovation increases H100 inference speed by up to 30% and training speed by up to 24%, without compromising model accuracy. TwELL addresses the inefficiency in feedforward network layers of large models, where over 80% of neurons remain inactive during text generation. TwELL optimizes GPU operations by dividing data into small blocks, allowing GPUs to handle them efficiently and eliminating costly global memory operations. Tests on a 1.5-billion-parameter model showed that only 2% of neurons required computation, maintaining performance across multiple tasks. As models scale, this optimization could yield even greater performance improvements, with larger models showing a significant reduction in active neuron ratios.