Tether Launches BitNet LoRA Framework for Consumer Devices

Tether has introduced a cross-platform BitNet LoRA fine-tuning framework within QVAC Fabric, designed to optimize the training and inference of Microsoft BitNet (1-bit LLM) models. This innovation allows for the training and fine-tuning of billion-parameter models on consumer devices such as laptops, consumer-grade GPUs, and smartphones. Notably, it enables BitNet models to be fine-tuned on mobile GPUs, including Adreno, Mali, and Apple Bionic, marking a significant advancement in mobile AI capabilities. The framework supports heterogeneous hardware, including Intel, AMD, and Apple Silicon, and is the first to facilitate 1-bit LLM LoRA fine-tuning on non-NVIDIA devices. Performance tests indicate that BitNet model inference on mobile GPUs is 2 to 11 times faster than on CPUs, with VRAM usage reduced by up to 77.8% compared to traditional 16-bit models. Tether highlights that this technology could reduce dependency on high-end computational power and cloud infrastructure, promoting decentralization and localization in AI training.

You may also like