PrismML has launched the Ternary Bonsai series of language models, featuring a 1.58-bit ternary weights technique that reduces GPU memory usage to one-ninth of a 16-bit model while maintaining high performance. The series, which includes models with 8B, 4B, and 1.7B parameters, is now open-sourced on Hugging Face and supports Apple devices natively. The 1.58-bit model constrains neural network weights to three values: {-1, 0, +1}, enhancing reasoning capabilities by eliminating redundant connections.
The Ternary Bonsai 8B model, with a weight file size of just 1.75 GB, achieves an average benchmark score of 75.5, outperforming its 1-bit predecessor and similar dense models in intelligence density. It also offers improved energy efficiency and inference speed, achieving 27 tokens per second on the iPhone 17 Pro Max with 3 to 4 times better energy efficiency. These models are distributed under the Apache 2.0 license, providing developers with high-performance AI solutions for edge devices.
PrismML Unveils Ternary Bonsai Model with 9x Fewer Parameters
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
