Sapient Intelligence has open-sourced its HRM-Text model, a 1 billion parameter text generation model based on the Hierarchical Reasoning Model (HRM) architecture. This innovative model significantly reduces pre-training costs by 130 to 600 times compared to traditional models, achieving training with only 40 billion structured tokens. The model can be trained from scratch in about 46 hours using two 8-GPU H100 servers, costing approximately $1,472.
The HRM-Text model features a dual-timescale recurrent design, utilizing two sets of Transformer modules that alternate on the same input batch, allowing for dynamic computational depth extension. This design facilitates low-cost validation of model theories previously hindered by high computational expenses. The open-source release includes the complete engineering framework, although the model's weights are pre-trained and unaligned, limiting its use to prefix continuation tasks.
Sapient Open-Sources Cost-Effective 1B-Parameter HRM-Text Model
Disclaimer: The content provided on Phemex News is for informational purposes only. We do not guarantee the quality, accuracy, or completeness of the information sourced from third-party articles. The content on this page does not constitute financial or investment advice. We strongly encourage you to conduct you own research and consult with a qualified financial advisor before making any investment decisions.
