The LightSeek Foundation has introduced the Shepherd Model Gateway (SMG) to tackle CPU bottlenecks in large language model (LLM) services. Launched on May 1, the SMG aims to optimize production efficiency by offloading non-GPU tasks to a Rust-based gateway. This approach minimizes CPU blocking during the inference process by establishing minimal gRPC boundaries, thereby enhancing overall service performance.