
Zhipu.AI General Language Models advancements
The AI landscape is witnessing transformative changes, highlighted by Zhipu.AI’s significant open-sourcing of its advanced General Language Models (GLM) and Kuaishou’s innovative approach to reinforcement learning with its SRPO framework, including AI advancements applications. These developments not only demonstrate technological prowess but also set the stage for enhanced capabilities in AI applications across various domains.
Zhipu.AI GLM models IPO advancements
Zhipu.AI, based in Beijing, has announced the open-sourcing of its next-generation GLM models, including the GLM-4 series and the GLM-Z1 inference models. This strategic move positions Zhipu as a competitive player in the global AI market, emphasizing its ambition for a potential initial public offering (IPO) in the context of AI advancements, especially regarding General Language Models.
The GLM-Z1 model reportedly achieves inference speeds up to eight times faster than its predecessor, DeepSeek-R1, by utilizing optimized parameters and advanced sampling techniques. This advancement enables it to process 200 tokens per second on consumer-grade GPUs, significantly outpacing human reading speed, which stands at approximately 300 words per minute (Wikipedia, 2023).

Zhipu.AI autonomous AI advancements
Zhipu.AI’s focus on the GLM-Z1-Rumination model further illustrates its commitment to creating more autonomous AI agents in the context of AI advancements, including General Language Models applications. This model’s capabilities include actively searching the internet, employing tools, and conducting in-depth analyses—an evolution from traditional reactive AI models.
The open-source initiative also includes smaller, 9B parameter versions of the GLM-4 and GLM-Z1, catering to environments with limited resources while maintaining impressive performance in mathematical reasoning and general tasks.
SRPO reinforcement learning framework
Meanwhile, the Kwaipilot team at Kuaishou has introduced a novel reinforcement learning framework called Two-Staged history-Resampling Policy Optimization (SRPO). This approach addresses the challenges faced by traditional GRPO training methods, which often struggle with performance bottlenecks and sample utilization inefficiencies in the context of AI advancements, especially regarding Zhipu, including General Language Models applications in the context of Zhipu.AI.AI.
The SRPO framework has achieved remarkable results, such as surpassing DeepSeek-R1-Zero-level performance in both mathematical reasoning and coding tasks while requiring only one-tenth of the training steps necessary for traditional methods.