GitHub Trending์ถ์ฒ: GitHub Trending Daily All์กฐํ์ 24
THUDM/slime
By GitHub Trending Daily All2026๋
2์ 15์ผ
**THUDM/slime**
slime is an LLM post-training framework for RL Scaling.slime ไธญๆ็ slime is an LLM post-training framework for RL scaling, providing two core capabilities: High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang; Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines. slime is the RL-framework behind GLM-4.7, GLM-4.6, GLM-4.5 and apart from models from Z.ai, we also supports the following models: Qwen3 series (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 series; DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1); Llama 3. Blogs Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling. Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL v0.1.0 release note: v0.1.0: Redefining High-Performance RL Training Frameworks Table of Contents Architecture Overview Quick Start Projects Built with slime Arguments Walkthrough Developer Guide FAQ & Acknowledgements Architecture Overview Module Descriptions: training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training. rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer...
---
**[devsupporter ํด์ค]**
์ด ๊ธฐ์ฌ๋ GitHub Trending Daily All์์ ์ ๊ณตํ๋ ์ต์ ๊ฐ๋ฐ ๋ํฅ์ ๋๋ค. ๊ด๋ จ ๋๊ตฌ๋ ๊ธฐ์ ์ ๋ํด ๋ ์์๋ณด์๋ ค๋ฉด ์๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์.
slime is an LLM post-training framework for RL Scaling.slime ไธญๆ็ slime is an LLM post-training framework for RL scaling, providing two core capabilities: High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang; Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines. slime is the RL-framework behind GLM-4.7, GLM-4.6, GLM-4.5 and apart from models from Z.ai, we also supports the following models: Qwen3 series (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 series; DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1); Llama 3. Blogs Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling. Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL v0.1.0 release note: v0.1.0: Redefining High-Performance RL Training Frameworks Table of Contents Architecture Overview Quick Start Projects Built with slime Arguments Walkthrough Developer Guide FAQ & Acknowledgements Architecture Overview Module Descriptions: training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training. rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer...
---
**[devsupporter ํด์ค]**
์ด ๊ธฐ์ฌ๋ GitHub Trending Daily All์์ ์ ๊ณตํ๋ ์ต์ ๊ฐ๋ฐ ๋ํฅ์ ๋๋ค. ๊ด๋ จ ๋๊ตฌ๋ ๊ธฐ์ ์ ๋ํด ๋ ์์๋ณด์๋ ค๋ฉด ์๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์.