GitHub Trending출처: GitHub Trending Daily All조회수 69

THUDM/slime

By GitHub Trending Daily All

2026년 2월 15일

**THUDM/slime**

slime is an LLM post-training framework for RL Scaling.slime 中文版 slime is an LLM post-training framework for RL scaling, providing two core capabilities: High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang; Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines. slime is the RL-framework behind GLM-4.7, GLM-4.6, GLM-4.5 and apart from models from Z.ai, we also supports the following models: Qwen3 series (Qwen3Next, Qwen3MoE, Qwen3), Qwen2.5 series; DeepSeek V3 series (DeepSeek V3, V3.1, DeepSeek R1); Llama 3. Blogs Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling. Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL v0.1.0 release note: v0.1.0: Redefining High-Performance RL Training Frameworks Table of Contents Architecture Overview Quick Start Projects Built with slime Arguments Walkthrough Developer Guide FAQ & Acknowledgements Architecture Overview Module Descriptions: training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training. rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer...

---

**[devsupporter 해설]**

이 기사는 GitHub Trending Daily All에서 제공하는 최신 개발 동향입니다. 관련 도구나 기술에 대해 더 알아보시려면 원본 링크를 참고하세요.

원본 보기

목록으로 돌아가기

Welcome back

THUDM/slime

DevSupporter

Categories

게시글 정보