Step-by-Step Guides์ถœ์ฒ˜: DigitalOcean์กฐํšŒ์ˆ˜ 7

Designing Hardware-Aware Algorithms with Kimi Linear: Kimi Delta Attention

By Melani Maheswaran
2026๋…„ 2์›” 13์ผ
**Designing Hardware-Aware Algorithms with Kimi Linear: Kimi Delta Attention**

Introduction Moonshot AI has done it again. We were impressed with their release of Kimi-K2 and their post-training approach. Now, in addition to Kimi-K2-Thinking (which we encourage you to check out), they also released Kimi Linear, a hybrid linear attention architecture where they introduce a new attention mechanism, Kimi Delta Attention (KDA). The release features an open-source KDA kernel (written in triton), vLLM implementations, as well as the pre-trained and instruction-tuned model checkpoints (48B total parameters, 3B activated parameters, 1 million context length). In this article, we are going to discuss key findings from the Kimi Linear paper and show how you can run the model with DigitalOcean...

---

**[devsupporter ํ•ด์„ค]**

์ด ๊ธฐ์‚ฌ๋Š” DigitalOcean์—์„œ ์ œ๊ณตํ•˜๋Š” ์ตœ์‹  ๊ฐœ๋ฐœ ๋™ํ–ฅ์ž…๋‹ˆ๋‹ค. ๊ด€๋ จ ๋„๊ตฌ๋‚˜ ๊ธฐ์ˆ ์— ๋Œ€ํ•ด ๋” ์•Œ์•„๋ณด์‹œ๋ ค๋ฉด ์›๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.