Project Templates์ถœ์ฒ˜: Show HN์กฐํšŒ์ˆ˜ 8

Show HN: oMLX โ€“ Native Mac inference server that persists KV cache to SSD

By jundot
2026๋…„ 2์›” 20์ผ
**Show HN: oMLX โ€“ Native Mac inference server that persists KV cache to SSD**

I built an open-source LLM inference server optimized for Apple Silicon. The main motivation was coding agents - tools like Claude Code send requests where the context prefix keeps shifting, invalidating KV cache. A few turns later the agent circles back, and your Mac has to re-prefill the entire context from scratch.oMLX solves this with paged SSD caching. Every KV cache block is persisted to disk. When a previous prefix returns, it's restored instantly instead of being recomputed...

---

**[devsupporter ํ•ด์„ค]**

์ด ๊ธฐ์‚ฌ๋Š” Show HN์—์„œ ์ œ๊ณตํ•˜๋Š” ์ตœ์‹  ๊ฐœ๋ฐœ ๋™ํ–ฅ์ž…๋‹ˆ๋‹ค. ๊ด€๋ จ ๋„๊ตฌ๋‚˜ ๊ธฐ์ˆ ์— ๋Œ€ํ•ด ๋” ์•Œ์•„๋ณด์‹œ๋ ค๋ฉด ์›๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.