Project Templates์ถ์ฒ: Show HN์กฐํ์ 8
Show HN: oMLX โ Native Mac inference server that persists KV cache to SSD
By jundot2026๋
2์ 20์ผ
**Show HN: oMLX โ Native Mac inference server that persists KV cache to SSD**
I built an open-source LLM inference server optimized for Apple Silicon. The main motivation was coding agents - tools like Claude Code send requests where the context prefix keeps shifting, invalidating KV cache. A few turns later the agent circles back, and your Mac has to re-prefill the entire context from scratch.oMLX solves this with paged SSD caching. Every KV cache block is persisted to disk. When a previous prefix returns, it's restored instantly instead of being recomputed...
---
**[devsupporter ํด์ค]**
์ด ๊ธฐ์ฌ๋ Show HN์์ ์ ๊ณตํ๋ ์ต์ ๊ฐ๋ฐ ๋ํฅ์ ๋๋ค. ๊ด๋ จ ๋๊ตฌ๋ ๊ธฐ์ ์ ๋ํด ๋ ์์๋ณด์๋ ค๋ฉด ์๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์.
I built an open-source LLM inference server optimized for Apple Silicon. The main motivation was coding agents - tools like Claude Code send requests where the context prefix keeps shifting, invalidating KV cache. A few turns later the agent circles back, and your Mac has to re-prefill the entire context from scratch.oMLX solves this with paged SSD caching. Every KV cache block is persisted to disk. When a previous prefix returns, it's restored instantly instead of being recomputed...
---
**[devsupporter ํด์ค]**
์ด ๊ธฐ์ฌ๋ Show HN์์ ์ ๊ณตํ๋ ์ต์ ๊ฐ๋ฐ ๋ํฅ์ ๋๋ค. ๊ด๋ จ ๋๊ตฌ๋ ๊ธฐ์ ์ ๋ํด ๋ ์์๋ณด์๋ ค๋ฉด ์๋ณธ ๋งํฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์.
