Where randomness meets reason
Tag
2 posts
AMD 780M iGPU + 64GB DDR5 runs Gemma 4 28B at 19.5 tok/s. Setup guide, benchmarks, and cost breakdown vs. Mac Mini for local LLM inference under $600.
omlx: macOS-native LLM server for Apple Silicon with SSD KV caching that cuts cold-start prefill from 90s to under 5s. Complete RAG customer support chatbot tutorial included.