Where randomness meets reason
Tag
2 posts
AMD 780M iGPU + 64GB DDR5 runs Gemma 4 28B at 19.5 tok/s. Setup guide, benchmarks, and cost breakdown vs. Mac Mini for local LLM inference under $600.
Ollama, LM Studio, omlx, llama.cpp, MLX-LM, vMLX — compared on the specific requirements of local agent workloads on Apple Silicon.