技术解析 on 梦兽编程

技术解析 on 梦兽编程https://rexai.top/categories/%E6%8A%80%E6%9C%AF%E8%A7%A3%E6%9E%90/Recent content in 技术解析 on 梦兽编程Hugo -- 0.161.1zh-cn梦兽编程Tue, 24 Mar 2026 00:00:00 +0000把 400B 大模型塞进 48G 内存：LLM in a Flash 背后的魔法https://rexai.top/ai/llm/2026-03-24-apple-llm-in-flash-moe-local-inference/Tue, 24 Mar 2026 00:00:00 +0000https://rexai.top/ai/llm/2026-03-24-apple-llm-in-flash-moe-local-inference/Apple 2023 年的一篇论文，让 4000 亿参数的模型跑在了普通 MacBook 上。核心技术 MoE + 量化，背后藏着一个关于'按需调用'的工程哲学。