<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>技术解析 on 梦兽编程</title><link>https://rexai.top/categories/%E6%8A%80%E6%9C%AF%E8%A7%A3%E6%9E%90/</link><description>Recent content in 技术解析 on 梦兽编程</description><generator>Hugo -- 0.161.1</generator><language>zh-cn</language><copyright>梦兽编程</copyright><lastBuildDate>Tue, 24 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://rexai.top/categories/%E6%8A%80%E6%9C%AF%E8%A7%A3%E6%9E%90/index.xml" rel="self" type="application/rss+xml"/><item><title>把 400B 大模型塞进 48G 内存：LLM in a Flash 背后的魔法</title><link>https://rexai.top/ai/llm/2026-03-24-apple-llm-in-flash-moe-local-inference/</link><pubDate>Tue, 24 Mar 2026 00:00:00 +0000</pubDate><guid>https://rexai.top/ai/llm/2026-03-24-apple-llm-in-flash-moe-local-inference/</guid><description>Apple 2023 年的一篇论文，让 4000 亿参数的模型跑在了普通 MacBook 上。核心技术 MoE + 量化，背后藏着一个关于&amp;#39;按需调用&amp;#39;的工程哲学。</description></item></channel></rss>