AI Agent memory infrastructure toolkit - Ghost + Memory Engine + PostgreSQL

Your AI Agent Can Think, But It Can't Remember

AI agents can reason, plan, and converse—but forget everything once the session ends. The Ghost project solves this with a pure PostgreSQL-based infrastructure, turning the database into the agent’s memory palace.

March 26, 2026 · 6 min · 1196 words · Dream Beast Programming
Mac mini connected to SSD freezer and DRAM fridge, illustrating the layered architecture of LLM in a Flash

Cramming a 400B Model into 48GB: The Magic Behind LLM in a Flash

An Apple paper from 2023 made it possible to run a 400 billion parameter model on an ordinary MacBook. The core technologies—MoE and quantization—hide an engineering philosophy built around on-demand loading.

March 24, 2026 · 5 min · 857 words · Dream Beast Programming
oMLX runs local LLMs on Mac Apple Silicon, dramatically outperforming Ollama with TTFT dropping from 90s to 1-3s

90 Seconds of Waiting, Gone: How oMLX Buries Ollama on Mac

oMLX is built for Apple Silicon, using the MLX framework, SSD-backed KV cache, and continuous batching to cut TTFT from 90 seconds to 1-3 seconds in long-context scenarios, comprehensively outperforming Ollama.

March 23, 2026 · 6 min · 1133 words · Mengshou Programming
Ramp AI Agent Enterprise Finance Automation: One Agent + A Thousand Skills

Don't Build a Thousand Agents: How Ramp Automates Finance with One Agent

Ramp, America’s fastest-growing enterprise finance platform valued at $32B with 50,000+ customers and $100B+ in annual transaction volume, chose a ‘one Agent + a thousand skills’ architecture over building many agents. This is a deep dive into Ramp’s AI实战经验.

March 19, 2026 · 17 min · 3428 words · 梦兽编程
High‑value AI Toolkit Less than a coffee/month →
扫码关注公众号
微信公众号二维码

Weekly Rust / AI tips · Community Q&A · Exclusive perks