Let’s be blunt: this hack pays for itself. You can get the “world’s fastest” coding model behind Claude Code without paying subscription fees. The idea is simple—deploy a gateway via run.claw.cloud, back it with Cerebras’ free credits, and expose it as an OpenAI-compatible endpoint so your tools work out of the box.
Why do this? Speed and cost. Cerebras brings the inference horsepower—it’s like plugging into an ultra-fast supercharger. Claude Code provides the driving experience. Give it a clean, fast fuel line, and it flies. What used to be expensive or slow can now be both quick and affordable.
No deep Node knowledge required. If you can copy-paste in a terminal, you’re good.
Environment setup (Claude Code + Cerebras + Cloudflare Workers)
On macOS, the easiest way to install Node is Homebrew:
brew install node
npm install -g @anthropic-ai/claude-code
On Windows, install Ubuntu via WSL, then install Node and the CLI inside Ubuntu:
sudo apt update && sudo apt install -y nodejs npm
npm install -g @anthropic-ai/claude-code
Three steps to an OpenAI-compatible route (run.claw.cloud + Cerebras)
Step one: sign up at cerebras.ai and generate a free API key from the console. It’s OpenAI-compatible and includes the Qwen-3-Coder-480B family for high-speed coding.
Rate limits: up to 30 requests/min and 900 requests/hour.
Step two: open https://console.run.claw.cloud/signin?link=IY4OLFYXS3WY
, log in, search the app store for new-api
, and deploy it with one click. You’ll get a domain like your-app.run.claw.cloud
. On first visit, set the admin username and password (some environments cap username length—shorter is safer).
Step three: in your new admin panel, go to Channel Management. Add a channel, any name you like, type = OpenAI. Base URL = https://api.cerebras.ai
. Key = the API key you just created. Save and enable it. From now on, any OpenAI-style request through your gateway will be securely forwarded to Cerebras.
Optional: deploy claude-worker-proxy (extra speed and stability)
- Register Cloudflare and install Wrangler:
https://developers.cloudflare.com/workers/wrangler
- Clone the open-source project:
https://github.com/glidea/claude-worker-proxy
git clone https://github.com/glidea/claude-worker-proxy
cd claude-worker-proxy
npm install
wrangler login
- Run
npm run deploycf
and note your Worker URL:https://claude-worker-proxy.your-subdomain.workers.dev
.
Wire up Claude Code (custom OpenAI-compatible endpoint)
Claude Code can point to any OpenAI-compatible API. Just set the base URL to your gateway and keep working as usual—it’s like adding a high-grade water filter: Claude Code doesn’t care where the water comes from, as long as it’s clean and pressurized.
~/.claude/settings.json
{
"env": {
"ANTHROPIC_BASE_URL": "https://claude-worker-proxy.your-subdomain.workers.dev/openai/https://your-app.run.claw.cloud/v1",
"ANTHROPIC_API_KEY": "sk-your-new-api-token",
"ANTHROPIC_MODEL": "qwen-3-coder-480b",
"ANTHROPIC_SMALL_FAST_MODEL": "gpt-oss-120b",
"API_TIMEOUT_MS": "300000"
}
}
If you want to smoke-test the route from the terminal, try a standard Chat Completions call. Replace domain, key, and model as needed:
curl https://your-app.run.claw.cloud/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_PROXY_KEY_OR_CEREBRAS_KEY" \
-d '{
"model": "qwen-3-coder-480b",
"messages": [
{"role": "user", "content": "Write a Node script to list a directory with proper error handling"}
]
}'
If you get a valid JSON response, your route is good. Use the same base URL and key in your editor or the Claude Code CLI and you’re set.
Troubleshooting and performance tips
401/403 typically means a wrong key, missing /v1
in the URL, or a misspelled model. Fix those first. For long generations, raise client timeouts a bit. Ramp up concurrency gradually—treat it like warming up an engine, not flooring it from a cold start.
Why this setup is worth it
You get Claude Code’s ergonomics plus Cerebras’s inference speed. Cost-wise, you shift “pay per call” into sensible free-credit usage—plenty for day-to-day coding. The bigger win: it’s a swappable fuel line. Want OpenRouter later? Another OpenAI-compatible service? Same wiring, no tooling rewrite.
Follow 梦兽编程微信公众号,解锁更多黑科技