/

Overview

Use Claude Code CLI and VSCode for free by routing Anthropic API calls to free or low-cost LLM providers.

Overview

Use Claude Code CLI and VSCode for free. No Anthropic API key required.

free-claude-code is a lightweight proxy that intercepts Claude Code’s API calls and routes them to alternative LLM providers. You keep the same Claude Code interface, commands, and workflow—just with different backends that cost less or nothing at all.

How it works

┌─────────────────┐        ┌──────────────────────┐        ┌──────────────────┐
│  Claude Code    │───────>│  free-claude-code    │───────>│  LLM Provider    │
│  CLI / VSCode   │<───────│  Proxy (:8082)       │<───────│  NIM / OR / LMS  │
└─────────────────┘        └──────────────────────┘        └──────────────────┘
   Anthropic API                                             Native Anthropic
   format (SSE)                                             or OpenAI chat SSE

Claude Code sends standard Anthropic API requests. The proxy forwards them to your configured provider, translates responses back into Anthropic format, and streams results to Claude Code in real time.

What you can do with it

  • Run Claude Code for free: NVIDIA NIM gives you 40 requests per minute at no cost. OpenRouter has hundreds of free models. LM Studio and llama.cpp run locally with zero API costs.
  • Mix providers per model: Route Opus requests to one provider, Sonnet to another, Haiku to a third. Use the best model for each task without switching tools.
  • Use local models for privacy: Run everything offline with LM Studio or llama.cpp—no data leaves your machine.
  • Control costs explicitly: Set rate limits, concurrency caps, and per-provider timeouts to avoid surprise bills.
  • Deploy as a Discord or Telegram bot: Remote autonomous coding with tree-based threading, session persistence, and live progress streaming.

Supported providers

ProviderCostRate LimitBest For
NVIDIA NIMFree40 req/minDaily driver, generous free tier
OpenRouterFree / PaidVariesModel variety, fallback options
DeepSeekUsage-basedVariesDirect access to DeepSeek chat/reasoner
LM StudioFree (local)UnlimitedPrivacy, offline use, no rate limits
llama.cppFree (local)UnlimitedLightweight local inference engine

Key features

FeatureWhat it does
Zero Cost40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio.
Drop-in ReplacementSet 2 environment variables. No modifications to Claude Code CLI or VSCode extension needed.
Per-Model MappingRoute Opus / Sonnet / Haiku to different models and providers. Mix freely.
Thinking Token SupportParses <thinking> tags and reasoning_content into native Claude thinking blocks.
Request Optimization5 categories of trivial API calls intercepted locally, saving quota and latency.
Smart Rate LimitingProactive rolling-window throttle plus reactive 429 exponential backoff.
Voice NotesSend voice messages on Discord or Telegram; transcribed and processed as prompts.

Start here

If you are new to free-claude-code, start with the shortest useful path:

  1. Quick start: Get the proxy running with NVIDIA NIM in under 5 minutes.
  2. Install guide: Set up uv, clone the repo, and configure your first provider.
  3. Provider guides: Learn the specific setup for NVIDIA NIM, OpenRouter, or local models.
  4. Environment variables: Reference for all configuration options.
  5. Troubleshooting: Fix common connection, authentication, and model mapping issues.