/guides/nvidia-nim
NVIDIA NIM
Configure free-claude-code to use NVIDIA NIM's free tier. Get 40 requests per minute at no cost.
NVIDIA NIM
NVIDIA NIM offers a generous free tier: 40 requests per minute with no credit card required. This is the recommended starting provider for new free-claude-code users.
Get an API key
- Visit build.nvidia.com/settings/api-keys
- Sign in or create an NVIDIA account
- Generate a new API key
- Copy the key (starts with
nvapi-)
Configure the proxy
Edit your .env file:
NVIDIA_NIM_API_KEY="nvapi-your-key-here"
MODEL_OPUS=
MODEL_SONNET=
MODEL_HAIKU=
MODEL="nvidia_nim/z-ai/glm4.7"
ENABLE_THINKING=true
Model format
NVIDIA NIM models use the format:
nvidia_nim/organization/model-name
Examples:
| Model | Identifier |
|---|---|
| GLM-4.7 | nvidia_nim/z-ai/glm4.7 |
| MiniMax-M2.5 | nvidia_nim/minimaxai/minimax-m2.5 |
| Qwen3.5-397B | nvidia_nim/qwen/qwen3.5-397b-a17b |
| Kimi K2.5 | nvidia_nim/moonshotai/kimi-k2.5 |
| Step-3.5 Flash | nvidia_nim/stepfun-ai/step-3.5-flash |
Browse all available models at build.nvidia.com/explore/discover.
Update the model list
To refresh the local model list with the latest NIM offerings:
curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
Per-model routing
Route different Claude model tiers to different NIM models:
MODEL_OPUS="nvidia_nim/moonshotai/kimi-k2.5"
MODEL_SONNET="nvidia_nim/qwen/qwen3.5-397b-a17b"
MODEL_HAIKU="nvidia_nim/z-ai/glm4.7"
MODEL="nvidia_nim/z-ai/glm4.7"
When Claude Code requests:
- Claude 3 Opus: Routes to
MODEL_OPUS - Claude 3.5 Sonnet: Routes to
MODEL_SONNET - Claude 3 Haiku: Routes to
MODEL_HAIKU - Any other model: Falls back to
MODEL
Rate limits
The free tier provides:
- 40 requests per minute
- Resets every 60 seconds
If you exceed the limit, NIM returns HTTP 429. The proxy handles this with exponential backoff and retries.
Configure stricter limits in .env:
PROVIDER_RATE_LIMIT=35
PROVIDER_RATE_WINDOW=60
PROVIDER_MAX_CONCURRENCY=3
Proxy configuration
If you need to route NIM requests through a proxy:
NVIDIA_NIM_PROXY="http://username:password@host:port"
Supports http:// and socks5:// protocols.
Troubleshooting
“Invalid API key” errors: Verify your key starts with nvapi- and has not expired. Generate a new key if needed.
“Model not found” errors: Check the exact model name in the NIM catalog. Organization and model names are case-sensitive.
Frequent 429 errors: You are hitting the rate limit. The proxy will retry with backoff, but you may want to reduce PROVIDER_RATE_LIMIT or add PROVIDER_MAX_CONCURRENCY to throttle requests.
Slow responses: NIM free tier may have higher latency during peak hours. Consider OpenRouter paid models or local inference for time-sensitive work.
Recommended models
| Use Case | Model | Notes |
|---|---|---|
| General coding | GLM-4.7 | Good balance of speed and capability |
| Complex reasoning | Kimi K2.5 | Strong on long-context tasks |
| Fast responses | Step-3.5 Flash | Lower latency for simple queries |
| Creative writing | Qwen3.5-397B | Strong narrative capabilities |