/guides/nvidia-nim

NVIDIA NIM

Configure free-claude-code to use NVIDIA NIM's free tier. Get 40 requests per minute at no cost.

NVIDIA NIM

NVIDIA NIM offers a generous free tier: 40 requests per minute with no credit card required. This is the recommended starting provider for new free-claude-code users.

Get an API key

Visit build.nvidia.com/settings/api-keys
Sign in or create an NVIDIA account
Generate a new API key
Copy the key (starts with nvapi-)

Configure the proxy

Edit your .env file:

NVIDIA_NIM_API_KEY="nvapi-your-key-here"

MODEL_OPUS=
MODEL_SONNET=
MODEL_HAIKU=
MODEL="nvidia_nim/z-ai/glm4.7"

ENABLE_THINKING=true

Model format

NVIDIA NIM models use the format:

nvidia_nim/organization/model-name

Examples:

Model	Identifier
GLM-4.7	`nvidia_nim/z-ai/glm4.7`
MiniMax-M2.5	`nvidia_nim/minimaxai/minimax-m2.5`
Qwen3.5-397B	`nvidia_nim/qwen/qwen3.5-397b-a17b`
Kimi K2.5	`nvidia_nim/moonshotai/kimi-k2.5`
Step-3.5 Flash	`nvidia_nim/stepfun-ai/step-3.5-flash`

Browse all available models at build.nvidia.com/explore/discover.

Update the model list

To refresh the local model list with the latest NIM offerings:

curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json

Per-model routing

Route different Claude model tiers to different NIM models:

MODEL_OPUS="nvidia_nim/moonshotai/kimi-k2.5"
MODEL_SONNET="nvidia_nim/qwen/qwen3.5-397b-a17b"
MODEL_HAIKU="nvidia_nim/z-ai/glm4.7"
MODEL="nvidia_nim/z-ai/glm4.7"

When Claude Code requests:

Claude 3 Opus: Routes to MODEL_OPUS
Claude 3.5 Sonnet: Routes to MODEL_SONNET
Claude 3 Haiku: Routes to MODEL_HAIKU
Any other model: Falls back to MODEL

Rate limits

The free tier provides:

40 requests per minute
Resets every 60 seconds

If you exceed the limit, NIM returns HTTP 429. The proxy handles this with exponential backoff and retries.

Configure stricter limits in .env:

PROVIDER_RATE_LIMIT=35
PROVIDER_RATE_WINDOW=60
PROVIDER_MAX_CONCURRENCY=3

Proxy configuration

If you need to route NIM requests through a proxy:

NVIDIA_NIM_PROXY="http://username:password@host:port"

Supports http:// and socks5:// protocols.

Troubleshooting

“Invalid API key” errors: Verify your key starts with nvapi- and has not expired. Generate a new key if needed.

“Model not found” errors: Check the exact model name in the NIM catalog. Organization and model names are case-sensitive.

Frequent 429 errors: You are hitting the rate limit. The proxy will retry with backoff, but you may want to reduce PROVIDER_RATE_LIMIT or add PROVIDER_MAX_CONCURRENCY to throttle requests.

Slow responses: NIM free tier may have higher latency during peak hours. Consider OpenRouter paid models or local inference for time-sensitive work.

Recommended models

Use Case	Model	Notes
General coding	GLM-4.7	Good balance of speed and capability
Complex reasoning	Kimi K2.5	Strong on long-context tasks
Fast responses	Step-3.5 Flash	Lower latency for simple queries
Creative writing	Qwen3.5-397B	Strong narrative capabilities