Dev ResourcesUpdated May 18, 2026

Free Coding Agent with NVIDIA Nemotron: Self-Host on a Cheap VPS

Run a free cloud coding agent using NVIDIA's Nemotron 3 Super (120B) and an open-source agent like OpenCode. Step-by-step setup on Hetzner, Netcup, or Contabo for under $5 a month.

Nemotron + OpenCode
Free 120B model, $5 VPS, full control
NVIDIA NemotronOpenCodeSelf-HostingVPSHetznerNetcupAI Coding AgentsFree AIOpen SourceTutorial
Cost
Under $5/mo
Time
~25 min
Steps
6
Stack
Ubuntu · OpenCode · NVIDIA NIM
What you'll get
  • Free 120B coding model via NVIDIA Nemotron
  • OpenCode (open-source Claude Code alternative) on a VPS you control
  • Always-on agent reachable from your laptop or phone via SSH tunnel
Share

A Reddit post in r/SideProject made the rounds recently with a neat trick: NVIDIA's free Nemotron 3 Super endpoint plus a hosted coding agent gets you a working "free Cursor in the cloud" in about ten minutes. The catch is the agent in that post is a paid SaaS. The model is free, the wrapper is not.

You can get the same result without the SaaS. NVIDIA's Nemotron endpoint speaks the OpenAI-compatible protocol, so any open-source coding agent that lets you set a base URL and a model name will work. This guide wires up Nemotron 3 Super to OpenCode, the most popular open-source Claude Code alternative, running on a $5 VPS you control.

If you already read my Hermes Agent on Hetzner guide, the VPS hardening section will look familiar. This post focuses on the bits that are different: the Nemotron free tier, agent choice, and a cheap-VPS comparison so you can pick a host that fits.

Step 1 of 6
2 min read

What you actually get

At a glance
Model
Free 120B Nemotron via NVIDIA
Agent
OpenCode (open source)
Server
$5 VPS handles the rest

Two free things stacked on top of one cheap thing.

  • NVIDIA Nemotron 3 Super 120B-a12b. A 120B hybrid Mamba/Transformer MoE that activates 12B parameters per token. 1M context window. Released March 2026 under NVIDIA's open license. Free tier on build.nvidia.com with rate limits in the ~40 RPM range, no per-token cost. The same model is also mirrored as a free tier on OpenRouter if you prefer to go through them.
  • OpenCode. Open-source coding agent, TUI plus IDE extensions, client/server architecture so you can run the server on the VPS and connect from your laptop or phone. Works with any OpenAI-compatible endpoint.
  • A $4 to $7 VPS. 2 vCPU and 4 GB RAM is plenty. The agent process is light. Inference is remote.
Net cost
The VPS only. Models are free, the agent is free, the API call is free.

Why not just use the SaaS in the original post?

  • Lock-in. Your prompts, history, repos, and connectors live in a third party. If they pivot or raise prices, you migrate.
  • Cost path. Free tiers on hosted agent SaaS tend to shrink once usage picks up. The model being free does not mean the wrapper stays free.
  • Privacy. A self-hosted agent on your VPS sees your code on disk. Inference still leaves the box to NVIDIA, but the rest of the pipeline is yours.
  • It is not harder. This whole setup is fewer steps than the SaaS flow once you have an SSH key.
Step 2 of 6
3 min read

Pick a VPS

At a glance
Default
Hetzner CX22
More RAM
Contabo VPS S
OS
Ubuntu 24.04

All four of these run the agent comfortably. Prices as of May 2026.

ProviderSpecPriceNotes
Hetzner CX22
Recommended
2 vCPU · 4 GB · 40 GB NVMe$4.59Best price/performance. EU and US locations. 20 TB traffic.
Netcup VPS 500 G12
2 vCPU · 4 GB · 128 GB NVMe$4.69DE only. Bigger disk, DDR5 ECC RAM.
OVHcloud VPS Starter
1 vCPU · 2 GB$4.20Cheapest entry but only 1 vCPU. Tight for anything else on the box.
Contabo VPS S
4 vCPU · 8 GB · 100 GB NVMe$6.99Most RAM and disk for the money. Shared vCPU can throttle.

For a single-tenant coding agent, Hetzner CX22 is what I would default to. If you want more headroom to run side projects on the same box, Contabo VPS S gives you double the RAM for a couple of dollars more, with the caveat that the vCPUs are shared and noisier under load.

The rest of this guide uses Ubuntu 24.04 commands, but they map cleanly to any of the four.

Step 3 of 6
2 min

Get a free NVIDIA API key

At a glance
Account
NVIDIA developer (free)
Key prefix
nvapi-
Verify
Smoke-test with curl
  1. Sign in at build.nvidia.com. A regular NVIDIA developer account is fine.
  2. Open build.nvidia.com/settings/api-keys and generate a key. It is prefixed nvapi-.
  3. Find the Nemotron 3 Super 120B-a12b entry and confirm the model ID. At time of writing it is nvidia/nemotron-3-super-120b-a12b. NVIDIA occasionally renames variants, so always read the page rather than copy-pasting from a tutorial.

The endpoint is https://integrate.api.nvidia.com/v1. It speaks the standard OpenAI chat-completions protocol.

A quick smoke test from your laptop, before touching the VPS:

Terminal · laptop
curl https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Authorization: Bearer $NVIDIA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/nemotron-3-super-120b-a12b",
    "messages": [{"role": "user", "content": "Reply with a single word: ready"}]
  }'

If that returns a JSON response with ready in the content, the key works.

Step 4 of 6
8 min

Provision and harden the VPS

At a glance
OS
Ubuntu 24.04
User
Non-root with sudo
Firewall
ufw default-deny + ssh allow

Pick a provider, deploy Ubuntu 24.04, paste your SSH public key during creation, and grab the public IP.

Terminal · vps
ssh root@YOUR-VPS-IP
apt update && apt upgrade -y
apt install -y curl git ufw

Make a non-root user and put SSH and the firewall in a reasonable shape.

Terminal · vps
adduser agent --disabled-password --gecos ""
usermod -aG sudo agent
echo "agent ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/agent
chmod 440 /etc/sudoers.d/agent

mkdir -p /home/agent/.ssh
cp ~/.ssh/authorized_keys /home/agent/.ssh/ 2>/dev/null || true
chown -R agent:agent /home/agent/.ssh
chmod 700 /home/agent/.ssh
chmod 600 /home/agent/.ssh/authorized_keys

sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw enable
Test the new user before closing this session
From a second terminal on your laptop, run ssh agent@YOUR-VPS-IP and confirm it works. If ~/.ssh/authorized_keys was not on the root account at create time, the copy above silently does nothing and the new user has no keys.

Switch over and stay on the agent account from here on.

Terminal · vps
su - agent
Optional but recommended
Install Tailscale, then drop the public SSH allow and rely on the tailnet. Cuts most of the brute-force noise.
Step 5 of 6
5 min

Install and point OpenCode at Nemotron

At a glance
Install
Single-binary script
Provider
Custom OpenAI-compatible
Verify
TUI model picker

OpenCode ships a single-binary installer.

Terminal · vps
curl -fsSL https://opencode.ai/install | bash
source ~/.bashrc
opencode --version

If you would rather run it under Node, npm i -g opencode-ai works too. Either way, confirm it is on PATH before continuing.

OpenCode reads provider config from ~/.config/opencode/opencode.json. Create it with a custom OpenAI-compatible provider entry:

~/.config/opencode/opencode.json
{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "nv": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "NVIDIA NIM",
      "options": {
        "baseURL": "https://integrate.api.nvidia.com/v1",
        "apiKey": "{env:NVIDIA_API_KEY}"
      },
      "models": {
        "nvidia/nemotron-3-super-120b-a12b": {
          "name": "Nemotron 3 Super 120B"
        }
      }
    }
  },
  "model": "nv/nvidia/nemotron-3-super-120b-a12b"
}
Why the provider key is short
OpenCode joins the provider key and the model ID with a slash. The model ID already starts with nvidia/, so a short provider key like nv keeps the final model string readable. You can name the provider anything; the model ID has to match what NVIDIA publishes.

Export the key so OpenCode can read it. Drop it into ~/.bashrc so it survives reboots.

Terminal · vps
echo 'export NVIDIA_API_KEY="nvapi-your-key-here"' >> ~/.bashrc
source ~/.bashrc
chmod 600 ~/.config/opencode/opencode.json

Start it and confirm the model is selected.

Terminal · vps
mkdir -p ~/projects && cd ~/projects
opencode

Inside the TUI, the model picker should list NVIDIA NIM › Nemotron 3 Super 120B. Pick it, ask it to write a small file, and confirm it actually performs the action. If it errors out on the call, the message is almost always a bad key or a wrong model ID.

Step 6 of 6
4 min

Run OpenCode as a persistent service

At a glance
Unit type
User-level systemd
Binding
127.0.0.1 only
Access
SSH port-forward from laptop

The point of putting this on a VPS is so you do not have to keep an SSH session open. A user-level systemd unit handles that.

Put the API key in a dedicated env file so it never lands in a world-readable unit file or journal line.

Terminal · vps
mkdir -p ~/.config/opencode
echo 'NVIDIA_API_KEY=nvapi-your-key-here' > ~/.config/opencode/opencode.env
chmod 600 ~/.config/opencode/opencode.env
~/.config/systemd/user/opencode.service
[Unit]
Description=OpenCode agent server
After=network.target

[Service]
Type=simple
WorkingDirectory=/home/agent/projects
EnvironmentFile=%h/.config/opencode/opencode.env
ExecStart=/home/agent/.local/bin/opencode serve --port 4096 --hostname 127.0.0.1
Restart=on-failure

[Install]
WantedBy=default.target

Enable it and tail the logs.

Terminal · vps
chmod 600 ~/.config/systemd/user/opencode.service
systemctl --user daemon-reload
systemctl --user enable --now opencode
journalctl --user -u opencode -f

The server now listens on 127.0.0.1:4096. Do not open that port in ufw. To reach it from your laptop, use SSH port-forwarding.

Terminal · laptop
ssh -L 4096:127.0.0.1:4096 agent@YOUR-VPS-IP

Now http://localhost:4096 on your laptop is the agent on the VPS. The OpenCode desktop app and IDE extensions both accept a remote server URL. Point them at that and they will drive the remote agent.

Never expose the raw port
If you want public access, put it behind a reverse proxy with TLS and basic auth. The agent has shell access on the box.

Sanity-check the free tier

Two things worth confirming early so you do not get a surprise mid-task.

  • Rate limits. NVIDIA's free tier is around 40 requests per minute. Coding agents can burst higher than that during a multi-file refactor. If you see 429s, slow the agent down or queue the long edits.
  • Context cost. Nemotron 3 Super has a 1M context window. The endpoint will happily accept huge prompts, but free-tier latency scales with prompt size. For agent loops, lean on smaller, scoped prompts.

If you outgrow the free tier, the same config works against the OpenRouter free Nemotron mirror (different limits, same model), or you can swap the baseURL to any other OpenAI-compatible endpoint without touching the rest of the setup.

Alternatives if OpenCode does not fit

The same NVIDIA endpoint slots into other open-source agents the same way: set baseURL, paste the key, pick the model.

  • Aider. Best if you want a git-aware, multi-file editor in your terminal rather than a full agent loop. Configure with --openai-api-base https://integrate.api.nvidia.com/v1 and --model nvidia/nemotron-3-super-120b-a12b.
  • Cline. VS Code extension. In the provider settings, choose "OpenAI Compatible", paste the base URL and key, and enter the model ID exactly.
  • OpenHands. Heavier agent with a Kubernetes mode. Same Custom Model / Base URL fields in its setup wizard.
  • Hermes Agent. If you want messaging-app integration and a learning loop rather than a coding-only agent, swap the provider in its config to NVIDIA and you get the same free model behind a Telegram bot.

What it actually costs

  • VPS: $4 to $7 per month.
  • Model inference: $0.
  • Agent: $0.

Under $7 a month for an always-on, OpenAI-compatible coding agent backed by a 120B model. If you want a hosted-model fallback for when the free tier is rate-limited, an OpenRouter top-up of $5 covers a lot of overflow.

For reference, comparable hosted setups (Cursor Pro, Claude Code Max) run $20 to $200 a month and own the workflow end to end.

Troubleshooting

  • 401 Unauthorized from NVIDIA. Key is wrong, or the Authorization: Bearer header is missing. Re-test with the curl from Step 3.
  • 404 model not found. Model ID has drifted. Open the model page on build.nvidia.com and copy the exact slug.
  • Agent hangs on long edits. You are hitting the 40 RPM ceiling. Either drop concurrency or split the task.
  • Systemd service exits immediately. Check journalctl --user -u opencode --since "5 min ago". Nine times out of ten it is a missing NVIDIA_API_KEY in the unit's environment.
  • Cannot reach the TUI from the laptop. Confirm the SSH tunnel is up and that you are pointing the desktop app at http://localhost:4096, not the VPS IP.

Sources and further reading

Disclaimer: I have no affiliation with NVIDIA, OpenCode, Hetzner, Netcup, Contabo, or OVHcloud. This is informational. Free tiers and prices change; double-check before you commit.

Share

Get the weekly recap

New dev-resource guides, top launches, and what's worth a look. One email a week.