A Reddit post in r/SideProject made the rounds recently with a neat trick: NVIDIA's free Nemotron 3 Super endpoint plus a hosted coding agent gets you a working "free Cursor in the cloud" in about ten minutes. The catch is the agent in that post is a paid SaaS. The model is free, the wrapper is not.
You can get the same result without the SaaS. NVIDIA's Nemotron endpoint speaks the OpenAI-compatible protocol, so any open-source coding agent that lets you set a base URL and a model name will work. This guide wires up Nemotron 3 Super to OpenCode, the most popular open-source Claude Code alternative, running on a $5 VPS you control.
If you already read my Hermes Agent on Hetzner guide, the VPS hardening section will look familiar. This post focuses on the bits that are different: the Nemotron free tier, agent choice, and a cheap-VPS comparison so you can pick a host that fits.
What you actually get
- Model
- Free 120B Nemotron via NVIDIA
- Agent
- OpenCode (open source)
- Server
- $5 VPS handles the rest
Two free things stacked on top of one cheap thing.
- NVIDIA Nemotron 3 Super 120B-a12b. A 120B hybrid Mamba/Transformer MoE that activates 12B parameters per token. 1M context window. Released March 2026 under NVIDIA's open license. Free tier on build.nvidia.com with rate limits in the ~40 RPM range, no per-token cost. The same model is also mirrored as a free tier on OpenRouter if you prefer to go through them.
- OpenCode. Open-source coding agent, TUI plus IDE extensions, client/server architecture so you can run the server on the VPS and connect from your laptop or phone. Works with any OpenAI-compatible endpoint.
- A $4 to $7 VPS. 2 vCPU and 4 GB RAM is plenty. The agent process is light. Inference is remote.
Why not just use the SaaS in the original post?
- Lock-in. Your prompts, history, repos, and connectors live in a third party. If they pivot or raise prices, you migrate.
- Cost path. Free tiers on hosted agent SaaS tend to shrink once usage picks up. The model being free does not mean the wrapper stays free.
- Privacy. A self-hosted agent on your VPS sees your code on disk. Inference still leaves the box to NVIDIA, but the rest of the pipeline is yours.
- It is not harder. This whole setup is fewer steps than the SaaS flow once you have an SSH key.
Pick a VPS
- Default
- Hetzner CX22
- More RAM
- Contabo VPS S
- OS
- Ubuntu 24.04
All four of these run the agent comfortably. Prices as of May 2026.
| Provider | Spec | Price | Notes |
|---|---|---|---|
Hetzner CX22 Recommended | 2 vCPU · 4 GB · 40 GB NVMe | $4.59 | Best price/performance. EU and US locations. 20 TB traffic. |
Netcup VPS 500 G12 | 2 vCPU · 4 GB · 128 GB NVMe | $4.69 | DE only. Bigger disk, DDR5 ECC RAM. |
OVHcloud VPS Starter | 1 vCPU · 2 GB | $4.20 | Cheapest entry but only 1 vCPU. Tight for anything else on the box. |
Contabo VPS S | 4 vCPU · 8 GB · 100 GB NVMe | $6.99 | Most RAM and disk for the money. Shared vCPU can throttle. |
For a single-tenant coding agent, Hetzner CX22 is what I would default to. If you want more headroom to run side projects on the same box, Contabo VPS S gives you double the RAM for a couple of dollars more, with the caveat that the vCPUs are shared and noisier under load.
The rest of this guide uses Ubuntu 24.04 commands, but they map cleanly to any of the four.
Get a free NVIDIA API key
- Account
- NVIDIA developer (free)
- Key prefix
- nvapi-
- Verify
- Smoke-test with curl
- Sign in at build.nvidia.com. A regular NVIDIA developer account is fine.
- Open build.nvidia.com/settings/api-keys and generate a key. It is prefixed
nvapi-. - Find the Nemotron 3 Super 120B-a12b entry and confirm the model ID. At time of writing it is
nvidia/nemotron-3-super-120b-a12b. NVIDIA occasionally renames variants, so always read the page rather than copy-pasting from a tutorial.
The endpoint is https://integrate.api.nvidia.com/v1. It speaks the standard OpenAI chat-completions protocol.
A quick smoke test from your laptop, before touching the VPS:
curl https://integrate.api.nvidia.com/v1/chat/completions \
-H "Authorization: Bearer $NVIDIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "nvidia/nemotron-3-super-120b-a12b",
"messages": [{"role": "user", "content": "Reply with a single word: ready"}]
}'If that returns a JSON response with ready in the content, the key works.
Provision and harden the VPS
- OS
- Ubuntu 24.04
- User
- Non-root with sudo
- Firewall
- ufw default-deny + ssh allow
Pick a provider, deploy Ubuntu 24.04, paste your SSH public key during creation, and grab the public IP.
ssh root@YOUR-VPS-IP
apt update && apt upgrade -y
apt install -y curl git ufwMake a non-root user and put SSH and the firewall in a reasonable shape.
adduser agent --disabled-password --gecos ""
usermod -aG sudo agent
echo "agent ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/agent
chmod 440 /etc/sudoers.d/agent
mkdir -p /home/agent/.ssh
cp ~/.ssh/authorized_keys /home/agent/.ssh/ 2>/dev/null || true
chown -R agent:agent /home/agent/.ssh
chmod 700 /home/agent/.ssh
chmod 600 /home/agent/.ssh/authorized_keys
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw enablessh agent@YOUR-VPS-IP and confirm it works. If ~/.ssh/authorized_keys was not on the root account at create time, the copy above silently does nothing and the new user has no keys.Switch over and stay on the agent account from here on.
su - agentInstall and point OpenCode at Nemotron
- Install
- Single-binary script
- Provider
- Custom OpenAI-compatible
- Verify
- TUI model picker
OpenCode ships a single-binary installer.
curl -fsSL https://opencode.ai/install | bash
source ~/.bashrc
opencode --versionIf you would rather run it under Node, npm i -g opencode-ai works too. Either way, confirm it is on PATH before continuing.
OpenCode reads provider config from ~/.config/opencode/opencode.json. Create it with a custom OpenAI-compatible provider entry:
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"nv": {
"npm": "@ai-sdk/openai-compatible",
"name": "NVIDIA NIM",
"options": {
"baseURL": "https://integrate.api.nvidia.com/v1",
"apiKey": "{env:NVIDIA_API_KEY}"
},
"models": {
"nvidia/nemotron-3-super-120b-a12b": {
"name": "Nemotron 3 Super 120B"
}
}
}
},
"model": "nv/nvidia/nemotron-3-super-120b-a12b"
}nvidia/, so a short provider key like nv keeps the final model string readable. You can name the provider anything; the model ID has to match what NVIDIA publishes.Export the key so OpenCode can read it. Drop it into ~/.bashrc so it survives reboots.
echo 'export NVIDIA_API_KEY="nvapi-your-key-here"' >> ~/.bashrc
source ~/.bashrc
chmod 600 ~/.config/opencode/opencode.jsonStart it and confirm the model is selected.
mkdir -p ~/projects && cd ~/projects
opencodeInside the TUI, the model picker should list NVIDIA NIM › Nemotron 3 Super 120B. Pick it, ask it to write a small file, and confirm it actually performs the action. If it errors out on the call, the message is almost always a bad key or a wrong model ID.
Run OpenCode as a persistent service
- Unit type
- User-level systemd
- Binding
- 127.0.0.1 only
- Access
- SSH port-forward from laptop
The point of putting this on a VPS is so you do not have to keep an SSH session open. A user-level systemd unit handles that.
Put the API key in a dedicated env file so it never lands in a world-readable unit file or journal line.
mkdir -p ~/.config/opencode
echo 'NVIDIA_API_KEY=nvapi-your-key-here' > ~/.config/opencode/opencode.env
chmod 600 ~/.config/opencode/opencode.env[Unit]
Description=OpenCode agent server
After=network.target
[Service]
Type=simple
WorkingDirectory=/home/agent/projects
EnvironmentFile=%h/.config/opencode/opencode.env
ExecStart=/home/agent/.local/bin/opencode serve --port 4096 --hostname 127.0.0.1
Restart=on-failure
[Install]
WantedBy=default.targetEnable it and tail the logs.
chmod 600 ~/.config/systemd/user/opencode.service
systemctl --user daemon-reload
systemctl --user enable --now opencode
journalctl --user -u opencode -fThe server now listens on 127.0.0.1:4096. Do not open that port in ufw. To reach it from your laptop, use SSH port-forwarding.
ssh -L 4096:127.0.0.1:4096 agent@YOUR-VPS-IPNow http://localhost:4096 on your laptop is the agent on the VPS. The OpenCode desktop app and IDE extensions both accept a remote server URL. Point them at that and they will drive the remote agent.
Sanity-check the free tier
Two things worth confirming early so you do not get a surprise mid-task.
- Rate limits. NVIDIA's free tier is around 40 requests per minute. Coding agents can burst higher than that during a multi-file refactor. If you see 429s, slow the agent down or queue the long edits.
- Context cost. Nemotron 3 Super has a 1M context window. The endpoint will happily accept huge prompts, but free-tier latency scales with prompt size. For agent loops, lean on smaller, scoped prompts.
If you outgrow the free tier, the same config works against the OpenRouter free Nemotron mirror (different limits, same model), or you can swap the baseURL to any other OpenAI-compatible endpoint without touching the rest of the setup.
Alternatives if OpenCode does not fit
The same NVIDIA endpoint slots into other open-source agents the same way: set baseURL, paste the key, pick the model.
- Aider. Best if you want a git-aware, multi-file editor in your terminal rather than a full agent loop. Configure with
--openai-api-base https://integrate.api.nvidia.com/v1and--model nvidia/nemotron-3-super-120b-a12b. - Cline. VS Code extension. In the provider settings, choose "OpenAI Compatible", paste the base URL and key, and enter the model ID exactly.
- OpenHands. Heavier agent with a Kubernetes mode. Same Custom Model / Base URL fields in its setup wizard.
- Hermes Agent. If you want messaging-app integration and a learning loop rather than a coding-only agent, swap the provider in its config to NVIDIA and you get the same free model behind a Telegram bot.
What it actually costs
- VPS: $4 to $7 per month.
- Model inference: $0.
- Agent: $0.
Under $7 a month for an always-on, OpenAI-compatible coding agent backed by a 120B model. If you want a hosted-model fallback for when the free tier is rate-limited, an OpenRouter top-up of $5 covers a lot of overflow.
For reference, comparable hosted setups (Cursor Pro, Claude Code Max) run $20 to $200 a month and own the workflow end to end.
Troubleshooting
401 Unauthorizedfrom NVIDIA. Key is wrong, or theAuthorization: Bearerheader is missing. Re-test with the curl from Step 3.404 model not found. Model ID has drifted. Open the model page on build.nvidia.com and copy the exact slug.- Agent hangs on long edits. You are hitting the 40 RPM ceiling. Either drop concurrency or split the task.
- Systemd service exits immediately. Check
journalctl --user -u opencode --since "5 min ago". Nine times out of ten it is a missingNVIDIA_API_KEYin the unit's environment. - Cannot reach the TUI from the laptop. Confirm the SSH tunnel is up and that you are pointing the desktop app at
http://localhost:4096, not the VPS IP.
Sources and further reading
- NVIDIA build platform and API keys
- Nemotron 3 Super on OpenRouter (free mirror)
- OpenCode providers documentation
Disclaimer: I have no affiliation with NVIDIA, OpenCode, Hetzner, Netcup, Contabo, or OVHcloud. This is informational. Free tiers and prices change; double-check before you commit.