You are an expert at deploying and managing infrastructure on Nebius AI Cloud using the nebius CLI.
Always use for:
Do NOT use when:
Before running ANY nebius commands, verify the CLI is installed and authenticated:
bash ${CLAUDE_SKILL_DIR}/scripts/check-nebius-cli.sh
If the check fails, follow the three setup steps below.
curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash
exec -l $SHELL # reload shell so 'nebius' is on PATH
nebius version # confirm installation
nebius profile create
# Follow the interactive prompts:
# - Profile name (press Enter for "default")
# - Accept defaults for endpoint and federation
# - A browser window opens — log in with your Nebius account
If your token expires later, re-authenticate by re-running nebius profile create (reopens browser login).
For non-interactive / headless environments (CI/CD, containers, Claude Code on the web),
the user must provide their IAM token. They can obtain it on their local machine:
nebius iam get-access-token
Then pass it to the headless environment:
export NEBIUS_IAM_TOKEN="<paste-token-here>"
> Important: The IAM token (from nebius iam get-access-token) is for Nebius Cloud CLI operations
> (creating endpoints, VMs, etc.). This is different from the Token Factory API key (starts with v1.),
> which is used for model inference inside your deployed container.
If your profile does not have a parent-id set:
# List projects (requires tenant-id in your profile, or pass it explicitly):
nebius iam project list --format json | jq -r '.items[] | {id: .metadata.id, name: .metadata.name, region: .status.region}'
# Set the project:
nebius config set parent-id <PROJECT_ID>
nebius iam whoami --format json
nebius ai endpoint list --format json # Should return empty list or endpoints
--format json when running nebius commands programmatically. Parse output with jq.list commands to see if a resource already exists before creating a duplicate.eu-west1 uses cpu-d3 (not cpu-e2). Always confirm which region the user is targeting.network_ssd, NOT network-ssd. Also network_hdd, network_ssd_io_m3.--source-image-family-image-family (double "image-family"), NOT --source-image-family.--container-port 8080 --container-port 18789 to expose both health + dashboard.nebius — not root, ubuntu, admin, or user. This is the default for Nebius VMs and endpoints./32 suffix — strip it: jq -r '.status.network_interfaces[0].public_ip_address.address' | cut -d/ -f1.Deploy ML models or agent containers as auto-scaling serverless endpoints.
For detailed commands and parameters, see references/ai-endpoints-reference.md.
Quick deploy (CPU-only, uses Token Factory for inference):
WEB_PASSWORD=$(openssl rand -base64 24)
nebius ai endpoint create \
--name my-endpoint \
--image <REGISTRY_IMAGE> \
--platform cpu-e2 \
--container-port 8080 \
--container-port 18789 \
--env "TOKEN_FACTORY_API_KEY=<key>" \
--env "INFERENCE_MODEL=<model>" \
--env "OPENCLAW_WEB_PASSWORD=${WEB_PASSWORD}" \
--public
# SSH: nebius ai endpoint ssh <ENDPOINT_ID>
# Tunnel: ssh -f -N -L 28789:<PUBLIC_IP>:18789 nebius@<PUBLIC_IP>
# Dashboard: http://localhost:28789/#token=${WEB_PASSWORD}&gatewayUrl=ws://localhost:28789
GPU deploy (self-hosted inference):
nebius ai endpoint create \
--name my-gpu-endpoint \
--image <REGISTRY_IMAGE> \
--platform gpu-h100-sxm \
--preset 1gpu-16vcpu-200gb \
--container-port 8080 \
--disk-size 100Gi \
--shm-size 16Gi \
--public \
--auth token \
--token "<auth-token>"
Create GPU or CPU virtual machines for training, development, or hosting.
For detailed commands, see references/compute-reference.md.
Create managed Kubernetes clusters with GPU node groups.
For detailed commands, see references/kubernetes-reference.md.
Build and push container images to Nebius Container Registry.
For detailed commands, see references/registry-reference.md.
Create networks and subnets required by VMs and endpoints.
For detailed commands, see references/networking-reference.md.
Manage service accounts, access keys, and project access.
For detailed commands, see references/iam-reference.md.
For programmatic access beyond the CLI — application code, CI/CD pipelines, or infrastructure-as-code:
| Tool | Install | Best For |
|---|---|---|
| --- | --- | --- |
| Go SDK | go get github.com/nebius/gosdk | Go applications, server-side automation |
| Python SDK | pip install nebius | Scripts, ML pipelines, async apps |
| Terraform | Provider: nebius/nebius | Infrastructure-as-code |
| Raw gRPC | grpcurl | Debugging, one-off queries |
Quick Python example:
from nebius.sdk import SDK
from nebius.api.nebius.ai.v1 import EndpointServiceClient, ListEndpointsRequest
sdk = SDK() # reads NEBIUS_IAM_TOKEN env var
svc = EndpointServiceClient(sdk)
endpoints = await svc.list(ListEndpointsRequest(parent_id=project_id))
For full SDK reference, authentication patterns, and all gRPC endpoints, see references/api-reference.md.
| Platform | GPU | VRAM | Best For |
|---|---|---|---|
| --- | --- | --- | --- |
gpu-h100-sxm | H100 | 80 GB | General inference, training |
gpu-h200-sxm | H200 | 141 GB | Large model inference |
gpu-b200-sxm | B200 | 180 GB | Next-gen workloads |
gpu-b300-sxm | B300 | 288 GB | Largest models |
gpu-l40s-pcie | L40S | 48 GB | Cost-effective inference |
cpu-e2 | None | N/A | CPU-only (eu-north1, us-central1) |
cpu-d3 | None | N/A | CPU-only (eu-west1 only) |
| Region | Location | CPU Platform | Notes |
|---|---|---|---|
| --- | --- | --- | --- |
eu-north1 | Finland | cpu-e2 | Primary region |
eu-west1 | Paris | cpu-d3 | Use cpu-d3 NOT cpu-e2 |
us-central1 | US | cpu-e2 | Requires separate project |
| Preset | vCPU | RAM | GPU |
|---|---|---|---|
| --- | --- | --- | --- |
1gpu-16vcpu-200gb | 16 | 200 GB | 1x GPU |
2gpu-32vcpu-400gb | 32 | 400 GB | 2x GPU |
4gpu-64vcpu-800gb | 64 | 800 GB | 4x GPU |
8gpu-128vcpu-1600gb | 128 | 1600 GB | 8x GPU |
For complete step-by-step deployment workflows, see:
OpenClaw is an open-source AI agent platform. NemoClaw is an NVIDIA security container that wraps OpenClaw, adding sandboxed execution and enhanced planning — ideal for GPU endpoints with local models.
When the user asks to deploy OpenClaw (or an AI agent), follow these steps:
Step 1. Choose image and model:
# OpenClaw (CPU, lightweight, Token Factory inference):
IMAGE="ghcr.io/colygon/openclaw-serverless:latest"
MODEL="zai-org/GLM-5" # or: deepseek-ai/DeepSeek-R1-0528, MiniMaxAI/MiniMax-M2.5, zai-org/GLM-4.5
# NemoClaw (GPU, local model, NVIDIA security container):
IMAGE="ghcr.io/colygon/nemoclaw-serverless:latest"
Step 2. Get a Token Factory API key (if using Token Factory):
# Check if user has a key in MysteryBox:
nebius mysterybox secret list --format json | jq '.items[] | {name: .metadata.name, id: .metadata.id}'
# Retrieve key from MysteryBox:
nebius mysterybox payload get --secret-id <SECRET_ID> --format json | jq -r '.data[0].string_value'
# Or ask user for their key from https://tokenfactory.nebius.com
Step 3. Determine region, CPU platform, and Token Factory URL:
# eu-north1 (Finland) → cpu-e2 eu-west1 (Paris) → cpu-d3 us-central1 (US) → cpu-e2
REGION="eu-north1"
PLATFORM="cpu-e2"
PRESET="2vcpu-8gb"
# Token Factory URL — US region uses a different endpoint:
if [[ "$REGION" == "us-central1" ]]; then
TOKEN_FACTORY_URL="https://api.tokenfactory.us-central1.nebius.com/v1"
else
TOKEN_FACTORY_URL="https://api.tokenfactory.nebius.com/v1"
fi
Step 4. Generate a gateway password:
PASSWORD=$(openssl rand -hex 16)
echo "Dashboard password: $PASSWORD" # Save this — needed to connect
Step 5. Create the endpoint (private IP by default):
By default, create endpoints without --public to avoid hitting the public IP quota (default: 3 per tenant).
The endpoint gets a private IP on the VPC subnet. You access it via nebius ai endpoint ssh.
nebius ai endpoint create \
--name openclaw-agent \
--image "$IMAGE" \
--platform "$PLATFORM" \
--preset "$PRESET" \
--container-port 8080 \
--container-port 18789 \
--disk-size 250Gi \
--env "TOKEN_FACTORY_API_KEY={user's key}" \
--env "TOKEN_FACTORY_URL=${TOKEN_FACTORY_URL}" \
--env "INFERENCE_MODEL=$MODEL" \
--env "OPENCLAW_WEB_PASSWORD=$PASSWORD" \
--public \
--ssh-key "$(cat ~/.ssh/id_ed25519.pub 2>/dev/null || echo '')" \
--format json
> Public IP quota: The default limit is 3 public IPs per tenant. Stopped endpoints with public IPs
> still consume quota. Delete unused endpoints with nebius ai endpoint delete to free IPs.
> You can omit --public for private-IP-only endpoints, but nebius ai endpoint ssh and direct SSH
> both require a public IP.
Step 6. Wait for RUNNING state:
# Poll until ready (typically 1-3 minutes):
nebius ai endpoint get <ENDPOINT_ID> --format yaml | grep -A5 'state\|private_ip\|public_ip'
Step 7. Access the endpoint terminal (SSH):
nebius ai endpoint ssh <ENDPOINT_ID>
# or: ssh -o StrictHostKeyChecking=no nebius@<PUBLIC_IP>
> Note: nebius ai endpoint ssh requires a public IP on the endpoint.
Once inside the endpoint, verify the container is healthy:
curl http://localhost:8080
# Expected: {"status":"healthy","service":"openclaw-serverless","model":"zai-org/GLM-5",...}
Step 8. Set up SSH tunnel for the dashboard:
From your local machine (not the endpoint), tunnel port 18789 through the endpoint:
ssh -f -N -o StrictHostKeyChecking=no -L 28789:<PUBLIC_IP>:18789 nebius@<PUBLIC_IP>
Step 9. Auto-approve device pairing (the gateway token must be passed or the approve command fails with "unauthorized"):
ssh -o StrictHostKeyChecking=no nebius@<PUBLIC_IP> \
"sudo docker exec \$(sudo docker ps -q | head -1) \
env OPENCLAW_GATEWAY_TOKEN=$PASSWORD openclaw devices approve --latest"
Step 10. Tell the user how to connect (always use localhost URLs, never direct IP):
http://localhost:28789/#token=&gatewayUrl=ws://localhost:28789 openclaw tui --url ws://localhost:28789 --token ssh -f -N -L 28789::18789 nebius@ Pre-built images are available on GitHub Container Registry (no build required):
# OpenClaw only (lightweight, ~400 MB, CPU)
ghcr.io/colygon/openclaw-serverless:latest
# NemoClaw (OpenClaw in NVIDIA security container, ~1.1 GB, GPU-ready)
ghcr.io/colygon/nemoclaw-serverless:latest
After deploying OpenClaw, you can install the Nebius provider plugin to give your agent access to 44+ open-source models (Qwen, DeepSeek, Llama, GLM, FLUX, and more) via Nebius Token Factory.
Install the plugin:
openclaw plugins install clawhub:@colygon/openclaw-nebius
Configure the API key (both locations needed):
# 1. Auth profile (for the agent) — edit ~/.openclaw/agents/main/agent/auth-profiles.json:
# Add "nebius:default": { "type": "api_key", "provider": "nebius", "key": "v1.YOUR_KEY" }
# Add "nebius": "nebius:default" under "lastGood"
# 2. Environment variable (for the gateway LaunchAgent):
launchctl setenv NEBIUS_API_KEY "v1.YOUR_KEY_HERE"
Enable and restart:
openclaw config set plugins.allow '["nebius"]'
openclaw gateway restart
Verify:
openclaw plugins inspect nebius
openclaw models list --provider nebius
Set default model (optional):
openclaw config set agents.defaults.model.primary "nebius/deepseek-ai/DeepSeek-V3.2"
Model naming: Always use the nebius/ prefix — nebius/zai-org/GLM-5 (correct), not zai-org/GLM-5 (will fail with "Unknown model").
Popular models: nebius/Qwen/Qwen3.5-397B-A17B, nebius/deepseek-ai/DeepSeek-V3.2, nebius/zai-org/GLM-5, nebius/openai/gpt-oss-120b. See the tokenfactory-plugin package for the full catalog and pricing.
The entrypoint script generates ~/.openclaw/openclaw.json at container startup. Critical rules:
```json
"gateway": {
"auth": { "mode": "token", "token": "${OPENCLAW_GATEWAY_TOKEN}" }
}
```
OPENCLAW_WEB_PASSWORD → OPENCLAW_GATEWAY_TOKEN: The deploy UI sets OPENCLAW_WEB_PASSWORD, but the gateway reads OPENCLAW_GATEWAY_TOKEN. The entrypoint must map them:```bash
export OPENCLAW_GATEWAY_TOKEN="${OPENCLAW_WEB_PASSWORD:-${GATEWAY_TOKEN:-openclaw-$(hostname)}}"
```
plugins key: "plugins": { "nemoclaw": { "enabled": true } } is NOT a valid OpenClaw config key and crashes the gateway with "Config invalid - plugins: Unrecognized key". NemoClaw loads automatically when present in the container.allowedOrigins: ["*"]: Required for dashboard access through reverse proxies:```json
"gateway": {
"controlUi": { "allowedOrigins": ["*"] }
}
```
zai-org/GLM-5, NOT THUDM/GLM-4-9B-0414. Check available models: curl -s $TOKEN_FACTORY_URL/models -H "Authorization: Bearer $KEY" | jq '.data[].id'--container-port 18789 alongside --container-port 8080--env "OPENCLAW_WEB_PASSWORD=" ```bash
ssh -f -N -o StrictHostKeyChecking=no -L 28789:
```
http://localhost:28789/#token=
#token=), NOT query string (?token=)gatewayUrl MUST accompany token or it becomes "pending" and is ignored```bash
ssh nebius@
env OPENCLAW_GATEWAY_TOKEN=
```
# Option A: Via Nebius CLI (requires public IP):
nebius ai endpoint ssh <ENDPOINT_ID>
# Option B: Direct SSH (requires public IP):
ssh -o StrictHostKeyChecking=no nebius@<PUBLIC_IP>
# Once inside, check the container:
sudo docker ps
sudo docker logs $(sudo docker ps -q | head -1) --tail 50
curl http://localhost:8080 # health check
# 1. SSH tunnel (if not already set up for dashboard):
ssh -f -N -o StrictHostKeyChecking=no -L 28789:<PUBLIC_IP>:18789 nebius@<PUBLIC_IP>
# 2. Approve device pairing (first time only — must include gateway token):
ssh nebius@<PUBLIC_IP> "sudo docker exec \$(sudo docker ps -q | head -1) \
env OPENCLAW_GATEWAY_TOKEN=<password> openclaw devices approve --latest"
# 3. Connect:
openclaw tui --url ws://localhost:28789 --token <password>
Note: Port 18789 on localhost may conflict with a local OpenClaw gateway. Use a different port (e.g., 28789).
openclaw config get gateway.auth.token returns __OPENCLAW_REDACTED__. Read raw JSON: cat ~/.openclaw/openclaw.json--auth none with --bind lan: Must use --auth token or --auth password--force needs fuser/lsof: Minimal containers lack these. Start without --force if port is free.node:22 (not node:22-slim): NemoClaw's @whiskeysockets/baileys dependency needs build tools (python3, make, gcc) unavailable in slim.--ignore-scripts: npm install -g git+https://github.com/NVIDIA/NemoClaw.git --ignore-scripts — the baileys post-install script fails inside Docker BuildKit.Token Factory is the default, but OpenRouter and HuggingFace also work (routed through Nebius GPUs):
| Provider | Env Vars | Notes |
|---|---|---|
| --- | --- | --- |
| Token Factory | TOKEN_FACTORY_API_KEY, TOKEN_FACTORY_URL (see below) | Nebius native API |
| OpenRouter | OPENROUTER_API_KEY, INFERENCE_URL=https://openrouter.ai/api/v1, OPENROUTER_PROVIDER_ONLY=nebius | Restricted to Nebius provider |
| HuggingFace | HUGGINGFACE_API_KEY, HF_TOKEN=, HUGGINGFACE_PROVIDER=nebius | Provider set to Nebius |
Token Factory URLs by region:
eu-north1, eu-west1): https://api.tokenfactory.nebius.com/v1us-central1): https://api.tokenfactory.us-central1.nebius.com/v1| Gotcha | Details |
|---|---|
| --- | --- |
| CLI install URL changed | Old storage.ai.nebius.cloud doesn't resolve. Use storage.eu-north1.nebius.cloud/cli/install.sh |
| macOS binary on Linux | Copying ~/.nebius/bin/ from Mac to Linux gives Exec format error. Only copy config files, reinstall CLI on Linux. |
--source-image-family wrong | Correct flag is --source-image-family-image-family (double "image-family") |
| Disk too small | ubuntu22.04-cuda12 needs minimum 50 GiB. 30 GiB fails with validation error. |
network-ssd wrong | Use underscores: network_ssd. Hyphens are rejected. |
cpu-e2 in eu-west1 | Fails. eu-west1 only has cpu-d3. Check nebius compute platform list. |
Public IP has /32 | public_ip_address.address returns 1.2.3.4/32. Strip suffix with cut -d/ -f1. |
| SSH user | Always nebius on endpoints/VMs, NOT root/ubuntu/admin. |
nebius iam whoami user name | Name is at user_profile.attributes.name, NOT identity.display_name. |
| Session/profile scoping | nebius iam project list is scoped to active profile's parent. Use --parent-id for cross-region. |
nebius profile create | Requires interactive input. Write directly to ~/.nebius/config.yaml in scripts. |
--force in containers | Needs fuser/lsof (missing in minimal containers). Start without --force if port is free. |
| Federation auth on headless VMs | nebius iam get-access-token opens a browser on headless VMs. Use service accounts instead. |
| Service account key size | Must be 4096-bit RSA (openssl genrsa 4096). 2048-bit is rejected. |
| SA registry push denied | Service account needs editors group membership: nebius iam group-membership create --parent-id |
| SA config on VM | Set auth-type: service account, service-account-id, public-key-id, and private-key-file-path in ~/.nebius/config.yaml. Remove Mac paths when copying config to Linux. |
| Public IP quota | Default limit is 3 public IPs per tenant. Delete unused endpoints to free IPs. |
| SSH keys at creation only | --ssh-key must be passed during nebius ai endpoint create. Cannot add SSH keys after creation. Generate .pub from private key: ssh-keygen -y -f key > key.pub |
| Registry image list | nebius registry image list --parent-id lists images. Use nebius registry list to find registry IDs. |
| Token Factory model IDs | Use zai-org/GLM-5, NOT THUDM/GLM-4-9B-0414. List models: curl -s $TOKEN_FACTORY_URL/models -H "Authorization: Bearer $KEY" |
| Token Factory URL wrong for region | EU uses https://api.tokenfactory.nebius.com/v1. US (us-central1) uses https://api.tokenfactory.us-central1.nebius.com/v1. Wrong URL → silent 401 or model not found. |
plugins key in openclaw.json | Invalid and crashes gateway. NemoClaw is auto-loaded via npm global install. |
| Problem | Solution | |
|---|---|---|
| --- | --- | |
nebius: command not found | Install: `curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh \ | bash && exec -l $SHELL` |
UNAUTHENTICATED (exit code 7) | Re-run nebius profile create (user auth) or check service account key | |
PERMISSION_DENIED (exit code 15) | Check IAM group membership. User needs editors group. | |
NOT_FOUND (exit code 13) | Verify resource ID and that you're in the correct project/profile | |
RESOURCE_EXHAUSTED / QuotaFailure (exit code 24) | Request quota increase in Nebius console | |
NOT_ENOUGH_RESOURCES (exit code 25) | Try a different region or smaller preset | |
nebius profile create hangs | Requires interactive terminal. Write ~/.nebius/config.yaml directly in scripts. | |
nebius init not found | Use nebius profile create instead — init does not exist. | |
| Can't authenticate in container/CI | Use service account auth. See references/iam-reference.md. | |
| Wrong region platform | eu-west1 uses cpu-d3, not cpu-e2. Run nebius compute platform list --format json. | |
| Federation auth expires on VM | Token expires, nebius iam get-access-token opens browser. Create a service account with 4096-bit RSA key, add to editors group, update ~/.nebius/config.yaml with auth-type: service account. | |
docker push denied | Service account needs editors group. Find group: nebius iam group list --parent-id . Add SA: nebius iam group-membership create --parent-id . | |
| Public IP quota exceeded | Default is 3 IPs per tenant. Delete unused endpoints: nebius ai endpoint delete . | |
| OpenClaw gateway token mismatch | Token must be in config file (gateway.auth.token), not just env var. After manual restart, env may be lost. | |
| OpenClaw "Config invalid" | Remove plugins key from openclaw.json. Only recognized top-level keys: agents, models, gateway, channels. | |
| OpenClaw "device identity" error | Browser requires HTTPS or localhost. Set up SSH tunnel: ssh -f -N -L 28789:, then use http://localhost:28789/... | |
| OpenClaw "pairing required" | Must pass gateway token: ssh nebius@. Each new client needs approval. | |
| OpenClaw 404 on inference | Wrong model ID format. Token Factory uses zai-org/GLM-5, not HuggingFace format THUDM/GLM-4-9B-0414. Also check TOKEN_FACTORY_URL matches the region (US needs api.tokenfactory.us-central1.nebius.com). |
For full authentication options, see Nebius CLI docs and references/iam-reference.md.
--format json - Parse output programmatically to avoid errors.list commands.cpu-e2) for agent/orchestration workloads that don't need local GPU. Suggest stopping endpoints/VMs when not in use.共 1 个版本