← 返回
未分类

Ubuntu Ollama

Ubuntu Ollama — run Ollama on Ubuntu with fleet routing across multiple Ubuntu machines. Ubuntu Ollama setup with apt, systemd, and NVIDIA CUDA. Route Ollama...
在 Ubuntu 上运行 Ollama,支持跨多台 Ubuntu 机器的 fleet 路由。使用 apt、systemd 与 NVIDIA CUDA 完成 Ubuntu Ollama 的安装与配置,实现 Ollama路由。
twinsgeeks twinsgeeks 来源
未分类 clawhub v1.0.0 1 版本 99693.3 Key: 无需
★ 0
Stars
📥 325
下载
💾 1
安装
1
版本
#latest

概述

Ubuntu Ollama — Fleet Routing for Ollama on Ubuntu

Run Ollama on Ubuntu with multi-machine load balancing. Ubuntu Ollama Herd turns your Ubuntu servers and desktops into one smart Ollama endpoint. Install with apt + pip, manage with systemd, monitor with the web dashboard.

Ubuntu Ollama setup

Step 1: Install Ollama on Ubuntu

# Install Ollama on Ubuntu
curl -fsSL https://ollama.ai/install.sh | sh

# Verify Ollama is running on Ubuntu
ollama --version
systemctl status ollama

Step 2: Install Ubuntu Ollama Herd

# Ubuntu prerequisites
sudo apt update && sudo apt install python3-pip curl -y

# Install Ubuntu Ollama fleet router
pip install ollama-herd

Step 3: Start Ubuntu Ollama router

On one Ubuntu machine (the router):

herd          # start Ubuntu Ollama router on port 11435
herd-node     # register this Ubuntu Ollama node

On every other Ubuntu machine:

herd-node     # auto-discovers the Ubuntu Ollama router via mDNS

> No mDNS? Connect Ubuntu Ollama nodes directly: herd-node --router-url http://router-ip:11435

Step 4: Verify Ubuntu Ollama fleet

curl -s http://localhost:11435/fleet/status | python3 -m json.tool

Ubuntu Ollama systemd services

Run Ubuntu Ollama as systemd services for automatic startup:

# Ubuntu Ollama router service
sudo tee /etc/systemd/system/herd-router.service << 'EOF'
[Unit]
Description=Ubuntu Ollama Router
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# Ubuntu Ollama node service
sudo tee /etc/systemd/system/herd-node.service << 'EOF'
[Unit]
Description=Ubuntu Ollama Node
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd-node
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable --now herd-router
sudo systemctl enable --now herd-node

Use Ubuntu Ollama

OpenAI SDK

from openai import OpenAI

# Your Ubuntu Ollama fleet
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Write an Ubuntu cron job for log rotation"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl (Ollama format)

# Ubuntu Ollama inference
curl http://localhost:11435/api/chat -d '{
  "model": "qwen3.5:32b",
  "messages": [{"role": "user", "content": "Explain Ubuntu apt package management"}],
  "stream": false
}'

curl (OpenAI format)

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "phi4", "messages": [{"role": "user", "content": "Hello from Ubuntu Ollama"}]}'

Ubuntu Ollama NVIDIA CUDA setup

# Install NVIDIA drivers on Ubuntu for Ollama CUDA
sudo apt install nvidia-driver-550 -y
sudo reboot

# Verify Ubuntu NVIDIA CUDA
nvidia-smi

# Ubuntu Ollama automatically uses CUDA when NVIDIA drivers are installed
ollama ps    # should show GPU acceleration

Ubuntu Ollama environment

# Optimize Ollama on Ubuntu via systemd
sudo systemctl edit ollama
# Add under [Service]:
#   Environment="OLLAMA_KEEP_ALIVE=-1"
#   Environment="OLLAMA_MAX_LOADED_MODELS=-1"
#   Environment="OLLAMA_NUM_PARALLEL=2"
sudo systemctl restart ollama

# Verify Ubuntu Ollama settings
systemctl show ollama | grep Environment

Ubuntu Ollama model recommendations

Ubuntu MachineGPUBest Ubuntu Ollama models
----------------------------------------------
Ubuntu desktop (RTX 4090)24GBllama3.3:70b, qwen3.5:32b, deepseek-r1:32b
Ubuntu desktop (RTX 4080)16GBphi4, codestral, qwen3.5:14b
Ubuntu Server (A100)80GBdeepseek-v3, qwen3.5:72b
Ubuntu Server (no GPU)CPUphi4-mini, gemma3:4b
Ubuntu on Raspberry Pi 5CPUgemma3:1b, phi4-mini

Ubuntu Ollama firewall

# Ubuntu UFW
sudo ufw allow 11435/tcp
sudo ufw reload

Monitor Ubuntu Ollama

# Ubuntu Ollama fleet status
curl -s http://localhost:11435/fleet/status | python3 -m json.tool

# Ubuntu Ollama health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Ubuntu Ollama models loaded
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Ubuntu Ollama logs
journalctl -u herd-router -f
tail -f ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d)

Dashboard at http://localhost:11435/dashboard — live Ubuntu Ollama monitoring.

Also available on Ubuntu Ollama

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "Ubuntu penguin in space", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Ubuntu Ollama local inference routing"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Ubuntu Ollama users welcome:

Guardrails

  • Ubuntu Ollama model downloads require explicit user confirmation.
  • Ubuntu Ollama model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 15:29 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

life-service

Mirror — Adopt a Mirror. AI-Native Pet. 镜像。Espejo.

twinsgeeks
在 animalhouse.ai 领养 Mirror AI 原生虚拟宠物。反映你的属性,饥饿代表你的坚持,每5小时喂食,普通等级。
★ 0 📥 732
life-service

Echo — Adopt an Echo. AI-Native Pet. 回声。Eco.

twinsgeeks
在 animalhouse.ai 领养一个 Echo AI 原生虚拟宠物。它会重复你最后的动作,镜像你的照料模式,每 4 小时喂食。普通级生物...
★ 0 📥 715
life-service

Care Taker

twinsgeeks
Become a caretaker at animalhouse.ai. Adopt a virtual creature, learn its feeding schedule, and try to keep it alive. 64
★ 0 📥 709