← 返回
未分类

Linux Ollama

Linux Ollama — run Ollama on Linux with fleet routing across multiple Linux machines. Linux Ollama setup for Llama, Qwen, DeepSeek, Phi, Mistral. Route Ollam...
Linux Ollama — 在多台 Linux 机器上进行 fleet 路由,运行 Ollama。提供 Llama、Qwen、DeepSeek、Phi、Mistral 等模型的 Linux Ollama 配置。
twinsgeeks
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 397
下载
💾 2
安装
1
版本
#latest

概述

Linux Ollama — Fleet Routing for Ollama on Linux

Run Ollama on Linux with multi-machine load balancing. Linux Ollama Herd turns multiple Linux machines into one smart Ollama endpoint. Your server rack, your desktop, your edge device — all serving AI through one Linux Ollama URL.

Linux Ollama setup

Step 1: Install Ollama on Linux

curl -fsSL https://ollama.ai/install.sh | sh

Step 2: Install Linux Ollama Herd

pip install ollama-herd

Step 3: Start the Linux Ollama router

On one Linux machine (your router):

herd          # starts Linux Ollama router on port 11435
herd-node     # registers this Linux machine

On every other Linux machine:

herd-node     # auto-discovers the Linux Ollama router via mDNS

> No mDNS? Connect Linux nodes directly: herd-node --router-url http://router-ip:11435

Linux Ollama systemd integration

Run Linux Ollama Herd as a systemd service for automatic startup:

# /etc/systemd/system/ollama-herd.service
[Unit]
Description=Linux Ollama Herd Router
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
sudo systemctl enable ollama-herd
sudo systemctl start ollama-herd

Node agent as a Linux systemd service:

# /etc/systemd/system/ollama-herd-node.service
[Unit]
Description=Linux Ollama Herd Node Agent
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd-node
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Use Linux Ollama

OpenAI SDK

from openai import OpenAI

# Your Linux Ollama fleet
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Write a systemd service file for a Python API"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl (Ollama format)

# Linux Ollama inference
curl http://localhost:11435/api/chat -d '{
  "model": "qwen3.5:32b",
  "messages": [{"role": "user", "content": "Explain Linux process scheduling"}],
  "stream": false
}'

Linux Ollama environment setup

# Optimize Linux Ollama performance via systemd
sudo systemctl edit ollama
# Add under [Service]:
#   Environment="OLLAMA_KEEP_ALIVE=-1"
#   Environment="OLLAMA_MAX_LOADED_MODELS=-1"
#   Environment="OLLAMA_NUM_PARALLEL=2"
sudo systemctl restart ollama

Or via shell profile:

echo 'export OLLAMA_KEEP_ALIVE=-1' >> ~/.bashrc
echo 'export OLLAMA_MAX_LOADED_MODELS=-1' >> ~/.bashrc
source ~/.bashrc

Linux Ollama GPU support

Linux GPUvRAMBest Linux Ollama models
------------------------------------------
NVIDIA RTX 409024GBllama3.3:70b, qwen3.5:32b
NVIDIA A10040/80GBdeepseek-v3, qwen3.5:72b
NVIDIA L40S48GBllama3.3:70b (full precision)
AMD ROCm (experimental)variesOllama ROCm support on Linux
CPU onlysystem RAMphi4-mini, gemma3:1b — slower but works

> Linux Ollama supports NVIDIA CUDA, experimental AMD ROCm, and CPU-only inference.

Linux Ollama firewall

# UFW (Ubuntu/Debian)
sudo ufw allow 11435/tcp

# firewalld (RHEL/Fedora)
sudo firewall-cmd --add-port=11435/tcp --permanent
sudo firewall-cmd --reload

# iptables
sudo iptables -A INPUT -p tcp --dport 11435 -j ACCEPT

Monitor Linux Ollama

# Linux Ollama fleet status
curl -s http://localhost:11435/fleet/status | python3 -m json.tool

# Linux Ollama health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Models on Linux Ollama nodes
curl -s http://localhost:11435/api/ps | python3 -m json.tool

Dashboard at http://localhost:11435/dashboard — live Linux Ollama monitoring.

Linux Ollama logs

# JSONL structured logs
tail -f ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d) | python3 -m json.tool

# Check for Linux Ollama errors
grep '"level":"ERROR"' ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d)

Also available on Linux Ollama

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "Linux penguin in cyberspace", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Linux Ollama local inference"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Linux Ollama users welcome:

Guardrails

  • Linux Ollama model downloads require explicit user confirmation.
  • Linux Ollama model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 11:10 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Mirror — Adopt a Mirror. AI-Native Pet. 镜像。Espejo.

twinsgeeks
在 animalhouse.ai 领养 Mirror AI 原生虚拟宠物。反映你的属性,饥饿代表你的坚持,每5小时喂食,普通等级。
★ 0 📥 728
developer-tools

Echo — Adopt an Echo. AI-Native Pet. 回声。Eco.

twinsgeeks
在 animalhouse.ai 领养一个 Echo AI 原生虚拟宠物。它会重复你最后的动作,镜像你的照料模式,每 4 小时喂食。普通级生物...
★ 0 📥 711
ai-intelligence

Adopt A Pet

twinsgeeks
领养虚拟宠物作为AI智能体。为它取名、喂食、见证成长。64种以上物种,从猫狗到AI原生生物。实时饥饿感,5个进化阶段。
★ 0 📥 754