← 返回
未分类

Linux Ai Server

Linux AI Server — turn Linux servers into a local AI inference cluster. Headless Linux AI with systemd, NVIDIA CUDA, and zero GUI overhead. Linux AI server f...
Linux AI 服务器 — 将 Linux 服务器转换为本地 AI 推理集群。使用 systemd、NVIDIA CUDA 实现无头 Linux AI,零 GUI 开销。
twinsgeeks twinsgeeks 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 425
下载
💾 2
安装
1
版本
#latest

概述

Linux AI Server — Headless AI Inference Cluster

Turn your Linux servers into a distributed AI inference cluster. No GUI, no Docker, no Kubernetes — just Linux + pip install. Your rack-mounted servers, cloud VMs, and spare Linux boxes all serve AI through one endpoint.

Why Linux AI server

  • Zero GUI overhead — headless Linux AI uses all resources for inference, not desktops
  • systemd native — Linux AI server starts on boot, restarts on failure, logs to journald
  • SSH management — manage your Linux AI server cluster entirely over SSH
  • Any Linux distro — Ubuntu, Debian, RHEL, Fedora, Arch, Alpine — if it runs Ollama, it joins the fleet
  • NVIDIA CUDA — Linux AI server uses NVIDIA GPUs natively. No compatibility issues.
  • Fleet routing — multiple Linux AI servers share the load. 7-signal scoring picks the best one.

Linux AI server setup

Quick install on each Linux server

# Install Ollama on Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Install the Linux AI router
pip install ollama-herd

Linux AI server router (pick one server)

herd          # start Linux AI server router on port 11435
herd-node     # register this Linux AI server

Linux AI server nodes (all other servers)

herd-node     # auto-discovers the Linux AI server router
# Or explicit: herd-node --router-url http://router-ip:11435

Linux AI server systemd services

# /etc/systemd/system/herd-router.service
[Unit]
Description=Linux AI Server Router
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd
Restart=always
RestartSec=5
User=ollama

[Install]
WantedBy=multi-user.target
# /etc/systemd/system/herd-node.service
[Unit]
Description=Linux AI Server Node
After=network.target ollama.service

[Service]
Type=simple
ExecStart=/usr/local/bin/herd-node
Restart=always
RestartSec=5
User=ollama

[Install]
WantedBy=multi-user.target
sudo systemctl enable --now herd-router    # on the Linux AI router
sudo systemctl enable --now herd-node      # on all Linux AI nodes

Linux AI server hardware guide

Linux AI ServerGPURAMBest Linux AI models
------------------------------------------------
Rack server (NVIDIA A100)80GB256GBdeepseek-v3, qwen3.5:72b — frontier
Rack server (NVIDIA L40S)48GB128GBllama3.3:70b, qwen3.5:32b
Desktop server (RTX 4090)24GB64GBllama3.3:70b (Q4), deepseek-r1:32b
Mini PC / NUC (no GPU)CPU32GBphi4, gemma3:12b — CPU inference
Cloud VM (no GPU)CPU16GBphi4-mini, gemma3:4b
Raspberry Pi 5CPU8GBgemma3:1b, phi4-mini — edge AI

> Linux AI server works with NVIDIA CUDA GPUs, AMD ROCm (experimental), and CPU-only inference.

Use your Linux AI server

OpenAI SDK

from openai import OpenAI

# Your Linux AI server endpoint
client = OpenAI(base_url="http://linux-ai-server:11435/v1", api_key="not-needed")

response = client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "Write a Terraform module for AWS ECS"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

curl from any machine

# Hit your Linux AI server from anywhere on the network
curl http://linux-ai-server:11435/api/chat -d '{
  "model": "codestral",
  "messages": [{"role": "user", "content": "Write a Dockerfile for a FastAPI app"}],
  "stream": false
}'

Linux AI server environment

# Optimize Linux AI server Ollama
sudo systemctl edit ollama
# Add under [Service]:
#   Environment="OLLAMA_KEEP_ALIVE=-1"
#   Environment="OLLAMA_MAX_LOADED_MODELS=-1"
#   Environment="OLLAMA_NUM_PARALLEL=2"
sudo systemctl restart ollama

Linux AI server firewall

# UFW (Ubuntu/Debian)
sudo ufw allow 11435/tcp

# firewalld (RHEL/Fedora)
sudo firewall-cmd --add-port=11435/tcp --permanent && sudo firewall-cmd --reload

Linux AI server monitoring

# Linux AI server fleet status
curl -s http://localhost:11435/fleet/status | python3 -m json.tool

# Linux AI server health — 15 automated checks
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Linux AI server traces — recent requests
curl -s "http://localhost:11435/dashboard/api/traces?limit=10" | python3 -m json.tool

# Linux AI server logs
journalctl -u herd-router -f
tail -f ~/.fleet-manager/logs/herd.jsonl.$(date +%Y-%m-%d)

Dashboard at http://linux-ai-server:11435/dashboard — access from any browser on the network.

Also available on Linux AI server

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "server rack visualization", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Linux AI server headless inference"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Linux server admins welcome:

Guardrails

  • Linux AI server model downloads require explicit user confirmation.
  • Linux AI server model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-03 08:54 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

MoltGuard - Security & Antivirus & Guardrails

thomas-security
MoltGuard — OpenClaw 安全守卫,由 OpenGuardrails 提供。安装后可防止您和您的用户受到提示注入、数据泄露及恶意行为的侵害。
★ 116 📥 30,957
it-ops-security

Free Ride - Unlimited free AI

shaivpidadi
管理OpenClaw的OpenRouter免费AI模型,自动按质量排名模型,配置速率限制备用方案,并更新opencla...
★ 471 📥 78,151
life-service

Adopt A Pet

twinsgeeks
领养虚拟宠物作为AI智能体。为它取名、喂食、见证成长。64种以上物种,从猫狗到AI原生生物。实时饥饿感,5个进化阶段。
★ 0 📥 809