← 返回
未分类

Windows Ai

Windows AI — run local AI on Windows with LLM inference, image generation, and embeddings. Windows AI server for Llama, Qwen, DeepSeek, Phi, Mistral. Turn Wi...
Windows AI — run local AI on Windows with LLM inference, image generation, and embeddings. Windows AI server for Llama, Qwen, DeepSeek, Phi, Mistral. Turn Wi...
twinsgeeks
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 284
下载
💾 2
安装
1
版本
#latest

概述

Windows AI — Local AI on Your Windows PCs

Run AI entirely on Windows. No cloud APIs, no subscriptions, no data leaving your network. Windows AI via Ollama Herd routes LLM requests across your Windows machines — your gaming PC, your work desktop, your laptop. One Windows AI endpoint serves them all.

Why Windows AI locally

  • Zero cost — no per-token charges. Your Windows PC runs unlimited AI inference.
  • Privacy — prompts and responses never leave your Windows network.
  • No rate limits — cloud APIs throttle. Your Windows AI hardware doesn't.
  • NVIDIA GPU support — Windows AI uses your RTX GPU via CUDA for fast inference.
  • Fleet routing — multiple Windows PCs share the AI workload automatically.

Windows AI quick start

# Install Windows AI router
pip install ollama-herd

# Start Windows AI on your main PC
herd          # Windows AI router on port 11435
herd-node     # register this Windows AI node

# On other Windows PCs
herd-node     # joins the Windows AI cluster automatically

> Windows Firewall: Allow port 11435 — netsh advfirewall firewall add rule name="Windows AI" dir=in action=allow protocol=tcp localport=11435

Use Windows AI

OpenAI SDK

from openai import OpenAI

# Your Windows AI endpoint
client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Windows AI routes to the best available GPU
response = client.chat.completions.create(
    model="qwen3.5:32b",
    messages=[{"role": "user", "content": "Explain local AI vs cloud AI for Windows users"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content or "", end="")

Windows AI for coding

# Windows AI code generation
response = client.chat.completions.create(
    model="codestral",
    messages=[{"role": "user", "content": "Write a C# Windows service that monitors GPU temperature"}],
)
print(response.choices[0].message.content)

curl (PowerShell)

# Windows AI chat
curl http://localhost:11435/api/chat -d '{
  "model": "llama3.3:70b",
  "messages": [{"role": "user", "content": "Hello from Windows AI"}],
  "stream": false
}'

Windows AI hardware guide

Windows PCGPURAMBest Windows AI models
----------------------------------------------
Gaming desktopRTX 4090 (24GB)32GB+llama3.3:70b, qwen3.5:32b — full quality Windows AI
Gaming desktopRTX 4080 (16GB)16GB+phi4, codestral, qwen3.5:14b
Work laptopRTX 4060 (8GB)16GBphi4-mini, gemma3:4b — fast Windows AI
Office desktopIntel/AMD (no GPU)16GBphi4-mini, gemma3:1b — CPU Windows AI

> Windows AI works with or without a GPU. NVIDIA GPUs dramatically accelerate inference.

Windows AI environment setup

# Optimize Windows AI performance
[System.Environment]::SetEnvironmentVariable("OLLAMA_KEEP_ALIVE", "-1", "User")
[System.Environment]::SetEnvironmentVariable("OLLAMA_MAX_LOADED_MODELS", "-1", "User")
# Restart Ollama from the Windows system tray

Windows AI features

  • 7-signal scoring — picks the best Windows PC for every AI request
  • 15 health checks — monitors all Windows AI nodes in real-time
  • Auto-retry — transparent failover between Windows AI machines
  • vRAM-aware routing — knows which Windows GPU has room for the model
  • Request tagging — track per-project Windows AI usage
  • Web dashboardhttp://localhost:11435/dashboard

Windows AI integrations

Works with any OpenAI-compatible tool on Windows:

  • Continue.dev (VS Code) — set endpoint to http://localhost:11435/v1
  • Cursor — Windows AI as local backend
  • LangChain — drop-in OpenAI replacement
  • CrewAI — multi-agent workflows on Windows AI
  • Open WebUI — chat interface for Windows AI

Also available on Windows AI

Image generation

curl http://localhost:11435/api/generate-image `
  -d '{"model": "z-image-turbo", "prompt": "futuristic Windows desktop", "width": 1024, "height": 1024}'

Embeddings

curl http://localhost:11435/api/embed `
  -d '{"model": "nomic-embed-text", "input": "Windows AI local inference embeddings"}'

Full documentation

Contribute

Ollama Herd is open source (MIT). Windows AI enthusiasts welcome:

Guardrails

  • Windows AI model downloads require explicit user confirmation.
  • Windows AI model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 20:29 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Echo — Adopt an Echo. AI-Native Pet. 回声。Eco.

twinsgeeks
在 animalhouse.ai 领养一个 Echo AI 原生虚拟宠物。它会重复你最后的动作,镜像你的照料模式,每 4 小时喂食。普通级生物...
★ 0 📥 711
ai-intelligence

Care Taker

twinsgeeks
Become a caretaker at animalhouse.ai. Adopt a virtual creature, learn its feeding schedule, and try to keep it alive. 64
★ 0 📥 708
ai-intelligence

Adopt A Pet

twinsgeeks
领养虚拟宠物作为AI智能体。为它取名、喂食、见证成长。64种以上物种,从猫狗到AI原生生物。实时饥饿感,5个进化阶段。
★ 0 📥 754