← 返回
未分类 中文

Phi Phi4

Phi 4 by Microsoft — small but powerful LLMs that run on minimal hardware. Phi-4 (14B), Phi-4-mini (3.8B), and Phi-3.5 across your device fleet. Perfect for...
Phi 4 由微软打造——体积小但功能强大的 LLM,可在极简硬件上运行。Phi-4(14B)、Phi-4-mini(3.8B)和 Phi-3.5 可在您的设备集群上部署。非常适合...
twinsgeeks twinsgeeks 来源
未分类 clawhub v1.0.1 1 版本 99771.2 Key: 无需
★ 2
Stars
📥 396
下载
💾 2
安装
1
版本
#apple-silicon#efficient#fleet-routing#latest#local-llm#low-ram#mac-mini#macbook-air#microsoft-phi#ollama#phi#phi-4#phi4#small-llm

概述

Phi 4 — Microsoft's Small Models, Big Results

Phi models prove you don't need 70B parameters for great results. Phi-4 matches much larger models on reasoning benchmarks while running on hardware as modest as an 8GB MacBook Air. Route them across your fleet for even better throughput.

Supported Phi models

ModelParametersOllama nameRAM neededBest for
----------------------------------------------------
Phi-414Bphi410GBReasoning, math, code — punches way above its weight
Phi-4-mini3.8Bphi4-mini4GBUltra-fast on any device, even 8GB Macs
Phi-3.5-mini3.8Bphi3.54GBProven lightweight model
Phi-3-medium14Bphi3:14b10GBBalanced quality and speed

Quick start

pip install ollama-herd    # PyPI: https://pypi.org/project/ollama-herd/
herd                       # start the router (port 11435)
herd-node                  # run on each device — finds the router automatically

No models are downloaded during installation. All pulls require user confirmation.

Why Phi for small devices

A Mac Mini with 16GB RAM can run Phi-4 (14B) with room to spare. A MacBook Air with 8GB runs Phi-4-mini comfortably. These models start in seconds and respond fast — ideal for devices that can't load a 70B model.

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")

# Phi-4 for reasoning
response = client.chat.completions.create(
    model="phi4",
    messages=[{"role": "user", "content": "Solve: if 3x + 7 = 22, what is x?"}],
)
print(response.choices[0].message.content)

Phi-4-mini — fastest response times

curl http://localhost:11435/api/chat -d '{
  "model": "phi4-mini",
  "messages": [{"role": "user", "content": "Summarize this in 3 bullet points: ..."}],
  "stream": false
}'

OpenAI-compatible API

curl http://localhost:11435/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "phi4", "messages": [{"role": "user", "content": "Write a unit test for a login function"}]}'

Ideal hardware pairings

> Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Your deviceRAMBest Phi modelWhy
--------------------------------------
MacBook Air (8GB)8GBphi4-miniFits with room for other apps
Mac Mini (16GB)16GBphi4Full Phi-4 with headroom
Mac Mini (24GB)24GBphi4Can run Phi-4 + an embedding model simultaneously
MacBook Pro (36GB)36GBphi4 + phi4-miniBoth loaded, router picks based on task

Monitor your fleet

# What's loaded and where
curl -s http://localhost:11435/api/ps | python3 -m json.tool

# Fleet health overview
curl -s http://localhost:11435/dashboard/api/health | python3 -m json.tool

# Model recommendations based on your hardware
curl -s http://localhost:11435/dashboard/api/recommendations | python3 -m json.tool

Web dashboard at http://localhost:11435/dashboard — live view of nodes, queues, and performance.

Also available on this fleet

Larger LLMs (when you need more power)

Llama 3.3 (70B), Qwen 3.5, DeepSeek-R1, Mistral Large — route to a bigger machine in the fleet.

Image generation

curl http://localhost:11435/api/generate-image \
  -d '{"model": "z-image-turbo", "prompt": "minimalist circuit board art", "width": 512, "height": 512}'

Speech-to-text

curl http://localhost:11435/api/transcribe -F "file=@meeting.wav" -F "model=qwen3-asr"

Embeddings

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "Microsoft Phi small language model"}'

Full documentation

Guardrails

  • Model downloads require explicit user confirmation — Phi models are small (2-8GB) but still require confirmation.
  • Model deletion requires explicit user confirmation.
  • Never delete or modify files in ~/.fleet-manager/.
  • No models are downloaded automatically — all pulls are user-initiated or require opt-in.

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-05-03 11:05 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,394 📥 322,239
ai-agent

Find Skills

guipi888
场景驱动+关键词双模式技能发现工具。当用户用自然语言描述场景/需求(如"我想做一个海报""帮我分析股票"),或明确说"安装技能/find skills/找个skill"时,自动从官方内置、本地已安装、SkillHub、虾评、GitHub、C
★ 1,459 📥 507,805
life-service

Adopt A Pet

twinsgeeks
领养虚拟宠物作为AI智能体。为它取名、喂食、见证成长。64种以上物种,从猫狗到AI原生生物。实时饥饿感,5个进化阶段。
★ 0 📥 791