← 返回
未分类 Key 中文

OllamaDiffuser Image generation

Local AI image generation using OllamaDiffuser. Use this skill when Claude needs to generate, edit (img2img/inpaint), or control (ControlNet) images locally...
使用 OllamaDiffuser 在本地进行 AI 图像生成。当 Claude 需要在本地生成、编辑(img2img/inpaint)或控制(ControlNet)图像时使用此技能。
1tsnakers 1tsnakers 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 1
Stars
📥 245
下载
💾 0
安装
1
版本
#latest

概述

OllamaDiffuser

OllamaDiffuser is a local AI image generation tool that provides an Ollama-like experience for Stable Diffusion and FLUX models. It can be interfaced via CLI, REST API, or MCP.

Setup & Installation

If the tool is not yet installed or needs specific hardware support, use these commands:

  • Standard Installation: pip install ollamadiffuser
  • Full Suite (Recommended): pip install "ollamadiffuser[full]"
  • Low-VRAM/GGUF Support: pip install "ollamadiffuser[gguf]"
  • MCP/Agent Integration: pip install "ollamadiffuser[mcp]"
  • Apple Silicon (Metal): CMAKE_ARGS="-DSD_METAL=ON" pip install stable-diffusion-cpp-python

Authentication: Gated models (e.g., FLUX.1-dev, SD 3.5) require a Hugging Face token.

  • export HF_TOKEN=your_token_here (Add to .bashrc or .zshrc for persistence).

Core Workflows

1. Text-to-Image Generation

Generate an image from a text prompt.

  • Tool/Command: Use the generate_image MCP tool or the REST API /api/generate.
  • Key Parameters:
  • prompt: Detailed description of the image.
  • width / height: Default is usually 1024x1024 for SDXL/FLUX, 512x512 for SD1.5.
  • seed: Optional for reproducibility.
  • response_format: Set to b64_json for agent-friendly base64 responses.

2. Model Management

Manage which models are downloaded and active in VRAM.

  • Listing Models: Use list_models to see installed versions.
  • Pulling Models: Use ollamadiffuser pull via shell.
  • Loading Models: Use load_model to switch active models in memory.
  • Recommendations: Use ollamadiffuser recommend to find models that fit the available GPU VRAM.

3. Image-to-Image & Inpainting

Modify existing images.

  • Img2Img: Use /api/generate/img2img. Requires image (file/base64) and strength (0.0-1.0; lower = closer to original).
  • Inpainting: Use /api/generate/inpaint. Requires image and a mask image.

4. Advanced Control (ControlNet)

Use structural guides (Canny, Depth, OpenPose) for precise control.

  • Workflow:
  1. Ensure a ControlNet model is pulled (e.g., ollamadiffuser pull controlnet-canny-sd15).
  2. Use /api/generate/controlnet.
  3. Provide a control_image and specify the preprocessor (e.g., "canny").

Model Selection Guide

Use CaseRecommended ModelVRAMNote
:---:---:---:---
Highest Qualityflux.1-dev20GB+Requires HF Token
Fast & High Qualityflux.1-schnell16GB+No token needed
Budget GPU (6GB)flux.1-dev-gguf-q4ks6GBGGUF Quantized
Ultra Low VRAMflux.1-dev-gguf-q2k3GBEntry-level
Classic/Faststable-diffusion-1.54GB+Great for img2img
Photorealisticrealvisxl-v46GB+SDXL based

Technical Notes

  • API Base URL: http://localhost:8000
  • Web UI: http://localhost:8001 (Start with ollamadiffuser --mode ui)
  • HF Tokens: Gated models (FLUX.1-dev, SD 3.5) require export HF_TOKEN=your_token.
  • GGUF Support: Install with pip install "ollamadiffuser[gguf]" for memory-efficient runs.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-12 05:32 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

design-media

Openai Whisper

steipete
使用 Whisper CLI 进行本地语音转文字(无需 API 密钥)
★ 330 📥 93,642
dev-programming

Cross Platform Notifier

1tsnakers
在 Windows、macOS、Linux(包括 WSL)上触发系统原生通知,任务完成后提醒用户。
★ 0 📥 259
design-media

UI/UX Pro Max

xobi667
提供 UI/UX 设计智能与实现指导,帮助打造精美界面。适用于 UI 设计、UX 流程、信息架构、视觉风格、设计系统/标记、组件规格、文案/微文案、无障碍及前端 UI(HTML/CSS/JS、React、Next.js、Vue、Svelte
★ 216 📥 47,309