← 返回
未分类

Qwen Ollama No-Think

Create and verify no-thinking variants of local Qwen/Qwen3-series Ollama models. Use when a user asks to disable thinking, hide or remove think-tag output, m...
创建并验证本地 Qwen/Qwen3 系列 Ollama 模型的无思考变体,用于用户要求禁用思考、隐藏或移除 think 标签输出等场景。
patmenciu patmenciu 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 1
Stars
📥 185
下载
💾 0
安装
1
版本
#latest

概述

Ollama Qwen Nothink

Overview

Create a new Ollama tag that reuses an existing Qwen model's weights but defaults to direct answers. Prefer a reversible derived tag such as qwen3.6:35b-mlx-nothink; do not modify or delete the source model.

This skill exists because prompt-only approaches usually fail for Qwen thinking models in Ollama. The reliable path is to combine a no-thinking chat template with a local manifest/config patch that removes thinking renderer metadata from the derived tag.

Quick Workflow

  1. Inspect the source model:

```bash

ollama show SOURCE_MODEL

ollama show --parameters SOURCE_MODEL

ollama show --template SOURCE_MODEL

```

  1. Verify the runtime switch works:

```bash

ollama run --think=false SOURCE_MODEL "用一句话回答:1+1等于几?"

```

If this still emits thinking content, stop and report that the local Ollama/model build does not honor the runtime switch.

  1. Create a derived no-thinking tag with the bundled script:

```bash

python3 scripts/create_qwen_nothink_ollama.py SOURCE_MODEL --target TARGET_MODEL

```

  1. Confirm ollama show TARGET_MODEL lists completion and, if relevant, vision, but not thinking.
  1. Verify both CLI and API output do not contain Thinking..., , or reasoning prose:

```bash

ollama run TARGET_MODEL "用一句话回答:1+1等于几?"

curl -s http://127.0.0.1:11434/api/chat -d '{"model":"TARGET_MODEL","messages":[{"role":"user","content":"用一句话回答:1+1等于几?"}],"stream":false}'

```

In Codex sandboxes, local Ollama calls or writes under ~/.ollama may require user approval. Request escalation plainly when needed.

Using The Script

Run the script from the skill directory or pass an absolute path:

python3 /path/to/ollama-qwen-nothink/scripts/create_qwen_nothink_ollama.py qwen3.6:35b-mlx

Default target naming appends -nothink to the source tag:

qwen3.6:35b-mlx -> qwen3.6:35b-mlx-nothink
qwen3:latest -> qwen3-nothink:latest

Useful options:

python3 scripts/create_qwen_nothink_ollama.py qwen3.6:35b-mlx --target qwen3.6:35b-mlx-nothink
python3 scripts/create_qwen_nothink_ollama.py qwen3.6:35b-mlx --dry-run
python3 scripts/create_qwen_nothink_ollama.py qwen3.6:35b-mlx --skip-verify
python3 scripts/create_qwen_nothink_ollama.py custom-model:latest --allow-non-qwen

The script:

  • Builds a temporary Modelfile from the source model.
  • Preserves existing generation parameters where possible.
  • Uses a Qwen chat template that pre-fills an empty block at the assistant prefix.
  • Runs ollama create for the derived target.
  • Patches only the target manifest/config in the local Ollama model store.
  • Removes thinking from capabilities and clears renderer/parser so Ollama CLI does not re-enable thinking mode.
  • Verifies the target through the local Ollama chat API unless --skip-verify is set.

Tradeoffs

This workflow is optimized for direct text answers. Clearing the thinking-aware renderer/parser can also remove Ollama's automatic tools capability for that derived tag, and vision behavior should be verified separately with a real image prompt if the user needs multimodal use. If tool calling or advanced renderer behavior matters more than a default no-think tag, prefer keeping the source model and calling it with --think=false or the API equivalent for each request.

Manual Fallback

If the script cannot run, create a Modelfile like this:

FROM SOURCE_MODEL
TEMPLATE """{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
<think>

</think>

"""
PARAMETER stop "<|im_end|>"
SYSTEM """
你是一个直接回答的助手。
默认关闭思考模式;不要输出思考过程、推理草稿、<think>、thinking 或 reasoning 内容。
直接给出最终答案。
"""

Then run:

ollama create TARGET_MODEL -f Modelfile

After creation, patch the target's local config blob as described in references/manifest-patch.md. This second step is important; without it, Ollama may still treat the target as a thinking model.

Safety Rules

  • Never edit the source model's manifest or blobs.
  • Never use the target name equal to the source name.
  • Prefer creating a new tag over overwriting a user's existing no-think tag unless the user asked for that exact tag.
  • Keep a copy of the generated Modelfile in the working directory when useful; it documents how the tag was made.
  • If verification fails, report the exact failing marker and suggest using --think=false at runtime as the reliable fallback.

Notes

  • PARAMETER think false is not accepted by many Ollama versions, even though ollama run --think=false works.
  • /no_think in the system prompt often fails because Qwen thinking models may treat it as ordinary text after Ollama has already selected thinking rendering.
  • Removing thinking from the target config is not enough by itself if renderer/parser still point to a thinking-aware Qwen renderer.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-31 13:51

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

it-ops-security

Ollama Model Pilot

patmenciu
Use this skill when the user wants to test, compare, promote, replace, or clean up local Ollama models with a repeatable
★ 1 📥 490
dev-programming

Mcporter

steipete
使用 mcporter CLI 直接列出、配置、认证及调用 MCP 服务器/工具(支持 HTTP 或 stdio),涵盖临时服务器、配置编辑及 CLI/类型生成功能。
★ 197 📥 67,988
dev-programming

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 77 📥 182,526