← 返回
AI智能 中文

Model Deploy Skill

Use this skill when users request to deploy LLMs (Qwen, DeepSeek, etc.) on specified GPU servers and start the model service. This skill can Download models...
当用户请求在指定GPU服务器上部署大语言模型(如Qwen、DeepSeek等)并启动服务时使用,支持下载模型...
wangwei1237
AI智能 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 649
下载
💾 23
安装
1
版本
#latest

概述

Model Deploy

Deploy large language models on GPU servers using vLLM. NOTE: only ModelScope plateform and vLLM inference engine is supported currently.

Please ensure that the server where your OpenClaw is located has passwordless login access to the GPU servers. You can achieve this using ssh-copy-id command on your OpenClaw server.

This skill assumes that Miniconda is already installed on your server and is used to manage Python environments. You can use the following command to create the vllm environment with Miniconda:

conda create -n vllm python=3.10 -y
conda activate vllm
pip install vllm

Quick Start

On the ModelScope platform, models are uniquely identified by /. For example, for Qwen/Qwen3.5-0.8B, MODEL_ORG is Qwen and MODEL_NAME is Qwen3.5-0.8B.

Deploying Qwen Family Models

To deploy Qwen-Family models, use the deployment script scripts/deploy.sh. The usage of the script is as follows:

Usage: [ENV_VARS] deploy.sh <model_name>

Example:
  PORT=8001 \
  GPU_COUNT=4 \
  ./deploy.sh Qwen3.5-0.8B

Environment Variables:
  ENV_NAME        conda environment name (default: vllm)
  PORT            service port (default: 8000)
  GPU_COUNT       number of GPUs for tensor parallelism (default: 1)
  PROXY           proxy address (default: http://{proxyaddress}:{port})
  MODEL_BASE_PATH local path to store models (default: /home/work/models)
VariableDescriptionDefault
--------------------------------
MODEL_ORGmodel organizationQwen
MODEL_NAMEmodel nameQwen3.5-0.8B
ENV_NAMEconda environmentvllm
PORTmodel service port8000
GPU_COUNTnumber of GPUs for tensor parallelism1
PROXYproxy addresshttp://{proxyaddress}:{port}
MODEL_BASE_PATHlocal storage path for models/home/work/models

Deployment Steps

  • Extract required information from the user request: model name (MODEL_NAME), model organization (MODEL_ORG), target server address (TARGET_HOST), deployment user (TARGET_USER), and other necessary parameters.
  • Copy ./skills/model-deploy/scripts/deploy.sh to the specified path on the target server, e.g., $HOME/wangwei1237.
  • Grant execute permission to the deployment script on the target server.
  • Run the deployment script on the target server using the following format:
  • ssh ${TARGET_USER}@${TARGET_HOST} "cd $HOME/wangwei1237 && PORT=8001 && ./deploy.sh Qwen3.5-0.8B"
    
  • After deployment, test whether the model service has started successfully on the target server by running:
  • curl -X POST http://127.0.0.1:8001/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
          "messages": [{"role": "user", "content": "你好"}],
          "max_tokens": 512
      }'
    

Constraints

  • Commands on the target server must be executed in this format:

ssh ${TARGET_USER}@${TARGET_HOST} "${CMD}"

Troubleshooting

  • Port occupied: Check with netstat -tlnp | grep
  • Version issues: Run pip install vllm --upgrade
  • Network issues: Set proxy with export https_proxy="http://{proxyaddress}:{port}"
  • Insufficient GPU memory: Check GPU usage with nvidia-smi, find a suitable GPU index GPU_FAN, set export CUDA_VISIBLE_DEVICES=$GPU_FAN to specify the GPU, then rerun the deployment script.

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-19 21:18 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,358 📥 318,384
ai-intelligence

Nano Banana Pro

steipete
使用 Nano Banana Pro (Gemini 3 Pro Image) 生成或编辑图像。支持文生图、图生图及 1K/2K/4K 分辨率,适用于图像创建、修改及编辑请求,使用 --input-image 指定输入图像。
★ 418 📥 115,221
ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。支持创建/查询实体(人员、项目、任务、事件、文档)及关联...
★ 712 📥 243,841