← 返回
未分类

DiePre Embodied Bridge

DiePre 具身桥接层 —— 将2D视觉检测桥接到3D空间理解和机器人动作规划,vision-action-evolution-loop 的具体实现
DiePre 具身桥接层 —— 将2D视觉检测桥接到3D空间理解和机器人动作规划,vision-action-evolution-loop 的具体实现
kingofzhao kingofzhao 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 291
下载
💾 0
安装
1
版本
#latest

概述

DiePre Embodied Bridge Skill

元数据

字段
-------------------------------------------
名称diepre-embodied-bridge
版本1.0.0
作者KingOfZhao
发布日期2026-03-31
置信度96%

核心哲学

vision-action-evolution-loop 定义了抽象的五阶段闭环。

本 Skill 是它的具体实现层——聚焦于"如何把2D线条变成3D动作"。

认知节点关系:

vision-action-evolution-loop (父: 抽象闭环)
    └── diepre-embodied-bridge (本Skill: 具体实现)
            ├── diepre-vision-cognition (上游: 2D检测)
            └── diepre-action-memory (下游: 动作记忆, 未来)

三大核心认知

1. 已知几何估算(非通用3D重建)

包装盒不是复杂场景,是已知几何体。不需要 NeRF / Gaussian Splatting / SfM。

输入: 2D DXF + FEFCO类型 + 纸板厚度
算法: FEFCO规则引擎 + 2D尺寸 → 3D展开坐标 → 折叠矩阵
输出: 三维空间坐标 (x,y,z) + 折叠顺序 + 面法向量
硬件: M1 Max 轻松运行(纯CPU计算,<100ms)

为什么排除 NeRF/Gaussian Splatting?

  • 包装盒是平面折叠结构,不是复杂3D场景
  • NeRF需要数百张照片+GPU集群训练,M1 Max跑不动
  • Gaussian Splatting需要密集视角,生产环境不现实
  • 已知几何估算:1张照片+FEFCO规则→3D,秒级完成

2. MCP 工具链(Tool-Augmented)

OpenCV 管道封装为可调用工具,VLA 模型调用工具而非处理原始图像:

tools = {
    "detect_dieline": {
        "input": "image_path: str",
        "output": "dxf_path: str, confidence: float",
        "impl": "diepre_vision.analyze"
    },
    "estimate_dimensions": {
        "input": "dxf_path: str",
        "output": "length, width, height, thickness_mm",
        "impl": "dimension_estimator.from_dxf"
    },
    "identify_fefco_type": {
        "input": "dxf_path: str, layout_features: dict",
        "output": "fefco_type: str (e.g. 0201, 0427)",
        "impl": "fefco_classifier.classify"
    },
    "calculate_fold_sequence": {
        "input": "fefco_type: str, dimensions: dict, material: str",
        "output": "ordered_steps: list[FoldStep]",
        "impl": "fold_planner.plan"
    },
    "compute_grasp_points": {
        "input": "fold_sequence: list[FoldStep], material_thickness: float",
        "output": "grasp_points: list[GraspPoint] (xyz + force + angle)",
        "impl": "grasp_calculator.compute"
    },
    "estimate_quality": {
        "input": "image_path: str, expected_dimensions: dict",
        "output": "quality_score: float, defects: list",
        "impl": "quality_checker.evaluate"
    }
}

3. 自迭代进化机制

执行任务 → 记录结果 → 提取失败模式 → 调整参数 → 下次优化

具体流程:
1. 每次任务执行完,写入 evolution_log/{task_id}.json:
   {
     "task_id": "diepre_20260331_001",
     "input": {"image": "...", "fefco": "0201", "material": "B flute"},
     "execution": {"steps": [...], "timing_ms": 3400},
     "result": {"success": false, "fail_step": 3, "error": "grasp_slip"},
     "params_used": {"grasp_force": 2.5, "approach_angle": 45}
   }

2. 定期扫描 evolution_log/,提取失败模式:
   - grasp_slip 在 B flute 上发生频率 73% → 提高抓取力
   - fold_sequence 错误在 FEFCO 0427 上频率 40% → 修正折叠规则

3. 更新参数文件 params/evolved_params.json:
   {"B_flute_grasp_force": 3.2, "0427_fold_override": [...]}

4. 下次任务加载 evolved_params.json,用优化后参数执行

安装命令

clawhub install diepre-embodied-bridge
# 或手动安装
cp -r skills/diepre-embodied-bridge ~/.openclaw/skills/

调用方式

from skills.diepre_embodied_bridge import DiePreEmbodiedBridge

bridge = DiePreEmbodiedBridge(workspace=".")

# 单次执行
result = bridge.execute(
    image_path="path/to/box_photo.jpg",
    material="B flute",
    thickness_mm=3.0
)

print(result.fefco_type)          # "0201"
print(result.dimensions)          # {"L": 300, "W": 200, "H": 100}
print(result.fold_sequence)       # [FoldStep(...), ...]
print(result.grasp_points)        # [GraspPoint(x=150,y=0,z=50,force=3.2), ...]
print(result.quality_score)       # 0.92
print(result.confidence)          # 0.96

# 自迭代: 注入失败反馈
bridge.record_failure(
    task_id="diepre_20260331_001",
    fail_step=3,
    error_type="grasp_slip",
    context={"material": "B flute", "grasp_force": 2.5}
)

# 查看进化状态
stats = bridge.evolution_stats()
print(stats.total_tasks)          # 47
print(stats.failure_rate)         # 0.12
print(stats.top_failure_modes)    # [("grasp_slip", 8), ("fold_error", 4)]

学术参考文献

  1. From 2D CAD to 3D Parametric via VLM — 2D→3D桥接,参数化建模
  2. Tool-Augmented VLLMs as Generic CAD Task Solvers (ICCV 2025) — 工具增强策略,MCP工具链的理论基础
  3. Vlaser: Synergistic Embodied Reasoning — 抓取点计算+力控参数
  4. Efficient VLA Models — 本地部署优化(M1 Max适用)
  5. SAGE: Multi-Agent Self-Evolution — 自迭代进化的学术对应
  6. Self-evolving Embodied AI — 记忆自更新+参数进化

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-07 20:03 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

ai-agent

Find Skills

guipi888
场景驱动+关键词双模式技能发现工具。当用户用自然语言描述场景/需求(如"我想做一个海报""帮我分析股票"),或明确说"安装技能/find skills/找个skill"时,自动从官方内置、本地已安装、SkillHub、虾评、GitHub、C
★ 1,460 📥 516,846
ai-agent

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,395 📥 322,463
ai-agent

self-improving agent

pskoett
捕获经验教训、错误及修正内容,以实现持续改进。适用于以下场景:(1)命令或操作意外失败;(2)用户纠正Claude(如“不,那不对……”“实际上……”);(3)用户请求的功能不存在;(4)外部API或工具出现故障;(5)Claude发现自身
★ 4,099 📥 825,640