← 返回
数据分析 中文

Arxiv Gamedevbench Evaluating Agentic Capabili

Learned from arXiv paper GameDevBench: Evaluating Agentic Capabilities Through Game Development. Use this skill to scaffold Node.js experiments based on the...
源自arXiv论文《GameDevBench:通过游戏开发评估智能体能力》。用于基于……搭建Node.js实验脚手架。
wanng-ide
数据分析 clawhub v1.0.0 1 版本 99899.2 Key: 无需
★ 0
Stars
📥 991
下载
💾 101
安装
1
版本
#latest

概述

arxiv-gamedevbench-evaluating-agentic-capabili

Source

  • Paper key: 44f3ad505bee7a5c25a60d2a3686cb7e
  • Title: GameDevBench: Evaluating Agentic Capabilities Through Game Development
  • Categories: cs.AI,cs.CL,cs.SE

Learned insight

Despite rapid progress on coding agents, progress on their multimodal counterparts has lagged behind. A key challenge is the scarcity of evaluation testbeds that combine the complexity of software development with the need for deep multimodal understanding. Game development provides such a testbed as agents must navigate large, dense codebases while manipulating intrinsically multimodal assets such as shaders, sprites, and animations within a visual game scene. We present GameDevBench, the first

Node.js implementation entry

node {baseDir}/scripts/run.js

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-29 07:18 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

developer-tools

Api Tester

wanng-ide
执行结构化HTTP/HTTPS请求(GET、POST、PUT、DELETE),支持自定义标头和JSON正文。适用于API测试、健康检查或交互操作。
★ 7 📥 7,524
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 368 📥 140,927
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 199 📥 65,287