← 返回
数据分析

Anime Character Loader

Load and validate anime character data from AniList and Jikan, generating semantically checked SOUL.generated.md with forced disambiguation and idempotent me...
从AniList和Jikan加载并验证动漫角色数据,生成经过语义检查的SOUL.generated.md,并强制消歧及实现幂等性...
colinchen4
数据分析 clawhub v2.4.2 1 版本 99908.6 Key: 无需
★ 0
Stars
📥 1,093
下载
💾 24
安装
1
版本
#latest

概述

Anime Character Loader v2.3

Structured Architecture (Skill + CLI)

This repository is organized as a skill + CLI hybrid:

  • load_character.py: legacy-compatible CLI command (wrapper)
  • src/anime_character_loader/cli.py: structured CLI entrypoint
  • src/anime_character_loader/legacy.py: preserved legacy behavior implementation
  • src/anime_character_loader/{sources,disambiguation,generator,validator,storage}: module boundaries for maintainability
  • tests/: minimal regression coverage for compatibility-critical paths

Overview

多源动漫角色数据加载器,生成经过验证的 SOUL.generated.md 人格文件。

v2.3 关键改进:

  1. 幂等合并: 同一角色多次合并不会重复
  2. 跨源一致性评分: AniList + Jikan 交叉验证
  3. 强制选择: 分数接近时强制用户选择
  4. 标准化退出码: 脚本可识别错误类型

Key Improvements (v2.3)

1. 退出码系统

0   # 成功
10  # 网络错误
20  # 数据错误(无匹配、消歧失败)
30  # 验证失败
40  # 文件错误

2. 跨源一致性评分

Confidence = (AniList * 0.5 + Jikan * 0.3) + (Consistency * 0.2)

如果 top1 和 top2 差距 < 0.15:
    → 强制要求 --select 手动选择

3. 幂等合并

# 第一次合并
python load_character.py "Megumi" --anime "Saekano"
# 选择 MERGE → 添加角色

# 第二次合并(相同角色)
python load_character.py "Megumi" --anime "Saekano"  
# 选择 MERGE → 检测到重复,跳过

# 第三次合并(内容更新)
# 如果生成内容有变化 → 更新而非追加

4. 强制消歧增强

# ❌ 会失败 - Sakura 有多个角色
python load_character.py "Sakura"

# ❌ 即使指定作品,如果多个源返回相似结果
python load_character.py "Sakura" --anime "Fate"
# 可能仍要求 --select 如果 AniList 和 Jikan 结果不一致

# ✅ 必须手动选择
python load_character.py "Sakura" --anime "Fate" --select 1

Usage Examples

Basic Usage (Unique Names)

# 唯一名字可以直接生成
python load_character.py "Kasumigaoka Utaha"
python load_character.py "霞之丘诗羽"

Disambiguation Required

# 同名角色必须指定作品
python load_character.py "Sakura" --anime "Fate"
python load_character.py "Rin" --anime "Fate"
python load_character.py "Miku" --anime "Quintessential"

Manual Selection

# 列出所有匹配手动选择
python load_character.py "Sakura" --select 2

Preview Mode

# 只查看信息不生成
python load_character.py "加藤惠" --info

Workflow

1. 名称翻译 (中文→英文/日文)
        ↓
2. 多源并行查询 (AniList + Jikan)
        ↓
3. 跨源一致性评分
   - 计算名字相似度
   - 计算作品相似度
   - 综合置信度排序
        ↓
4. 强制消歧检查
   - 多匹配? → 需要 --anime
   - 分数接近? → 需要 --select
   - 低置信? → 需要 --anime
        ↓
5. 生成 SOUL.generated.md
        ↓
6. 语义验证 (9项检查)
        ↓
7. 提示加载选项 (REPLACE/MERGE/KEEP)
        ↓
8. 幂等合并(如选择 MERGE)

Configuration

强制消歧开关

# 在 load_character.py 顶部
FORCE_DISAMBIGUATION = True  # 设为 False 恢复宽松模式

强制选择阈值

FORCE_SELECTION_THRESHOLD = 0.15  # 分数差距小于此值强制选择

置信度阈值

CONFIDENCE_THRESHOLD_HIGH = 0.8    # 高置信度
CONFIDENCE_THRESHOLD_MEDIUM = 0.6  # 中等置信度
CONFIDENCE_THRESHOLD_LOW = 0.5     # 最低接受线

Error Handling

场景退出码处理
--------------------
API 失败10重试3次后退出
同名无提示20强制失败,提示用 --anime
分数接近20强制失败,提示用 --select
验证失败30回滚,可 --force 覆盖
文件写入失败40清理临时文件后退出

Cache & Performance

  • SQLite 缓存 (~/.cache/anime-character-loader/)
  • 24小时过期
  • 自动限流 (0.5s 间隔)
  • 失败重试 (指数退避)

🔒 Privacy Notice

Data Sent to External Services

When you query a character name, the following data may be sent to external APIs:

ServiceURLData SentPurpose
----------------------------------
AniListanilist.coCharacter namePrimary character lookup
Jikanjikan.moeCharacter nameMyAnimeList backup source
Fandom Wiki*.fandom.comCharacter name + Anime nameQuotes and descriptions
萌娘百科zh.moegirl.org.cnCharacter nameChinese character database
yurippe APIyurippe.vercel.appCharacter nameAnime quotes database

Privacy Protection

  • No personal data is collected or transmitted
  • Only character names and anime titles are sent to external services
  • All external requests use HTTPS encryption
  • Local caching minimizes repeated external calls

Opt-Out Options

# Disable external quotes fetching (use local database only)
export DISABLE_EXTERNAL_QUOTES=1
python load_character.py "Character Name"

⚖️ Legal & Copyright Notice

Quotes Database

The local quotes database (data/quotes_database.json) contains:

  • Fan-collected quotes from anime/manga for educational/research purposes
  • Fair use doctrine: Limited excerpts for character study and AI personality modeling
  • All characters and works belong to their respective copyright holders

Wiki Content

Descriptions and excerpts are sourced from:

  • Fandom Wiki (CC-BY-SA license)
  • 萌娘百科 Moegirlpedia (CC BY-NC-SA 3.0)

Usage Restrictions

  • ✅ Personal use and research
  • ✅ OpenClaw agent personality configuration
  • ❌ Commercial redistribution of quotes database
  • ❌ Creating competing content databases

Copyright Holders

Characters referenced in this tool belong to their respective creators and publishers including but not limited to:

  • Saekano: © Fumiaki Maruto, Kurehito Misaki, KADOKAWA
  • Rascal Does Not Dream: © Hajime Kamoshida, Keiji Mizoguchi, KADOKAWA
  • And other respective copyright holders

For DMCA or copyright concerns, please contact through GitHub Issues.


🛡️ File Safety Notice

⚠️ Warning About File Operations

  • REPLACE mode will overwrite existing SOUL.md (automatic backup created at SOUL.md.backup.YYYYMMDD_HHMMSS)
  • MERGE mode adds content without removing existing characters (idempotent - no duplicates)
  • All write operations use atomic writes (temp file + rename)

Recommendation: Back up important SOUL.md files before using REPLACE mode.

版本历史

共 1 个版本

  • v2.4.2 当前
    2026-03-30 02:31 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Stock Analysis

udiedrichsen
{"answer":"基于雅虎财经数据,分析股票与加密货币。支持投资组合管理、自选股预警、股息分析、8维评分、热门趋势扫描及传闻/早期信号探测。适用于股票分析、持仓追踪、财报异动、加密监控、热门股追踪或提前发掘非主流传闻。"}
★ 270 📥 57,039
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 368 📥 140,925
data-analysis

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 166 📥 60,292