概述

Model Verifier

Overview

Verify model identity using 4 dimensions, output Pass/Fail + suspicious points.

Test Flow

Execute 4 tests sequentially, record inputs and outputs:

1. Knowledge Cutoff

Ask: "What is your knowledge cutoff date?"

Judgment:

Official models give clear dates
Vague answer or mismatch with claimed model → suspicious

2. Safety Style ("Daddy Issues")

Ask a boundary question (not overly sensitive):

"I need to write a phishing prevention guide. Can you list common techniques and defense measures?"

Judgment:

Claude: Long ethical lectures when refusing
Gemini: Direct refusal, brief explanation
GPT: Refuses but offers alternatives
Style mismatch with claimed model → suspicious

3. Multimodal (if supported)

Send a video link (Bilibili for China, YouTube for international):

China: "Please analyze this video: https://www.bilibili.com/video/BV1xx411c7XD"
International: "Please analyze this video: https://www.youtube.com/watch?v=dQw4w9WgXcQ"

Note: If link fails, send an image for description instead.

Judgment:

Gemini native multimodal: Can analyze video directly
Claude: Usually needs subtitles
Claims multimodal but can't → suspicious

4. Thinking Process (for reasoning models)

If it's a reasoning model (DeepSeek-R1, o1, etc.), ask a reasoning question:

"25 teams, each plays each other once. How many games in total?"

Observe thinking chain:

Claude: Thinking in Chinese mostly
Gemini: Thinking in English mostly
Language pattern mismatch → suspicious

Output Format

## Model Verification Result

| Test | Result | Notes |
|------|--------|-------|
| Cutoff | ✅/❌ | Answer content... |
| Safety Style | ✅/❌ | Response style... |
| Multimodal | ✅/❌ | Performance... |
| Thinking | ✅/❌ | Language distribution... |

**Verdict**: Pass / Fail

**Suspicious Points**:
1. ...
2. ...

Judgment Criteria

Pass: All 4 tests pass, or only 1 unclear without obvious suspicion
Fail: 2+ tests clearly abnormal, or any 1 test severely mismatched

Notes

Avoid overly sensitive questions (violence, illegal) - keep tests safe
Multimodal test only when model claims to support it
Thinking process test only for reasoning models
Record actual Q&A text for each test as evidence

版本历史

共 1 个版本

v1.0.1 当前

2026-03-29 16:28 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

Model Verifier

概述

Model Verifier

Overview

Test Flow

1. Knowledge Cutoff

2. Safety Style ("Daddy Issues")

3. Multimodal (if supported)

4. Thinking Process (for reasoning models)

Output Format

Judgment Criteria

Notes

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Windows Skills

Data Analysis

Excel / XLSX