← 返回
未分类 Key

红外相机视频处理

Automated wildlife detection pipeline for infrared trail camera footage. Scans video folders, extracts metadata (filename, capture time, duration), performs frame extraction and AI vision analysis to identify humans and wildlife species with individual counts, applies location-based correction, and exports structured results to Excel. Ignores audio. Designed for batch processing of trail camera footage.
Automated wildlife detection pipeline for infrared trail camera footage. Scans video folders, extracts metadata (filename, capture time, duration), performs frame extraction and AI vision analysis to identify humans and wildlife species with individual counts, applies location-based correction, and exports structured results to Excel. Ignores audio. Designed for batch processing of trail camera footage.
BORING
未分类 community v2.2.1 4 版本 98148.1 Key: 需要
★ 0
Stars
📥 53
下载
💾 0
安装
4
版本
#latest

概述

AI Wildlife Camera

Automated wildlife detection pipeline for infrared trail camera footage. Version 2.2 — pre-execution interaction + configurable frame density.

What's New in v2.2

OptimizationProblem AddressedImplementation
-----------------------------------------------
Pre-execution interactionAgent extracted frames before asking user preferencesAll questions moved to Phase 0 before any file operations
Configurable frame densityHigh density caused excessive token usageUser selects High/Medium/Low before extraction starts
Three-zone frame extractionAnimal漏识 (animals entering mid-video missed)Front 25% (50% frames) + Middle 50% (35% frames) + Last 25% (15% frames)
Increased frame densitySmall/fast animals missed in sparse sampling8-28 frames per video (up from 5-20), 800px resolution (up from 640px)
Similar species contrast reasoningSpecies misjudgmentPrompt model to compare distinguishing features dynamically
Human-priority detection rulesFalse positives in human activity scenesPrompt: "detected humans → lower wildlife threshold, flag separately"
"Suspected wildlife" categoryDirect denial of unclear casesOutput: has_wildlife: True/False/疑似 instead of binary
Cross-frame deduplication promptCount errors (same animal counted multiple times)Prompt instructs model to check "跨帧一致性"
Dual-model API supportSingle model biasanalyze_api.py supports Qwen-VL with checkpointing and retry

Workflow Overview

5-phase pipeline for batch-processing trail camera videos:

PhaseTaskScriptAgent Role
---------------------------------
0预执行交互确认(强制)Agent asks user
1Scan videos, extract metadatainventory.pyAuto
2Extract frames from each video (three-zone, user-selected density)extract_frames.pyAuto
3Vision analysis + location correction + write resultsanalyze_api.py / Agent reviewAPI batch or Agent reviews
4Export results to Excelexport_excel.pyAuto

关键变更(v2.2):所有交互确认在 Phase 0 完成,确认后才执行任何文件操作。

Prerequisites

  • FFmpeg with ffprobe (ffmpeg.exe / ffprobe.exe on Windows)
  • Python 3.8+ with openpyxl
  • NVIDIA GPU (optional, speeds up if vision model runs locally — currently uses agent vision)

Configuration

Edit scripts/inventory.py and scripts/extract_frames.py top CONFIG section:

FFMPEG_BIN = r"C:\path\to\ffmpeg\bin"      # Windows
# FFMPEG_BIN = "/usr/bin"                    # Linux/macOS

INPUT_DIR = r"C:\TrailCamera\Videos"        # Your footage folder
OUTPUT_DIR = r"C:\TrailCamera\Output"        # Results folder

Phase 0: 预执行交互确认(强制)

⚠️ 执行任何文件操作前,Agent 必须先完成以下交互,获得用户明确确认后方可继续。

Step 0a: 询问相机安装地点

Prompt the user:

> "🦌 即将开始野生动物视觉识别。请指定红外相机安装地点(至少精确到省/地区,如'中国云南省'),这会用于修正物种识别结果。默认'中国'。"

Store location in output/location.txt (single line, e.g. 中国云南省高黎贡山).

Step 0b: 说明当前模型并询问用户选择

Prompt the user with full transparency:

> "🧠 当前可用的视觉识别模型:

> - A) Qwen-VL-Plus API(默认,通过阿里云百炼接口,批量处理,支持物种对比推理、跨帧去重、断点续传)

> - B) Kimi 内置视觉(我逐帧查看,适合小批量或 API 不可用时)

> - C) 其他 API(GPT-4o / Claude / Gemini,需要你自行提供 API key)

>

> 请选 A/B/C,或告诉我你的偏好。如选 C,请提供 API key 和模型名称。"

根据用户选择配置对应脚本:

  • 选 A → 确认 analyze_api.py 中已有阿里云 key,可直接使用
  • 选 B → 进入 Agent 手动 review 模式
  • 选 C → 用户提供 key 后写入 analyze_api.py CONFIG,或创建新的分析脚本

Step 0c: 询问帧密度

Prompt the user:

> "📐 帧密度选择(影响识别精度和 token / API 消耗):

> - :当前设置(三段式共 8-28 帧/视频,精度最高,token 消耗最大)

> - :调减 50%(三段式共 4-14 帧/视频,平衡精度与成本)

> - :调减 75%(三段式共 2-7 帧/视频,成本最低,适合快速初筛)

>

> 请选 高/中/低。"

Store density selection in output/frame_density.txt (single line: high / medium / low).

Frame density scaling rules:

User ChoiceScaling<30s30-60s60-120s120-300s300-600s>600s
-------------------------------------------------------------------------
100%81215182228
50%46891114
25%234567

> 最低保障:每个 zone 至少提取 1 帧,确保覆盖视频前中后三段。

Step 0d: 最终确认

Prompt the user:

> "📋 确认信息:

> - 地点:[用户提供的地点]

> - 模型:[用户选择的模型]

> - 帧密度:[高/中/低]

> - 待处理视频:[INPUT_DIR 路径]

>

> 确认无误后回复'开始',我将执行扫描→帧提取(按选的密度)→视觉识别→导出报告。"

收到用户明确回复(如"开始"/"确认"/"跑吧")后,方可进入 Phase 1。


Phase 1: Inventory (Auto)

python scripts/inventory.py

Scans INPUT_DIR recursively for video files (.mp4, .mov, .avi, .mkv, .m4v, .mpg, .mpeg).

Extracts per video:

  • 原始文件名filename
  • 拍摄时间 — Parsed from filename patterns, then ffprobe creation_time, fallback to file mtime
  • 视频时长 — Duration in seconds
  • 分辨率 — Width × Height
  • 文件大小 — MB
  • 编码格式 — Video codec

Outputs:

  • output/inventory.json — Raw data
  • output/inventory.xlsx — Excel preview (Phase 1 data only)

Date Parsing Priority

  1. Filename patterns: IMG_YYYYMMDD_HHMMSS, YYYY-MM-DD_HH-MM-SS, YYYYMMDD_HHMMSS, YYYY_MM_DD_HH_MM_SS
  2. ffprobe creation_time / date tag
  3. File modification time (fallback)

Phase 2: Frame Extraction (Auto) — Three-Zone + User Density

python scripts/extract_frames.py

Reads output/inventory.json and output/frame_density.txt, extracts frames per video using three-zone strategy with user-selected density.

Frame naming: Flat structure — output/frames/PIRT0001_frame_001.jpg, PIRT0001_frame_002.jpg, etc.

Frame extraction reads frame_density.txt to determine scaling factor:

  • high → 100% of base frame count (8-28 frames)
  • medium → 50% of base frame count (4-14 frames)
  • low → 25% of base frame count (2-7 frames)

Base frame count (high density):

Video DurationTotal FramesFirst 25% (trigger zone)Middle 50% (activity zone)Last 25% (exit zone)
-------------------------------------------------------------------------------------------------------
< 30 sec8431
30–60 sec12642
60–120 sec15753
120–300 sec18963
300–600 sec221183
> 600 sec2814104

> Rationale for three-zone split: Trail camera videos are triggered by motion, but animals may enter at start, linger mid-video, or exit at the end. Three-zone coverage maximizes detection probability across the full clip.

Frame resolution: 800px width (up from 640px) for better small animal detail.

Frame quality: JPEG quality=2 (high).

Phase 3: Vision Analysis + Correction + Write Results

Step 3a: Run Vision Analysis

根据用户在 Phase 0b 的选择,执行对应的视觉识别:

Option A — API Batch Mode (Qwen-VL)

python scripts/analyze_api.py

Sends frames per video to Qwen-VL-Plus API (frame count depends on density selected in Phase 0c).

Option B — Agent Manual Review

> Agent views extracted frames using read tool on image files and records per-video summary.

Option C — Other API

> Use user-provided API key and model.

Step 3b: Apply Location-Based Correction

After raw vision results are in, read output/location.txt and apply correction rules:

  • 不在该地区分布的物种 → 排除或降置信度
  • 该地区常见物种 → 提升置信度
  • 候鸟/迁徙种 → 标注季节性
  • 中国家养动物 → 与野生动物区分
  • 入侵物种 → 特别标注
  • 易混淆物种对 → 用区分特征修正。参考 references/wildlife_guide.md 中的地区物种参考和形态特征描述,对相似物种进行排除法推理。修正时不硬编码具体物种对,而是根据实际检出结果动态比对地区常见物种的形态特征(体型、毛色、尾型、行为模式等)。

Step 3c: Write Results to vision_analysis.json

After applying location-based correction, write the final results to output/vision_analysis.json:

{
  "location": "中国云南省高黎贡山",
  "model_used": "qwen-vl-plus",
  "frame_density": "medium",
  "correction_applied": true,
  "videos": [
    {
      "filename": "RCNX0001.AVI",
      "has_human": false,
      "has_wildlife": true,
      "species_detected": ["野猪"],
      "individual_count": {"野猪": 2},
      "confidence": "high",
      "notes": "夜间拍摄,成年个体带幼崽,从左侧进入画面"
    }
  ]
}

Field reference:

FieldDescription
--------------------
has_humanTrue / False / 疑似
has_wildlifeTrue / False / 疑似(v2新增"疑似"用于难以辨认的情况)
species_detectedList of species names or []
individual_countDict: {species: count} or total int
confidencehigh / medium / lowlow for suspected/unclear cases)
notesFree text: behavior, weather, lighting, API raw response, correction notes

Writing command (for Agent or script):

import json

vision_analysis = {
    "location": location,  # from location.txt
    "model_used": model_name,  # e.g. "qwen-vl-plus" or "kimi-vision"
    "frame_density": density,  # from frame_density.txt
    "correction_applied": True,
    "videos": corrected_results  # list of dicts
}

with open("output/vision_analysis.json", "w", encoding="utf-8") as f:
    json.dump(vision_analysis, f, ensure_ascii=False, indent=2)

Phase 4: Export to Excel (Auto)

python scripts/export_excel.py

Reads inventory.json + vision_analysis.json, merges data, writes structured Excel:

ColumnSource
----------------
序号auto
原始文件名inventory
拍摄时间inventory (parsed)
视频时长(秒)inventory
是否有人类vision_analysis
是否有野生动物vision_analysis
识别物种vision_analysis (comma-separated)
个体数量vision_analysis
置信度vision_analysis
备注vision_analysis

Output: output/wildlife_report.xlsx

Color coding:

  • 🟢 浅绿 — 有野生动物
  • 🟠 浅橙 — 有人类
  • 🔴 浅红 — 读取错误行

Batch Processing Tips

  • For >50 videos: split into folders of 20–30 videos to manage frame review load
  • Frame extraction can run overnight; vision review can resume per-folder
  • Use --fps 1 if you want 1 frame per second (modify extract_frames.py CONFIG)

Known Limitations

  • Audio is ignored — no acoustic species identification
  • API dependency — Qwen-VL API requires valid key and network; fallback to agent manual review when unavailable
  • Night/IR footage — low-light frames may still reduce accuracy; infrared-trained models TBD for v3
  • Small animals — v2 improved with 800px frames and three-zone sampling, but very distant rodents/birds may still be missed
  • Cross-model validation — v2 uses single API model; v3 planned: dual-model consensus (Kimi + Qwen-VL) with disagreement flagging

版本历史

共 3 个版本

  • v2.2.1 更新安全设置 当前
    2026-05-20 17:38 安全
  • v2.2.0 Initial release
    2026-05-20 17:07 安全
  • v1.0.0 Initial release
    2026-05-20 15:46 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

suspicious
查看报告

🔗 相关推荐

ai-intelligence

ontology

oswalpalash
类型化知识图谱,用于结构化智能体记忆与可组合技能。支持创建/查询实体(人员、项目、任务、事件、文档)及关联...
★ 711 📥 243,714
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,356 📥 318,049
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,215 📥 266,415