Extract keyframes from videos, analyze content with vision models, and generate comprehensive reports with 3 representative screenshots. Optimized for token efficiency using I-frame detection.
Video Input → Extract Keyframes → Vision Analysis → Select Top 3 → Generate Report → Send Output
When user sends video via Feishu, the file is auto-saved to:
~/.openclaw/media/inbound/<filename>.mp4
ffmpeg -i <video_path> 2>&1 | grep -E "(Duration|Video)"
Returns: duration, resolution, bitrate, codec info.
Use the provided script for optimal keyframe extraction:
bash ~/.openclaw/workspace/skills/video-analyzer/scripts/extract_keyframes.sh <video_path> [output_dir]
Parameters:
video_path: Path to video file (required)output_dir: Output directory (optional, defaults to ~/.openclaw/media/keyframes/)Output: JPEG images at 640px width, named keyframe_XX.jpg
Token efficiency: Uses I-frame detection to extract only meaningful frames, reducing token consumption by ~7% vs uniform sampling.
Use the image tool with all extracted keyframes:
prompt: "Analyze these keyframes from a video. Please:
1. Describe the video's theme and content
2. Select 3 most representative frames (explain why)"
Structure the analysis report:
## 📌 Video Theme
[Description]
## 🖼️ Representative Screenshots
| Frame | Reason |
|-------|--------|
| frame_XX | [Why representative] |
Send via Feishu:
| Video Length | Keyframes | Estimated Tokens |
|---|---|---|
| -------------- | ----------- | ------------------ |
| 5 seconds | 5-8 | ~8,000-14,000 |
| 15 seconds | 12-16 | ~20,000-28,000 |
| 30 seconds | 20-30 | ~35,000-50,000 |
Optimization tips:
extract_keyframes.sh - Extract keyframes using ffmpeg I-frame detectionffmpeg_reference.md - Advanced ffmpeg commands for video processing共 1 个版本