← 返回
未分类 中文

Photo Screener

AI-powered photo pre-screening using MobileCLIP2-S0 model. 18x faster than ViT-L/14 with 80% selection consistency (Top-10 overlap 8/10). Use when the user w...
使用MobileCLIP2‑S0模型进行AI照片预筛选,速度比ViT‑L/14快18倍,选取一致性80%(Top‑10重叠8/10),适用于用户...
konanok
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 254
下载
💾 0
安装
1
版本
#latest

概述

Photo Screener — MobileCLIP-powered Smart Pre-screening

Intelligently filter, deduplicate, and classify photos using Apple MobileCLIP2-S0, preparing them for efficient multimodal LLM processing.

Why MobileCLIP2-S0?

Based on 4-model comparison test:

MetricMobileCLIP2-S0ViT-L/14 (baseline)
-----------------------:---------------::-----------------:
Encoding Speed26.7ms/img483.7ms/img
Speed Ratio18.1x faster1x
Pearson Correlation0.781.0 (baseline)
Top-10 Overlap8/1010/10
Model Size74.8M427.6M
Embed Dim512768

> 💡 1/18 of the time, 80% selection consistency — best speed/quality tradeoff.

Dependencies

Declaration file: requirements.txt

Prefer venv: Before running scripts, activate the project-root virtual environment (e.g. .venv/). If it doesn't exist, create one first:

# Create venv and install dependencies (recommended)
python3 -m venv .venv
source .venv/bin/activate
pip install -r photo-screener/requirements.txt

# Or use the skill's setup script (checks + installs)
bash photo-screener/scripts/setup_deps.sh

# Before each session, activate venv
source .venv/bin/activate

Alternatively, install globally:

pip3 install -r photo-screener/requirements.txt

Model Download Policy

The model is NOT pre-downloaded. This is by design to avoid:

  • Unexpected large downloads (~300MB)
  • Wasted bandwidth if the user doesn't need this skill

Download Behavior

ModeBehavior
------------------------------------------------------------------------
Interactive (terminal)Prompts user: "是否下载模型?[Y/n]"
Non-interactive (piped/agent)Exits with manual download instructions
--auto-download flagDownloads without confirmation

Manual Download

# Using China mirror (recommended)
HF_ENDPOINT=https://hf-mirror.com python3 -c \
    "import open_clip; open_clip.create_model_and_transforms('MobileCLIP2-S0', pretrained='dfndr2b')"

# Or run setup script
bash photo-screener/scripts/setup_deps.sh

Configuration

Copy config.example.toml to config.toml and edit. See config.example.toml for all available options.

Usage

# Basic screening
python3 scripts/screen.py ~/data/output/thumbnails

# Custom thresholds
python3 scripts/screen.py ~/data/output/thumbnails \
    --min-score 5.0 --sim-threshold 0.95

# Keep top 50
python3 scripts/screen.py ~/data/output/thumbnails --top-k 50

# Auto-download model (skip confirmation)
python3 scripts/screen.py ~/data/output/thumbnails --auto-download

# Pass specific file paths instead of a directory
python3 scripts/screen.py \
    --paths ~/data/RAW/001/thumbnails/DSC_0001.jpg \
            ~/data/RAW/001/thumbnails/DSC_0002.jpg

# Dry run
python3 scripts/screen.py ~/data/output/thumbnails --dry-run
OptionDescriptionDefault
------------------------------------------------------------------------
input_dirDirectory with photos (optional with --paths)required
--pathsSpecific image paths (alternative to input_dir)
--output, -oOutput JSON pathauto
--min-scoreMin aesthetic score (1-10)4.0
--sim-thresholdDedup threshold (0-1)0.97
--batch-sizeMax photos per LLM batch20
--top-kKeep only top Kall
--recursiveSearch subdirectoriesoff
--auto-downloadSkip model download promptoff
--dry-runPreview onlyoff

Pipeline

Photos (thumbnails)
    │
    ▼  Stage 1: MobileCLIP Encoding (~27ms/image)
    │  → 512-dim normalized embeddings
    │
    ├── Stage 2: Aesthetic Scoring
    │   └── LAION MLP (zero-padded 512→768 dim)
    │   └── Remove below threshold (default: 4.0)
    │
    ├── Stage 3: Similarity Dedup
    │   └── Cosine similarity + greedy dedup
    │   └── Higher score = higher priority
    │
    ├── Stage 4: Scene Classification
    │   └── Zero-shot text matching (14 categories)
    │
    └── Output: filter_report.json

Agent Integration

When using this skill from an agent:

  1. Check dependencies: bash scripts/setup_deps.sh
  2. Run with --auto-download: in agent context, use --auto-download to skip interactive prompt
  3. Or pre-download: run setup_deps.sh first which handles model download with user confirmation
# Agent-friendly command (auto-download)
python3 photo-screener/scripts/screen.py \
    ~/data/output/{session-id}/thumbnails \
    --output ~/data/output/{session-id}/filter_report.json \
    --auto-download

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-05-21 15:44

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

developer-tools

Github

steipete
使用 `gh` CLI 与 GitHub 交互,通过 `gh issue`、`gh pr`、`gh run` 和 `gh api` 管理议题、PR、CI 运行及高级查询。
★ 672 📥 324,963
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,223 📥 267,364
ai-intelligence

Self-Improving + Proactive Agent

ivangdavila
自我反思+自我批评+自我学习+自组织记忆。智能体评估自身工作、发现错误并持续改进。
★ 1,371 📥 319,710