← 返回
未分类 中文

Image Duplication Detector

Detect image duplication and tampering in manuscript figures using computer vision algorithms
使用计算机视觉算法检测手稿图像中的复制与篡改
aipoch-ai aipoch-ai 来源
未分类 clawhub v1.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 373
下载
💾 3
安装
1
版本
#latest

概述

Image Duplication Detector

ID: 195

Description

Uses Computer Vision (CV) algorithms to scan all images in paper manuscripts to detect potential duplication or local tampering (PS traces).

Usage

# Scan single PDF file
python scripts/main.py --input paper.pdf --output report.json

# Scan image folder
python scripts/main.py --input ./images/ --output report.json

# Specify similarity threshold (default 0.85)
python scripts/main.py --input paper.pdf --threshold 0.90 --output report.json

# Enable tampering detection
python scripts/main.py --input paper.pdf --detect-tampering --output report.json

# Generate visualization report
python scripts/main.py --input paper.pdf --visualize --output report.json

Parameters

ParameterTypeDefaultRequiredDescription
-------------------------------------------------
--inputstring-YesInput PDF file or image folder path
--outputstringreport.jsonNoOutput report path
--thresholdfloat0.85NoSimilarity threshold (0-1), higher is stricter
--detect-tamperingflagfalseNoEnable tampering/PS trace detection
--visualizeflagfalseNoGenerate visualization comparison images
--temp-dirstring./tempNoTemporary file directory

Output Format

{
  "summary": {
    "total_images": 12,
    "duplicates_found": 2,
    "tampering_detected": 1,
    "processing_time": "3.5s"
  },
  "duplicates": [
    {
      "group_id": 1,
      "similarity": 0.98,
      "images": [
        {"page": 2, "index": 1, "path": "..."},
        {"page": 5, "index": 3, "path": "..."}
      ]
    }
  ],
  "tampering": [
    {
      "image": "page_3_img_2.png",
      "suspicious_regions": [
        {"x": 120, "y": 80, "width": 50, "height": 50, "confidence": 0.92}
      ]
    }
  ]
}

Requirements

opencv-python>=4.8.0
numpy>=1.24.0
Pillow>=10.0.0
PyPDF2>=3.0.0
pdf2image>=1.16.0
imagehash>=4.3.0
scikit-image>=0.21.0
matplotlib>=3.7.0

Algorithm Details

Duplication Detection

  • Perceptual Hashing: Uses pHash, dHash, aHash combination to detect visually similar images
  • Feature Matching: ORB feature point matching to verify similarity
  • SSIM: Structural similarity index as auxiliary verification

Tampering Detection

  • ELA (Error Level Analysis): Detects JPEG compression level inconsistencies
  • Noise Analysis: Noise pattern anomaly detection
  • Copy-Move Detection: Copy-move forgery detection
  • Lighting Inconsistency: Lighting consistency analysis

Example

from scripts.main import ImageDuplicationDetector

detector = ImageDuplicationDetector(
    threshold=0.85,
    detect_tampering=True
)

results = detector.scan("paper.pdf")
detector.save_report(results, "report.json")

Notes

  • Supports PDF, PNG, JPG, TIFF formats
  • Large files recommended for batch processing
  • Tampering detection may produce false positives, manual review recommended

Risk Assessment

Risk IndicatorAssessmentLevel
-----------------------------------
Code ExecutionPython/R scripts executed locallyMedium
Network AccessNo external API callsLow
File System AccessRead input files, write output filesMedium
Instruction TamperingStandard prompt guidelinesLow
Data ExposureOutput files saved to workspaceLow

Security Checklist

  • [ ] No hardcoded credentials or API keys
  • [ ] No unauthorized file system access (../)
  • [ ] Output does not expose sensitive information
  • [ ] Prompt injection protections in place
  • [ ] Input file paths validated (no ../ traversal)
  • [ ] Output directory restricted to workspace
  • [ ] Script execution in sandboxed environment
  • [ ] Error messages sanitized (no stack traces exposed)
  • [ ] Dependencies audited
  • Prerequisites

# Python dependencies
pip install -r requirements.txt

Evaluation Criteria

Success Metrics

  • [ ] Successfully executes main functionality
  • [ ] Output meets quality standards
  • [ ] Handles edge cases gracefully
  • [ ] Performance is acceptable

Test Cases

  1. Basic Functionality: Standard input → Expected output
  2. Edge Case: Invalid input → Graceful error handling
  3. Performance: Large dataset → Acceptable processing time

Lifecycle Status

  • Current Stage: Draft
  • Next Review Date: 2026-03-06
  • Known Issues: None
  • Planned Improvements:
  • Performance optimization
  • Additional feature support

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-31 09:43 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

it-ops-security

Free Ride - Unlimited free AI

shaivpidadi
管理OpenClaw的OpenRouter免费AI模型,自动按质量排名模型,配置速率限制备用方案,并更新opencla...
★ 472 📥 78,660
it-ops-security

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 90 📥 31,094
data-analysis

Survival Analysis (KM)

aipoch-ai
生成Kaplan‑Meier生存曲线,计算生存统计量(log‑rank检验、中位生存时间),并估算临床及生物...的 hazard ratios。
★ 2 📥 1,017