← 返回
开发者工具 中文

Detect File Type - Local

Local, offline AI-powered file type detection — no network, no API keys
本地离线 AI 文件类型检测,无需网络和 API 密钥
pgeraghty
开发者工具 clawhub v0.2.0 1 版本 99859.9 Key: 无需
★ 1
Stars
📥 693
下载
💾 4
安装
1
版本
#latest

概述

Detect File Type - Local

Local-only, offline file type detection. Uses an embedded ML model (Google Magika) to identify 214 file types by content — no network calls, no API keys, no data leaves the machine. All inference runs on-device via ONNX Runtime.

When to Use

  • Identify unknown files by their content (not just extension) — locally, without sending data anywhere
  • Verify that a file's extension matches its actual content
  • Check MIME types before processing uploads or downloads
  • Triage files in a directory by type
  • Detect extension mismatches and masquerading (e.g., .pdf.exe, .xlsx.lnk)
  • Flag suspicious polyglot-style payloads (for example PDF/ZIP or PDF/HTA-style chains)
  • When privacy matters — file bytes never leave the local machine

Installation

pip install detect-file-type-local

From source:

pip install -e /path/to/detect-file-type-skill

Usage

Single file

detect_file_type path/to/file

Multiple files

detect_file_type file1.pdf file2.png file3.zip

Recursive directory scan

detect_file_type --recursive ./uploads/

From stdin

cat mystery_file | detect_file_type -

# Optional best-effort fast path (head only)
cat mystery_file | detect_file_type --stdin-mode head --stdin-max-bytes 1048576 -

Output formats

detect_file_type --json file.pdf    # JSON (default)
detect_file_type --human file.pdf   # Human-readable
detect_file_type --mime file.pdf    # Bare MIME type

Programmatic (Python)

python -m detect_file_type path/to/file

Output Schema (JSON)

Single file returns an object; multiple files return an array.

{
  "path": "document.pdf",
  "label": "pdf",
  "mime_type": "application/pdf",
  "score": 0.99,
  "group": "document",
  "description": "PDF document",
  "is_text": false
}

Fields

FieldTypeDescription
--------------------------
pathstringInput path (or - for stdin)
labelstringDetected file type label (e.g., pdf, png, python)
mime_typestringMIME type (e.g., application/pdf)
scorefloatConfidence score (0.0–1.0)
groupstringCategory (e.g., document, image, code)
descriptionstringHuman-readable description
is_textboolWhether the file is text-based

Exit Codes

CodeMeaning
---------------
0All files detected successfully
1Fatal error (no results produced)
2Partial failure (some files failed, some succeeded)

Error Handling

Errors are printed to stderr. Common cases:

  • File not found: error: path/to/file: No such file or directory
  • Permission denied: error: path/to/file: Permission denied
  • Not a regular file: error: path/to/dir: Not a regular file

When processing multiple files, detection continues for remaining files even if some fail.

Limitations

  • Default stdin mode (spool) writes stdin to a temporary file and uses Magika path detection.
  • --stdin-mode head is best effort and may miss trailing-byte signatures.
  • Very small files (< ~16 bytes) may produce low-confidence results
  • Empty files are detected as empty
  • Detection is content-based — file extensions are ignored

Security Context

版本历史

共 1 个版本

  • v0.2.0 当前
    2026-03-29 22:32 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

productivity

Detect File Type Skill

pgeraghty
已弃用 — 删除此遗留条目;已被 detect-file-type-local 替代
★ 0 📥 591
developer-tools

CodeConductor.ai

larsonreever
AI驱动平台,提供快速全栈开发、智能体、工作流自动化及低代码AI集成的可扩展产品创建。
★ 66 📥 179,861
developer-tools

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 921 📥 185,734