← 返回
未分类 Key 中文

Skill PDF Orgnizer

Organizes PDFs by extracting metadata and classifying into topics, renaming files, and sorting into topic-based folders using AI analysis.
利用AI分析提取元数据,按主题分类重命名PDF并整理到对应文件夹。
yxl184 yxl184 来源
未分类 clawhub v1.0.0 1 版本 99740.3 Key: 需要
★ 0
Stars
📥 384
下载
💾 5
安装
1
版本
#latest

概述

PDF Organizer Skill

Description

AI-powered PDF organization tool that automatically categorizes and organizes PDF files by topic using GPT analysis.

Features

  • AI-Powered Content Analysis: Uses OpenAI/Kimi API to extract titles, authors, and journal names
  • Automatic Topic Classification: Classifies PDFs into 10 standard categories (Technology, Finance, Health, Science, Education, Business, Entertainment, Politics, Sports, Other)
  • Smart File Naming: Renames files in Title_Author_Journal.pdf format using only underscores
  • Hierarchical Organization: Creates topic-based folders with subtopic subfolders
  • Batch Processing: Efficiently processes multiple PDFs at once
  • Incremental Mode: Only processes new/modified files
  • Error Handling: Gracefully handles corrupted PDFs and API failures
  • Detailed Logging: Tracks all operations with comprehensive statistics

Requirements

  • Python 3.8+
  • OpenAI API key or Kimi API key
  • PDF files to organize

Usage

  1. Configure API key in config.json
  2. Place PDF files in input_pdfs/ folder
  3. Run: python pdf_organizer.py
  4. Organized files appear in organized_pdfs/ folder

Project Structure

pdf_organizer.py          # Main entry point
modules/
  ├── pdf_reader.py       # PDF text extraction
  ├── content_analyzer.py  # OpenAI/Kimi API integration
  ├── folder_manager.py    # Folder creation and management
  └── file_mover.py        # File operations
config.json              # Configuration file
requirements.txt          # Python dependencies
README.md                # Documentation
setup.py                # Setup/initialization script

Configuration Options

  • openai_api_key: Your API key (required)
  • input_folder: Folder containing PDFs to organize
  • output_folder: Destination folder for organized PDFs
  • model: GPT model to use (gpt-3.5-turbo or moonshot-v1-8k)
  • api_provider: API provider ("openai" or "kimi")
  • dry_run: Preview mode without moving files (true/false)
  • incremental: Only process new/modified files (true/false)
  • max_chars: Maximum characters to analyze from each PDF
  • custom_topics: Custom topic mappings

Example Output

Organized files are named: Title_Author_Journal.pdf

Example: Visual_SLAM_What_Are_the_Current_Trends_and_What_to_Expect_Ali_Tourani,_Hriday_Bavle,_Jose_Luis_Sanchez-Lopez,_and_Holger_Voos_Sensors.pdf

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 21:38 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

office-efficiency

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 376 📥 143,259
office-efficiency

Nano Pdf

steipete
使用nano-pdf CLI通过自然语言指令编辑PDF
★ 276 📥 115,603
office-efficiency

Gog

steipete
Google Workspace 命令行工具,支持 Gmail、日历、云端硬盘、通讯录、表格和文档。
★ 924 📥 186,369