---
name: patent-software-ip
description: "Generate CN patent docs (claims, specification, abstract) and software copyright materials from AI/big-data project code or docs. Covers 7 AI domains + big data, 11 claim templates, auto domain detection, desensitization, prior-art search, and self-check."
version: "2.0.0"
author: jaccen
tags: ["patent", "software-copyright", "ip", "ai", "big-data", "3d-vision", "generative-ai", "embodied-ai", "nlp", "rag", "ai-engineering", "ai-safety"]
Generate CNIPA invention patent documents or CPCC software copyright materials from AI / big-data project code, design docs, and research papers.
Covers 7 AI domains + Big Data (23 sub-directions), 11 claim templates.
Full version (Chinese, with Word/PPT output): see AI-Copyright-Skill project.
patent / claims / specification / software copyright / disclosure / IP application / paper-to-patent / /patent-software-ip
Phase A Requirement Diagnosis -> path + domain classification + risk level
Phase B Project Analysis -> auto-detect domain + extract key technical points
Phase C Generation (branch by path)
C1 Patent: prior art search -> claims (11 templates) -> specification -> abstract -> self-check
C2 Software Copyright: manual -> source code doc -> self-check
Phase D Iterative Correction
Confirm: path (patent/copyright/both), tech topic, applicant/inventor info, existing materials.
Auto domain classification (see Section "AI Domain Taxonomy" below).
Gate: 3-5 line diagnosis summary including domain + risk level.
| Domain | Sub-directions | High-Risk Flags |
|---|---|---|
| -------- | --------------- | ----------------- |
| D1 Perceptual Intelligence | 2D vision, 3D vision, multi-sensor fusion | 3D vision: bind 4-stage pipeline |
| D2 Cognition & Language | NLP, multimodal LLM, RAG, knowledge graph | RAG: show full 5-stage chain |
| D3 Generative AI | Diffusion, LLM text gen, cross-modal gen, AIGC watermark | Must bind condition injection method; pure content gen = rejected |
| D4 Decision & Interaction | Embodied AI, reinforcement learning, multi-agent | Must bind sensor + actuator; RL: bind reward to concrete task |
| D5 AI Engineering | Training/fine-tuning, inference deployment, data engineering, edge IoT | Training: bind to specific model architecture; inference: bind to hardware |
| D6 AI Safety & Governance | Adversarial robustness, watermark/tracing, privacy, alignment | Need concrete technical measure, not policy-level description |
| D7 Industry Applications | Autonomous driving, industrial, medical, financial, AI4Science | Must bind data processing means; financial: bind to data analysis |
| D8 Big Data | Distributed computing, data pipeline, stream processing, data quality, real-time analytics | Must bind to specific application scenario; pure platform = rejected |
Source files -> domain mapping:
| Key file | Detected domain |
|---|---|
| ---------- | ---------------- |
model.py, unet.py, vae.py | D3 Generative AI |
train.py, finetune.py | D5 AI Engineering (Training) |
inference.py, triton_serve.py, onnx_export.py | D5 AI Engineering (Inference) |
render.py, gaussian.py, splat.py | D1 3D Vision |
llm.py, chat.py, rag_chain.py | D2 NLP / RAG |
robot.py, vla.py, env.py | D4 Embodied AI |
reward.py, ppo.py | D4 Reinforcement Learning |
watermark.py, embed_watermark.py | D6 AI Safety / Watermark |
spark_job.py, flink_job.py, kafka_consumer.py | D8 Big Data |
etl.py, data_pipeline.py, feature_store.py | D8 Big Data (Data Engineering) |
stream.py, realtime_analytics.py | D8 Big Data (Streaming) |
dataset.py, dataloader.py | D5 AI Engineering (Data) |
privacy.py, dp_train.py | D6 AI Safety (Privacy) |
config.yaml, pipeline.py + langchain | D2 RAG / Agent |
Also detect 6 industry contexts: medical, financial, autonomous driving, industrial, smart city, education.
Priority: model definition -> training/inference -> domain-specific core -> papers/design docs -> README.
Output: Key Points List (innovations, scheme skeleton, key params, distinctions, quantifiable effects, domain classification).
Gate: Present key points list for user confirmation.
Online search 2-3 rounds: CNIPA patent DB, Google Patents, arXiv. Each result: source ID, scheme summary, limitations.
CPC suggestions by domain:
Structure: Method (1 independent + 3-8 dependent) + System (1 independent + 3-8 dependent) + Storage Medium (1 independent).
Template selection by domain:
| Template | Domain | Independent claim skeleton |
|---|---|---|
| ---------- | -------- | -------------------------- |
| T1 Model Architecture | D1/D2/D5 | Predefined network -> layer composition -> feature extraction -> output |
| T2 3D Vision | D1 3D | Capture -> sparse reconstruction -> dense optimization -> rendering (expand formula) |
| T3 Training Strategy | D5 | Data construction -> model initialization -> loss design -> optimization -> convergence |
| T4 Multimodal Fusion | D1/D2 | Multi-modal input -> modality-specific encoding -> cross-modal alignment -> fused output |
| T5 RAG Pipeline | D2 | Parse -> retrieve -> rerank -> reconstruct -> generate |
| T6 Diffusion Model | D3 | Noise scheduling -> condition injection (specify: cross-attention/adapter/ControlNet) -> denoising -> decode |
| T7 Agent | D2/D4 | Environment perception -> task decomposition -> tool selection -> execution -> feedback |
| T8 Embodied Intelligence | D4 | Sensor input -> perception -> planning -> actuator output + safety constraint (dependent) |
| T9 Inference Optimization | D5 | Model loading -> computation graph optimization -> kernel fusion -> output |
| T10 Big Data Processing | D8 | Data ingestion -> distributed processing (specify: Spark/Flink/MapReduce) -> aggregation -> storage/output |
| T11 Data Engineering & Quality | D8 | Data collection -> quality assessment -> anomaly detection -> cleaning -> feature extraction -> storage |
Drafting rules (all domains):
5-chapter: Tech Field -> Background (prior art + defects) -> Invention Content (problem + scheme + effects, quantified) -> Figure Description -> Specific Embodiments.
Desensitization:
Figures (mermaid flowchart TB/LR): System architecture + method flow + domain-specific pipeline (training/rendering/data pipeline/stream topology/etc.).
<=300 chars. Tech domain + core scheme + main effect. No commercial terms.
Domain-specific self-check:
| Domain | Extra checks |
|---|---|
| -------- | ------------- |
| D1 3D Vision | Rendering formula in claim? 4-stage pipeline? |
| D2 NLP/RAG | Full 5-stage RAG chain? Specific embedding model? |
| D3 Generative AI | Condition injection method specified? Not pure content gen? |
| D4 Embodied | Sensor + actuator bound in every step? Safety dependent claim? |
| D5 AI Engineering | Specific model architecture? Hardware binding for inference? |
| D6 AI Safety | Concrete technical measure? Not policy-level? |
| D7 Financial/Medical | Data processing means bound? Not pure business method? |
| D8 Big Data | Specific application scenario bound? Not pure platform? Distributed topology described? |
Structure: Introduction (env + capability) -> Installation (env + weights + config) -> Functions (core + data + API + monitoring) -> Non-functional -> FAQ.
Templates by domain:
File priority by domain:
| Domain | Required files | Domain-specific required |
|---|---|---|
| -------- | --------------- | ------------------------ |
| D1 3D Vision | model.py, train.py, inference.py, render.py | render.py |
| D2 NLP/RAG | model.py, train.py, inference.py, retriever.py | retriever.py |
| D3 Generative AI | model.py, train.py, inference.py, generate.py | generate.py |
| D4 Embodied | model.py, train.py, inference.py, control.py, env.py | control.py |
| D5 AI Engineering | model.py, finetune.py, export.py, deploy.py | finetune.py |
| D6 AI Safety | model.py, watermark.py, adv_train.py | watermark.py |
| D8 Big Data | pipeline.py, etl.py, stream.py, config.yaml | pipeline.py |
<3000 lines: submit all; >3000: front 1500 + back 1500 by priority.
Desensitization: Remove API keys, absolute paths, internal addresses, personal info, hardware models, cloud URLs, DB passwords. Retain algorithm comments.
Deep-dive reference files for domain-specific patent writing rules, claim templates, and software copyright guides.
| File | Sections | Key Content |
|---|---|---|
| ------ | ---------- | ------------- |
| eferences/ai-patent-claims-guide.md | 11 claim templates (T1-T14) | Full legal claim text per template: method/system/medium triples with dependent claims; Big Data T10-T14 included |
| eferences/ai-patent-special.md | Patentability framework, 8 risk domains, CPC codes, desensitization rules | AI+Big Data patentability risk assessment; domain mapping; figure requirements; industry desensitization; CPC classification (7.1-7.7); 9-domain quick reference |
| eferences/ai-software-copyright-guide.md | Type detection, source file priority, 5 domain templates, FAQ | Decision tree for 10+ project types; source code priority by domain; Big Data dedicated template (section 3.5); desensitization checklist; common pitfalls |
Identify -> Locate -> Targeted fix -> Save as v{N} -> Re-run affected self-check items only. Do NOT re-run full pipeline.
outputs/{case-id}/
patent/ claims.md + specification.md + abstract.md + full.md
software-copyright/ manual.md + source_code.md
Prohibitions: No skill name/repo path/disclaimers in deliverables. No self-check section in body. No fabricated patent numbers/links. No "approximately" in claims. No commercial terms in abstract.
| Pattern | Why rejected | Fix |
|---|---|---|
| --------- | ------------- | ----- |
| Pure content generation (no condition injection) | "Intellectual activity rules" | Specify cross-attention/adapter/ControlNet in claims |
| Financial AI without data processing means | "Business method" | Bind to specific feature engineering + model architecture |
| Embodied AI without sensor/actuator binding | "Pure algorithm" | Add "executed via LiDAR module" + "motor controller" |
| RAG without full pipeline | "Insufficient disclosure" | Show all 5 stages in method claim |
| Big Data platform without application | "Abstract idea" | Bind to specific scenario (e.g., real-time traffic analytics) |
| RL without reward function | "Insufficient disclosure" | Include reward computation formula |
| AI watermark without robustness test | "Insufficient technical effect" | Add adversarial/noise/compression robustness claim |
| Medical AI without clinical validation | "Insufficient enablement" | Add evaluation on specific dataset with clinical metrics |
共 1 个版本