← 返回
安全合规 Key

Zeelin Patent Retriever

Team ZeeLin’s production-grade patent evidence retrieval skill for Google Patents BigQuery. Converts natural-language research intent into auditable multi-ro...
Team ZeeLin 面向 Google Patents BigQuery 的生产级专利证据检索技能。将自然语言研究意图转化为可审计的多路检索查询。
yangyuwen-bri
安全合规 clawhub v0.1.2 1 版本 100000 Key: 需要
★ 0
Stars
📥 643
下载
💾 34
安装
1
版本
#latest

概述

ZeeLin Patent Retriever

Team ZeeLin skill for Google Patents retrieval via BigQuery.

This skill performs patent retrieval and structured output generation only. It does not provide legal conclusions.

30-Second Quickstart Card

Purpose:

  • Fetch, deduplicate, and structure patent evidence from Google Patents BigQuery for downstream analysis.

Required env:

  • GOOGLE_APPLICATION_CREDENTIALS
  • GOOGLE_CLOUD_PROJECT

Run this:

python3 -m pip install -r requirements.txt
RUN_ID="quick_$(date +%Y%m%d_%H%M%S)"; RUN_DIR="results/${RUN_ID}"; mkdir -p "$RUN_DIR"
python3 scripts/patent_search.py --keywords "ai sentiment analysis" --limit 80 --output "$RUN_DIR/seed_raw.json"
python3 scripts/build_query_plan.py --topic "Public Opinion + AI" --keywords "public opinion ai sentiment" --task-id "$RUN_ID" --seed-raw "$RUN_DIR/seed_raw.json" --concept-output "$RUN_DIR/concept_scan.json" --plan-output "$RUN_DIR/query_plan.json"
python3 scripts/patent_search_plan.py --plan "$RUN_DIR/query_plan.json" --output-raw "$RUN_DIR/retriever_raw.json" --output-retriever "$RUN_DIR/retriever_result.json" --min-results 20

Expected outputs:

  • $RUN_DIR/concept_scan.json
  • $RUN_DIR/query_plan.json
  • $RUN_DIR/retriever_raw.json
  • $RUN_DIR/retriever_result.json

If it fails:

  • Missing env vars: configure Google credentials first.
  • Too few results: keep filters and increase limits/expansion rounds before relaxing constraints.

1. Execution Rules

  1. Use the three-stage flow by default: seed -> build_plan -> execute_plan.
  2. Default minimum result count is 20 unless the user explicitly requests another value.
  3. If the user specifies hard constraints (year, country, assignee, inventor, IPC/CPC), they must be applied in query_plan.json (filters) before execution.
  4. Before execution, echo planned filters. After execution, echo effective filters, result size, and output file paths.

2. Pre-Run Checks

Required environment variables:

  • GOOGLE_APPLICATION_CREDENTIALS
  • GOOGLE_CLOUD_PROJECT

Install dependencies:

python3 -m pip install -r requirements.txt

Optional environment check:

python3 - <<'PY'
import os
required = ["GOOGLE_APPLICATION_CREDENTIALS", "GOOGLE_CLOUD_PROJECT"]
missing = [k for k in required if not os.getenv(k)]
print({"ok": not missing, "missing": missing})
PY

3. Capability Boundary and Parameter Sources

3.1 Supported filter dimensions

  • Text: keywords_all / keywords_any / keywords_anchor_any / keywords_not
  • Taxonomy: ipc_prefix_any / cpc_prefix_any
  • Entities: assignee_any / inventor_any
  • Geography: country_in
  • Date ranges: pub_date_from / pub_date_to / filing_date_from / filing_date_to

Field source: query_plan.json (schema: schemas/query_plan.schema.json).

3.2 Default behavior for missing inputs

  • min_results: default 20
  • Country unspecified: default US,CN,WO,EP,JP,KR
  • Date range unspecified: default years_back=8
  • Keywords missing: ask for clarification and do not run

3.3 Year-to-date mapping rules

  • Single year (e.g. 2021) => from=20210101, to=20211231
  • Year range (e.g. 2021-2023) => from=20210101, to=20231231
  • Relative window (e.g. “last N years”) => use --years-back N

4. Standard Flow (Command Templates)

Create a run directory first:

RUN_ID="run_$(date +%Y%m%d_%H%M%S)"
RUN_DIR="results/${RUN_ID}"
mkdir -p "$RUN_DIR"

Step 1: Seed retrieval

python3 scripts/patent_search.py \
  --keywords "<keywords>" \
  --limit 80 \
  --output "$RUN_DIR/seed_raw.json"

Step 2: Build query plan

python3 scripts/build_query_plan.py \
  --topic "<topic>" \
  --keywords "<keywords>" \
  --task-id "$RUN_ID" \
  --years-back 8 \
  --country-in "US,CN,WO,EP,JP,KR" \
  --seed-raw "$RUN_DIR/seed_raw.json" \
  --concept-output "$RUN_DIR/concept_scan.json" \
  --plan-output "$RUN_DIR/query_plan.json"

Step 3: Apply explicit user constraints (critical)

When the user explicitly requests country/year/assignee filters, patch query_plan.json before execution.

python3 - <<'PY'
import json
import os
from pathlib import Path

plan_path = Path(os.environ["RUN_DIR"]) / "query_plan.json"
plan = json.loads(plan_path.read_text(encoding="utf-8"))

# Example override: 2021-2023 + US + keyword constraints
for r in plan.get("query_rounds", []):
    f = r.setdefault("filters", {})
    f["country_in"] = ["US"]
    f["pub_date_from"] = 20210101
    f["pub_date_to"] = 20231231
    f.setdefault("keywords_any", [])
    f["keywords_any"] = list(dict.fromkeys(f["keywords_any"] + ["sentiment", "public opinion", "risk"]))

plan_path.write_text(json.dumps(plan, ensure_ascii=False, indent=2), encoding="utf-8")
print({"updated": str(plan_path)})
PY

Step 4: Execute planned retrieval

python3 scripts/patent_search_plan.py \
  --plan "$RUN_DIR/query_plan.json" \
  --output-raw "$RUN_DIR/retriever_raw.json" \
  --output-retriever "$RUN_DIR/retriever_result.json" \
  --min-results 20

Step 5: Validate outputs

python3 scripts/schema_check.py --input "$RUN_DIR/concept_scan.json" --schema schemas/concept_scan.schema.json
python3 scripts/schema_check.py --input "$RUN_DIR/query_plan.json" --schema schemas/query_plan.schema.json
python3 scripts/schema_check.py --input "$RUN_DIR/retriever_result.json" --schema schemas/retriever_result.schema.json

5. Natural Language to Parameter Mapping Examples

Example A:

  • User input: Find US patents on AI public-opinion early warning from 2021 to 2023, at least 30 results
  • Mapping:
  • topic="AI public opinion early warning"
  • keywords="ai public opinion early warning sentiment"
  • Plan override: country_in=["US"], pub_date_from=20210101, pub_date_to=20231231
  • Execution arg: --min-results 30

Example B:

  • User input: Search multimodal emotion recognition patents in CN/JP/KR over the last 5 years, focus on Tencent and ByteDance
  • Mapping:
  • --years-back 5
  • country_in=["CN","JP","KR"]
  • assignee_any=["Tencent","ByteDance"]

6. Post-Execution Response Template (required)

Retrieval completed.
Effective filters:
- Countries: ...
- Publication date range: ...
- Filing date range: ...
- Keywords (any/all/not): ...
- Assignee/Inventor filters: ...

Results:
- Patent count: ...
- Country distribution: ...
- Latest publication date: ...

Files:
- concept_scan: ...
- query_plan: ...
- retriever_raw: ...
- retriever_result: ...

7. Common Failures and Recovery

  • Missing environment variables: instruct user to configure Google credentials first.
  • Insufficient retrieval volume:
  1. Keep constraints, increase per-round limits.
  2. Increase expansion rounds.
  3. If still insufficient, ask whether to relax country/date constraints.
    • Cost risk: prioritize narrower date windows and country scopes before broad scans.

8. Output Contract

Required output files:

  • concept_scan.json
  • query_plan.json
  • retriever_raw.json
  • retriever_result.json

retriever_result.json minimum requirements:

  • patents count >= min_results (default 20)
  • each item includes publication_number and title

9. References

  • Methodology: references/methodology.md
  • Quick examples: examples/quickstart.md

版本历史

共 1 个版本

  • v0.1.2 当前
    2026-03-30 04:05 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Gsdata

yangyuwen-bri
Use GSData open platform via local adapter script for account/content/rank/pubsent/nlp queries. Use when user asks for 舆
★ 0 📥 735
security-compliance

OpenClaw Backup

alex3alex
备份与恢复 OpenClaw 数据。适用于创建备份、设置自动备份计划、从备份恢复或管理备份轮转。处理 ~/.openclaw 目录归档并包含适当的排除规则。
★ 89 📥 30,604
security-compliance

Skill Vetter

spclaudehome
AI智能体技能安全预审工具。安装ClawdHub、GitHub等来源技能前,检查风险信号、权限范围及可疑模式。
★ 1,215 📥 266,439