← 返回
数据分析 Key 中文

semantic-scholar

Search, retrieve, and organize scholarly metadata with the Semantic Scholar APIs. Use when Codex needs to find papers or authors, build paper sets from compl...
使用 Semantic Scholar API 搜索、获取并组织学术元数据。用于 Codex 需要查找论文或作者、构建完整论文集时。
grenzlinie
数据分析 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 792
下载
💾 36
安装
1
版本
#latest

概述

Semantic Scholar

Overview

Choose the right Semantic Scholar API workflow before writing code or issuing requests. Prefer small, field-scoped online calls for interactive search, paper/search/bulk for large retrieval jobs, recommendations when the user already has seed papers, and datasets only for offline or release-based data pulls.

Workflow Decision Tree

Start by classifying the task:

  • Use the Graph API for live paper or author lookup, metadata retrieval, query refinement, and batch fetches by known IDs.
  • Use the Recommendations API when the user already has one or more relevant papers and wants similar or related work.
  • Use the Datasets API when the user needs offline snapshots, release-to-release diffs, or corpus-scale ingestion rather than interactive search.

Then choose the endpoint pattern:

  • Use paper/search for normal interactive search, smaller result sets, ranking, and iterative query tuning.
  • Use paper/search/bulk for large result collection; it uses continuation-token pagination and is the default for broad literature harvesting.
  • Use paper/batch or author/batch when the user already has IDs and wants metadata efficiently.
  • Use author/search for author discovery by name or affiliation-like clues.
  • Use recommendations for "papers like this one" workflows.

Operating Rules

  • Request only the fields needed for the task. Semantic Scholar explicitly supports field projection; smaller field lists are faster and less error-prone.
  • Prefer API key authentication via SEMANTIC_SCHOLAR_API_KEY when available, especially for repeated or larger jobs.
  • Handle pagination explicitly. paper/search and author/search are interactive search flows; paper/search/bulk uses continuation tokens.
  • Add retry and backoff for 429 and transient 5xx responses.
  • Preserve raw results before flattening or post-processing them.
  • For broad discovery, write Boolean-rich queries instead of a single brittle phrase. Use exact phrases only when the user asks for them.
  • Do not route normal search tasks to Datasets API. Use datasets only when the user truly needs offline release files or diffs.

Typical Workflows

Search papers interactively

Use this for "find papers about X", "search by title keywords", or "filter by year/citations/open access".

  • Start with paper/search if the user expects inspection and refinement.
  • Keep fields minimal.
  • If the search must collect many records, switch to paper/search/bulk.
  • Read references/query-recipes.md for query patterns.
  • Read references/graph-api.md for endpoint details.

Harvest a broad paper set

Use this for literature review corpora, screening spreadsheets, or downstream ranking.

  • Prefer scripts/semantic_scholar_bulk_search.py.
  • Save raw output to JSONL and only then export CSV if the user needs tabular review.
  • Expose query, year filter, sort, and field selection as parameters instead of hardcoding them.

Fetch by known IDs

Use paper/batch or author/batch when IDs are already known from previous steps or user input.

  • Batch fetch is usually better than repeated single-record lookups.
  • Ask for only the fields required for the analysis or export.

Expand from seed papers

Use recommendations when the user says things like "find papers similar to this", "expand from these seed papers", or "build a related-work set".

  • Use the Recommendations API instead of trying to approximate similarity with a new keyword query.
  • Keep the seed-paper IDs and result set separate from keyword-search results so provenance stays clear.
  • Read references/recommendations-api.md.

Pull datasets or release diffs

Use the Datasets API only for offline ingestion or change tracking between releases.

  • Read references/datasets-api.md.
  • Confirm storage expectations before downloading large files.
  • Document the exact release identifiers used in the workflow.

References

Script

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-03-30 01:43 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 198 📥 65,121
data-analysis

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 165 📥 60,018
data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 368 📥 140,466