概述

Authenticated Paper Fetcher

Use this skill to fetch individual academic PDFs through access the user already has: open access, institutional library proxy, an authenticated local browser profile, or a user-provided remote browser/CDP session.

Boundaries

Proceed only for content the user is authorized to access or that is open access.
Never ask for or store passwords, SSO secrets, 2FA codes, session cookies, or publisher API keys in chat.
Do not bypass paywalls, CAPTCHAs, rate limits, DRM, robots controls, or account restrictions.
Do not do bulk downloading unless the user confirms the library/publisher license permits it. Prefer official TDM APIs for mining-scale requests.
Treat browser profiles, CDP endpoints, and downloaded PDFs as sensitive. Do not print tokens or cookie values.
Before using a cloud browser for university login, tell the user to confirm their school permits entering SSO credentials into that provider.

Preferred Workflow

Normalize the request to a DOI, publisher URL, or library permalink.
If the user gives an open-access URL or DOI, try normal direct retrieval first.
If institutional access is needed, prefer a local persistent browser profile:

node /scripts/fetch-paper.mjs --url "" --out papers --pause-for-login

If the user has a school proxy prefix, pass it explicitly:

node /scripts/fetch-paper.mjs --doi "" --proxy-prefix "https://ezproxy.example.edu/login?url=" --out papers --pause-for-login

If the user provides a cloud browser or remote Chrome CDP endpoint, set PAPER_FETCH_CDP_ENDPOINT or use --cdp. Read references/cloud-browser-options.md first.
For SpringerLink URLs, the helper will try page PDF links and the usual link.springer.com/content/pdf/.pdf pattern after the authenticated page loads.
Save the PDF and sidecar metadata JSON. Report the saved path, final article URL, and any entitlement/login problem.

Helper Script

Use scripts/fetch-paper.mjs for repeatable retrieval.

Examples:

node <skill-dir>/scripts/fetch-paper.mjs --doi "10.1007/s00134-020-06033-2" --out papers --pause-for-login
node <skill-dir>/scripts/fetch-paper.mjs --url "https://link.springer.com/article/10.1007/s00134-020-06033-2" --out papers --headless
$env:PAPER_FETCH_CDP_ENDPOINT="wss://<redacted-remote-browser-endpoint>"
node <skill-dir>/scripts/fetch-paper.mjs --url "https://link.springer.com/article/<doi>" --out papers

If Node reports that Playwright is missing, ask permission before installing dependencies. Typical local setup:

npm install --save-dev playwright
npx playwright install chromium

For cloud-only CDP usage, playwright-core may be sufficient if the provider supplies the browser:

npm install --save-dev playwright-core

Handling Login

If the script says access is unavailable, ask the user to log in through the opened local or cloud browser session, then rerun the same command.
Use --pause-for-login only when the user is ready to complete SSO in the browser.
Use --login-only to warm the profile/session without downloading.
Do not automate 2FA, CAPTCHA solving, hidden proxy rotation, or anti-bot evasion.

When Retrieval Fails

Report the exact non-sensitive cause:

no PDF link found on the authenticated page
HTTP status such as 403, 401, or 404
publisher says the article is not included in the user's entitlement
Playwright/browser dependency is unavailable
cloud browser endpoint is expired or not connected

Then suggest a lawful next path: user reauthenticates, provides an EZproxy/OpenAthens URL, uses a library permalink, uses an official publisher API/TDM route, or manually supplies the PDF.

版本历史

共 1 个版本

v1.0.0 Initial release 当前

2026-04-22 10:51 安全安全

安全检测

腾讯云安全 (Keen)

安全，无风险

查看报告

腾讯云安全 (Sanbu)

安全，无风险

查看报告

Authenticated Paper Fetcher

概述

Authenticated Paper Fetcher

Boundaries

Preferred Workflow

Helper Script

Handling Login

When Retrieval Fails

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Stock Analysis

A股量化 AkShare

All-Market Financial Data Hub