Convert raw literature inputs into standardized records safe for project memory, paper databases, and downstream synthesis pipelines.
One of the following is required:
pdf_path — local path to PDF fileurl — link to paper/articleraw_text — extracted or pasted textmetadata_blob — existing metadata dictPlus:
project_id — required for any writebacksource_type — one of: pdf, doi, url, text, metadataoptional tags — list of strings for categorizationReturn a structured object:
title: string
authors: string[] | null
year: number | null
source: string # journal, conference, preprint, etc.
doi_or_url: string | null
project_id: string
paper_type: string # experimental, theoretical, review, etc.
material_system: string | null # e.g. "钙钛矿太阳能电池", " graphene FET"
device_type: string | null # e.g. "FTO/glass", "flexible substrate"
key_variables: string[] | null # independent variables studied
key_metrics: string[] | null # measured outcomes (PCE, mobility, etc.)
core_findings: string # 2-3 sentence neutral summary
claimed_mechanism: string | null
limitations: string | null
normalized_summary: string # 1-2 paragraph structured summary
uncertain_fields: string[] | null # fields that could not be verified
writeback_ready: boolean # true only if key identity fields present
writeback_payload: object # the record to write into project memory
null for missing fields; list in uncertain_fields.core_findings and normalized_summary grounded in what the text actually says.writeback_ready = false, list explicitly which fields are missing and why.For PDFs, use the summarize skill or pdfplumber/PyMuPDF to extract text before processing.
writeback_ready based on presence of key identity fieldsIf parsing is incomplete:
uncertain_fields with the list of fields that could not be determinedwriteback_ready = false when title, authors, or year are missingFor synthesis after normalization, see the research skill for paper synthesis workflows.
共 1 个版本