Use this skill when the user wants to generate research ideas from a research domain, optionally with seed ideas. The skill retrieves recent top-conference papers, builds a domain development tree, discovers open problems through multi-agent brainstorming, ranks top questions, iteratively generates and evaluates ideas, and outputs final research questions and ideas with a concise process record.
This skill is designed for research ideation, not for claiming definitive literature coverage. When evidence is incomplete, explicitly say so and record the limitation.
The user must provide:
domain: <research domain>
The user may optionally provide:
seed_ideas:
- <initial thought, hypothesis, direction, or constraint>
constraints:
target_year_range: <default: recent 3 complete publication years>
min_papers: <default: 50>
venues: <default: ICLR, ICML, NeurIPS, KDD, WWW, SIGIR, ACL, EMNLP, NAACL, CVPR, ICCV, ECCV, AAAI, IJCAI>
output_language: <default: zh-CN>
max_tree_nodes: <default: 60>
min_tree_nodes: <default: 15>
final_question_count: <default: 5>
idea_retry_limit: <default: 5>
replacement_limit: <default: 3>
If the domain is missing, ask the user for it. If seed ideas are missing, proceed without asking.
At the start of each run, create or verify a dated run directory under outputs/.
Resolve:
RUN_DATE = current local date formatted as YYYY-M-D, for example 2026-5-26
RUN_OUTPUT_DIR = outputs/<RUN_DATE>
All run artifacts must be written under RUN_OUTPUT_DIR. Do not write directly under the top-level outputs/ directory.
Directory structure:
outputs/
2026-5-26/
corpus/
tree_structure/
questions/
ideas/
logs/
final/
Required final outputs:
outputs/<RUN_DATE>/tree_structure/domain_tree.md
outputs/<RUN_DATE>/tree_structure/domain_tree.json
outputs/<RUN_DATE>/tree_structure/node_paper_mapping.csv
outputs/<RUN_DATE>/final/final_questions_and_ideas.md
outputs/<RUN_DATE>/final/process_summary.md
outputs/<RUN_DATE>/logs/process_record.md
Recommended intermediate outputs:
outputs/<RUN_DATE>/corpus/paper_index.csv
outputs/<RUN_DATE>/corpus/paper_cards.jsonl
outputs/<RUN_DATE>/corpus/retrieval_report.md
outputs/<RUN_DATE>/questions/all_node_questions.json
outputs/<RUN_DATE>/questions/all_node_questions.md
outputs/<RUN_DATE>/questions/top_questions.json
outputs/<RUN_DATE>/questions/top_questions.md
outputs/<RUN_DATE>/ideas/candidate_ideas_round_<N>.json
outputs/<RUN_DATE>/ideas/novelty_checks_round_<N>.json
outputs/<RUN_DATE>/ideas/evaluation_round_<N>.json
outputs/<RUN_DATE>/ideas/rejection_memos_round_<N>.json
Before analysis, read and operationalize this file:
references/evaluation_of_good_ideas.md
Convert it into internal checklists. Do not merely summarize it. Use evaluation_of_good_ideas.md to score, reject, and iterate ideas.
If a reference file is missing, record the error in outputs/, use the fallback criteria embedded in this skill, and explicitly mention the fallback in the final process summary.
Convert the user's request into a run configuration:
{
"domain": "...",
"seed_ideas": ["..."],
"constraints": {
"target_year_range": "recent_3_complete_publication_years",
"min_papers": 50,
"venues": ["ICLR", "ICML", "NeurIPS", "KDD", "WWW", "SIGIR", "ACL", "EMNLP", "NAACL", "CVPR", "ICCV", "ECCV", "AAAI", "IJCAI"],
"output_language": "zh-CN",
"max_tree_nodes": 60,
"min_tree_nodes": 15,
"final_question_count": 5,
"idea_retry_limit": 5,
"replacement_limit": 3
}
}
Resolve “recent 3 years” as the most recent three complete publication years. If the current year's conference proceedings are incomplete, prefer the latest complete three years and document the decision.
Create the output directories. Start an append-only process record at:
outputs/<RUN_DATE>/logs/process_record.md
The process record must include:
Retrieve at least min_papers valid recent papers from top venues relevant to the domain.
Recommended authoritative sources:
Run multi-pass retrieval:
For every candidate paper, collect:
{
"paper_id": "P-001",
"title": "...",
"authors": ["..."],
"year": 2024,
"venue": "...",
"abstract": "...",
"url": "...",
"source_url": "...",
"venue_verified_by": "...",
"evidence_status": "verified_or_uncertain_or_auxiliary",
"keywords": ["..."],
"relevance_score": 0.0,
"assigned_topics": []
}
Filtering rules:
If fewer than min_papers valid papers remain, apply recovery steps in order:
If still fewer than min_papers papers are available, stop full analysis and write:
outputs/<RUN_DATE>/logs/retrieval_failure_report.md
Include searched queries, venues, years, valid paper count, likely reason, and advice for adjusting the domain.
For each valid paper, create a structured card:
{
"paper_id": "P-001",
"title": "...",
"problem": "...",
"core_method": "...",
"main_contribution": "...",
"technical_assumptions": ["..."],
"datasets_or_benchmarks": ["..."],
"claimed_improvements": ["..."],
"limitations": ["..."],
"future_work_signals": ["..."],
"relevance_to_domain": "...",
"possible_idea_hooks": ["..."]
}
Rules:
uncertain.outputs//corpus/paper_cards.jsonl .outputs//corpus/paper_index.csv .outputs//corpus/retrieval_report.md .Organize the literature into a tree-like pyramid.
Default levels:
Level 0: user domain
Level 1: major research directions
Level 2: subdirections
Level 3: specific tasks, method families, problem types, or application scenarios
Level 4+: deeper breakdown only when evidence supports it
Each tree node must follow:
{
"node_id": "N-001",
"level": 0,
"title": "...",
"definition": "...",
"parent_id": null,
"child_ids": ["N-002"],
"representative_papers": ["P-001", "P-002"],
"dominant_methods": ["..."],
"common_assumptions": ["..."],
"known_limitations": ["..."],
"open_problem_hints": ["..."]
}
Tree constraints:
min_tree_nodes and max_tree_nodes when possible.Output:
outputs/<RUN_DATE>/tree_structure/domain_tree.md
outputs/<RUN_DATE>/tree_structure/domain_tree.json
outputs/<RUN_DATE>/tree_structure/node_paper_mapping.csv
The markdown tree must show node path, representative papers, common assumptions, limitations, and open-problem hints.
Simulate or instantiate three sub-agents with distinct roles.
Focus on technical assumptions, failure modes, datasets, metrics, reproducibility, computational cost, and hidden experimental weaknesses.
Borrow methods, concepts, tasks, or evaluation protocols from a different field.
Pick a source field different from the user's domain:
Focus on non-obvious combinations, new problem definitions, new benchmarks, new mechanisms, new theoretical framings, and high-risk high-reward ideas.
Define active node:
A node with enough representative papers, known limitations, or open-problem hints to support at least one concrete research problem.
Usually include Level 1 and below. Include Level 0 only if it supports meaningful field-level problems.
For each candidate active node, decide evidence density:
high: at least 5 representative papers and multiple concrete limitations or open-problem hints
medium: 2-4 representative papers and at least one concrete limitation or open-problem hint
low: fewer than 2 representative papers or only vague limitations
Generate 0-3 final problems per candidate active node:
For each active node, run three rounds:
Each sub-agent proposes candidate problems with:
{
"problem_statement": "...",
"why_it_matters": "...",
"evidence_from_papers": ["P-001", "P-002"],
"current_limitation": "...",
"affected_node_id": "N-010",
"level": 2
}
Each sub-agent critiques the others' problems using:
The main agent synthesizes the final problems for the node according to its evidence density.
Each final node problem must be:
If a problem duplicates an earlier one, merge it. Generate a replacement only when the node still has enough evidence for another independent problem.
Output:
outputs/<RUN_DATE>/questions/all_node_questions.json
outputs/<RUN_DATE>/questions/all_node_questions.md
All generated questions are scored independently by the three sub-agents.
Use 1-5 scoring:
Significance
Novelty
Evidence
Tractability
Idea Potential
Fit to Good-Idea Checklist
Each score entry:
{
"question_id": "Q-001",
"scores": {
"significance": 5,
"novelty": 4,
"evidence": 5,
"tractability": 4,
"idea_potential": 5,
"fit_to_good_idea": 4
},
"rationale": "..."
}
The main agent computes:
final_score = mean(all_agent_dimension_scores)
Apply diversity constraints:
final_question_count, default 5.Output:
outputs/<RUN_DATE>/questions/top_questions.json
outputs/<RUN_DATE>/questions/top_questions.md
For each selected top question, run three rounds.
The sub-agents should cover different solution origins:
Each proposal must include:
{
"idea_title": "...",
"target_question_id": "Q-001",
"core_hypothesis": "...",
"method_overview": "...",
"what_is_new": "...",
"why_it_might_work": "...",
"required_resources": "...",
"possible_experiments": ["..."],
"risks": ["..."]
}
Agents critique each proposal:
The main agent synthesizes one candidate idea per top question with:
Output each round to:
outputs/<RUN_DATE>/ideas/candidate_ideas_round_<N>.json
Before scoring a candidate idea, run an idea-specific novelty check. Do not rely only on the initial domain corpus.
For each candidate idea:
verified, uncertain, or insufficient.Each novelty check must produce:
{
"idea_id": "I-001",
"queries": ["..."],
"closest_prior_work": [
{
"paper_id_or_url": "...",
"title": "...",
"year": 2024,
"venue": "...",
"why_close": "...",
"delta": "..."
}
],
"novelty_evidence_status": "verified_or_uncertain_or_insufficient",
"novelty_risk": "..."
}
If no credible closest prior work can be found, do not automatically treat the idea as novel. Mark the novelty evidence as uncertain and record the search limitation.
Save novelty checks to:
outputs/<RUN_DATE>/ideas/novelty_checks_round_<N>.json
Use references/evaluation_of_good_ideas.md as the primary rubric. If unavailable, use the fallback rubric in this skill.
Default evaluation record:
{
"idea_id": "I-001",
"scores": {
"novelty": 0,
"importance": 0,
"feasibility": 0,
"clarity": 0,
"technical_depth": 0,
"evaluation_plan": 0,
"difference_from_existing_work": 0,
"risk_awareness": 0
},
"score_scale": "0-100_per_dimension",
"weights": {
"novelty": 0.15,
"importance": 0.15,
"feasibility": 0.12,
"clarity": 0.10,
"technical_depth": 0.15,
"evaluation_plan": 0.13,
"difference_from_existing_work": 0.10,
"risk_awareness": 0.10
},
"total_score": 0,
"decision": "pass",
"novelty_evidence_status": "verified_or_uncertain_or_insufficient",
"rejection_reasons": ["..."],
"must_fix": ["..."]
}
All dimensions are scored from 0 to 100. Compute total_score as the weighted sum defined in references/evaluation_of_good_ideas.md, rounded to the nearest integer.
Default pass rule:
total_score >= 75
AND novelty >= 80
AND importance >= 80
AND feasibility >= 70
AND evaluation_plan >= 60
AND the idea is not a shallow combination of existing work
AND the idea has a clear experimental validation path
AND the idea can articulate contribution beyond prior work
AND novelty_evidence_status is not insufficient
If an idea fails, generate a rejection memo:
{
"idea_id": "I-001",
"target_question_id": "Q-001",
"failed_criteria": ["..."],
"rejection_reasons": ["..."],
"must_fix": ["..."],
"forbidden_retry_patterns": ["..."]
}
Then rerun the three-round idea brainstorming for that question.
Retry rules:
idea_retry_limit, default 5.Replacement rules:
If a question fails to produce a passing idea after maximum retries:
retired.replacement_limit, default 3.If fewer than 5 passing ideas are obtained after all retries and replacements, final outputs must state the true number of passing ideas. Never present failed ideas as accepted final ideas.
Maintain outputs/ throughout the run.
Use this structure:
# Process Record
## 1. Input
- Domain:
- Seed ideas:
- Constraints:
## 2. Reference Loading
- evaluation_of_good_ideas:
- Fallback used:
## 3. Retrieval
- Year range:
- Venues:
- Search queries:
- Retrieved papers:
- Valid papers after filtering:
- Recovery actions if any:
## 4. Paper Analysis
- Paper cards generated:
- Uncertain claims:
## 5. Tree Construction
- Number of nodes:
- Number of levels:
- Main directions:
- Merge/split decisions:
## 6. Question Generation
- Active nodes:
- Expected question range:
- Actual questions:
- Skipped low-evidence nodes:
- Deduplication notes:
## 7. Voting
- Top questions:
- Diversity constraints applied:
## 8. Idea Generation and Evaluation
- Question:
- Iteration count:
- Novelty check status:
- Rejections:
- Final decision:
## 9. Final Results
- Qualified ideas:
- Retired questions:
- Known limitations:
Keep the process record concise. Do not include full private deliberation. Include decisions, evidence, and outcomes.
Write:
outputs/<RUN_DATE>/final/final_questions_and_ideas.md
Required sections:
# Final Questions and Ideas
## 1. Run Summary
- Domain:
- Number of papers analyzed:
- Number of tree nodes:
- Number of candidate questions:
- Number of final qualified ideas:
## 2. Selected Top Questions
| Question ID | Node Path | Level | Question | Score |
|---|---|---:|---|---:|
## 3. Final Ideas
### Idea 1: <title>
#### Target Question
#### Node Path
#### Motivation
#### Core Hypothesis
#### Proposed Method
#### Novelty
#### Difference from Existing Work
#### Targeted Novelty Check
#### Experimental Plan
#### Expected Contribution
#### Risks and Mitigation
#### Evaluation Score
Write:
outputs/<RUN_DATE>/final/process_summary.md
Required sections:
# Process Summary
## Input
## Corpus
## Tree Structure Summary
## Question Discovery Summary
## Voting Summary
## Idea Iteration Summary
## Final Outcome
## Known Limitations
Use this only if references/evaluation_of_good_ideas.md cannot be read.
A good research idea should be:
Use this only if references/evaluation_of_good_ideas.md cannot be read.
Score out of 100:
Novelty: 20
Importance: 15
Feasibility: 15
Clarity: 10
Technical Depth: 15
Evaluation Plan: 10
Difference from Existing Work: 10
Risk Awareness: 5
Pass if:
total >= 75
novelty >= 14/20
importance >= 10/15
feasibility >= 10/15
evaluation_plan >= 7/10
Reject if any of these are true:
共 1 个版本