← 返回
未分类 Key 中文

Exa Search

Semantic search, web scraping, and content extraction optimized for AI Agents and LLMs. Use when you need highly relevant web search, clean Markdown extracti...
语义搜索、网页抓取和内容提取,针对AI代理和大语言模型优化。用于需要高度相关的网络搜索、干净的Markdown提取等场景。
simonpierreboucher02
未分类 clawhub v1.0.0 1 版本 100000 Key: 需要
★ 0
Stars
📥 161
下载
💾 0
安装
1
版本
#latest

概述

Exa Search Skill

This skill extends Manus's capabilities with a custom-built, semantic search engine optimized specifically for AI agents and LLMs. It enables neural search, clean Markdown extraction, extractive query highlights, and conceptual similarity searches.

When to Use

Use this skill when:

  1. You need to search the web using complex natural language prompts rather than raw keywords.
  2. You need clean, boilerplate-free Markdown text from web pages for LLM context windows.
  3. You need query-relevant extractive snippets (highlights) to reduce token consumption.
  4. You need to perform similarity searches using an existing URL as a conceptual query.
  5. You need to perform deep, multi-step web research or build structured lists using schema validation.

Core Capabilities & Commands

1. Neural Search (/search)

Query Exa's index using semantic embeddings. Unlike keyword matching, this understands the conceptual meaning of your prompt.

from exa_py import Exa
exa = Exa(api_key="YOUR_EXA_API_KEY")

results = exa.search(
    query="companies building innovative fusion energy reactors",
    type="auto",
    num_results=5,
    contents={"highlights": True}
)

2. Clean Web Extraction (/contents)

Retrieve webpage content stripped of navigation menus, sidebars, advertisements, and other boilerplate, returned as clean Markdown.

contents = exa.get_contents(
    urls=["https://example.com/target-article"],
    text=True,
    max_age_hours=24
)

3. Similar Link Discovery (/findSimilar)

Find conceptually similar pages in Exa's index using a starting URL as your query.

similar = exa.find_similar(
    url="https://arxiv.org/abs/2307.06435",
    num_results=5
)

Advanced Workflows & Best Practices

Cache Freshness & Live Crawling

By default, Exa serves cached pages to optimize speed. To control cache freshness, use max_age_hours instead of deprecated livecrawl parameters:

  • max_age_hours=0: Forces a live crawl of the URL.
  • max_age_hours=1: Uses cache if it's less than 1 hour old, otherwise performs a live crawl.
  • max_age_hours=-1: Cache-only lookup (never crawl).

Subpage Crawling

Automatically discover and extract content from linked subpages on a target site. Highly effective for documentation or news archives:

results = exa.get_contents(
    ["https://docs.exa.ai"],
    subpages=10,
    subpage_target=["api", "reference"],
    max_age_hours=24
)

RAG Integration Pattern

Always format extracted contents cleanly into XML blocks for downstream LLM generation:

context = "\n".join([
    f"<source><url>{r.url}</url><highlights>{r.highlights}</highlights></source>"
    for r in results.results
])

References & Resources

  • Detailed API endpoints and SDK configurations: API Reference
  • Command-line search utility: Execute /home/ubuntu/skills/exa-search/scripts/exa_search.py --help

版本历史

共 1 个版本

  • v1.0.0 当前
    2026-06-01 21:24

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

Open Alex

simonpierreboucher02
利用 OpenAlex 进行元数据查询,无需 API 密钥,即可查找并引用学术作品、作者、机构和趋势。
★ 0 📥 195
developer-tools

EODHD API

simonpierreboucher02
提供与EODHD(EOD历史数据)API交互的工具和工作流程,用于获取金融数据。使用此技能可以获取市场数据、基本面数据等。
★ 1 📥 550

Tavily api

simonpierreboucher02
教AI利用Tavily进行精准的实时网络搜索、内容抓取、站点映射和来源评估,确保正确引用与错误处理。
★ 0 📥 197