← 返回
未分类

yula-web-search

Yula的自定义网页搜索 - 无需API密钥。使用多种备用搜索方法,通过允许匿名访问的公共服务实现。通过本地网络直接发起curl请求,解析HTML提取搜索结果,然后从排名靠前的结果中提取内容并进行总结。开箱即用,无需任何配置。为中文查询提供完整的网页搜索能力。 使用场景: 用户要求搜索网页、查找最新信息、寻找新闻、获取当前价格、阅读网页内容时。 触发短语包括: "搜索"、"查找"、"寻找最新"、"当前是什么"、"查看网页"等。
Yula
未分类 community v1.0.1 1 版本 100000 Key: 无需
★ 0
Stars
📥 133
下载
💾 4
安装
1
版本
#latest

概述

Yula Web Search Skill

Custom web search skill by Yula — NO API KEY REQUIRED.

Uses multiple public anonymous search services that don't require API keys. Works via direct curl requests from local network:

  1. Parse Bing search to get top results (title + URL)
  2. Extract full content from the top 2-3 most relevant URLs**
  3. Summarize all information into a comprehensive answer
  4. If one method fails, automatically try next fallback

Just works, no configuration needed.

When to Use

USE this skill when:

  • Search for latest news, information, or products
  • Look up current prices, availability, or status
  • Find answers to questions that require up-to-date data
  • Extract content from a specific URL
  • Research topics that need current web information
  • Chinese language searches (optimized for China region)

DON'T use when:

  • Weather forecast → use weather skill
  • Local file search → use file tools
  • Already have the information in context

Search Workflow

Complete Workflow (with content extraction)

  1. Search Bing → get top 5-8 results with title + URL
  2. Filter results → select top 2-3 most relevant URLs based on title matching query
  3. Extract content from each selected URL using curl + html-to-text
  4. Combine all extracted content
  5. Summarize into a coherent answer for the user

Search Methods (Tried in Order)

Method 1: Direct Bing Search (cn.bing.com, Primary)

Direct request to Chinese Bing, parse HTML to extract result titles and URLs:

QUERY="your search query"
QUERY_ENCODED=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$QUERY'))"
curl -s -m 20 -L -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "https://cn.bing.com/search?q=$QUERY_ENCODED" | python3 -c "
import re, sys
from html.parser import HTMLParser

class BingParser(HTMLParser):
    def __init__(self):
        super().__init__()
        self.results = []
        self.in_h2 = False
        self.current_url = None
        self.current_title = []
    def handle_starttag(self, tag, attrs):
        attrs_dict = dict(attrs)
        if tag == 'h2':
            self.in_h2 = True
            self.current_title = []
        if tag == 'a' and self.in_h2 and 'href' in attrs_dict:
            url = attrs_dict['href']
            if 'bing.com' not in url and url.startswith('http'):
                self.current_url = url
    def handle_endtag(self, tag):
        if tag == 'h2':
            if self.current_url:
                title = ''.join(self.current_title).strip()
                self.results.append((title, self.current_url))
                self.current_url = None
            self.in_h2 = False
    def handle_data(self, data):
        if self.in_h2 and self.current_url:
            self.current_title.append(data)

parser = BingParser()
parser.feed(sys.stdin.read())
for i, (title, url) in enumerate(parser.results[:8]):
    print(f'{i+1}\\t{title}\\t{url}")
"

Method 2: Direct Google Search (google.com, Fallback 1)

If Bing fails, try Google:

QUERY="your search query"
QUERY_ENCODED=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$QUERY'))"
curl -s -m 20 -L -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36" "https://www.google.com/search?q=$QUERY_ENCODED" | python3 -c "
import re, sys
results = []
pattern = r'<h3 class=\"zBAuLc\"><a href=\"([^\"]+)\"'
matches = re.findall(pattern, html)
for i, url in enumerate(matches[:8]):
    # Google enconder title is after... extract separately
    pass
# Simplified extraction - get first 8 URLs
"

Extract Content from URL (after search)

After getting search results, extract text content from top relevant URLs:

def extract_url_content(url):
    # Use curl to get HTML
    # Use python to extract text content, remove scripts/styles/scripts, get main text
    # Return cleaned text content, limit to ~2000 chars per URL

**Example full workflow example:

# After getting search results, select top 2-3 relevant URLs
for (title, url) in selected_urls:
    curl -s -m 20 -L -A "USER_AGENT" "$url" | python3 -c "
import sys
from html.parser import HTMLParser

class TextExtractor(HTMLParser):
    def __init__(self):
        super().__init__()
        self.text = []
        self.in_script = False
        self.in_style = False
    def handle_starttag(self, tag, attrs):
        if tag == 'script' or tag == 'style' or tag == 'noscript':
            self.in_script = True
        if tag == 'style':
            self.in_style = True
    def handle_endtag(self, tag):
        if tag == 'script' or tag == 'style' or tag == 'noscript':
            self.in_script = False
        if tag == 'style':
            self.in_style = False
    def handle_data(self, data):
        if not self.in_script and not self.in_style:
            words = data.strip()
            if words:
                self.text.append(words)

parser = TextExtractor()
parser.feed(sys.stdin.read())
content = ' '.join(parser.text)
# Clean up whitespace and limit length
content = ' '.join(content.split())[:2000]
print(content)
"

User-Agent

Always use a modern browser User-Agent to avoid being blocked immediately:

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"

Complete Example Workflow

> "Search 2026腾势Z9GT介绍"

**Step 1: Search Bing → get top 8 results

# Output gives title + url:
1.  2026款腾势Z9GT-腾势官网
    URL: https://www.tengshiauto.com/product-detail/26-z9gt.html
2.  【腾势Z9GT】腾势_腾势Z9GT报价_腾势Z9GT图片_汽车之家
    URL: https://www.autohome.com.cn/7659
...

Step 2: Select most relevant results

  • Pick 2-3 results that best match the query
  1. Official website
  2. Autohome article

**Step 3: Extract content from each URL

  • Get cleaned text from each page
  • Limit to 2000 characters per page

**Step 4: Combine and summarize

  • Read all extracted content
  • Summarize into a coherent answer with key information: price, specs, release date

Best Practices

  1. Content extraction → Always extract top 2-3 results, not more (avoids too much text)
  2. Relevance filtering → Select results whose title contains more query keywords preferentially
  3. Character limit → Limit total extracted text to ~5000 chars total to avoid token overflow
  4. Timeout → 20 seconds max per request to avoid hanging
  5. Fallback → if one search engine fails, automatically try next
  6. Chinese optimized → prefer Chinese keywords, Chinese websites

How to select relevant results

  • Count how many query keywords appear in the title
  • Sort results by relevance
  • Pick top N (2-3) for content extraction
  • Official sites and major portal sites have higher priority

Notes

  • NO API KEY REQUIRED — works out of the box
  • Multiple fallback methods → if one gets blocked, try the next
  • Direct curl via local network → uses your existing network/proxy
  • Python HTML parsing → extracts title/url reliably
  • Automatic content extraction and summarization → gives complete answer without user having to click links
  • Modern browser User-Agent → reduces chance of being blocked
  • Free for non-commercial use
  • Rate limits: be respectful, don't spam too many requests quickly
  • If all methods fail, fall back to general knowledge

Full Workflow Summary

1. Search Bing → get (title + url)
   ↓
2. Filter by relevance → pick top 2-3
   ↓
 3. Extract content from each URL
   ↓
 4. Combine all text
   ↓
 5. Summarize into final answer
   ↓
 6. Present to user with sources

Author

Created by Yula

GitHub: https://github.com/wjzhb/yula-web-search

License

Copyright (c) 2026 Yula

Licensed under the MIT License

If you find this skill useful, please ⭐ star it on GitHub!

版本历史

共 1 个版本

  • v1.0.1 Created by Yula GitHub: https://github.com/wjzhb/yula-web-search 当前
    2026-04-08 11:18 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Tavily 搜索

jacky1n7
通过 Tavily API 进行网页搜索(Brave 替代方案)。当用户要求搜索网页、查找来源或链接,且 Brave 网页搜索不可用时使用。
★ 278 📥 101,474
data-analysis

Stock Analysis

udiedrichsen
利用Yahoo Finance数据深度分析股票和加密货币。支持投资组合管理、关注列表与提醒、股息分析、八维度股票评分、热门趋势扫描(热点扫描器)及谣言/早期信号检测。适用于股票分析、投资组合追踪、财报反应、加密货币监控、热门股票发现及在主流
★ 281 📥 58,189
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 216 📥 71,398