← 返回
数据分析 中文

finviz-crawler

Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building new...
持续抓取 finviz.com 金融新闻,使用 SQLite 存储,支持文章提取和查询,适用于金融市场监测和构建新项目。
camopel
数据分析 clawhub v3.0.0 1 版本 100000 Key: 无需
★ 0
Stars
📥 1,127
下载
💾 63
安装
1
版本
#latest

概述

finviz-crawler

Why This Skill?

📰 Your own financial news database — most finance skills just wrap an API for one-shot queries. This skill runs continuously, building a local archive of every headline and article from Finviz. Query your history anytime — no API limits, no missing data.

🆓 No API key, no subscription — scrapes finviz.com directly using Crawl4AI + RSS. Bloomberg, Reuters, Yahoo Finance, CNBC articles extracted automatically. Zero cost.

🤖 Built for AI summarization — the query tool outputs clean text/JSON optimized for LLM digests. Pair with an OpenClaw cron job for automated morning briefings, evening wrap-ups, or weekly investment summaries.

💾 Auto-cleanup — configurable expiry automatically deletes old articles from both the database and disk. Set --expiry-days 30 to keep a month of history, or 0 to keep everything forever.

🔄 Daemon architecture — runs as a background service that starts/stops with OpenClaw. No manual intervention needed after setup. Works with systemd (Linux) and launchd (macOS).

Install

python3 scripts/install.py

Works on macOS, Linux, and Windows. Installs Python packages (crawl4ai, feedparser), sets up Playwright browsers, creates data directories, and verifies everything.

Manual install

pip install crawl4ai feedparser
crawl4ai-setup  # or: python -m playwright install chromium

Usage

Run the crawler

# Default: ~/workspace/finviz/, 7-day expiry
python3 scripts/finviz_crawler.py

# Custom paths and settings
python3 scripts/finviz_crawler.py --db /path/to/finviz.db --articles-dir /path/to/articles/

# Keep 30 days of articles
python3 scripts/finviz_crawler.py --expiry-days 30

# Never auto-delete (keep everything)
python3 scripts/finviz_crawler.py --expiry-days 0

# Custom crawl interval (default: 300s)
python3 scripts/finviz_crawler.py --sleep 600

Query articles

# Last 24 hours of headlines
python3 scripts/finviz_query.py --hours 24

# Titles only (compact, good for LLM summarization)
python3 scripts/finviz_query.py --hours 12 --titles-only

# With full article content
python3 scripts/finviz_query.py --hours 12 --with-content

# List downloaded articles with content status
python3 scripts/finviz_query.py --list-articles --hours 24

# Database stats
python3 scripts/finviz_query.py --stats

Manage tickers

# List all tracked tickers
python3 scripts/finviz_query.py --list-tickers

# Add single ticker (auto-generates keywords from symbol)
python3 scripts/finviz_query.py --add-ticker NVDA

# Add with custom keywords
python3 scripts/finviz_query.py --add-ticker "NVDA:nvidia,jensen huang"

# Add multiple tickers (batch)
python3 scripts/finviz_query.py --add-ticker NVDA TSLA AAPL
python3 scripts/finviz_query.py --add-ticker "NVDA:nvidia,jensen" "TSLA:tesla,elon musk"

# Remove tickers (batch)
python3 scripts/finviz_query.py --remove-ticker NVDA TSLA

# Custom DB path
python3 scripts/finviz_query.py --list-tickers --db /path/to/finviz.db

Tickers are stored in the tickers table inside finviz.db alongside articles. The crawler reads this table each cycle to know which ticker pages to scrape.

Configuration

SettingCLI flagEnv varDefault
-------------------------------------
Database path--db~/workspace/finviz/finviz.db
Articles directory--articles-dir~/workspace/finviz/articles/
Crawl interval--sleep300 (5 min)
Article expiry--expiry-daysFINVIZ_EXPIRY_DAYS7 days
TimezoneFINVIZ_TZ or TZSystem default

💬 Chat Commands (OpenClaw Agent)

When this skill is installed, the agent recognizes /finviz as a shortcut:

CommandAction
-----------------
/finviz listShow tracked tickers
/finviz add NVDA, TSLAAdd tickers to track
/finviz remove NVDARemove a ticker
/finviz statsShow article/ticker counts
/finviz helpShow available commands

The agent runs these via the finviz_query.py CLI internally.

📱 PrivateApp Dashboard

A companion mobile dashboard is available in PrivateApp — a personal PWA dashboard for your home server.

The Finviz app provides:

  • Headlines browser with time-range filters (12h / 24h / Week)
  • Ticker-specific news filtering
  • LLM-powered summaries on demand

Install PrivateApp, and the Finviz dashboard is built-in — no extra setup needed.

Architecture

Crawler daemon (finviz_crawler.py):

  • Crawls finviz.com/news.ashx headlines every 5 minutes
  • Fetches article content via Crawl4AI (Playwright) or RSS (paywalled sites)
  • Bot/paywall detection rejects garbage content
  • Per-domain rate limiting, user-agent rotation
  • Deduplicates via SHA-256 title hash
  • Auto-expires old articles (configurable)
  • Clean shutdown on SIGTERM/SIGINT

Query tool (finviz_query.py):

  • Read-only SQLite queries (no HTTP, stdlib only)
  • Filter by time window, export titles or full content
  • Designed for LLM summarization pipelines

Run as a service (optional)

systemd (Linux)

[Unit]
Description=Finviz News Crawler

[Service]
ExecStart=python3 /path/to/scripts/finviz_crawler.py --expiry-days 30
Restart=on-failure
RestartSec=30

[Install]
WantedBy=default.target

launchd (macOS)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key><string>com.finviz.crawler</string>
    <key>ProgramArguments</key>
    <array>
        <string>python3</string>
        <string>/path/to/scripts/finviz_crawler.py</string>
        <string>--expiry-days</string>
        <string>30</string>
    </array>
    <key>RunAtLoad</key><true/>
    <key>KeepAlive</key><true/>
</dict>
</plist>

Data layout

~/workspace/finviz/
├── finviz.db          # SQLite: articles + tickers (single DB)
├── articles/          # Full article content as .md files
│   ├── market/        # General market headlines
│   ├── nvda/          # Per-ticker articles
│   └── tsla/
└── summaries/         # LLM summary cache (.json)

Cron integration

Pair with an OpenClaw cron job for automated digests:

Schedule: 0 6 * * * (6 AM daily)
Task: Query last 24h → LLM summarize → deliver to Matrix/Telegram/Discord

版本历史

共 1 个版本

  • v3.0.0 当前
    2026-03-29 09:17 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

data-analysis

Excel / XLSX

ivangdavila
创建、检查和编辑 Microsoft Excel 工作簿及 XLSX 文件,支持可靠的公式、日期、类型、格式、重算及模板保留功能。
★ 366 📥 139,963
data-analysis

Data Analysis

ivangdavila
{"answer":"数据分析与可视化。查询数据库、生成报告、自动化电子表格,将原始数据转化为清晰可行的见解。适用于:(1) 您……"}
★ 198 📥 64,859
data-analysis

A股量化 AkShare

mbpz
A股量化数据分析工具,基于AkShare库获取A股行情、财务数据、板块信息等。用于回答关于A股股票查询、行情数据、财务分析、选股等问题。
★ 162 📥 59,675