Save web articles to local storage, supporting articles with images. Automatically downloads images, generates Markdown, and converts to PDF.
1. Fetch article content (Jina Reader or browser)
2. Save to saved-articles/{title}-{date}.md
3. Send file to Feishu
1. Create directory reports/{article-name}/
2. Create images/ subdirectory
3. Download all images to images/
4. Generate Markdown (relative path references)
5. Convert to PDF
6. Send PDF to Feishu
Methods:
!Image format![]()
tagsDecision:
mkdir -p ~/.openclaw/workspace/reports/{article-name}/images/
Directory Structure:
reports/{article-name}/
├── {article-name}.md # Markdown file
├── {article-name}.html # HTML intermediate (optional)
├── {article-name}.pdf # Final output (optional)
└── images/ # Image directory
├── image1.jpg
├── image2.png
└── ...
curl -s "https://r.jina.ai/URL"
Pros: Auto-converts to Markdown, extracts image links
Cons: Some sites blocked
# Open webpage
browser action=open url=URL
# Get content
browser action=act kind=evaluate fn='() => document.body.innerText'
# Get images
browser action=act kind=evaluate fn='() => {
const imgs = document.querySelectorAll("img");
return JSON.stringify(Array.from(imgs).map(img => ({
src: img.src,
alt: img.alt
})));
}'
Single Image:
curl -o "images/image1.jpg" "https://example.com/image.jpg"
Batch Download (Python):
import requests
from pathlib import Path
def download_images(image_urls, output_dir):
"""Download image list"""
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
for i, url in enumerate(image_urls, 1):
try:
# Get extension
ext = url.split('.')[-1].split('?')[0]
if ext not in ['jpg', 'jpeg', 'png', 'gif', 'webp']:
ext = 'jpg'
# Download
resp = requests.get(url, timeout=30, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
})
if resp.status_code == 200:
filename = f"image{i}.{ext}"
(output_dir / filename).write_bytes(resp.content)
print(f"✅ {filename}")
else:
print(f"❌ HTTP {resp.status_code}: {url}")
except Exception as e:
print(f"❌ {e}: {url}")
# Usage
# download_images(['url1', 'url2'], 'images/')
Image Naming:
image1.jpg, image2.png, ...cover.jpg, screenshot.png, ...Template:
# {Article Title}
> Source: {URL}
> Author: {author}
> Published: {date}
---

{Content}
---
## Images


---
*Saved: {timestamp}*
Image Reference Format:

Using Preset Styles:
# CSS file
CSS_FILE=~/.openclaw/workspace/templates/mobile-friendly.css
# Convert to HTML
pandoc {article-name}.md -o {article-name}.html --standalone --css=$CSS_FILE
# Generate PDF
weasyprint {article-name}.html {article-name}.pdf
PDF Configuration:
Problem: Images too large (e.g., 1200px wide), exceed PDF page width (~432pt/6 inches)
Solution: Create CSS file to limit image max-width
Required CSS:
/* Prevent image overflow */
img {
max-width: 100%;
height: auto;
display: block;
margin: 1em auto;
}
/* Images in images/ directory - 90% width */
img[src^="images/"] {
max-width: 90%;
margin: 0.5em auto;
}
/* Body styles */
body {
max-width: 100%;
padding: 1cm;
}
Correct PDF Generation Flow:
# 1. Create CSS file (in article directory)
cat > style.css << 'EOF'
img { max-width: 100%; height: auto; }
img[src^="images/"] { max-width: 90%; }
EOF
# 2. Generate HTML with CSS
pandoc {article-name}.md -o {article-name}.html --standalone --css=style.css
# 3. Generate PDF
weasyprint {article-name}.html {article-name}.pdf
Key Points:
max-width: 100% or max-width: 90%images/xxx.jpgSend Markdown:
message action=send channel=feishu target="user:ou_xxx" filePath="path/to/file.md"
Send PDF:
message action=send channel=feishu target="user:ou_xxx" filePath="path/to/file.pdf"
| Source | Fetch Method | Image Handling |
|---|---|---|
| -------- | -------------- | ---------------- |
| Twitter/X | Jina Reader | Download pbs.twimg.com images |
| WeChat Official Account | browser + Camoufox | Download mmbiz.qpic.cn images |
| General Webpages | Jina Reader | Download all img tags |
| Login Required Sites | browser | User manual screenshot |
Image URL Format:
https://pbs.twimg.com/media/XXXXX?format=jpg&name=small
Download Command:
# Get best quality
curl -o "images/image1.jpg" "https://pbs.twimg.com/media/XXXXX?format=jpg&name=large"
Problem: WeChat has anti-hotlinking, direct download fails
Solutions:
# Use tool from agent-reach
cd ~/.agent-reach/tools/wechat-article-for-ai
python3 main.py "https://mp.weixin.qq.com/s/ARTICLE_ID"
After saving, verify:
□ Markdown file generated
□ All images downloaded successfully
□ Image relative paths correct
□ Images display correctly (local preview)
□ PDF generated successfully (optional)
□ File sent to Feishu
| Error | Cause | Solution |
|---|---|---|
| ------- | ------- | ---------- |
| Image download failed | Anti-hotlinking/Network | Use browser or lower quality |
| PDF generation failed | Missing fonts/dependencies | Check weasyprint installation |
| Markdown images not showing | Path error | Check relative paths |
| Jina Reader blocked | Site restriction | Use browser fetch |
| Type | Directory |
|---|---|
| ------ | ----------- |
| Simple articles | saved-articles/{title}-{date}.md |
| Articles with images | reports/{article-name}/ |
| Temporary files | /tmp/article-{id}/ |
Skill Version: 1.0.0
Created: 2026-03-17
共 1 个版本