概述

xiaoyuzhou-asr

Transcribe 小宇宙 podcast episodes to text using local Qwen3-ASR (Metal/CUDA accelerated).

> Required service: this skill does not call 小宇宙 directly. It requires a compatible

> ultrazg/xyz API server to be installed and running.

> Default base URL is http://localhost:23020; override it with XYZ_BASE_URL. Without

> this service, login, search, episode lookup, and audio URL retrieval will not work.

Prerequisites

xyz API server running — fetches episode data and audio URLs from 小宇宙

```bash

git clone https://github.com/ultrazg/xyz.git && cd xyz && go run .

# Default port: 23020, change with -p

```

ffmpeg — audio format conversion (brew install ffmpeg)
Qwen3-ASR model — download (HF Hub does NOT ship tokenizer.json):

```bash

python3 -c "

from huggingface_hub import snapshot_download

snapshot_download('Qwen/Qwen3-ASR-0.6B', local_dir='models/0.6B')

"

```

qwen3-asr-rs — build from source:

```bash

git clone https://github.com/alan890104/qwen3-asr-rs.git && cd qwen3-asr-rs

cargo build --release --example local_transcribe

```

tokenizer.json — auto-generated by the transcription script on first run (from vocab.json + merges.txt). No manual step needed.

Quick Start

# 1. Login (first time only, saves to ~/.xiaoyuzhou-asr.json)
python3 scripts/transcribe_podcast.py --login

# 2. Check all dependencies
python3 scripts/transcribe_podcast.py --check-env

# 3. Transcribe a single episode
python3 scripts/transcribe_podcast.py --keyword "早咖啡" -o output.md

# Or transcribe a shared episode URL
python3 scripts/transcribe_podcast.py --url "https://www.xiaoyuzhoufm.com/episode/EPISODE_ID" -o output.md

CLI Commands

Authentication

# Interactive login — sends verification code to phone, saves tokens
python3 scripts/transcribe_podcast.py --login

Discovery

# Search podcasts and show PID (for batch mode)
python3 scripts/transcribe_podcast.py --podcast-info --keyword "声动早咖啡"

# List recent episodes of a podcast
python3 scripts/transcribe_podcast.py --list-episodes --pid PODCAST_ID --count 10

Transcription

# Single episode by keyword (picks first result)
python3 scripts/transcribe_podcast.py --keyword "关键词" -o output.md

# Single episode by EID
python3 scripts/transcribe_podcast.py --eid EPISODE_ID -o output.md

# Single episode by Xiaoyuzhou URL
python3 scripts/transcribe_podcast.py --url "https://www.xiaoyuzhoufm.com/episode/EPISODE_ID" -o output.md

# Batch: transcribe 5 latest episodes of a podcast
python3 scripts/transcribe_podcast.py --pid PODCAST_ID --count 5 -o ./transcripts/

# With specific format
python3 scripts/transcribe_podcast.py --eid EPISODE_ID --format srt -o output.srt

Diagnostics

# Check all dependencies (ffmpeg, xyz API, token, ASR binary, model)
python3 scripts/transcribe_podcast.py --check-env

Output Formats

Format	Flag	Description
--------	------	-------------
Markdown	`--format markdown` (default)	Metadata header + transcript
SRT	`--format srt`	Subtitles with estimated timestamps
Plain text	`--format txt`	Minimal header + transcript
JSON	`--format json`	Metadata + transcript as JSON

Batch Mode

Transcribes the N most recent episodes of a podcast (--pid --count N)
Saves each episode as a separate file in the output directory
Checkpoint/resume: skips episodes that already exist in the output directory

Configuration

Settings are resolved in priority order: CLI argument > Environment variable > Config file.

Config File (`~/.xiaoyuzhou-asr.json`)

Auto-created by --login. Can also store paths:

{
  "token": "x-jike-access-token",
  "refresh_token": "x-jike-refresh-token",
  "model_dir": "/path/to/models/0.6B",
  "asr_bin": "/path/to/local_transcribe"
}

Environment Variables

Variable	Description	Default
----------	-------------	---------
`XYZ_ACCESS_TOKEN`	x-jike access token	— (required)
`XYZ_REFRESH_TOKEN`	Refresh token for auto-renewal	— (optional)
`XYZ_BASE_URL`	xyz API base URL	`http://localhost:23020`
`XYZ_HTTP_TIMEOUT`	xyz API request timeout in seconds	`15`
`XYZ_DOWNLOAD_TIMEOUT`	Audio download timeout in seconds	`120`
`QWEN3_ASR_MODEL_DIR`	Qwen3-ASR model directory	auto-detect
`QWEN3_ASR_BIN`	local_transcribe binary path	auto-detect

Token Management

--login saves tokens to config file automatically
If API returns 401, auto-refresh using refresh token
Prompt user to login if no valid token

References

xyz API endpoints and auth: references/xyz-api.md
Qwen3-ASR usage and performance: references/qwen3-asr.md

Constraints

MUST split audio into ≤3-minute segments for Metal GPU stability (auto-handled by script)
Audio must be WAV 16kHz mono (auto-converted by script)
tokenizer.json auto-generated on first run (from vocab.json + merges.txt)
xyz API requires Chinese phone number (+86) login
All processing is local — audio never leaves the machine
Download retries up to 3 times on network failure

Script Reuse

This is a skill project, not a packaged Python library. Prefer the CLI above. Other scripts in

this repository can still import scripts/transcribe_podcast.py directly:

from transcribe_podcast import (
    search_episodes, transcribe_episode, format_output,
    TranscriptionError, ApiError, TokenExpiredError,
)

try:
    episodes, _ = search_episodes(token, "早咖啡")
    episode, transcript, timings = transcribe_episode(token, eid, model_dir, asr_bin)
    output = format_output(episode, transcript)
except TranscriptionError as e:
    print(f"Error: {e}")

版本历史

共 2 个版本

v2.0.0 当前

2026-05-29 13:40
v1.0.0

2026-05-08 02:08 安全安全

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

Xiaoyuzhou Asr

概述

xiaoyuzhou-asr

Prerequisites

Quick Start

CLI Commands

Authentication

Discovery

Transcription

Diagnostics

Output Formats

Batch Mode

Configuration

Config File (`~/.xiaoyuzhou-asr.json`)

Environment Variables

Token Management

References

Constraints

Script Reuse

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Nano Banana Pro

Story Long Write

Openai Whisper

Xiaoyuzhou Asr

概述

xiaoyuzhou-asr

Prerequisites

Quick Start

CLI Commands

Authentication

Discovery

Transcription

Diagnostics

Output Formats

Batch Mode

Configuration

Config File (~/.xiaoyuzhou-asr.json)

Environment Variables

Token Management

References

Constraints

Script Reuse

版本历史

安全检测

腾讯云安全 (Keen)

腾讯云安全 (Sanbu)

🔗 相关推荐

Nano Banana Pro

Story Long Write

Openai Whisper

Config File (`~/.xiaoyuzhou-asr.json`)