← 返回
内容创作 Key 中文

Volcengine Ata Subtitle

Generate subtitles with automatic time alignment using Volcengine ATA API. Use when the user wants to: (1) add time-aligned subtitles to videos, (2) convert...
使用火山引擎ATA API生成带自动时间对齐的字幕。适用于用户需要:(1) 为视频添加时间对齐的字幕,(2) 转换...
blackeight4752
内容创作 clawhub v0.1.0 1 版本 99812.7 Key: 需要
★ 0
Stars
📥 533
下载
💾 34
安装
1
版本
#latest

概述

Volcengine ATA Subtitle (自动打轴)

Generate subtitles with automatic time alignment using Volcengine's ATA (Automatic Time Alignment) API.

Prerequisites

Set the following environment variables or create a config file:

Option A: Environment Variables

export VOLC_ATA_APP_ID="your-app-id"
export VOLC_ATA_TOKEN="your-access-token"
export VOLC_ATA_API_BASE="https://openspeech.bytedance.com"

Option B: Config File

Create ~/.volcengine_ata.conf:

[credentials]
appid = your-app-id
access_token = your-access-token
secret_key = your-secret-key

[api]
base_url = https://openspeech.bytedance.com
submit_path = /api/v1/vc/ata/submit
query_path = /api/v1/vc/ata/query

Execution (Python CLI Tool)

A Python CLI tool is provided at ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py.

Quick Examples

# Basic usage: audio + text → SRT subtitle
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.srt

# Specify output format (srt or ass)
python3 ~/.openclaw/workspace/skills/volcengine-ata-subtitle/volc_ata.py \
  --audio storage/audio.wav \
  --text storage/subtitle.txt \
  --output storage/subtitles/final.ass \
  --format ass

Input Requirements

Audio File

  • Format: WAV (PCM)
  • Sample Rate: 16000 Hz (16kHz)
  • Channels: 1 (mono)
  • Encoding: 16-bit PCM (pcm_s16le)

Extract from video:

ffmpeg -i input.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav

Text File

  • Format: Plain text (UTF-8)
  • Structure: One sentence per line
  • No punctuation: ATA will handle automatically
  • No timestamps: Pure text only

Example:

主人闹钟没响睡过头了
我们俩轮流用鼻子拱他脸
他以为地震了抱着枕头就跑

Output Formats

SRT (SubRip)

1
00:00:00,000 --> 00:00:02,500
第一句字幕

2
00:00:02,500 --> 00:00:05,000
第二句字幕

ASS (Advanced Substation Alpha)

[Script Info]
Title: ATA Subtitles
ScriptType: v4.00+

[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:00.00,0:00:02.50,Default,,0,0,0,,第一句字幕

Rules

  1. Always check that credentials are configured before making API calls.
  2. Audio must be 16kHz mono PCM - convert if necessary with ffmpeg.
  3. Text should be plain - no timestamps, no punctuation.
  4. Default format: SRT (most compatible).
  5. Handle errors gracefully - display clear error messages.

Troubleshooting

Invalid Sample Rate

Error: Invalid sample rate, expected 16000Hz

Fix:

ffmpeg -i input.mp4 -ar 16000 -ac 1 audio.wav

Authorization Failed

Error: Authorization failed

Fix: Check token format. Should be Bearer; {token} (with semicolon).

Related Documents

版本历史

共 1 个版本

  • v0.1.0 当前
    2026-03-31 04:57 安全 安全

安全检测

腾讯云安全 (Keen)

安全,无风险
查看报告

腾讯云安全 (Sanbu)

安全,无风险
查看报告

🔗 相关推荐

content-creation

Humanizer

biostartechnology
消除AI写作痕迹,使文本更自然真实。基于维基百科"AI写作特征"指南,识别并修正夸张象征、宣传用语、肤浅-ing分析、模糊归因、破折号滥用、三项排比、AI词汇、负面平行结构及冗长连接词等模式。
★ 857 📥 199,308
content-creation

Baidu Wenku AIPPT

ide-rea
使用百度文库 AI 智能生成 PPT,自动根据内容选择模板。
★ 66 📥 46,133
ai-intelligence

Alibaba Super Resolution

blackeight4752
使用阿里云超分辨率API提升视频分辨率。适用于:①将低分辨率视频上采样至高分辨率;②提升视频质量。
★ 0 📥 526