← 返回
未分类 中文

Gladia Pre Recorded Transcription

Transcribe pre-recorded audio files or URLs with Gladia. Use when the user needs batch/async transcription, speaker diarization, subtitles (SRT/VTT), PII red...
使用 Gladia 转录预录音频或 URL,支持批量/异步转录、说话人分离、字幕(SRT/VTT)及 PII 脱敏。
gladiaio gladiaio 来源
未分类 clawhub v1.0.1 1 版本 100000 Key: 无需
★ 0
Stars
📥 62
下载
💾 1
安装
1
版本
#latest

概述

Pre-Recorded Transcription

Gladia's pre-recorded API transcribes audio and video files asynchronously.

> SDK-first: always use the official SDK — see gladia-sdk-integration for policy, setup, and fallback criteria.

When to Use

  • Existing audio/video files or URLs (including social/video links)
  • Batch or asynchronous transcription workflows
  • Pre-recorded-only features: diarization, PII redaction, subtitles

When NOT to use: If the user needs real-time / live transcription of a stream, microphone, or ongoing audio feed, use the gladia-live-transcription skill instead. Live transcription uses WebSocket sessions, not the pre-recorded API.

References

Consult these resources as needed:

  • ./references/transcription-options.md -- Full options (JS + Python)
  • ./references/managing-jobs.md -- get, list, getFile, delete
  • ./references/delivery-and-response.md -- Response shape and events
  • ../gladia-audio-intelligence/SKILL.md -- Feature availability and config
  • ../gladia-sdk-integration/SKILL.md -- Setup, config, SDK vs raw API
  • ../gladia-sdk-integration/references/sdk-versions.md -- Current SDK versions
  • ../gladia-troubleshooting/SKILL.md -- Errors and diagnostics

API Endpoints (reference — prefer SDK methods instead)

EndpointMethodSDK equivalent
------------------------------------------------------------------------
/v2/uploadPOSTtranscribe() auto-uploads local files
/v2/pre-recordedPOSTcreate() / transcribe()
/v2/pre-recordedGETlist()
/v2/pre-recorded/:idGETget() / poll() / transcribe()
/v2/pre-recorded/:idDELETEdelete()
/v2/pre-recorded/:id/fileGETgetFile()

Workflow

Recommended (SDK)

The SDK transcribe() method handles upload, job creation, and polling in one call. Use this by default.

const result = await client.preRecorded().transcribe("./audio.mp3", {
  language_config: { languages: ["en"] },
  diarization: true,
});

console.log(result.result?.transcription?.full_transcript);
result = client.prerecorded().transcribe(
    "audio.mp3",
    {"language_config": {"languages": ["en"]}, "diarization": True},
)

print(result.result.transcription.full_transcript)

Audio input can be a local file path, HTTP(S) URL, social/video URL, or binary file object. For full input types, see gladia-sdk-integration.

Fallback (raw REST — only when SDK is not feasible)

Use raw REST only when SDK use is not possible.

  1. Upload (if local file): POST /v2/upload with multipart form data → get audio_url
  2. Create job: POST /v2/pre-recorded with audio_url and config → get id
  3. Poll: GET /v2/pre-recorded/:id until status: "done" (or use webhooks/callbacks)
  4. Parse results: Extract transcription, diarization, translation, etc. from response

Managing Jobs

Use SDK methods for post-processing operations:

  • JavaScript: client.preRecorded().get(id), .list(filters), .getFile(id), .delete(id)
  • Python: client.prerecorded().get(id), .list(filters), .get_file(id), .delete(id)

For full JS/Python examples, pagination filters, and REST equivalents, see ./references/managing-jobs.md.

Transcription Options

All options are passed as the second argument to transcribe(). Key options:

OptionDescription
------------------------------------------------------------
language_configExpected languages, code switching
diarizationSpeaker identification (pre-recorded only)
translationTranslate to target languages
summarizationGenerate bullet points or paragraph summary
subtitlesGenerate SRT/VTT files
pii_redactionRedact PII (pre-recorded only)
audio_to_llmRun custom LLM prompts on transcript
callback_urlAsync webhook delivery

For full option details, see ./references/transcription-options.md. For audio intelligence config, see gladia-audio-intelligence. For client-level retry/timeouts, see gladia-sdk-integration.

Response and Delivery

For full response JSON and event names, see ./references/delivery-and-response.md.

Limits and Specifications

ConstraintValue
--------------------------------------------------------
Max file size1000 MB
Max duration135 minutes (120 min for YouTube)
Enterprise max duration4h15
Concurrency (paid)25 concurrent jobs
Concurrency (free)3 concurrent jobs

Polling Best Practices

The SDK handles polling automatically — transcribe() polls until the job completes with configurable interval and timeout:

const result = await client.preRecorded().transcribe(audio, options, {
  interval: 5000, // Poll every 5s
  timeout: 600000, // Timeout after 10 minutes
});

If using raw REST instead of the SDK:

  • Use webhooks or callbacks instead of polling when possible
  • If polling, implement exponential backoff (start at 3s, max 30s)

Common Mistakes

  • Code switching without language list: enabling code_switching: true with empty languages triggers 100+ language evaluation. Always provide 3-5 expected languages.
  • Polling without backoff: rapid polling wastes requests and may trigger 429s. The SDK handles this; for raw REST, use webhooks or exponential backoff.
  • Expecting live-only features: diarization, PII redaction, and subtitles are pre-recorded only — not available in live mode.
  • Wrong audio file path: the audio download endpoint is /v2/pre-recorded/:id/file, not /v2/pre-recorded/:id/audio.

For the full list of gotchas and diagnostics, see the gladia-troubleshooting skill.

Further Reading

版本历史

共 1 个版本

  • v1.0.1 当前
    2026-06-09 19:31

安全检测

腾讯云安全 (Keen)

队列中

腾讯云安全 (Sanbu)

队列中

🔗 相关推荐

Gladia Documentation Auto

gladiaio
全面的 Gladia 语音转文字参考,已自动同步自 docs.gladia.io。作为通用后备方案,适用于其他专业技能不匹配的场景。
★ 0 📥 124

Gladia Audio Intelligence

gladiaio
配置并使用 Gladia音频智能功能:说话人分割、翻译、情感分析、命名实体识别(NER)、个人身份信息(PII)编辑、摘要。
★ 0 📥 96

Gladia Live Transcription

gladiaio
通过WebSocket实现Gladia实时语音转文字流。适用于需要实时转录、构建语音代理、会议录音、呼叫中心等场景。
★ 0 📥 94