Local transcription using NVIDIA Parakeet TDT 0.6B v3 with ONNX Runtime.
Runs on CPU — no GPU required. ~30x faster than realtime.
# Clone the repo
git clone https://github.com/groxaxo/parakeet-tdt-0.6b-v3-fastapi-openai.git
cd parakeet-tdt-0.6b-v3-fastapi-openai
# Run with Docker (recommended)
docker compose up -d parakeet-cpu
# Or run directly with Python
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 5000
Default port is 5000. Set PARAKEET_URL to override (e.g., http://localhost:5092).
OpenAI-compatible API at $PARAKEET_URL (default: http://localhost:5000).
# Transcribe audio file (plain text)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
-F "file=@/path/to/audio.mp3" \
-F "response_format=text"
# Get timestamps and segments
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
-F "file=@/path/to/audio.mp3" \
-F "response_format=verbose_json"
# Generate subtitles (SRT)
curl -X POST $PARAKEET_URL/v1/audio/transcriptions \
-F "file=@/path/to/audio.mp3" \
-F "response_format=srt"
import os
from openai import OpenAI
client = OpenAI(
base_url=os.getenv("PARAKEET_URL", "http://localhost:5000") + "/v1",
api_key="not-needed"
)
with open("audio.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="parakeet-tdt-0.6b-v3",
file=f,
response_format="text"
)
print(transcript)
| Format | Output |
|---|---|
| -------- | -------- |
text | Plain text |
json | {"text": "..."} |
verbose_json | Segments with timestamps and words |
srt | SRT subtitles |
vtt | WebVTT subtitles |
English, Spanish, French, German, Italian, Portuguese, Polish, Russian,
Ukrainian, Dutch, Swedish, Danish, Finnish, Norwegian, Greek, Czech,
Romanian, Hungarian, Bulgarian, Slovak, Croatian, Lithuanian, Latvian,
Estonian, Slovenian
Language is auto-detected — no configuration needed.
Open $PARAKEET_URL in a browser for drag-and-drop transcription UI.
# Check status
docker ps --filter "name=parakeet"
# View logs
docker logs -f <container-name>
# Restart
docker compose restart
# Stop
docker compose down
共 1 个版本