Generate full songs and instrumental tracks from a text description — studio-quality 44.1 kHz stereo, 5 seconds to 5 minutes, with section-level structure control. ElevenLabs Music on the RunComfy Model API, called through the runcomfy CLI.
runcomfy.com · ElevenLabs Music model · CLI docs
# 1. Install (one of — see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # global install
npx -y @runcomfy/cli --version # zero-install
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Generate music
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{"prompt": "..."}' \
--output-dir ./out
CLI deep dive: runcomfy-cli skill.
ElevenLabs Music's strength is structured songs with real vocals — it takes a style brief plus lyrics with section markers and returns a coherent, mixed track. Pick it for:
force_instrumental: true for background music, podcast intros, game loopsIf the user just wants ambient sound or a one-off SFX (thunder, footsteps), that's a sound-effects task, not music — ElevenLabs Music is for songs and tracks.
Model: elevenlabs/elevenlabs/music-generation
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| --- | --- | --- | --- | --- |
prompt | string | yes | — | Style description and lyrics with section markers. See prompting tips |
music_length_ms | int | no | 40000 | Output duration in ms. 5000–300000 (5 s – 5 min) |
force_instrumental | bool | no | false | true = instrumental only, no vocals |
output_format | string | no | mp3_standard | mp3_standard (default), or WAV — see the model page API tab for the full format list |
Output: 44.1 kHz stereo audio. The result JSON contains the generated audio URL — the CLI downloads it into --output-dir.
Pricing: ~$0.0083 per second of generated audio (30 s ≈ $0.25, 60 s ≈ $0.50, 5 min ≈ $2.49). Cost scales with music_length_ms, so draft short and finalize long.
Full vocal song with structure:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Upbeat indie-pop anthem, bright electric guitars, driving drums, 120 BPM, female lead vocal. [Intro 8 bars] instrumental build. [Verse] Chalk on the palms, laces double-knotted, morning on the ridge. [Chorus] We rise, we strike, we never fade out. [Bridge] soft breakdown, just piano and voice. [Outro] full band, fade.",
"music_length_ms": 60000
}' \
--output-dir ./out
Instrumental background bed:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Calm lo-fi hip-hop instrumental for a study playlist. Warm Rhodes piano, soft vinyl crackle, mellow boom-bap drums, 75 BPM. No vocals. Consistent loop-friendly groove throughout.",
"music_length_ms": 90000,
"force_instrumental": true
}' \
--output-dir ./out
Short brand jingle:
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "5-second cheerful brand stinger, bright marimba and a single uplifting chord resolve, no vocals.",
"music_length_ms": 5000,
"force_instrumental": true
}' \
--output-dir ./out
ElevenLabs Music reads one prompt field that carries both the style brief and the lyrics. Structure it well:
"Upbeat indie-pop anthem, bright electric guitars, 120 BPM, female lead vocal."[Intro], [Verse], [Chorus], [Bridge], [Outro]. Add approximate durations or bar counts — [Intro 8 bars], [Verse 16 bars]."electric guitar carries the chorus, drums sit back in the verse."force_instrumental: true AND say "no vocals" in the prompt — belt and suspenders.[Verse] (sung in Brazilian Portuguese) ...).music_length_ms: 35000) before paying for a 5-minute render.[Intro]/[Verse]/[Chorus] structure, music_length_ms matched to the video lengthforce_instrumental: true, 10–20 s, "loop-friendly, clean ending"force_instrumental: true, describe "seamless loop", 60–120 s, consistent groovemusic_length_ms: 35000 to lock genre/tempo/structure → final render at full lengthprompt field carries everything (style + lyrics). There is no separate "lyrics" parameter.music_length_ms 5000–300000). For longer pieces, generate sections and stitch externally.force_instrumental is the only vocal toggle — you can't request specific voice identities or clone a singer through this endpoint.| code | meaning |
|---|---|
| --- | --- |
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
The skill invokes runcomfy run elevenlabs/elevenlabs/music-generation with the JSON body. The CLI POSTs to the RunComfy Model API, polls request status, fetches the result, and downloads the generated audio file into --output-dir. Ctrl-C cancels the remote request before exit.
npm i -g @runcomfy/cli or npx -y @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented at docs.runcomfy.com/cli/install, they should review the script first.runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600. Set RUNCOMFY_TOKEN env var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.--input. The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content, even with backticks, quotes, or $(...) patterns.model-api.runcomfy.net (request submission) and .runcomfy.net / .runcomfy.com (download whitelist for generated audio). No telemetry, no callbacks.runcomfy — npm / npx lines are one-time operator setup, not commands the skill executes per call.共 1 个版本