Process multiple AI inference requests asynchronously using the Doubleword batch API with high throughput and low cost.
Before submitting batches, you need:
Batches are ideal for:
Pricing is per 1 million tokens (input / output):
Qwen3-VL-30B-A3B-Instruct-FP8 (mid-size):
Qwen3-VL-235B-A22B-Instruct-FP8 (flagship):
Cost estimation: Upload files to the Doubleword Console to preview expenses before submitting.
Two ways to submit batches:
Via API:
Via Web Console:
Create a .jsonl file where each line contains a complete, valid JSON object with no line breaks within the object:
{"custom_id": "req-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is 2+2?"}]}}
{"custom_id": "req-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "anthropic/claude-3-5-sonnet", "messages": [{"role": "user", "content": "What is the capital of France?"}]}}
Required fields per line:
custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mappingmethod: Always "POST"url: API endpoint - "/v1/chat/completions" or "/v1/embeddings"body: Standard API request with model and messagesOptional body parameters:
temperature: 0-2 (default: 1.0)max_tokens: Maximum response tokenstop_p: Nucleus sampling parameterstop: Stop sequencestools: Tool definitions for tool calling (see Tool Calling section)response_format: JSON schema for structured outputs (see Structured Outputs section)File requirements:
custom_id valuesCommon pitfalls:
custom_id valuesHelper script:
Use scripts/create_batch_file.py to generate JSONL files programmatically:
python scripts/create_batch_file.py output.jsonl
Modify the script's requests list to generate your specific batch requests.
Via API:
curl https://api.doubleword.ai/v1/files \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
-F purpose="batch" \
-F file="@batch_requests.jsonl"
Via Console:
Upload through the Batches section at https://app.doubleword.ai/
Response contains id field - save this file ID for next step.
Via API:
curl https://api.doubleword.ai/v1/batches \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'
Via Console:
Configure batch settings in the web interface.
Parameters:
input_file_id: File ID from upload stependpoint: API endpoint ("/v1/chat/completions" or "/v1/embeddings")completion_window: Choose based on urgency and budget:"24h": Best pricing, results within 24 hours (typically faster)"1h": 50% price premium, results within 1 hour (typically faster)Response contains batch id - save this for status polling.
Before submitting, verify:
Via API:
curl https://api.doubleword.ai/v1/batches/batch-xyz789 \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
Monitor real-time progress in the Batches dashboard.
Status progression:
validating - Checking input file formatin_progress - Processing requestscompleted - All requests finishedOther statuses:
failed - Batch failed (check error_file_id)expired - Batch timed outcancelling/cancelled - Batch cancelledResponse includes:
output_file_id - Download results hereerror_file_id - Failed requests (if any)request_counts - Total/completed/failed countsPolling frequency: Check every 30-60 seconds during processing.
Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.
Via API:
curl https://api.doubleword.ai/v1/files/file-output123/content \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
> results.jsonl
Via Console:
Download results directly from the Batches dashboard.
Response headers:
X-Incomplete: true - Batch still processing, more results comingX-Last-Line: 45 - Resume point for partial downloadsOutput format (each line):
{
"id": "batch-req-abc",
"custom_id": "request-1",
"response": {
"status_code": 200,
"body": {
"id": "chatcmpl-xyz",
"choices": [{
"message": {
"role": "assistant",
"content": "The answer is 4."
}
}]
}
}
}
Download errors (if any):
curl https://api.doubleword.ai/v1/files/file-error123/content \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY" \
> errors.jsonl
Error format (each line):
{
"id": "batch-req-def",
"custom_id": "request-2",
"error": {
"code": "invalid_request",
"message": "Missing required parameter"
}
}
Tool calling (function calling) enables models to intelligently select and use external tools. Doubleword maintains full OpenAI compatibility.
Example batch request with tools:
{
"custom_id": "tool-req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "anthropic/claude-3-5-sonnet",
"messages": [{"role": "user", "content": "What's the weather in Paris?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
}
}
Use cases:
Structured outputs guarantee that model responses conform to your JSON Schema, eliminating issues with missing fields or invalid enum values.
Example batch request with structured output:
{
"custom_id": "structured-req-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "anthropic/claude-3-5-sonnet",
"messages": [{"role": "user", "content": "Extract key info from: John Doe, 30 years old, lives in NYC"}],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person_info",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
}
}
}
}
}
Benefits:
autobatcher is a Python client that automatically converts individual API calls into batched requests, reducing costs without code changes.
Installation:
pip install autobatcher
How it works:
Key benefit: Significant cost reduction through automatic batching while writing normal async code using the familiar OpenAI interface.
Documentation: https://github.com/doublewordai/autobatcher
Via API:
curl https://api.doubleword.ai/v1/batches?limit=10 \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
View all batches in the dashboard.
Via API:
curl https://api.doubleword.ai/v1/batches/batch-xyz789/cancel \
-X POST \
-H "Authorization: Bearer $DOUBLEWORD_API_KEY"
Via Console:
Click cancel in the batch details view.
Notes:
Parse JSONL output line-by-line:
import json
with open('results.jsonl') as f:
for line in f:
result = json.loads(line)
custom_id = result['custom_id']
content = result['response']['body']['choices'][0]['message']['content']
print(f"{custom_id}: {content}")
Check for incomplete batches and resume:
import requests
response = requests.get(
'https://api.doubleword.ai/v1/files/file-output123/content',
headers={'Authorization': f'Bearer {api_key}'}
)
if response.headers.get('X-Incomplete') == 'true':
last_line = int(response.headers.get('X-Last-Line', 0))
print(f"Batch incomplete. Processed {last_line} requests so far.")
# Continue polling and download again later
Extract failed requests from error file and resubmit:
import json
failed_ids = []
with open('errors.jsonl') as f:
for line in f:
error = json.loads(line)
failed_ids.append(error['custom_id'])
print(f"Failed requests: {failed_ids}")
# Create new batch with only failed requests
Handle tool call responses:
import json
with open('results.jsonl') as f:
for line in f:
result = json.loads(line)
message = result['response']['body']['choices'][0]['message']
if message.get('tool_calls'):
for tool_call in message['tool_calls']:
print(f"Tool: {tool_call['function']['name']}")
print(f"Args: {tool_call['function']['arguments']}")
"user-123-question-5", "dataset-A-row-42""1", "req1"custom_id must be unique within the batch24h for cost savings (50-83% cheaper), 1h only when time-sensitiveerror_file_id and retry failed requestscompleted/total ratioFor complete API details, see:
references/api_reference.md - Full endpoint documentation and schemasreferences/getting_started.md - Detailed setup and account managementreferences/pricing.md - Model costs and SLA comparison共 1 个版本