Gemini on RunAPI exposes two protocols:
| Protocol | Endpoint | Use when |
|---|---|---|
| --- | --- | --- |
| OpenAI-compatible | POST /v1beta/openai/chat/completions | You already use the OpenAI SDK or any OpenAI client |
| Native Gemini | POST /v1beta/models/ | You use Google's @google/generative-ai SDK (currently gemini-3-flash-preview only) |
Both accept the same RunAPI API Key.
RUNAPI_TOKEN=YOUR_RUNAPI_TOKEN
Get a RunAPI API Key at
from openai import OpenAI
client = OpenAI(
api_key="YOUR_RUNAPI_TOKEN",
base_url="https://runapi.ai/v1beta/openai",
)
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "YOUR_RUNAPI_TOKEN",
baseURL: "https://runapi.ai/v1beta/openai",
});
export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai
response = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "Explain quantum computing simply."}],
reasoning_effort="high",
)
print(response.choices[0].message.content)
print(response.usage)
const response = await client.chat.completions.create({
model: "gemini-2.5-flash",
messages: [{ role: "user", content: "Explain quantum computing simply." }],
});
curl -X POST "https://runapi.ai/v1beta/openai/chat/completions" \
-H "x-api-key: YOUR_RUNAPI_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash",
"messages": [{"role": "user", "content": "Explain quantum computing simply."}]
}'
curl -X POST \
"https://runapi.ai/v1beta/models/gemini-3-flash-preview:streamGenerateContent" \
-H "x-goog-api-key: YOUR_RUNAPI_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{ "role": "user", "parts": [{ "text": "Hello!" }] }
]
}'
The native protocol returns SSE chunks in Google's streamGenerateContent
format — use the official @google/generative-ai SDK or Google's
google-genai Python package to consume it.
stream = client.chat.completions.create(
model="gemini-2.5-flash",
messages=[{"role": "user", "content": "Write a haiku about coding."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
Streaming runs through a regional edge proxy so the request does not hold a
Rails/Puma thread. Long generations should always stream.
{
"model": "gemini-2.5-flash",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What is in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/img.jpg" } }
]
}
]
}
Standard OpenAI multimodal block for the OpenAI-compatible endpoint. For the
native endpoint, embed image data as parts[].inlineData or parts[].fileData.
{
"model": "gemini-2.5-pro",
"messages": [
{ "role": "user", "content": "Latest news on Gemini 3." }
],
"tools": [
{ "type": "function", "function": { "name": "googleSearch" } }
]
}
Available on gemini-2.5-flash, gemini-2.5-pro, gemini-3.1-pro-preview,
and gemini-3-pro-preview.
{
"model": "gemini-2.5-flash",
"messages": [{ "role": "user", "content": "Give me one person object." }],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"schema": {
"type": "object",
"properties": { "name": { "type": "string" }, "age": { "type": "integer" } },
"required": ["name", "age"]
}
}
}
}
Supported on gemini-2.5-pro, gemini-3.1-pro-preview, gemini-3-pro-preview,
and gemini-3-flash-preview — pass reasoning_effort: "low" | "medium" | "high".
curl https://runapi.ai/v1beta/models -H "x-api-key: YOUR_RUNAPI_TOKEN"
Or via the OpenAI-style path:
curl https://runapi.ai/v1beta/openai/models \
-H "Authorization: Bearer YOUR_RUNAPI_TOKEN"
| Model ID | OpenAI endpoint | Native endpoint | Capabilities |
|---|---|---|---|
| --- | --- | --- | --- |
gemini-2.5-flash | yes | — | Chat, multimodal, Google Search, structured output, thoughts |
gemini-2.5-pro | yes | — | + reasoning effort |
gemini-3.1-pro-preview | yes | — | + reasoning effort |
gemini-3-pro-preview | yes | — | + reasoning effort |
gemini-3-flash-preview | yes | :streamGenerateContent | Chat, multimodal, function calling, structured output, reasoning effort |
gemini-flash-latest resolves to gemini-3-flash-preview.
export GOOGLE_API_KEY=YOUR_RUNAPI_TOKEN
export GOOGLE_GENAI_BASE_URL=https://runapi.ai
gemini
:streamGenerateContent path is currently only wired for gemini-3-flash-preview — use the OpenAI-compatible endpoint for every
other Gemini model.
hold the agent on a long blocking request.
googleSearch function tool.not this skill file.
共 1 个版本