Gemini Native (Text) - MixRoute API

curl --request POST \
  --url 'https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:generateContent' \
  --header 'Authorization: Bearer sk-xxxxxxxxxx' \
  --header 'Content-Type: application/json' \
  --data '{
    "contents": [
      {"role": "user", "parts": [{"text": "Explain artificial intelligence in one sentence"}]}
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

{
  "candidates": [
    {
      "content": {
        "parts": [{"text": "Artificial intelligence is a discipline that studies how to make computers simulate and implement human intelligence."}],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": []
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 20,
    "totalTokenCount": 30
  },
  "modelVersion": "gemini-2.5-pro"
}

POST

v1beta

models

{model}

:generateContent

curl --request POST \
  --url 'https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:generateContent' \
  --header 'Authorization: Bearer sk-xxxxxxxxxx' \
  --header 'Content-Type: application/json' \
  --data '{
    "contents": [
      {"role": "user", "parts": [{"text": "Explain artificial intelligence in one sentence"}]}
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

{
  "candidates": [
    {
      "content": {
        "parts": [{"text": "Artificial intelligence is a discipline that studies how to make computers simulate and implement human intelligence."}],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": []
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 20,
    "totalTokenCount": 30
  },
  "modelVersion": "gemini-2.5-pro"
}

Introduction

The Gemini Native API uses Google Gemini’s request and response format, suitable for Google official clients (such as google-generativeai SDK) or scenarios requiring direct use of Gemini data structures. If using OpenAI-compatible clients (such as OpenAI SDK), please use the /v1/chat/completions endpoint.

Differences from OpenAI Format

Feature	Gemini Native	OpenAI Compatible
Message Structure	`contents[].parts[]`	`messages[].content`
Role Names	`user` / `model`	`user` / `assistant`
Streaming Parameter	URL param `?alt=sse`	Body param `stream: true`
System Prompt	`systemInstruction`	`messages[0].role: "system"`
Multimodal	Mixed `parts[]` array	Mixed `content[]` array

API Endpoints

Function	Method	Path
Text Generation (Non-streaming)	POST	`/v1beta/models/{model}:generateContent`
Text Generation (Streaming)	POST	`/v1beta/models/{model}:streamGenerateContent?alt=sse`
Single Embedding	POST	`/v1beta/models/{model}:embedContent`
Batch Embedding	POST	`/v1beta/models/{model}:batchEmbedContents`

Authentication

Two authentication methods are supported:

Method	Header	Example
Bearer Token (Recommended)	`Authorization`	`Bearer sk-xxxxxxxxxx`
Google Style	`x-goog-api-key`	`sk-xxxxxxxxxx`

Request Parameters

Parameter	Type	Required	Description
`contents`	array	Yes	Conversation content array
`generationConfig`	object	No	Generation configuration
`safetySettings`	array	No	Safety filter settings
`systemInstruction`	object	No	System instruction
`tools`	array	No	Tool definitions (function calling, search, etc.)
`cachedContent`	string	No	Cached content name

generationConfig Parameters

Parameter	Type	Description
`temperature`	number	Randomness (0-2)
`topP`	number	Nucleus sampling (0-1)
`topK`	integer	Top-K sampling
`maxOutputTokens`	integer	Maximum output tokens
`stopSequences`	array	Stop sequences
`candidateCount`	integer	Number of candidate responses
`thinkingConfig`	object	Thinking mode configuration

Basic Examples

cURL (Non-streaming)
cURL (Streaming)
Python
Node.js

curl -X POST "https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:generateContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "contents": [
      {"role": "user", "parts": [{"text": "Explain artificial intelligence in one sentence"}]}
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

curl -X POST "https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "contents": [
      {"role": "user", "parts": [{"text": "Write a poem about spring"}]}
    ],
    "generationConfig": {
      "temperature": 0.8,
      "maxOutputTokens": 2048
    }
  }'

import google.generativeai as genai

genai.configure(
    api_key="sk-xxxxxxxxxx",
    transport="rest",
    client_options={"api_endpoint": "https://api.mixroute.ai"}
)

model = genai.GenerativeModel("gemini-2.5-pro")
response = model.generate_content("Explain artificial intelligence in one sentence")
print(response.text)

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("sk-xxxxxxxxxx");

// Custom endpoint
const model = genAI.getGenerativeModel(
  { model: "gemini-2.5-pro" },
  { baseUrl: "https://api.mixroute.ai/v1beta" }
);

const result = await model.generateContent("Explain artificial intelligence in one sentence");
console.log(result.response.text());

Advanced Features

Thinking Mode

Gemini 2.5 Pro and Gemini 3 Pro support thinking mode, allowing the model to perform deep reasoning before answering.Gemini 2.5 Pro - Using thinkingBudget:

{
  "contents": [{"role": "user", "parts": [{"text": "Solve this geometry problem step by step"}]}],
  "generationConfig": {
    "maxOutputTokens": 16384,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingBudget": 8192
    }
  }
}

Gemini 3 Pro - Using thinkingLevel:

{
  "contents": [{"role": "user", "parts": [{"text": "Explain the principles of quantum entanglement"}]}],
  "generationConfig": {
    "maxOutputTokens": 16384,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingLevel": "MEDIUM"
    }
  }
}

Parameter	Applicable Model	Options
`thinkingBudget`	Gemini 2.5 Pro	1-24576 (token count)
`thinkingLevel`	Gemini 3 Pro	`LOW` / `MEDIUM` / `HIGH`

Multimodal Input

Supports multiple input formats including images, audio, and video.Image Input (Base64):

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "inlineData": {
            "mimeType": "image/jpeg",
            "data": "BASE64_ENCODED_IMAGE"
          }
        },
        {"text": "Describe the content of this image"}
      ]
    }
  ]
}

Image Input (URL):

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "fileData": {
            "mimeType": "image/jpeg",
            "fileUri": "https://example.com/image.jpg"
          }
        },
        {"text": "What's in this image?"}
      ]
    }
  ]
}

Supported MIME Types:

Images: image/jpeg, image/png, image/gif, image/webp
Audio: audio/mp3, audio/wav, audio/aac
Video: video/mp4, video/webm
Documents: application/pdf

Function Calling

{
  "contents": [{"role": "user", "parts": [{"text": "What's the weather like in Shanghai today?"}]}],
  "tools": [
    {
      "functionDeclarations": [
        {
          "name": "get_weather",
          "description": "Get weather information for a specified city",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "City name"
              },
              "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit"
              }
            },
            "required": ["location"]
          }
        }
      ]
    }
  ],
  "toolConfig": {
    "functionCallingConfig": {
      "mode": "AUTO"
    }
  }
}

Function Calling Modes:

AUTO: Model automatically decides whether to call
ANY: Force tool invocation
NONE: Disable tool invocation

Google Search (Grounding)

Enable Google Search for real-time information:

{
  "contents": [{"role": "user", "parts": [{"text": "What's the weather like in Beijing today?"}]}],
  "tools": [
    {
      "googleSearch": {}
    }
  ]
}

Dynamic Retrieval Configuration:

{
  "contents": [{"role": "user", "parts": [{"text": "Latest AI news"}]}],
  "tools": [
    {
      "googleSearch": {
        "dynamicRetrievalConfig": {
          "mode": "MODE_DYNAMIC",
          "dynamicThreshold": 0.5
        }
      }
    }
  ]
}

Streaming Output

Python Streaming:

import google.generativeai as genai

genai.configure(
    api_key="sk-xxxxxxxxxx",
    transport="rest",
    client_options={"api_endpoint": "https://api.mixroute.ai"}
)

model = genai.GenerativeModel("gemini-2.5-pro")
response = model.generate_content(
    "Write an article about artificial intelligence",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

cURL Streaming:

curl -X POST "https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:streamGenerateContent?alt=sse" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "Tell me a story"}]}]
  }'

Context Caching

For long texts or multi-turn conversations, caching can save token consumption.Create Cache:

curl -X POST "https://api.mixroute.ai/v1beta/cachedContents" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "model": "models/gemini-2.5-pro",
    "displayName": "my-cache",
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "This is a long document content..."}]
      }
    ],
    "ttl": "3600s"
  }'

Use Cache:

{
  "cachedContent": "cachedContents/abc123",
  "contents": [
    {"role": "user", "parts": [{"text": "Based on the above document, summarize the key points"}]}
  ]
}

Image Generation

Generate images using Gemini 2.0 Flash or Imagen models:

{
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "Generate an image of a sunset at the beach"}]
    }
  ],
  "generationConfig": {
    "responseModalities": ["TEXT", "IMAGE"]
  }
}

Imagen Model:

curl -X POST "https://api.mixroute.ai/v1beta/models/imagen-3.0-generate-002:predict" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "instances": [
      {"prompt": "A cute cat in the sunshine"}
    ],
    "parameters": {
      "sampleCount": 1
    }
  }'

Embedding API

Single Embedding

curl -X POST "https://api.mixroute.ai/v1beta/models/text-embedding-004:embedContent" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "content": {
      "parts": [{"text": "This is text to be vectorized"}]
    }
  }'

Response Example:

{
  "embedding": {
    "values": [0.0123, -0.0456, 0.0789, ...]
  }
}

Batch Embedding

curl -X POST "https://api.mixroute.ai/v1beta/models/text-embedding-004:batchEmbedContents" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxxxxxxxxx" \
  -d '{
    "requests": [
      {
        "model": "models/text-embedding-004",
        "content": {"parts": [{"text": "First text"}]}
      },
      {
        "model": "models/text-embedding-004",
        "content": {"parts": [{"text": "Second text"}]}
      }
    ]
  }'

Response Example:

{
  "embeddings": [
    {"values": [0.0123, -0.0456, ...]},
    {"values": [0.0234, -0.0567, ...]}
  ]
}

Response Format

{
  "candidates": [
    {
      "content": {
        "parts": [{"text": "Response text"}],
        "role": "model"
      },
      "finishReason": "STOP",
      "safetyRatings": [
        {
          "category": "HARM_CATEGORY_HARASSMENT",
          "probability": "NEGLIGIBLE"
        }
      ]
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 20,
    "totalTokenCount": 30
  }
}

Error Handling

HTTP Status	Error Type	Description
400	INVALID_ARGUMENT	Invalid request parameter
401	UNAUTHENTICATED	Invalid or missing API key
403	PERMISSION_DENIED	No access to this model
404	NOT_FOUND	Model not found
429	RESOURCE_EXHAUSTED	Rate limit exceeded
500	INTERNAL	Internal server error

Error Response Example:

{
  "error": {
    "code": 400,
    "message": "Invalid value at 'contents[0].parts[0]'",
    "status": "INVALID_ARGUMENT"
  }
}

Comparison with OpenAI Format

Feature	Gemini Native	OpenAI Compatible
Base URL	`https://api.mixroute.ai/v1beta`	`https://api.mixroute.ai/v1`
Message Structure	`contents[].parts[]`	`messages[].content`
Role Names	`user` / `model`	`user` / `assistant`
System Prompt	`systemInstruction`	`messages[0].role: "system"`
Streaming Request	URL param `?alt=sse`	Body param `stream: true`
Temperature Range	0-2	0-2
Function Calling	`tools[].functionDeclarations`	`tools[].function`
Search Grounding	`tools[].googleSearch`	Not supported
Thinking Mode	`thinkingConfig`	Not supported

curl --request POST \
  --url 'https://api.mixroute.ai/v1beta/models/gemini-2.5-pro:generateContent' \
  --header 'Authorization: Bearer sk-xxxxxxxxxx' \
  --header 'Content-Type: application/json' \
  --data '{
    "contents": [
      {"role": "user", "parts": [{"text": "Explain artificial intelligence in one sentence"}]}
    ],
    "generationConfig": {
      "temperature": 0.7,
      "maxOutputTokens": 1024
    }
  }'

{
  "candidates": [
    {
      "content": {
        "parts": [{"text": "Artificial intelligence is a discipline that studies how to make computers simulate and implement human intelligence."}],
        "role": "model"
      },
      "finishReason": "STOP",
      "index": 0,
      "safetyRatings": []
    }
  ],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 20,
    "totalTokenCount": 30
  },
  "modelVersion": "gemini-2.5-pro"
}

OpenAI SDK Usage Chat Completions (OpenAI)

​Introduction

​Differences from OpenAI Format

​API Endpoints

​Authentication

​Request Parameters

​generationConfig Parameters

​Basic Examples

​Advanced Features

​Thinking Mode

​Multimodal Input

​Function Calling

​Google Search (Grounding)

​Streaming Output

​Context Caching

​Image Generation

​Embedding API

​Single Embedding

​Batch Embedding

​Response Format

​Error Handling

​Comparison with OpenAI Format