API Blibs — Developer Integration Guide

EARLY ACCESS: Features, prices, parameters may change in future.

Powered by optimized Blibs-based architecture, our OpenAI-compatible API delivers high-performance large language model inference — without requiring dedicated GPU resources on your end. Hosted exclusively in EU data centers, it ensures seamless integration with your applications while adhering to stringent GDPR compliance. Leveraging Trooper.AI’s scalable infrastructure, this solution eliminates hardware constraints, enabling cost-efficient deployment for startups and enterprises alike. Whether you’re building chatbots, fine-tuning models, or deploying generative AI workflows, our API provides low-latency responses with enterprise-grade reliability.

Designed for flexibility, the service integrates effortlessly into existing pipelines via straightforward endpoints, supporting batch processing and real-time queries. Backed by persistent storage and advanced security protocols—including firewalls and DDoS protection—the platform guarantees uninterrupted uptime for mission-critical AI tasks. Explore our tailored configurations below to match your project’s demands.

Order API Blib now


Models and Pricing

Our base pricing is as follows. Additional pricing per country may apply on API Blib order. Alos you can benefit from our regular Extra Credits Promotions!

Route Model Context Input/1M Output/1M +33% Promo** Strengths
liv Google Gemma 4 2.3B + 0.15B (vision) + 0.3B (audio) 20,380 €0.029 €0.049 €0.029 €0.022 in, €0.049 €0.037 out Text, Vision, Audio, Reasoning, Tools, JSON
clara Mistral Ministral 3 13.5B + 0.4B (vision) 9,248 €0.139 €0.249 €0.139 €0.105 in, €0.249 €0.187 out EU Model, Vision, Tools, Text, JSON
nikola NVIDIA Nemotron 3 Nano 30B MoE (128 experts, 6 active @ 3.5B) 20,480 €0.159 €0.319 €0.159 €0.120 in, €0.319 €0.240 out Text, Reasoning, Tools, Coding, JSON

* Current Promotion gives your +33% Extra Credits on your payment from €150 this month. Pay €150 and get €200 into your account!


How to order your API Blib

Before using our API you need to order your API Blib on our website: Order API Blib

You can choose you Model and your Region and after deploying your desired model should see something like this in your Management Dashboard:

API Blib in your Dashboard
API Blib in your Dashboard


Free Static System Prompt

We offer free system prompt up to 1,024 characters. The free system prompt is only available via the dashboard! Dynamic system messages sent via the API is charged as usual. Click on Actions and you see the system prompt edit dialogue:

API Blib Static System Prompt
API Blib Static System Prompt

Lets go, order your API Blib now:

Order API Blib now


LLM Benchmark 🧪

LLM Quality Benchmark Interface
LLM Quality Benchmark Interface

If you want to benchmark our LLM Endpoints with other LLM Platforms, you can use our free LLM Quality Benchmark tool here: Free LLM Quality Benchmark This is how it looks like:

Test any OpenAI-compatible LLM endpoint with 25 automated quality checks — reasoning, coding, multilingual, structured output, tool calling and more.


Base URL

Code
https://eu.router.trooper.ai/v1

Regional endpoints for country-level data residency:

Domain Region
eu.router.trooper.ai EU (all EU data centers)
de.router.trooper.ai Germany only
nl.router.trooper.ai Netherlands only

Authentication

All requests require a Bearer token. Get your API key from the management dashboard .

Code
Authorization: Bearer YOUR_TROOPER_KEY

Available Models

Activate routes at trooper.ai/order-apiblib . Each route gives you a model name you use in API calls.

Route Base Model Strengths
liv Google Gemma 4 Cheapest. Text, images, audio, reasoning. High-volume workloads.
clara Mistral Ministral 3 Vision-first. Fast throughput, strong EU language support.
nikola NVIDIA Nemotron 3 Nano Reasoning powerhouse. Code generation, function calling, agentic workflows.

Endpoints

POST /v1/chat/completions

Standard OpenAI-compatible chat completions endpoint.

GET /v1/models

Lists your activated models. Requires authentication. Returns only models matching the region of the domain you’re calling.

Each model object includes:

Field Type Description
id string Your route name (used as model in requests).
object string Always "model".
owned_by string Owner identifier.
created integer Unix timestamp of creation.
base_models string[] Underlying model name(s).
context_length integer Maximum context window (tokens).
max_tokens integer Maximum output tokens.
capabilities object Feature flags for this model (see below).
supported_parameters string[] Parameters accepted by this model.

Capabilities object:

Flag Type Description
thinking boolean Supports reasoning_effort and chain-of-thought reasoning.
tools boolean Supports function calling / tools.
vision boolean Supports image and PDF inputs.
audio boolean Supports audio inputs.
json_mode boolean Supports response_format (JSON mode / structured outputs).
token_budget boolean Supports explicit thinking token budget control.

Use capabilities.thinking to determine whether a model accepts reasoning parameters before sending them.

Example response:

json
{
  "object": "list",
  "data": [
    {
      "id": "clara",
      "object": "model",
      "owned_by": "trooper_42",
      "created": 1700000000,
      "base_models": ["Ministral-3"],
      "context_length": 131072,
      "max_tokens": 131072,
      "capabilities": {
        "tools": true,
        "vision": true,
        "audio": false,
        "thinking": false,
        "json_mode": true,
        "token_budget": false
      },
      "supported_parameters": [
        "temperature", "top_p", "max_tokens", "stream",
        "response_format", "tools", "tool_choice"
      ]
    }
  ]
}

GET /health

Returns endpoint availability and region info.

Order API Blib now


Request Parameters

All parameters follow the OpenAI Chat Completions API format.

Required

Parameter Type Description
model string Your route name (e.g. "clara", "nikola", "liv")
messages array Array of message objects (role + content)

Optional

Parameter Type Default Description
max_tokens integer auto (32–4096) Maximum output tokens. Auto-clamped to route’s context window.
max_completion_tokens integer Alias for max_tokens.
stream boolean false Enable SSE streaming.
temperature number model default Sampling temperature (0–2).
top_p number model default Nucleus sampling.
response_format object {"type": "json_object"} or {"type": "json_schema", "json_schema": {...}}
tools array Function calling tool definitions (OpenAI format).
tool_choice string/object Controls tool selection ("auto", "none", or specific tool).
reasoning object `{“effort”: “none”
reasoning_effort string Shorthand: "none", "medium", "high".
reasoning.exclude boolean false Strip reasoning content from the response.

Features

Streaming

Standard SSE streaming, fully compatible with the OpenAI SDK.

json
{ "stream": true }

Response format: data: {...}\n\n lines, terminated by data: [DONE]\n\n.

JSON Mode

Request structured JSON output. Make sure not only setting JSON request in text message, you also need to set at least the response_format to json_object!

If the model fails to produce valid JSON, you are not charged.

json
{ "response_format": { "type": "json_object" } }

With a schema:

json
{
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "my_schema",
      "schema": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "age": { "type": "integer" }
        },
        "required": ["name", "age"]
      }
    }
  }
}

Vision (Images & PDFs)

Send images via URL or base64. PDFs are auto-converted to page images server-side.

json
{
  "messages": [{
    "role": "user",
    "content": [
      { "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } },
      { "type": "text", "text": "Describe this image." }
    ]
  }]
}

Base64:

json
{
  "type": "image_url",
  "image_url": { "url": "data:image/png;base64,iVBOR..." }
}
  • Images are auto-resized and metadata-stripped (SSRF-safe).
  • Max image count and size depend on your route configuration.
  • Supported on liv and clara routes.

Audio

Send audio files in multimodal messages. Supported on liv.

Thinking / Reasoning

Control whether and how deeply the model reasons (chain-of-thought) before answering.

Value Effect
"none" Thinking off — fastest responses, lowest token usage.
"low" Thinking off — same as none.
"medium" Thinking on — the model reasons step-by-step before answering. Good balance of quality and speed.
"high" Thinking on + deep — the model is instructed to think very carefully and in great detail. Best for complex math, logic, and code. If max_tokens is set, it must be at least 4 096 or the request will be rejected.

Enable standard thinking:

json
{ "reasoning_effort": "medium" }

Enable deep thinking for maximum quality:

json
{ "reasoning_effort": "high" }

Or via the reasoning object:

json
{ "reasoning": { "effort": "high" } }

Disable thinking explicitly:

json
{ "reasoning_effort": "none" }

To strip reasoning from the response (thinking still happens, but the reasoning tokens are not returned):

json
{ "reasoning": { "effort": "high", "exclude": true } }

Thinking Behaviour can also be configured per route in the management dashboard. The dashboard setting controls the default behaviour and how reasoning is returned:

Thinking Modes under Actions on supported Models
Thinking Modes under Actions on supported Models

  • Disabled — Thinking off by default. Can still be enabled per-request via "reasoning_effort": "medium" or "high".
  • Strip — Thinking on, but reasoning tokens are removed from the response.
  • Reasoning Content — Thinking on, reasoning returned in a separate reasoning_content field.
  • Think Tag — Thinking on, reasoning returned as <think> tags inside the content.

Thinking rescue: if the model enters a reasoning loop, the router auto-recovers and returns a usable response.

Function Calling / Tools

Standard OpenAI tools format. Works with all routes.

json
{
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  }],
  "tool_choice": "auto"
}

Auto Context Compression

If your input exceeds the context window, the router automatically compresses the middle of the conversation to fit — no manual truncation needed. You always get a response.

System Prompt

A free system prompt can be configured per route in the management dashboard. It’s prepended to every request automatically and not billed.


Response Format

Standard OpenAI response envelope:

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "clara",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 8,
    "total_tokens": 20
  }
}

With reasoning enabled, the response may include reasoning_content alongside content.

Transaction ID

Every response includes an x-transaction-id header for billing reference and debugging.


Error Handling

Errors follow the OpenAI error envelope format:

json
{
  "error": {
    "message": "The model 'nonexistent' does not exist.",
    "type": "invalid_request_error",
    "param": "model",
    "code": "model_not_found"
  }
}

Error Codes

HTTP Status Code Description
400 invalid_value Missing model, API key, input too short, or invalid max_tokens.
403 invalid_api_key Invalid API key or insufficient budget.
404 model_not_found Model doesn’t exist or not activated.
404 region_mismatch Model not available in the requested region.
500 Internal router error.
503 No endpoints available in the requested region.

Billing

  • Per-token pricing — input and output tokens billed separately per million tokens.
  • No idle costs — you only pay when you make API calls.
  • Prepaid credits — top up your balance and draw down with usage.
  • System prompt tokens are free (not billed).
  • Failed JSON requests are not charged when using JSON mode.
  • Vision inputs are billed per unit (images processed).

Order API Blib now


Integration Examples

See popular examples of code how to use a OpenAI compatible API for LLM inference. The router.trooper.ai must be replaced with the endpoint URL shown in your API Blib Order!

Python (OpenAI SDK)

python
from openai import OpenAI

client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

response = client.chat.completions.create(
    model="clara",
    messages=[{"role": "user", "content": "Summarize this document."}],
    max_tokens=1024
)

print(response.choices[0].message.content)

Node.js (OpenAI SDK)

javascript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://router.trooper.ai/v1",
  apiKey: "YOUR_TROOPER_KEY",
});

const response = await client.chat.completions.create({
  model: "nikola",
  messages: [{ role: "user", content: "Write a unit test for this function." }],
  max_tokens: 2048,
});

console.log(response.choices[0].message.content);

Python with Streaming

python
from openai import OpenAI

client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

stream = client.chat.completions.create(
    model="liv",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    max_tokens=2048,
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

JSON Mode

python
import json
from openai import OpenAI

client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

response = client.chat.completions.create(
    model="clara",
    messages=[{"role": "user", "content": "List the 3 largest EU countries as JSON with name and population."}],
    max_tokens=512,
    response_format={"type": "json_object"}
)

data = json.loads(response.choices[0].message.content)
print(data)

Vision (Image Analysis)

python
from openai import OpenAI

client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

response = client.chat.completions.create(
    model="clara",
    messages=[{
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": "https://example.com/invoice.png"}},
            {"type": "text", "text": "Extract all line items from this invoice as JSON."}
        ]
    }],
    max_tokens=2048,
    response_format={"type": "json_object"}
)

print(response.choices[0].message.content)

Function Calling

python
from openai import OpenAI

client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="nikola",
    messages=[{"role": "user", "content": "What's the weather in Berlin?"}],
    tools=tools,
    tool_choice="auto",
    max_tokens=512
)

tool_calls = response.choices[0].message.tool_calls
if tool_calls:
    print(tool_calls[0].function.name, tool_calls[0].function.arguments)

LangChain

python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY",
    model="clara",
    max_tokens=1024
)

response = llm.invoke("Extract all dates from the following text: ...")
print(response.content)

LlamaIndex

python
from llama_index.llms.openai_like import OpenAILike

llm = OpenAILike(
    api_base="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY",
    model="nikola",
    max_tokens=2048
)

response = llm.complete("Explain the EU AI Act in simple terms.")
print(response.text)

cURL

bash
curl https://router.trooper.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_TROOPER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "clara",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 512
  }'

Migration from OpenAI

One-line change — update the base URL and API key:

python
# Before (OpenAI)
client = OpenAI(api_key="sk-...")

# After (Trooper.AI)
client = OpenAI(
    base_url="https://router.trooper.ai/v1",
    api_key="YOUR_TROOPER_KEY"
)

Everything else stays the same: request format, response schema, streaming, tools, JSON mode.


Data Residency & Compliance

  • All processing on ISO/IEC 27001 certified data centers in Germany and the EU.
  • Zero data retention — prompts and responses processed in RAM only, never stored.
  • GDPR compliant — no cross-border data transfers, no model training on your data.
  • Country-level routing available (DE, NL, or broader EU).

Order API Blib now