Timestamp	Tokens	Latency	Status
No recent requests

Zephyr API

Zephyr API is a RESTful interface to access a wide diversity of state-of-the-art language models. Designed to deliver Pro-level reasoning at Flash speeds with low cost, it's ideal for agentic workflows, multi-turn chat, and coding assistance.

Key Features

High speed and efficiency
Pro-level reasoning
Low operational cost
Multimodal support (text, images)
Temperature configuration
SSE streaming
Tool integration

Base URL

All API requests should be made to the following base URL:

https://zephyr-api.matiasvergara.workers.dev/

Available Models

Zephyr API offers a wide range of state-of-the-art models tailored to different performance and cost needs.

Standard Models

Available with regular API usage limits based on your plan.

Model ID	Description	Plan
openai	GPT-5 Nano	Basic+
gemini-3	Gemini 3.0 Flash	Basic+
gemini-2.5	Gemini 2.5 Flash Lite	Basic+
gpt-oss	GPT-OSS 120B	Mega+
grok	xAI Grok 4 Fast	Pro+
minimax	MiniMax M2.1	Pro+
glm4.7	GLM 4.7	Mega+
deepseek-v3.2	DeepSeek V3.2 with Thinking	Mega+

✨ Premium Models

Enterprise Exclusive — These state-of-the-art models are only available with the Enterprise plan.

Daily Limit: Premium models have a limit of 100 requests/day. Usage resets at midnight UTC.

Model ID	Description	Plan
gpt-5.2	OpenAI GPT 5.2	Enterprise
gemini-3-pro	Gemini 3 Pro	Enterprise
opus-4.5	Claude Opus 4.5	Enterprise
sonnet-4.5	Claude Sonnet 4.5	Enterprise
haiku-4.5	Claude Haiku 4.5	Enterprise
opus-4.5-reasoning	Claude Opus 4.5 with Reasoning Content	Enterprise

Feature Availability

Check which features are available in your plan.

Feature	Basic	Pro	Mega	Enterprise	Custom
Visual Input (Images)
Function Calling
URL Context & Web Search (Gemini models)

Authentication

The Zephyr API uses API Keys to authenticate requests and track usage against your plan limits. All users must create an account and verify their email before accessing the API.

Creating an Account

To get started:

Click Sign Up on the login screen
Enter your email and create a password
Click Create Account
Check your email for a verification link
Click the verification link to activate your account

After verification, you can sign in and access your API key.

Getting Your API Key

Sign in with your verified email and password
Go to the Console section in the sidebar
Your API key will be displayed in the API Key card
Click the Copy button to copy your key

Using Your API Key

Include your API key in the Authorization header of every request:

Authorization: Bearer YOUR_API_KEY

Example Request

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"text": "Hello, world!"}'

Rate Limits & Usage

Your API key is linked to your subscription plan. Usage limits include:

Rate Limit: Requests per minute (varies by plan)
Daily Tokens: Maximum tokens per day
Daily Requests: Maximum requests per day
Premium Models (Enterprise): 100 requests/day for premium models

Usage resets at midnight UTC. Check the Console for your current usage statistics.

Endpoints

API Status

1. Health Check

Endpoint: GET /

Check if the API is operational. Requires valid API key.

Response

{
  "message": "Zephyr API is operational",
  "version": "3.2.0",
  "status": "ready"
}

2. Usage Stats

Endpoint: GET /stats

Get your current usage statistics and plan details.

Response

{
  "plan": "BASIC",
  "planLabel": "Free Tier",
  "stats": {
    "tokens": { "date": "2026-01-15", "count": 1024 },
    "requests": { "date": "2026-01-15", "count": 15 },
    "tokenLimit": 4096,
    "rateLimit": "1 req/min"
  }
}

Generation Endpoints

1. Text Generation

Endpoint: POST /generate

Generates text using the selected language model.

Request

POST /generate
Content-Type: application/json

{
  "text": "Your prompt here",
  "temperature": 0.7,
  "system": "You are a helpful assistant"
}

Parameters

Parameter	Type	Required	Default	Description
`text`	string	Yes	-	Your question or prompt
`temperature`	number	No	0.7	Controls creativity (0.0-1.0)
`system`	string	No	null	System prompt
`image`	string	No	null	Image URL for analysis
`url`	string	No	null	Web page URL for context
`stream`	boolean	No	false	Enable streaming
`history`	array	No	[]	Conversation history

2. Health Check

Endpoint: GET /health

Checks the API status.

GET /health

Response Format

Success Response

{
  "success": true,
  "model": "gemini-3-flash",
  "message": "AI response here...",
  "latency": 1.45,
  "finish_reason": "STOP",
  "usage": {
    "prompt_tokens": 38,
    "completion_tokens": 175,
    "total_tokens": 213
  },
  "metadata": {
    "request_id": "550e8400-e29b-41d4-a716-446655440000",
    "timestamp": "2026-01-12T15:30:00.000Z",
    "duration_ms": 1450,
    "temperature_used": 0.7
  }
}

Error Response

{
  "error": "API Error",
  "details": "Error message here",
  "status": 500,
  "request_id": "550e8400-e29b-41d4-a716-446655440000"
}

Status Codes

Code	Description
200	Success
400	Invalid parameters
500	Server error

Python Examples

Basic Usage

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

response = requests.post(url, json={
    "text": "What is artificial intelligence?",
    "temperature": 0.7
})

print(response.json()["message"])

With System Prompt

response = requests.post(url, json={
    "text": "Explain quantum physics",
    "system": "You are a physics professor. Use simple language.",
    "temperature": 0.5
})

print(response.json()["message"])

With Image

response = requests.post(url, json={
    "text": "Describe this image",
    "image": "https://example.com/photo.jpg"
})

print(response.json()["message"])

Streaming

response = requests.post(url, json={
    "text": "Write a short story",
    "stream": True
}, stream=True)

for line in response.iter_lines():
    if line:
        print(line.decode('utf-8'))

JavaScript Examples

Basic Usage

const response = await fetch("https://zephyr-api.matiasvergara.workers.dev/generate", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    text: "What is artificial intelligence?",
    temperature: 0.7
  })
});

const data = await response.json();
console.log(data.message);

With System Prompt

const response = await fetch("https://zephyr-api.matiasvergara.workers.dev/generate", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    text: "Explain quantum physics",
    system: "You are a physics professor.",
    temperature: 0.5
  })
});

console.log((await response.json()).message);

Streaming

const response = await fetch(url, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ text: "Write a story", stream: true })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(decoder.decode(value));
}

cURL Examples

Basic Request

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "What is artificial intelligence?",
    "temperature": 0.7
  }'

With System Prompt

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Explain quantum physics",
    "system": "You are a physics professor.",
    "temperature": 0.5
  }'

Health Check

curl https://zephyr-api.matiasvergara.workers.dev/health

Use Cases

Question & Answer

Simple factual queries and information retrieval.

Creative Writing

Stories, poems, and creative content generation.

Code Explanation

Analyze and explain code snippets.

Image Analysis

Describe and analyze images via URL.

Web Summarization

Summarize web pages and articles.

Multi-turn Chat

Conversational AI with history context.

Tool & Function Calling

Extend model capabilities with external tools and function calling. Available features depend on the model and plan.

URL Context

Gemini Models | Basic+ Plans

Pass a URL to provide web page context for your query. The model will fetch and analyze the page content.

⚠️ Basic Plan: Limited to 50 uses per day.

{
  "text": "Summarize this article",
  "url": "https://example.com/article",
  "model": "gemini-3"
}

Code Execution

Gemini Models | Basic+ Plans

Enable the model to write and execute Python code to solve problems that require computation.

⚠️ Basic Plan: Limited to 50 uses per day.

{
  "text": "Calculate the first 20 Fibonacci numbers",
  "model": "gemini-3",
  "tools": [
    {
      "code_execution": {}
    }
  ]
}

Google Search

Gemini Models | Basic+ Plans

Allow the model to search Google for real-time information and current events.

⚠️ Basic Plan: Limited to 50 uses per day.

{
  "text": "What are the latest news about AI?",
  "model": "gemini-3",
  "tools": [
    {
      "google_search": {}
    }
  ]
}

Function Calling

All Models

Define custom functions that the model can invoke. You handle the execution and return results.

Defining Tools

{
  "text": "What's the weather in New York?",
  "model": "gemini-3",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

Model Response with Tool Call

{
  "success": true,
  "message": null,
  "tool_calls": [
    {
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "get_weather",
        "arguments": "{\"location\": \"New York\"}"
      }
    }
  ]
}

Returning Tool Results

{
  "text": "Process the weather data",
  "history": [
    {"role": "assistant", "tool_calls": [...]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temp\": 72, \"condition\": \"sunny\"}"}
  ]
}

Feature Compatibility

Feature	gemini-3	gemini-2.5	openai	gpt-oss
URL Context
Code Execution
Google Search
Function Calling

Configuration

Temperature

Controls randomness in responses:

Range	Behavior
0.0 - 0.3	Deterministic, factual, consistent
0.4 - 0.7	Balanced (default: 0.7)
0.8 - 1.0	Creative, varied, unpredictable

System Prompts

Customize AI behavior with different personas:

# Professional tone
system = "You are a university professor"

# Casual tone
system = "You are talking to a 10-year-old"

# Technical tone
system = "You are a technical documentation writer"

Best Practices

1. Use Appropriate Temperature

Use 0.3 for facts, 0.7 for balanced, 0.9 for creative tasks.

2. Provide Clear Instructions

Be specific: "Provide 5 facts about cats focusing on behavior" vs "Tell me about cats"

3. Use System Prompts for Consistency

Define the AI's role and tone for consistent responses.

4. Handle Errors Gracefully

Always wrap API calls in try/catch with proper timeout handling.

5. Use Streaming for Long Responses

Enable streaming for stories, articles, and long content.

API Status

Operational

https://zephyr-api.matiasvergara.workers.dev/

Python Examples

Complete Python examples with all parameters and tools.

Basic Text Generation

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "Explain quantum computing in simple terms",
    "model": "gemini-3",
    "temperature": 0.7,
    "max_tokens": 1024,
    "top_p": 0.9,
    "stream": False
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])

With URL Context (Gemini models)

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "Summarize the main points from this article",
    "url": "https://example.com/article",
    "model": "gemini-3",
    "temperature": 0.5,
    "max_tokens": 500
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])

With Code Execution (Gemini models)

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "Calculate the first 20 Fibonacci numbers",
    "model": "gemini-3",
    "tools": [
        {
            "code_execution": {}
        }
    ],
    "temperature": 0.3,
    "max_tokens": 2048
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])

With Google Search (Gemini models)

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "What are the latest developments in AI this week?",
    "model": "gemini-3",
    "tools": [
        {
            "google_search": {}
        }
    ],
    "temperature": 0.7,
    "max_tokens": 1500
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])

With Streaming

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "Write a short story about space exploration",
    "model": "gemini-3",
    "temperature": 0.9,
    "max_tokens": 2048,
    "stream": True
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers, stream=True)

for line in response.iter_lines():
    if line:
        print(line.decode('utf-8'))

Complete Example with All Parameters

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/generate"

payload = {
    "text": "Analyze this webpage and write code to extract the data",
    "url": "https://example.com/data",
    "model": "gemini-3",
    "temperature": 0.7,
    "max_tokens": 2048,
    "top_p": 0.9,
    "stream": False,
    "tools": [
        {
            "code_execution": {}
        },
        {
            "google_search": {}
        }
    ]
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()

print("Response:", result["text"])
print("Tokens used:", result["usage"]["total_tokens"])
print("Model:", result["model"])

With System Prompt

import requests

url = "https://zephyr-api.matiasvergara.workers.dev/chat/completions"

payload = {
    "model": "gemini-3",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly."
        },
        {
            "role": "user",
            "content": "How do I read a CSV file in Python?"
        }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["choices"][0]["message"]["content"])

With Function/Tool Calling

import requests
import json

url = "https://zephyr-api.matiasvergara.workers.dev/chat/completions"

# Define custom functions the model can call
payload = {
    "model": "gemini-3",
    "messages": [
        {
            "role": "user",
            "content": "What's the weather like in San Francisco?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit"
                        }
                    },
                    "required": ["location"]
                }
            }
        }
    ],
    "temperature": 0.3
}

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()

# Check if model wants to call a function
if result["choices"][0]["message"].get("tool_calls"):
    tool_call = result["choices"][0]["message"]["tool_calls"][0]
    function_name = tool_call["function"]["name"]
    function_args = json.loads(tool_call["function"]["arguments"])
    
    print(f"Model wants to call: {function_name}")
    print(f"With arguments: {function_args}")

JavaScript Examples

Complete JavaScript/Node.js examples with all parameters and tools.

Basic Text Generation

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        text: 'Explain quantum computing in simple terms',
        model: 'gemini-3',
        temperature: 0.7,
        max_tokens: 1024,
        top_p: 0.9,
        stream: false
    })
});

const data = await response.json();
console.log(data.text);

With URL Context (Gemini models)

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        text: 'Summarize the main points from this article',
        url: 'https://example.com/article',
        model: 'gemini-3',
        temperature: 0.5,
        max_tokens: 500
    })
});

const data = await response.json();
console.log(data.text);

With Code Execution (Gemini models)

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        text: 'Calculate the first 20 Fibonacci numbers',
        model: 'gemini-3',
        tools: [
            { code_execution: {} }
        ],
        temperature: 0.3,
        max_tokens: 2048
    })
});

const data = await response.json();
console.log(data.text);

With Google Search (Gemini models)

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        text: 'What are the latest developments in AI this week?',
        model: 'gemini-3',
        tools: [
            { google_search: {} }
        ],
        temperature: 0.7,
        max_tokens: 1500
    })
});

const data = await response.json();
console.log(data.text);

With Streaming

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        text: 'Write a short story about space exploration',
        model: 'gemini-3',
        temperature: 0.9,
        max_tokens: 2048,
        stream: true
    })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    
    const chunk = decoder.decode(value);
    console.log(chunk);
}

Complete Example with All Parameters

async function callZephyrAPI() {
    const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'Authorization': 'Bearer YOUR_API_KEY'
        },
        body: JSON.stringify({
            text: 'Analyze this webpage and write code to extract the data',
            url: 'https://example.com/data',
            model: 'gemini-3',
            temperature: 0.7,
            max_tokens: 2048,
            top_p: 0.9,
            stream: false,
            tools: [
                { code_execution: {} },
                { google_search: {} }
            ]
        })
    });

    const data = await response.json();
    
    console.log('Response:', data.text);
    console.log('Tokens used:', data.usage.total_tokens);
    console.log('Model:', data.model);
}

callZephyrAPI();

With System Prompt

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/chat/completions', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        model: 'gemini-3',
        messages: [
            {
                role: 'system',
                content: 'You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly.'
            },
            {
                role: 'user',
                content: 'How do I read a CSV file in Python?'
            }
        ],
        temperature: 0.7,
        max_tokens: 1024
    })
});

const data = await response.json();
console.log(data.choices[0].message.content);

With Function/Tool Calling

const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/chat/completions', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer YOUR_API_KEY'
    },
    body: JSON.stringify({
        model: 'gemini-3',
        messages: [
            {
                role: 'user',
                content: "What's the weather like in San Francisco?"
            }
        ],
        tools: [
            {
                type: 'function',
                function: {
                    name: 'get_weather',
                    description: 'Get the current weather in a given location',
                    parameters: {
                        type: 'object',
                        properties: {
                            location: {
                                type: 'string',
                                description: 'The city and state, e.g. San Francisco, CA'
                            },
                            unit: {
                                type: 'string',
                                enum: ['celsius', 'fahrenheit'],
                                description: 'The temperature unit'
                            }
                        },
                        required: ['location']
                    }
                }
            }
        ],
        temperature: 0.3
    })
});

const result = await response.json();

// Check if model wants to call a function
if (result.choices[0].message.tool_calls) {
    const toolCall = result.choices[0].message.tool_calls[0];
    const functionName = toolCall.function.name;
    const functionArgs = JSON.parse(toolCall.function.arguments);
    
    console.log(`Model wants to call: ${functionName}`);
    console.log(`With arguments:`, functionArgs);
}

cURL Examples

Complete cURL examples with all parameters and tools.

Basic Text Generation

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Explain quantum computing in simple terms",
    "model": "gemini-3",
    "temperature": 0.7,
    "max_tokens": 1024,
    "top_p": 0.9,
    "stream": false
  }'

With URL Context (Gemini models)

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Summarize the main points from this article",
    "url": "https://example.com/article",
    "model": "gemini-3",
    "temperature": 0.5,
    "max_tokens": 500
  }'

With Code Execution (Gemini models)

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Calculate the first 20 Fibonacci numbers",
    "model": "gemini-3",
    "tools": [
      {
        "code_execution": {}
      }
    ],
    "temperature": 0.3,
    "max_tokens": 2048
  }'

With Google Search (Gemini models)

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "What are the latest developments in AI this week?",
    "model": "gemini-3",
    "tools": [
      {
        "google_search": {}
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1500
  }'

With Streaming

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Write a short story about space exploration",
    "model": "gemini-3",
    "temperature": 0.9,
    "max_tokens": 2048,
    "stream": true
  }' \
  --no-buffer

Complete Example with All Parameters

curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Analyze this webpage and write code to extract the data",
    "url": "https://example.com/data",
    "model": "gemini-3",
    "temperature": 0.7,
    "max_tokens": 2048,
    "top_p": 0.9,
    "stream": false,
    "tools": [
      {
        "code_execution": {}
      },
      {
        "google_search": {}
      }
    ]
  }'

With System Prompt

curl -X POST https://zephyr-api.matiasvergara.workers.dev/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gemini-3",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly."
      },
      {
        "role": "user",
        "content": "How do I read a CSV file in Python?"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

With Function/Tool Calling

curl -X POST https://zephyr-api.matiasvergara.workers.dev/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gemini-3",
    "messages": [
      {
        "role": "user",
        "content": "What'\''s the weather like in San Francisco?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              },
              "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "The temperature unit"
              }
            },
            "required": ["location"]
          }
        }
      }
    ],
    "temperature": 0.3
  }'

Get Usage Stats

curl -X GET https://zephyr-api.matiasvergara.workers.dev/stats \
  -H "Authorization: Bearer YOUR_API_KEY"

Create Account

Verify Your Email

Account Dashboard

System Quotas

Master API Key

Recent Inference Requests

Customer Support

Simple, Transparent Pricing

Basic

Pro

Mega

Enterprise

Custom

Zephyr API

Key Features

Base URL

Available Models

Standard Models

✨ Premium Models

Feature Availability

Authentication

Creating an Account

Getting Your API Key

Using Your API Key

Example Request

Rate Limits & Usage

Endpoints

API Status

1. Health Check

Response

2. Usage Stats

Response

Generation Endpoints

1. Text Generation

Request

Parameters

2. Health Check

Response Format

Success Response

Error Response

Status Codes

Python Examples

Basic Usage

With System Prompt

With Image

Streaming

JavaScript Examples

Basic Usage

With System Prompt

Streaming

cURL Examples

Basic Request

With System Prompt

Health Check

Use Cases

Question & Answer

Creative Writing

Code Explanation

Image Analysis

Web Summarization

Multi-turn Chat

Tool & Function Calling

URL Context

Code Execution

Google Search

Function Calling

Defining Tools

Model Response with Tool Call

Returning Tool Results

Feature Compatibility

Configuration

Temperature

System Prompts

Best Practices

1. Use Appropriate Temperature

2. Provide Clear Instructions

3. Use System Prompts for Consistency

4. Handle Errors Gracefully

5. Use Streaming for Long Responses

API Status