> Model initialized in 142ms.
> Context cleared. Ready for input.
_
Next generation inference playground
We sent a 5-digit code to
Review your API usage, plan limits, and management tools.
Rate Limit
1 req/min
Hard Limit
Daily Token Quota
2,048 / 4,096
Daily Requests
450 / 1,000
Current Plan
BASIC
Free Tier
No API key created yet
| Timestamp | Tokens | Latency | Status |
|---|---|---|---|
| No recent requests | |||
> Model initialized in 142ms.
> Context cleared. Ready for input.
_
Choose a plan that fits your needs. Scale as you grow.
Perfect for testing and personal projects
Includes
Ideal for startups and growing businesses
Includes
Enterprise-grade for unlimited scale
Includes
Maximum power with premium AI models
Includes
Tailored solutions for your scale
Includes
Zephyr API is a RESTful interface to access a wide diversity of state-of-the-art language models. Designed to deliver Pro-level reasoning at Flash speeds with low cost, it's ideal for agentic workflows, multi-turn chat, and coding assistance.
All API requests should be made to the following base URL:
https://zephyr-api.matiasvergara.workers.dev/
Zephyr API offers a wide range of state-of-the-art models tailored to different performance and cost needs.
Available with regular API usage limits based on your plan.
| Model ID | Description | Plan |
|---|---|---|
| openai | GPT-5 Nano | Basic+ |
| gemini-3 | Gemini 3.0 Flash | Basic+ |
| gemini-2.5 | Gemini 2.5 Flash Lite | Basic+ |
| gpt-oss | GPT-OSS 120B | Mega+ |
| grok | xAI Grok 4 Fast | Pro+ |
| minimax | MiniMax M2.1 | Pro+ |
| glm4.7 | GLM 4.7 | Mega+ |
| deepseek-v3.2 | DeepSeek V3.2 with Thinking | Mega+ |
Enterprise Exclusive — These state-of-the-art models are only available with the Enterprise plan.
Daily Limit: Premium models have a limit of 100 requests/day. Usage resets at midnight UTC.
| Model ID | Description | Plan |
|---|---|---|
| gpt-5.2 | OpenAI GPT 5.2 | Enterprise |
| gemini-3-pro | Gemini 3 Pro | Enterprise |
| opus-4.5 | Claude Opus 4.5 | Enterprise |
| sonnet-4.5 | Claude Sonnet 4.5 | Enterprise |
| haiku-4.5 | Claude Haiku 4.5 | Enterprise |
| opus-4.5-reasoning | Claude Opus 4.5 with Reasoning Content | Enterprise |
Check which features are available in your plan.
| Feature | Basic | Pro | Mega | Enterprise | Custom |
|---|---|---|---|---|---|
| Visual Input (Images) | |||||
| Function Calling | |||||
| URL Context & Web Search (Gemini models) |
The Zephyr API uses API Keys to authenticate requests and track usage against your plan limits. All users must create an account and verify their email before accessing the API.
To get started:
After verification, you can sign in and access your API key.
Include your API key in the
Authorization header of every
request:
Authorization: Bearer YOUR_API_KEY
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"text": "Hello, world!"}'
Your API key is linked to your subscription plan. Usage limits include:
Usage resets at midnight UTC. Check the Console for your current usage statistics.
Endpoint: GET /
Check if the API is operational. Requires valid API key.
{
"message": "Zephyr API is operational",
"version": "3.2.0",
"status": "ready"
}
Endpoint: GET /stats
Get your current usage statistics and plan details.
{
"plan": "BASIC",
"planLabel": "Free Tier",
"stats": {
"tokens": { "date": "2026-01-15", "count": 1024 },
"requests": { "date": "2026-01-15", "count": 15 },
"tokenLimit": 4096,
"rateLimit": "1 req/min"
}
}
Endpoint: POST /generate
Generates text using the selected language model.
POST /generate
Content-Type: application/json
{
"text": "Your prompt here",
"temperature": 0.7,
"system": "You are a helpful assistant"
}
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string | Yes | - | Your question or prompt |
temperature |
number | No | 0.7 | Controls creativity (0.0-1.0) |
system |
string | No | null | System prompt |
image |
string | No | null | Image URL for analysis |
url |
string | No | null | Web page URL for context |
stream |
boolean | No | false | Enable streaming |
history |
array | No | [] | Conversation history |
Endpoint: GET /health
Checks the API status.
GET /health
{
"success": true,
"model": "gemini-3-flash",
"message": "AI response here...",
"latency": 1.45,
"finish_reason": "STOP",
"usage": {
"prompt_tokens": 38,
"completion_tokens": 175,
"total_tokens": 213
},
"metadata": {
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2026-01-12T15:30:00.000Z",
"duration_ms": 1450,
"temperature_used": 0.7
}
}
{
"error": "API Error",
"details": "Error message here",
"status": 500,
"request_id": "550e8400-e29b-41d4-a716-446655440000"
}
| Code | Description |
|---|---|
| 200 | Success |
| 400 | Invalid parameters |
| 500 | Server error |
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
response = requests.post(url, json={
"text": "What is artificial intelligence?",
"temperature": 0.7
})
print(response.json()["message"])
response = requests.post(url, json={
"text": "Explain quantum physics",
"system": "You are a physics professor. Use simple language.",
"temperature": 0.5
})
print(response.json()["message"])
response = requests.post(url, json={
"text": "Describe this image",
"image": "https://example.com/photo.jpg"
})
print(response.json()["message"])
response = requests.post(url, json={
"text": "Write a short story",
"stream": True
}, stream=True)
for line in response.iter_lines():
if line:
print(line.decode('utf-8'))
const response = await fetch("https://zephyr-api.matiasvergara.workers.dev/generate", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: "What is artificial intelligence?",
temperature: 0.7
})
});
const data = await response.json();
console.log(data.message);
const response = await fetch("https://zephyr-api.matiasvergara.workers.dev/generate", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: "Explain quantum physics",
system: "You are a physics professor.",
temperature: 0.5
})
});
console.log((await response.json()).message);
const response = await fetch(url, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ text: "Write a story", stream: true })
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(decoder.decode(value));
}
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-d '{
"text": "What is artificial intelligence?",
"temperature": 0.7
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-d '{
"text": "Explain quantum physics",
"system": "You are a physics professor.",
"temperature": 0.5
}'
curl https://zephyr-api.matiasvergara.workers.dev/health
Simple factual queries and information retrieval.
Stories, poems, and creative content generation.
Analyze and explain code snippets.
Describe and analyze images via URL.
Summarize web pages and articles.
Conversational AI with history context.
Extend model capabilities with external tools and function calling. Available features depend on the model and plan.
Gemini Models | Basic+ Plans
Pass a URL to provide web page context for your query. The model will fetch and analyze the page content.
⚠️ Basic Plan: Limited to 50 uses per day.
{
"text": "Summarize this article",
"url": "https://example.com/article",
"model": "gemini-3"
}
Gemini Models | Basic+ Plans
Enable the model to write and execute Python code to solve problems that require computation.
⚠️ Basic Plan: Limited to 50 uses per day.
{
"text": "Calculate the first 20 Fibonacci numbers",
"model": "gemini-3",
"tools": [
{
"code_execution": {}
}
]
}
Gemini Models | Basic+ Plans
Allow the model to search Google for real-time information and current events.
⚠️ Basic Plan: Limited to 50 uses per day.
{
"text": "What are the latest news about AI?",
"model": "gemini-3",
"tools": [
{
"google_search": {}
}
]
}
All Models
Define custom functions that the model can invoke. You handle the execution and return results.
{
"text": "What's the weather in New York?",
"model": "gemini-3",
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}
}
]
}
{
"success": true,
"message": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"New York\"}"
}
}
]
}
{
"text": "Process the weather data",
"history": [
{"role": "assistant", "tool_calls": [...]},
{"role": "tool", "tool_call_id": "call_abc123", "content": "{\"temp\": 72, \"condition\": \"sunny\"}"}
]
}
| Feature | gemini-3 | gemini-2.5 | openai | gpt-oss |
|---|---|---|---|---|
| URL Context | ||||
| Code Execution | ||||
| Google Search | ||||
| Function Calling |
Controls randomness in responses:
| Range | Behavior |
|---|---|
| 0.0 - 0.3 | Deterministic, factual, consistent |
| 0.4 - 0.7 | Balanced (default: 0.7) |
| 0.8 - 1.0 | Creative, varied, unpredictable |
Customize AI behavior with different personas:
# Professional tone
system = "You are a university professor"
# Casual tone
system = "You are talking to a 10-year-old"
# Technical tone
system = "You are a technical documentation writer"
Use 0.3 for facts, 0.7 for balanced, 0.9 for creative tasks.
Be specific: "Provide 5 facts about cats focusing on behavior" vs "Tell me about cats"
Define the AI's role and tone for consistent responses.
Always wrap API calls in try/catch with proper timeout handling.
Enable streaming for stories, articles, and long content.
Operational
https://zephyr-api.matiasvergara.workers.dev/
Complete Python examples with all parameters and tools.
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "Explain quantum computing in simple terms",
"model": "gemini-3",
"temperature": 0.7,
"max_tokens": 1024,
"top_p": 0.9,
"stream": False
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "Summarize the main points from this article",
"url": "https://example.com/article",
"model": "gemini-3",
"temperature": 0.5,
"max_tokens": 500
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "Calculate the first 20 Fibonacci numbers",
"model": "gemini-3",
"tools": [
{
"code_execution": {}
}
],
"temperature": 0.3,
"max_tokens": 2048
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "What are the latest developments in AI this week?",
"model": "gemini-3",
"tools": [
{
"google_search": {}
}
],
"temperature": 0.7,
"max_tokens": 1500
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json()["text"])
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "Write a short story about space exploration",
"model": "gemini-3",
"temperature": 0.9,
"max_tokens": 2048,
"stream": True
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers, stream=True)
for line in response.iter_lines():
if line:
print(line.decode('utf-8'))
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/generate"
payload = {
"text": "Analyze this webpage and write code to extract the data",
"url": "https://example.com/data",
"model": "gemini-3",
"temperature": 0.7,
"max_tokens": 2048,
"top_p": 0.9,
"stream": False,
"tools": [
{
"code_execution": {}
},
{
"google_search": {}
}
]
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
result = response.json()
print("Response:", result["text"])
print("Tokens used:", result["usage"]["total_tokens"])
print("Model:", result["model"])
import requests
url = "https://zephyr-api.matiasvergara.workers.dev/chat/completions"
payload = {
"model": "gemini-3",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly."
},
{
"role": "user",
"content": "How do I read a CSV file in Python?"
}
],
"temperature": 0.7,
"max_tokens": 1024
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json()["choices"][0]["message"]["content"])
import requests
import json
url = "https://zephyr-api.matiasvergara.workers.dev/chat/completions"
# Define custom functions the model can call
payload = {
"model": "gemini-3",
"messages": [
{
"role": "user",
"content": "What's the weather like in San Francisco?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit"
}
},
"required": ["location"]
}
}
}
],
"temperature": 0.3
}
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
response = requests.post(url, json=payload, headers=headers)
result = response.json()
# Check if model wants to call a function
if result["choices"][0]["message"].get("tool_calls"):
tool_call = result["choices"][0]["message"]["tool_calls"][0]
function_name = tool_call["function"]["name"]
function_args = json.loads(tool_call["function"]["arguments"])
print(f"Model wants to call: {function_name}")
print(f"With arguments: {function_args}")
Complete JavaScript/Node.js examples with all parameters and tools.
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'Explain quantum computing in simple terms',
model: 'gemini-3',
temperature: 0.7,
max_tokens: 1024,
top_p: 0.9,
stream: false
})
});
const data = await response.json();
console.log(data.text);
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'Summarize the main points from this article',
url: 'https://example.com/article',
model: 'gemini-3',
temperature: 0.5,
max_tokens: 500
})
});
const data = await response.json();
console.log(data.text);
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'Calculate the first 20 Fibonacci numbers',
model: 'gemini-3',
tools: [
{ code_execution: {} }
],
temperature: 0.3,
max_tokens: 2048
})
});
const data = await response.json();
console.log(data.text);
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'What are the latest developments in AI this week?',
model: 'gemini-3',
tools: [
{ google_search: {} }
],
temperature: 0.7,
max_tokens: 1500
})
});
const data = await response.json();
console.log(data.text);
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'Write a short story about space exploration',
model: 'gemini-3',
temperature: 0.9,
max_tokens: 2048,
stream: true
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
console.log(chunk);
}
async function callZephyrAPI() {
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/generate', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
text: 'Analyze this webpage and write code to extract the data',
url: 'https://example.com/data',
model: 'gemini-3',
temperature: 0.7,
max_tokens: 2048,
top_p: 0.9,
stream: false,
tools: [
{ code_execution: {} },
{ google_search: {} }
]
})
});
const data = await response.json();
console.log('Response:', data.text);
console.log('Tokens used:', data.usage.total_tokens);
console.log('Model:', data.model);
}
callZephyrAPI();
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'gemini-3',
messages: [
{
role: 'system',
content: 'You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly.'
},
{
role: 'user',
content: 'How do I read a CSV file in Python?'
}
],
temperature: 0.7,
max_tokens: 1024
})
});
const data = await response.json();
console.log(data.choices[0].message.content);
const response = await fetch('https://zephyr-api.matiasvergara.workers.dev/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'gemini-3',
messages: [
{
role: 'user',
content: "What's the weather like in San Francisco?"
}
],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get the current weather in a given location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'The city and state, e.g. San Francisco, CA'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit'],
description: 'The temperature unit'
}
},
required: ['location']
}
}
}
],
temperature: 0.3
})
});
const result = await response.json();
// Check if model wants to call a function
if (result.choices[0].message.tool_calls) {
const toolCall = result.choices[0].message.tool_calls[0];
const functionName = toolCall.function.name;
const functionArgs = JSON.parse(toolCall.function.arguments);
console.log(`Model wants to call: ${functionName}`);
console.log(`With arguments:`, functionArgs);
}
Complete cURL examples with all parameters and tools.
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Explain quantum computing in simple terms",
"model": "gemini-3",
"temperature": 0.7,
"max_tokens": 1024,
"top_p": 0.9,
"stream": false
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Summarize the main points from this article",
"url": "https://example.com/article",
"model": "gemini-3",
"temperature": 0.5,
"max_tokens": 500
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Calculate the first 20 Fibonacci numbers",
"model": "gemini-3",
"tools": [
{
"code_execution": {}
}
],
"temperature": 0.3,
"max_tokens": 2048
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "What are the latest developments in AI this week?",
"model": "gemini-3",
"tools": [
{
"google_search": {}
}
],
"temperature": 0.7,
"max_tokens": 1500
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Write a short story about space exploration",
"model": "gemini-3",
"temperature": 0.9,
"max_tokens": 2048,
"stream": true
}' \
--no-buffer
curl -X POST https://zephyr-api.matiasvergara.workers.dev/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"text": "Analyze this webpage and write code to extract the data",
"url": "https://example.com/data",
"model": "gemini-3",
"temperature": 0.7,
"max_tokens": 2048,
"top_p": 0.9,
"stream": false,
"tools": [
{
"code_execution": {}
},
{
"google_search": {}
}
]
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gemini-3",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant specialized in Python programming. Always provide code examples and explain concepts clearly."
},
{
"role": "user",
"content": "How do I read a CSV file in Python?"
}
],
"temperature": 0.7,
"max_tokens": 1024
}'
curl -X POST https://zephyr-api.matiasvergara.workers.dev/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "gemini-3",
"messages": [
{
"role": "user",
"content": "What'\''s the weather like in San Francisco?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit"
}
},
"required": ["location"]
}
}
}
],
"temperature": 0.3
}'
curl -X GET https://zephyr-api.matiasvergara.workers.dev/stats \
-H "Authorization: Bearer YOUR_API_KEY"