Skip to main content

Introduction

HyperGen provides an OpenAI-compatible HTTP API for image generation. This allows you to:
  • Generate images via REST API calls
  • Integrate with OpenAI client libraries
  • Deploy as a scalable service
  • Use familiar OpenAI API patterns
The API is fully compatible with OpenAI’s image generation endpoints, making it a drop-in replacement for services using the OpenAI API.

Starting the Server

Start the HyperGen API server using the CLI:
hypergen serve stabilityai/stable-diffusion-xl-base-1.0 \
  --host 0.0.0.0 \
  --port 8000 \
  --device cuda \
  --api-key your-secret-key

Server Options

model
string
required
HuggingFace model ID or local path to model
--host
string
default:"0.0.0.0"
Host address to bind to. Use 0.0.0.0 to accept connections from any IP.
--port
integer
default:"8000"
Port to listen on
--device
string
default:"cuda"
Device to run model on: cuda, cpu, or mps
--dtype
string
default:"float16"
Model data type: float16, bfloat16, or float32
--api-key
string
API key for authentication. If not set, authentication is disabled.
--lora
string
Path to LoRA weights to load
--max-queue-size
integer
default:"100"
Maximum number of requests in queue
--max-batch-size
integer
default:"1"
Maximum batch size for processing

Example

# Start server with authentication
hypergen serve stabilityai/sdxl-turbo \
  --api-key sk-hypergen-123456 \
  --device cuda \
  --port 8000

# Start server with LoRA
hypergen serve stabilityai/stable-diffusion-xl-base-1.0 \
  --lora ./my_lora_weights \
  --api-key sk-hypergen-123456

# Start server without authentication (development only)
hypergen serve stabilityai/sdxl-turbo \
  --device cuda \
  --port 8000

Authentication

API Key

If you start the server with --api-key, all requests must include the API key in the Authorization header.
curl http://localhost:8000/v1/images/generations \
  -H "Authorization: Bearer sk-hypergen-123456" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A cat holding a sign"
  }'

Error Responses

401 Unauthorized
error
Returned when:
  • Authorization header is missing
  • API key is invalid
  • API key format is incorrect
{
  "detail": "Invalid API key"
}

Base URL

When running locally, the default base URL is:
http://localhost:8000
All endpoints are prefixed with /v1 to match OpenAI’s API structure:
http://localhost:8000/v1/images/generations
http://localhost:8000/v1/models

Endpoints

HyperGen provides the following API endpoints:

POST /v1/images/generations

Generate images from text prompts (OpenAI-compatible)

GET /v1/models

List available models (OpenAI-compatible)

GET /health

Health check endpoint for monitoring

OpenAI Compatibility

HyperGen’s HTTP API is designed to be compatible with OpenAI’s image generation API. This means:
  1. Drop-in Replacement: Use HyperGen as a replacement for OpenAI’s DALL-E API
  2. Same Request Format: Request and response formats match OpenAI’s spec
  3. Client Library Support: Works with OpenAI’s official client libraries

Using OpenAI Python Library

from openai import OpenAI

# Point to your HyperGen server
client = OpenAI(
    api_key="sk-hypergen-123456",
    base_url="http://localhost:8000/v1"
)

# Use exactly like OpenAI's API
response = client.images.generate(
    prompt="A cat holding a sign",
    model="stabilityai/sdxl-turbo",  # Ignored by HyperGen
    n=1,
    size="1024x1024"
)

image_url = response.data[0].url

Using OpenAI Node.js Library

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-hypergen-123456',
  baseURL: 'http://localhost:8000/v1'
});

const response = await client.images.generate({
  prompt: 'A cat holding a sign',
  n: 1,
  size: '1024x1024'
});

console.log(response.data[0].url);

HyperGen Extensions

While maintaining OpenAI compatibility, HyperGen adds several extensions:

Extended Parameters

negative_prompt
string
Negative prompt to guide what to avoid
num_inference_steps
integer
default:"50"
Number of denoising steps (quality vs speed)
guidance_scale
float
default:"7.5"
How closely to follow the prompt
seed
integer
Random seed for reproducible results
lora_path
string
Path to LoRA weights (overrides server default)
lora_scale
float
default:"1.0"
LoRA influence strength (0.0 to 2.0)

Example with Extensions

import requests

response = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={"Authorization": "Bearer sk-hypergen-123456"},
    json={
        "prompt": "A beautiful landscape",
        "negative_prompt": "blurry, low quality",
        "num_inference_steps": 50,
        "guidance_scale": 7.5,
        "seed": 42,
        "size": "1024x1024"
    }
)

Error Handling

The API uses standard HTTP status codes:
200 OK
success
Request succeeded
400 Bad Request
error
Invalid request parameters
401 Unauthorized
error
Missing or invalid API key
500 Internal Server Error
error
Server error during generation

Error Response Format

All errors return a JSON response:
{
  "detail": "Error message describing what went wrong"
}
Or in OpenAI format:
{
  "error": {
    "message": "Error message",
    "type": "internal_error"
  }
}

Rate Limiting

HyperGen uses a request queue system:
  • Maximum queue size: Configurable via --max-queue-size (default: 100)
  • Requests beyond the queue size will wait
  • Queue status available via /health endpoint
Monitor queue size using the /health endpoint to track server load

Best Practices

  • Always use --api-key for authentication
  • Set appropriate --max-queue-size based on expected traffic
  • Use a reverse proxy (nginx, Caddy) for HTTPS
  • Monitor the /health endpoint
  • Run on a GPU-enabled instance
  • Use SDXL Turbo for faster generation (4 steps vs 50)
  • Adjust --max-batch-size for throughput vs latency
  • Use float16 dtype for 2x speed improvement
  • Keep queue size reasonable to avoid memory issues
  • Run without --api-key for easier testing
  • Use --reload flag for auto-restart on code changes
  • Monitor logs for debugging
  • Test with /health endpoint first

Next Steps