Monitor & Optimise Gateway Services

Currently operating as a brainstorming page

Request Routing

Request Processing Flow (both)

Request Validation: OpenAPI validation middleware validates request structure
Session Selection: AISessionManager selects appropriate orchestrator based on model capability
Payment Processing: Calculates payment based on pixel count for non-live endpoints
Model Execution: Sends request to AI worker with specified model

Request Processing Flow

Scroll to pan

Transcoding Requests

Traditional video transcoding requests are handled through:

RTMP ingest: Port 1935 by default
HTTP push: /live/{streamKey} endpoint when -httpIngest is enabled
HLS output: Adaptive bitrate streams for playback

AI Requests

AI processing requests are routed through dedicated endpoints ai_mediaserver.go

(fixme) OpenAPI Spec is here: ai/worker/api/openapi.json

/text-to-image

json

Generate images from text prompts. Uses jsonDecoder for parsing

/image-to-image

multipart/form-data

Transform images with prompts. Uses multipartDecoder for file uploads

/image-to-video

multipart/form-data

Create videos from images. Uses multipartDecoder for file uploads

/upscale

multipart/form-data

Upscale (enhance) images to higher resolution. Uses multipartDecoder for file uploads

/live/video-to-video/{stream}/start

multipart/form-data

Apply transformations to a live video streamed to the returned endpoints. Live video endpoint has specialized handling for real-time streaming with MediaMTX integration

Payment Models

The dual setup handles two different payment models:

Transcoding Payments

Basis: Per video segment processed Method: Payment tickets sent with each segment Verification: Multi-orchestrator verification for quality assurance

AI Payments

Basis: Per pixel processed (width × height × outputs) Method: Pixel-based payment calculation Live Video: Interval-based payments during streaming

Operational Considerations

Resource Allocation

When running dual setup, consider:

GPU resources: Shared between transcoding and AI workloads
Memory: AI models require significant RAM when loaded (“warm”)
Network: Bandwidth for both stream ingest and AI request/response

Monitoring

Monitor both workload types:

Transcoding: Segment processing latency, success rates
AI: Model loading times, inference latency, pixel processing rates

Scaling Strategies

Horizontal: Deploy multiple gateway instances behind a load balancer
Vertical: Allocate more GPU resources for AI model parallelism
Specialized: Separate nodes for transcoding vs AI based on workload patterns

About Gateways

Gateway Services & Providers

Run Your Own Gateway

Gateway Tools & Dashboards

Gateway Guides & Resources

Technical References

Monitor & Optimise Gateway Services

Request Routing

Transcoding Requests

AI Requests

Payment Models

Transcoding Payments

AI Payments

Operational Considerations

Resource Allocation

Monitoring

Scaling Strategies

About Gateways

Gateway Services & Providers

Run Your Own Gateway

Gateway Tools & Dashboards

Gateway Guides & Resources

Technical References

​Request Routing

​Transcoding Requests

​AI Requests

​Payment Models

​Transcoding Payments

​AI Payments

​Operational Considerations

​Resource Allocation

​Monitoring

​Scaling Strategies

Request Routing

Transcoding Requests

AI Requests

Payment Models

Transcoding Payments

AI Payments

Operational Considerations

Resource Allocation

Monitoring

Scaling Strategies