Livepeer’s AI subnet launched in Q3 2024 and has grown into a major source of new fee revenue for orchestrators. It turns GPU nodes into open, composable inference infrastructure that serves image generation, live-video effects, and large language model completions. AI workloads reach your node through gateway routing, capability advertisement, and container-based inference. The core operator distinction is between batch inference and live-video inference because the hardware profile and routing logic differ.Documentation Index
Fetch the complete documentation index at: https://na-36-docs-v2.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
How the network routes AI jobs
Applications never communicate with orchestrators directly. Every request flows through a gateway, which handles authentication, pricing negotiation, and routing to qualified nodes.How gateway selection actually works
Gateways discover orchestrators through theOrchestratorInfo structure, which your node broadcasts and updates on-chain. The key fields that determine whether you receive AI jobs are:
Gateway pricing is a hard gate. Gateways configure a maximum price they will pay per capability using the -maxPricePerCapability JSON flag. A pipeline priced above that maximum receives no jobs from that gateway, regardless of hardware quality.
Before setting prices in aiModels.json, check what prices the major gateways are using. See Models and VRAM Reference for a pricing reference table and Gateway Orchestrator Offerings for the full capability discovery protocol documentation.
For the complete list of supported pipelines and their model architectures, see AI Model Support in the Developers section.
The two workload types
The most important distinction for operators is between batch AI and live-video AI. These are different job types with different hardware profiles, different runtime architectures, and different operational characteristics.Batch AI
Cascade live-video AI
Comparison
AI pipeline types
Livepeer’s AI worker supports ten pipeline types. Each pipeline handles a specific class of inference task, with its own model format, VRAM floor, and pricing unit.text-to-image — Generate images from text prompts
text-to-image — Generate images from text prompts
SG161222/RealVisXL_V4.0_Lightning
Typical hardware: RTX 3090, RTX 4090, A5000Diffusion models (Stable Diffusion, SDXL variants) run natively on the managed livepeer/ai-runner container. The Lightning and Turbo variants reduce step count to deliver results in under 2 seconds on an RTX 4090.Source: SG161222/RealVisXL_V4.0_Lightning on HuggingFaceimage-to-image — Style transfer and transformation
image-to-image — Style transfer and transformation
ByteDance/SDXL-Lightning
Typical hardware: RTX 3090, RTX 4090image-to-video — Animate a still image
image-to-video — Animate a still image
image-to-text — Vision-language captioning
image-to-text — Vision-language captioning
Salesforce/blip-image-captioning-large
Typical hardware: RTX 2060, GTX 1080 (as secondary pipeline)audio-to-text — Speech recognition and transcription
audio-to-text — Speech recognition and transcription
openai/whisper-large-v3
Typical hardware: RTX 3060 12 GB, RTX 3080 10 GBSource: openai/whisper-large-v3 on HuggingFacesegment-anything-2 — Promptable segmentation
segment-anything-2 — Promptable segmentation
text-to-speech — Natural speech synthesis
text-to-speech — Natural speech synthesis
upscale — Resolution enhancement
upscale — Resolution enhancement
stabilityai/stable-diffusion-x4-upscaler
Pricing unit: Per input pixelllm — Large language model inference
llm — Large language model inference
meta-llama/Meta-Llama-3.1-8B-Instruct (via Ollama)
Typical hardware: GTX 1070 Ti, GTX 1080, RTX 2060The LLM pipeline uses a separate runner architecture from the standard livepeer/ai-runner image. See Batch AI Setup for the Ollama deployment guide.Source: Cloud SPE Ollama runner blog postlive-video-to-video — Cascade streaming AI
live-video-to-video — Cascade streaming AI
livepeer/ai-runner:live-base + ComfyStream
Typical hardware: RTX 4090, A100, H100This pipeline powers the Cascade architecture – Livepeer’s live-video AI system. It supports live AI effects, live style transfer, and streaming AI agents.Source: ComfyStream on GitHubHardware by workload type
What you build and what the network supplies
The Livepeer protocol handles the hard parts of running an inference marketplace. : You do need to:- Run and maintain GPU infrastructure
- Configure
aiModels.jsonwith your supported pipelines and pricing - Keep your primary models warm and your node performant
- Stay competitive on latency and pricing
- Build a marketplace or API
- Implement authentication or billing
- Handle service discovery
- Build brand recognition
Network participation
To verify your pipelines are visible to the network and check live capability coverage:- Network capabilities: tools.livepeer.cloud/ai/network-capabilities
- Orchestrator performance: explorer.livepeer.org