Add a captioning API to your AI video product
Backend-only, webhook-native, per-minute billing
Ship styled captions in your editor without building ASR, timing, render workers, ffmpeg, and a billing layer. ZapCap is the caption-rendering API your users never see — but their finished MP4s come out the other side.
Captions look simple, until they're in your roadmap.
Auto-captioning is the most-asked-for feature in any short-form video app. Building it isn't one feature — it's six. Each has its own edge cases, vendors, and on-call rotation.
- ASR layer — multilingual, retries, vendor sprawl.
- Timing & layout — line-break rules, words-per-cue, safe zones, CJK / Thai.
- Style presets — Beast, Hormozi, karaoke fill, keyword emphasis.
- Render pipeline — ffmpeg workers, queues, retries, output storage.
- Async delivery — webhooks, signature verification, polling fallbacks.
- Usage billing — per-minute metering, retries, credit balances.
Where ZapCap sits in your stack
Drop the API behind your existing backend. Your users never see ZapCap — they see your editor, your export button, your branded waiting state.
Two routes, one env var, one webhook
Server-side only. Your front-end calls your API; your API calls ZapCap; ZapCap calls you back when the render is ready.
{
"eventId": "evt_8kQ2...",
"taskId": "tsk_2hP4...",
"videoId": "vid_9Lm2...",
"status": "completed",
"renderUrl": "https://cdn.zapcap.../out.mp4"
}- Map taskId → your user in your own store when you create the task.
- Verify HMAC on the x-signature header.
- Dedupe webhook deliveries on eventId.
Export lifecycle inside your product
Map each ZapCap status onto a UI state your users already understand.
Everything you need before you ship
A short list to keep your captioning launch from looking captioning-launch-shaped. Engineering, billing, and growth in one column.
- API key in a secret store Never bundled into client builds.
- Webhook signature verified HMAC-SHA256 on every payload.
- Dedupe webhook deliveries on eventId Survive retry storms without double-processing.
- Credit-balance check before queueing Surface upgrade prompts before the failure.
- Style preset gallery in your UI Mirror the templates ZapCap exposes.
- Failure-state UI with re-run and contact-support paths.
- Polling fallback if your stack can't accept inbound webhooks.
Engineering scope, honestly
Caption rendering stack
- 1ASR vendor selection — accuracy, language coverage, retries.
- 2Caption layout engine — words-per-cue, line breaks, CJK / Thai.
- 3Style presets — font stack, animation, keyword emphasis logic.
- 4Render workers — ffmpeg / libass, autoscaling, GPU vs CPU.
- 5Job queue & state machine — retries, dead-letter, observability.
- 6Output storage — pre-signed URLs, expiry, cleanup.
- 7Billing meter — per-minute counters, refunds on failure.
Captioning as a primitive
- 1POST /videos — done.
- 2POST /videos/:id/task — done.
- 3Webhook handler — verify, store, notify.
Captions shipped in a single sprint
An AI video SaaS platform replaced an internal captioning prototype with the ZapCap API and shipped styled captions to users in the same sprint. Engineering scope shrank from a multi-quarter rendering stack to a webhook handler plus a style picker — caption rendering is now a primitive in the product, not a project.
For SaaS teams
No. Your front-end calls your backend; your backend holds the ZapCap API key. This is the same pattern as Stripe or OpenAI — never bundle keys into client builds.
Build the rest of the integration
Agency video captions
The same backend pattern, multi-tenant: a stored caption preset per client, delivered to a white-label portal.
Read moreMultilingual subtitle rendering
Add translated, per-script captions to your product so users can localize their exports.
Read morePerformance creative localization
The variant-generation angle on the same API — spin one clip into localized cuts at test speed.
Read moreE-commerce video localization
One master product clip, a localized captioned cut per market — at catalog scale.
Read moreTikTok Shop video localization
The 9:16, safe-zone-aware localization workflow for cross-border TikTok Shop.
Read moreShip captions in your next sprint
Backend-only API, webhook-native, from $0.10/min base usage pricing. Mark it up, bundle it, or pass it through — no render workers to babysit.