Upload Your Video
Drag-and-drop or select an MP4, MOV, AVI, MKV, or WMV file up to 500 MB. Your video is stored securely and processed on Scenith servers — nothing is shared with third parties.
Upload your video, let Whisper AI generate captions in under a minute, customise every detail, then download a clean MP4 with subtitles permanently burned in. No software install. No watermark on the free plan. 50+ languages supported.
By the numbers
Ready? Takes under 2 minutes
No account required to preview. Free-tier users download at 720p with no watermark. Upgrade for 1080p → 4K.
Open Free Subtitle GeneratorStep-by-step
The entire workflow lives in your browser — upload, generate, edit, export. No account setup, no queue, no complicated timeline editors.
Drag-and-drop or select an MP4, MOV, AVI, MKV, or WMV file up to 500 MB. Your video is stored securely and processed on Scenith servers — nothing is shared with third parties.
Click "Generate Subtitles". OpenAI Whisper analyses the audio track, transcribes speech, timestamps every word, and groups captions into readable segments — typically done in under 60 seconds for a 1-minute clip.
Open any caption in the timeline editor to fix text, nudge start/end times, or swap styles. Change font family, colour, size, stroke, background box, letter-spacing, and exact X/Y position. Preview updates live in the video player.
Hit "Process Subtitles" and Scenith renders the final video with captions permanently burned in. Download your MP4 in the quality your plan supports — 720p on the free tier, up to 4K on Studio.
The case for captions
Studies consistently show that over 85% of social media videos are watched without sound in public settings. Instagram, Facebook, and TikTok autoplay videos muted by default. If your video relies on spoken dialogue to carry meaning, a viewer scrolling in a café or on a commute will simply scroll past it.
Adding subtitles online for free is therefore not optional for modern content — it is the minimum viable accessibility layer that separates videos that convert from videos that get skipped.
Research insight
Facebook's internal data showed that videos with captions earned 12% more watch time on average than uncaptioned equivalents. The effect is even larger on short-form vertical video.
466 million people worldwide live with disabling hearing loss, according to the WHO. In the United States, the Americans with Disabilities Act (ADA) and the Web Content Accessibility Guidelines (WCAG 2.1 AA) both require captions for publicly available video content at Level 1.2.2.
Educational institutions receiving federal funding are legally required to caption all instructional video under Section 508. Major streaming platforms, corporate intranets, and government websites face similar mandates globally. Tools that let you add subtitles to videos online for free remove the cost barrier from compliance.
Search engines cannot watch video — but they can read caption text. When you burn subtitles into a video and also embed a transcript on the page, you give crawlers rich semantic content. YouTube's algorithm ranks captioned videos higher in suggested results and the text is indexed by Google Video Search.
An analysis of 10,000 YouTube videos found those with accurate closed captions received 7.32% more views within 90 days of upload compared to uncaptioned equivalents — purely from algorithmic uplift, independent of engagement metrics.
English is the first language of roughly 400 million people, yet the internet has 5.4 billion users. Adding subtitles in the native language of your target audience — or even providing accurate English captions for ESL audiences — dramatically expands the addressable market for any piece of video content.
AI subtitle generators that support 50+ languages, like Scenith's Whisper-powered tool, let a solo creator add multilingual captions to their videos for free, something that previously required hiring professional translators and caption-formatting agencies.
Creator economy stat
Creators who add subtitles in at least one additional language report 23–45% more international engagement on average, based on data from 2 000+ YouTube channels in 2025.
Educational psychologists refer to dual-coding theory: when learners process information through both visual text and audio simultaneously, retention improves by up to 40% compared to audio alone. This is why every serious e-learning platform — Coursera, Udemy, LinkedIn Learning — mandates subtitles on all instructional video content.
For marketing content, subtitles reinforce key messages: brand names, product features, call-to-action phrases. A viewer who both hears and reads "Free 14-day trial" is measurably more likely to act on that prompt than a viewer who only hears it.
TikTok's algorithm explicitly rewards high completion rate. Captions increase completion rate for muted viewers (who would otherwise drop off as soon as they realise they cannot follow the content). LinkedIn reports that video posts with captions achieve 3× higher engagement than those without. Instagram Reels with on-screen text are currently boosted by the Explore algorithm in 2026 as the platform doubles down on accessibility signals.
What you get
Scenith is not a stripped-down free tool with a paywall behind every feature. The core subtitle workflow — upload, generate, edit, export — is entirely free up to 720p.
OpenAI's Whisper model is trained on 680 000 hours of multilingual audio. It handles accents, background noise, overlapping speech, and technical vocabulary better than any other open-weight speech recognition model available in 2026. You get a production-quality draft transcript in under a minute for most videos — not the rough approximation you get from browser-native speech APIs.
50+ LanguagesEvery styling change — colour, font, size, position — updates instantly in the embedded video player. No blind exports to check your work.
Real-TimeChoose from professionally designed subtitle templates — cinematic white-stroke, TikTok bold-pop, documentary yellow, minimal lowercase, and more.
Swap font family, adjust weight, letter-spacing, line height, and border stroke width. Apply changes to one caption or all at once.
Drag or type exact X/Y coordinates. Useful for vertical 9:16 videos where bottom placement must clear lower-third graphics or safe-zone text.
Accidentally deleted a caption segment? The undo stack brings it back — even after a backend save — without regenerating the whole transcript.
Add a semi-transparent background rectangle behind caption text for readability on busy footage. Control fill colour, opacity, padding, corner radius, and border independently — great for the social-card style popular on LinkedIn and Instagram in 2026.
Accessibility-firstFree tier exports at 720p without watermark. Creator Lite unlocks 1080p, Creator unlocks 2K, and Studio plan goes all the way to 4K — always burned into a clean H.264 MP4 compatible with every platform.
No Watermark on FreePlatform-specific guidance
Different platforms have different safe zones, aspect ratios, and viewer behaviours. Here is what to set when you add subtitles to video online for each major channel in 2026.
Use bottom-centre placement with 10% margin from edges. White text + black stroke reads on any thumbnail. Caption text helps YouTubes search index rank your video.
Vertical 9:16 only. Place captions in the middle 60% of the frame — avoid the top (where the username sits) and bottom (where the action buttons overlap).
Large, high-contrast text is rewarded by TikToks completion-rate signal. Bold sans-serif at 130%+ scale performs best. Many creators add captions at the top 1/3 and bottom 1/3 simultaneously for dual-pane content.
Square (1:1) or landscape. Professional, minimal subtitle styles outperform flashy ones. LinkedIns algorithm gives a 3× engagement boost to captioned native video vs uncaptioned — add them every time.
Facebook autoplays muted in the News Feed. Captions are the only way viewers see your message before they choose to unmute. Keep captions large and avoid relying on audio for context.
Accuracy matters more than style here. Use simple sans-serif, 1.4 line-height, and slow reading pace (≤17 chars/sec). Add [sound effect] descriptors for accessibility compliance.
Who this is for
From solo creators to enterprise teams, anyone who publishes video content benefits from adding subtitles — here is how different professionals use the tool.
Growing a channel means making content accessible to international audiences and viewers who watch silently. AI subtitle generation lets solo creators caption every video in minutes — not hours — maintaining the upload frequency the algorithm rewards.
Festival submissions, streaming pitches, and distributor screeners increasingly require accessible captions. Generating a subtitle track from AI and fine-tuning timing and style takes a fraction of the time of manual captioning services — at zero cost on the free tier.
Product demo videos, explainer ads, and brand content all need captions for LinkedIn, YouTube, and OTT distribution. Subtitle generation removes the dependency on agency timelines — caption a video in-house in under 5 minutes.
Section 508, ADA, and institutional accessibility policies mandate captions for instructional video. AI generation achieves the accuracy most educators need for draft review, with manual editing to correct technical terms specific to the discipline.
Nonprofits communicate globally with limited budgets. Being able to add subtitles to videos online for free — across 50+ languages with no watermark — is a genuine force-multiplier for organisations whose audiences span multiple continents and languages.
Long-form interview clips repurposed for social media perform significantly better with captions. AI transcription of a 5-minute clip is accurate enough for a clean social media cut with only minor editing — enabling a production workflow no assistant required.
Global reach
Whisper AI automatically detects the spoken language in your video. You do not need to select a language before generating. All 50+ languages are included in the free tier — no upsell, no language pack to buy.
How language detection works
Whisper analyses the first 30 seconds of audio to identify the predominant language. If your video switches languages mid-way (code-switching), the model follows the dominant language for the full transcript. For mixed-language content, we recommend segmenting clips before uploading.
Honest comparison
Many tools claim to be free — then lock editing, export, or quality behind a paywall, or slap a watermark on every video. Here is an honest comparison.
| Feature | Scenith (Free) | Kapwing | VEED.IO | Rev.com | YouTube Auto-CC |
|---|---|---|---|---|---|
| Cost to export | ✓ Free at 720p | Watermark on free | Watermark on free | $1.25/min | Free |
| Watermark-free export | ✓ Yes (720p free) | ✗ Paid only | ✗ Paid only | ✓ Yes | ✓ (within YT only) |
| Custom font & colour | ✓ Full control | ⚠ Limited free | ⚠ Limited free | ✗ No | ✗ No |
| Burned-in MP4 export | ✓ Yes | ✓ Yes (paid) | ✓ Yes (paid) | ✓ Yes | ✗ No (captions only) |
| AI accuracy | ✓ Whisper 95–98% | ✓ Whisper | ✓ Proprietary | ✓ Human 99% | ⚠ 85–90% |
| 50+ languages | ✓ All free | ⚠ Limited | ⚠ Limited | ⚠ Extra cost | ✓ Many |
| Manual timing edit | ✓ Yes | ✓ Yes | ✓ Yes | ✓ Yes | ✗ No |
| Works without login | ⚠ Login to process | ⚠ Login required | ⚠ Login required | ✗ Account | ✓ YT account |
| Max free resolution | 720p | 720p (watermarked) | 720p (watermarked) | Any | N/A (caption file) |
The watermark tax
Most "free" video tools export with a watermark unless you pay. Scenith exports 720p videos watermark-free on the free tier — because we believe professional-quality output should not require a subscription for creators just starting out.
Expert guide
The wrong font choice can make accurately transcribed subtitles illegible. The rules professionals have refined over decades of broadcast captioning still hold in 2026:
WCAG 2.1 Success Criterion 1.4.3 requires a contrast ratio of at least 4.5:1between text and background for normal text. For large text (18px+ bold), the minimum drops to 3:1. In practice, the simplest solution that always works is white text with a 2–3px black stroke — this creates the equivalent of a black background wherever the text sits, guaranteeing contrast against any scene.
The BBC's subtitle specification — one of the most studied in the world — sets a maximum reading speed of 17 words per second for subtitle text, which equates to roughly 180 words per minute. The minimum display duration for any caption segment should be 1.5 seconds, even for very short phrases, to allow viewers time to register the text.
Quick timing rules
Min duration: 1.5 sec · Max duration: 7 sec ·Max chars per line: 42 · Max lines: 2 ·Reading speed: ≤17 words/sec
The bottom-centre position is standard because it sits below the action area in most shot compositions, leaving actors' faces and on-screen graphics unobstructed. However, it is not always optimal:
Poor line breaks are the most common complaint about AI-generated subtitles — not the accuracy, but the way the text wraps. The rules:
For truly accessible captions, transcribe important non-speech audio in brackets. This is a legal requirement for WCAG 1.2.2 Level AA compliance and a practical aid for deaf viewers who would otherwise miss essential context:
What creators say
"I was spending $80/month on a captioning service. Scenith does the same job in literally minutes, for free. The Whisper accuracy on my interview podcast is genuinely impressive."
"I add subtitles to every Reel now — engagement on my cooking videos doubled once the silent-scroll audience could follow along. The style presets saved me so much time."
"We produce compliance training videos at work. Scenith gave us WCAG-compliant captions in under 5 minutes per video — what used to take a contractor a full day."
"No watermark on the free plan was the dealbreaker for me. I showed the output to a client and they asked which agency we used. That says everything."
"I film in Hindi and add English subtitles for international reach. Scenith nailed the Hindi transcription and the editor made adding English translations straightforward."
"The positioning control is what sets this apart. I can subtitle vertical Reels without subtitles disappearing under the Instagram UI — small thing, huge difference."
No account required to preview
Upload a video, generate captions, edit styles, download MP4. Free tier — 720p, no watermark.
Add Subtitles Free — Open ToolCommon questions
Yes — Scenith runs entirely in your browser. Upload your video, generate captions with AI, customise styling, and download the finished MP4 with no desktop application required.
The core workflow — upload, AI generation, editing, 720p export — is genuinely free with no watermark. The "catch" is that higher export resolutions (1080p, 2K, 4K) require a paid plan. You pay to go premium; the free tier is complete.
All major formats including MP4, MOV, AVI, MKV, WMV, and WebM. Maximum file size is 500 MB per upload. For longer or larger files, consider compressing to H.264 MP4 before uploading.
Whisper AI achieves 95–98% word accuracy on clear audio recorded in standard conditions. Accuracy drops for heavy background noise, overlapping speakers, thick regional accents, or highly technical vocabulary. Manual editing is quick for any corrections needed.
Yes. Click any caption segment in the timeline list to open the editor. You can correct text, adjust start/end timestamps, change styling, or delete the segment entirely. All changes auto-save before you export.
Yes — the "Process Subtitles" step renders captions permanently into the MP4 (also called "burning in" or "hard-coding"). This ensures subtitles display on every platform, even those without separate caption track support.
Yes. Sceniths position controls let you set exact X/Y coordinates, so you can place subtitles in the centre of the frame rather than the bottom — clearing TikTok's and Instagram's lower UI chrome.
Whisper AI supports 50+ languages including English, Hindi, Spanish, French, German, Mandarin, Arabic, Japanese, Portuguese, Korean, Italian, Russian, Turkish, and many more. The language is auto-detected from the audio — no manual selection needed.
The current free tier supports videos up to 1 minute. Paid plans support longer content. For full feature details and limits, check the pricing page.
Videos are stored securely on Scenith servers for processing and are automatically deleted after 24–48 hours. Your content is never shared with third parties or used for training data.
Yes. If you make style edits without selecting a specific segment, changes apply globally to all captions. You can also click "Apply to All" for position changes made in the editor panel.
Burned-in (open) subtitles are permanently part of the video image — they always show and cannot be turned off. Closed captions (SRT/VTT files) are separate text tracks that viewers can toggle on or off. Scenith currently produces burned-in subtitles; SRT export is on the roadmap.
More from Scenith
Subtitles are one piece of a complete video production workflow. Scenith's other tools slot in around them.
Generate a professional voiceover for your video before adding subtitles — or create silent captions-only content with AI narration.
Open tool →Design thumbnail images that match your subtitle aesthetic — same brand colours, same font family, consistent look across the content.
Open tool →Explore Sceniths full suite of browser-based video tools — trim, filter, resize, compress — before or after adding your captions.
Open tool →Upload your first video and get AI-generated subtitles in under 2 minutes — free, no software download, no watermark on 720p.