Handytool
Video guide5 წთ კითხვაგანახლებული 11 თებ. 2026

AI transcription, browser-only

Turn any video into text and subtitles without uploading it anywhere.

Handytool runs OpenAI's Whisper model directly in your browser to transcribe MP4, MOV, WebM, and MKV files into plain text, SRT, and VTT subtitle files. Your video never leaves your device.

მთავარი მიღებული

  • 01Whisper AI runs inside your browser — your video is never uploaded to any server.
  • 02Supports 99 languages with automatic language detection.
  • 03Outputs plain text, SRT subtitle files, and WebVTT for direct use in editors and YouTube.
  • 04After the first run, the model is cached and transcription works offline.

Why Transcribe Video in Your Browser?

Transcribing an interview, lecture, Zoom recording, or YouTube video used to mean either typing it out manually or sending the file to a cloud service. Cloud services are fast but come with a real privacy trade-off — you are uploading potentially sensitive audio to a third-party server. Handytool takes a different approach: it downloads OpenAI's open-source Whisper speech model once and then runs it entirely on your device using WebGPU or WebAssembly.

The result is the same AI-quality transcription you would get from a cloud service, but your video file and audio stay completely private. The first run downloads the model (around 150 MB), but after that, the tool works entirely offline.

How to Transcribe a Video

Drop a video file and get a transcript in minutes.

  1. 01

    Open the transcription tool

    Go to the Transcribe Video tool on Handytool. No account is needed.

  2. 02

    Add your video file

    Drop in an MP4, MOV, WebM, MKV, M4V, or AVI file up to 500 MB. The audio is extracted locally with FFmpeg.wasm — nothing is uploaded.

  3. 03

    Select a language (optional)

    Whisper detects the spoken language automatically for most recordings. If your video has a heavy accent or is in a less common language, manually choosing the language improves accuracy.

  4. 04

    Click Transcribe

    On the first use, the Whisper model (~150 MB) downloads to your browser cache. Subsequent runs use the cached model and work offline. Transcription processes in 30-second chunks with overlap to keep context coherent.

  5. 05

    Download your output

    When the transcript appears, download it as plain text, an SRT subtitle file, or a WebVTT file. All three are generated from the same transcription run.

What You Can Do With a Video Transcript

Transcripts and subtitles unlock many downstream workflows.

  • 01Add closed captions to YouTube or Vimeo videos to improve accessibility.
  • 02Create searchable notes from lectures, webinars, or training recordings.
  • 03Repurpose interview footage into a blog post or article.
  • 04Add burned-in subtitles in a video editor using the SRT file.
  • 05Generate a summary or action items from a meeting recording.
  • 06Translate the transcript into another language after downloading the text.

Whisper Runs on Your Device — Nothing Is Transmitted

Handytool uses FFmpeg.wasm to extract the audio track locally, then passes it to Whisper running in your browser via WebGPU (where available) or pure WebAssembly. At no point is any audio or video data sent over the network.

This makes the tool suitable for confidential recordings — medical interviews, legal depositions, internal business meetings, therapy sessions — where uploading to a cloud transcription service is not acceptable.

Video Transcription FAQ

Which video formats are supported?

MP4, MOV, WebM, MKV, M4V, and AVI containers up to 500 MB. Common audio codecs inside those containers (AAC, MP3, Opus, Vorbis) all work.

Which languages can it transcribe?

All 99 languages Whisper supports, including English, Spanish, Mandarin, French, Arabic, Hindi, German, Russian, Portuguese, and Japanese. The transcript stays in the spoken language.

Can I generate subtitles for YouTube?

Yes. After transcribing, download the SRT or VTT file and upload it directly in YouTube Studio's caption editor.

How long can the video be?

Files up to 500 MB are accepted. Long recordings are split into 30-second chunks with 5-second overlap, so the transcript stays coherent across the whole video.

Is the video uploaded to a server?

No. Both FFmpeg.wasm and the Whisper model run locally in your browser. Nothing is uploaded at any stage.

Does it work offline?

After the first run, the Whisper model is cached in your browser. Subsequent transcriptions work fully offline — you only need an internet connection the first time.

დაკავშირებული ინსტრუმენტები

გაგრძელება ვიდეო ინსტრუმენტი

ვიდეო ინსტრუმენტი