Handytool
Audio guide5 წთ კითხვაგანახლებული 30 მარ. 2026

AI voice separation

Pull clean voice out of noisy recordings — privately, in your browser.

Handytool's voice isolator stacks multi-pass RNNoise denoising with a voice-activity gate to strip music, crowd noise, and room sound from any MP3, WAV, or M4A file.

მთავარი მიღებული

  • 01Two-stage pipeline: multi-pass neural denoising plus a voice-activity gate that silences non-speech frames.
  • 02Controls for isolation strength and the number of passes let you tune between natural-sounding and hard isolation.
  • 03Works best when the voice is louder than the background music or crowd noise.
  • 04Output is a 48 kHz mono WAV; nothing is uploaded to any server.

When You Need More Than Noise Reduction

Standard noise reduction handles steady background hiss and hum. But what about a podcast guest recorded in a busy cafe, an interview done over a music bed, or a speech filmed at a crowded event? When the background is loud, varied, or musical, a single denoise pass isn't enough — you need a system that can also identify which parts of the audio are speech and silence everything else.

Handytool's voice isolator runs a two-stage pipeline: multiple passes of RNNoise neural denoising to tighten the noise floor, followed by a voice-activity-driven gate that suppresses frames the model identifies as non-speech. The result is a track where silence replaces the background between phrases, rather than a quieter version of the original noise. The whole process runs locally in your browser — no upload, no account needed.

How to Isolate a Voice From Background Noise

  1. 01

    Drop your audio file

    Drag an MP3, WAV, M4A, OGG, or FLAC file into the tool. Up to 200 MB is accepted.

  2. 02

    Set isolation strength

    Strength controls how aggressively non-voice frames are gated. Start at 70–80 for podcasts or interviews; push to 90–100 to strip a music bed or crowd noise.

  3. 03

    Choose the number of passes

    Each additional pass of neural denoising tightens the noise floor. One pass works for lightly noisy recordings; two or three passes improve results when background noise is loud or mixed.

  4. 04

    Click Isolate and download

    The pipeline runs locally in your browser. When it finishes, download the isolated voice as a 48 kHz mono WAV.

Recordings That Benefit Most From Voice Isolation

  • 01Podcast guests recorded in cafes or restaurants
  • 02Interviews filmed at conferences or events with crowd noise
  • 03Speeches or presentations with a music bed underneath
  • 04Field recordings from outdoors with wind and traffic
  • 05Phone or video call recordings with noisy environments on one end

Your Audio Is Processed Locally, Not on a Server

The isolation pipeline is a 125 KB WebAssembly module loaded once in your browser. When you drop a file in, it is decoded and processed entirely on your own machine. No audio is streamed to a server, no account is created, and nothing is retained after you close the tab.

Processing time depends on the number of passes and file length. Two passes on a 10-minute file take roughly two to three minutes on a modern laptop. Files up to 200 MB are accepted.

Voice Isolator FAQ

How do I remove background music from a voice recording?

Drop your file into the Voice Isolator, set strength to 90–100, choose two or three passes, and click Isolate. The gate silences non-speech frames; the denoiser pulls down music bleeding through during words.

How is this different from the Voice Enhancer?

Voice Enhancer does a single denoise pass for a natural-feeling cleanup of steady noise. Voice Isolator stacks multiple passes and adds a voice-activity gate that silences anything outside speech — better for music, crowds, and varied noise.

What does the isolation strength slider do?

It sets how aggressively non-voice frames are attenuated. At 0 the gate is loose; at 100 anything the model isn't confident is voice goes to silence. 70–80 is a good starting point for podcasts, 90–100 for music or crowd removal.

Is my audio uploaded to a server?

No. The pipeline is a WebAssembly module that runs locally on your CPU. Nothing leaves your computer.

What output format do I get?

A mono 48 kHz WAV in 16-bit PCM. Use the Convert audio tool to export as MP3 if you need a smaller file.

How long can the recording be?

Up to 200 MB. Two passes process at roughly 3–5 times real-time on a modern laptop, so a 10-minute recording isolates in two to three minutes.

დაკავშირებული ინსტრუმენტები

გაგრძელება აუდიო ინსტრუმენტი

აუდიო ინსტრუმენტი