Blind Dereverberation — WPE — User Guide

Blind dereverberation using WPE (Weighted Prediction Error). Removes room reverberation without knowing the room acoustics. Powered by nara_wpe (Python package).

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 2.0 (2025) License: MIT License Citation: Cohen, S. (2025). Praat AudioTools Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools

Contents:

What this does Quick start WPE Theory Preset Strategies Parameters & Controls Visualization & Analysis Applications

What this does

This script implements blind dereverberation using the Weighted Prediction Error (WPE) algorithm. It removes room reverberation from an audio recording without requiring knowledge of the room acoustics or the original dry signal. The result is a "dryer" version of the input, with reduced reverb tails and improved clarity.

🎚️ What is WPE Dereverberation?

WPE (Weighted Prediction Error) is a state-of-the-art blind dereverberation algorithm that:

Models late reverberation as a long-term linear prediction
Estimates prediction coefficients from the reverberant signal itself
Subtracts predicted reverberation to recover the dry signal
Works blindly — no knowledge of room impulse response needed

The algorithm operates in the STFT domain and iteratively refines the dereverberated signal.

Key Features:

4 Preset Strategies — Light Room to Maximum, plus Custom
WPE Algorithm — Powered by nara_wpe Python package
Adjustable Parameters — Iterations, delay, filter length
Multi-Channel Support — Processes stereo and multichannel audio
8 kHz Limit — Focuses on reverb-dominated frequencies (adjustable in script)
Peak Normalization — Preserves dynamics, only attenuates if clipping would occur
Comprehensive Visualization — 4-panel display with waveforms, spectrograms, intensity comparison

Technical Implementation: (1) STFT: Convert audio to frequency domain with 512-point FFT, 256-sample hop. (2) Frequency Limiting: Process only bins up to 8000 Hz (reverb is concentrated in lower frequencies). (3) WPE: Apply nara_wpe.wpe with specified parameters. (4) ISTFT: Reconstruct time-domain signal. (5) Normalization: Scale only if peak exceeds 0.95. (6) Visualization: 4-panel display.

Quick start

In Praat, select exactly one Sound object (any duration, mono or stereo).
Run script… → select Dereverberation.praat.
Choose Preset (2-5 for specific strategies, 1 for custom).
Adjust parameters if using custom mode (iterations, delay, filter length).
Enable Draw_visualization for analysis display.
Click OK — script exports audio, runs Python WPE, imports result.

Quick tip: Start with Medium Room preset on a recording with noticeable reverb (e.g., speech recorded in a room). Enable visualization — compare the original and dereverberated spectrograms: you should see reduced reverb tails, especially in higher frequencies. The output appears as "source_dry" in the Objects window.

Important: PYTHON DEPENDENCIES — Requires Python with numpy, soundfile, and nara_wpe installed. Install with: pip install nara_wpe. PROCESSING TIME scales with audio length and filter length — 30-second files may take 20-40 seconds. 8 kHz LIMIT — The script processes only frequencies up to 8000 Hz, as reverberation is most prominent there. Higher frequencies are left unchanged. NORMALIZATION only applies if peak exceeds 0.95 — dynamics are preserved.

WPE Theory

The Reverberation Model

In the STFT domain, the reverberant signal Y(t,f) can be modeled as: Y(t,f) = X(t,f) + Σ_{τ=D}^{D+L-1} c(τ,f) · Y(t-τ,f) where: X(t,f) = desired dry signal Y(t-τ,f) = past reverberant frames D = prediction delay (samples after direct sound) L = filter length (number of taps) c(τ,f) = prediction coefficients per frequency bin The late reverberation is predicted from past frames and subtracted.

WPE Algorithm

📈 Weighted Prediction Error

The WPE algorithm iteratively estimates the prediction coefficients and the dry signal:

For iteration i: 1. Estimate prediction coefficients ĉ(τ,f) from current estimate of X(t,f) 2. Compute predicted late reverberation: R(t,f) = Σ ĉ(τ,f) · Y(t-τ,f) 3. Update dry signal estimate: X(t,f) = Y(t,f) - R(t,f) 4. Re-weight the error for next iteration

Key parameters:

Iterations: More iterations allow better convergence but risk over-subtraction. 5-20 typical.
Delay (D): Frames after direct sound before prediction starts. 2-5 frames typical. Larger = safer, smaller = more aggressive.
Filter length (L): Number of past frames used for prediction. 10-30 typical. Longer captures more reverb but increases computation.

Frequency Limiting

Reverberation is most audible and problematic below about 8 kHz. Higher frequencies are dominated by direct sound and early reflections, and WPE may introduce artifacts there. This script processes only bins up to 8000 Hz, leaving higher frequencies unchanged: max_bin = min(n_bins, 8000 Hz × FFT_size / sample_rate)

Parameter Guidelines

Parameter	Low	Medium	High
Iterations	5 — subtle, safe	10 — balanced	20 — aggressive, risk of artifacts
Delay	2 — aggressive (may remove early reflections)	3 — balanced	4-5 — conservative, safer
Filter length	10 — short reverb	15-20 — medium rooms	25-30 — large halls, long reverb tails

Preset Strategies

Preset 2: Light Room (small reverb)

🏠 Small Room

Iterations: 5 | Delay: 2 | Filter length: 10

Character: Subtle dereverberation for lightly reverberant spaces — safe, minimal artifacts

Use on: Speech in small rooms, close-miked recordings

Preset 3: Medium Room

🏢 Typical Room

Iterations: 10 | Delay: 3 | Filter length: 15

Character: Balanced dereverberation for typical recording environments

Use on: General purpose, speech, instrumental recordings

Preset 4: Large Hall (heavy reverb)

🏛️ Concert Hall

Iterations: 15 | Delay: 3 | Filter length: 20

Character: Strong dereverberation for large spaces with long reverb tails

Use on: Church recordings, concert halls, very reverberant rooms

Preset 5: Maximum (very wet signal)

💧 Extreme Dereverb

Iterations: 20 | Delay: 4 | Filter length: 30

Character: Maximum dereverberation — may introduce artifacts, use with caution

Use on: Extremely wet recordings, where clarity is paramount

Parameters & Controls

WPE Parameters

Parameter	Default	Description
Iterations	10	Number of WPE iterations (5-20) — more = stronger dereverberation
Delay	3	Prediction delay in frames (2-5) — larger = more conservative
Filter_length	15	Prediction filter taps (10-30) — longer = more reverb removed

Output

Parameter	Default	Description
Draw_visualization	1	Generate 4-panel analysis display
Play_result	1	Audition after processing

Fixed Parameters (internal)

Parameter	Value	Description
FFT size	512	STFT window length
Hop size	256	STFT hop length
Max frequency	8000 Hz	Frequencies above this are left unchanged
Normalization	0.95	Only scale if peak exceeds this threshold

Visualization & Analysis

4-Panel Display

Blind Dereverberation Visualization: Panel 1: TITLE • Script name, source name, preset, iterations, filter length Panel 2: ORIGINAL WAVEFORM • Gray waveform • Title: "Original" • Duration displayed Panel 3: DEREFERERATED WAVEFORM • Teal waveform • Title: "Dry" • X-axis: Time (s) Panel 4: ORIGINAL SPECTROGRAM • 0-5000 Hz spectrogram of original • Title: "Original Spectrogram" Panel 5: DEREFERERATED SPECTROGRAM • 0-5000 Hz spectrogram of processed output • Title: "Dereverberated Spectrogram" Panel 6: INTENSITY COMPARISON • X-axis: Time, Y-axis: dB • Gray line = original intensity • Teal line = dereverberated intensity • Title: "Intensity: Grey = original | Green = dereverberated" Panel 7: SUMMARY PANEL • WPE parameters (iterations, delay, filter length) • RMS original vs processed, ratio • Duration, sample rate, channels, preset

Reading the Spectrograms

What to look for:

Reverb tails: In the original spectrogram, look for energy decaying after transient events (especially in higher frequencies). In the processed version, these tails should be reduced.
Transient preservation: The attack portions of sounds should remain sharp — if they become smeared, WPE parameters are too aggressive.
High frequencies: Above 8000 Hz, the spectrogram should be identical to original (since they're unchanged).
Low frequencies: Dereverberation is most effective in the 1-5 kHz range where reverb is most problematic.

Interpreting Intensity Comparison

What the lines show:

Gray (original): Full intensity contour including reverb energy
Teal (processed): Should have lower intensity in reverb tails, similar peaks during transients
Ratio: The RMS ratio (processed/original) indicates overall energy reduction — typically 0.8-0.95 for moderate dereverberation

Applications

Speech Enhancement

Use case: Improving intelligibility of speech recorded in reverberant rooms

Technique: Medium Room preset, adjust based on room size

Workflow:

Import speech recording with noticeable reverb
Run with Medium Room preset
Listen for clarity improvement — should sound "closer" and more direct
If artifacts appear, reduce iterations or increase delay

Music Production

Use case: Reducing unwanted room sound in live recordings

Technique: Light Room or Medium Room depending on reverb amount

Applications:

Live instruments: Clean up acoustic recordings made in untreated rooms
Vocal tracks: Reduce ambience for a more studio-like sound
Field recordings: Tame environmental reverb while preserving character

Forensic Audio

Use case: Enhancing intelligibility of recordings made in reverberant spaces

Technique: Maximum preset with caution

Considerations:

Aggressive dereverberation may introduce artifacts that could be mistaken for evidence tampering
Document all processing parameters for transparency

Research & Education

Use case: Demonstrating reverberation and dereverberation concepts

Technique: Enable visualization, compare presets on test signals

Learning outcomes:

See reverberation in spectrograms as decaying energy after transients
Observe how different WPE parameters affect dereverberation
Understand trade-off between reverb reduction and artifact introduction

Practical Workflow Examples

🎤 Podcast Recording Enhancement

Goal: Clean up podcast recorded in a living room with noticeable reverb

Settings:

Source: 30-minute podcast recording
Preset: Medium Room
Custom: iterations=8 (slightly gentler), filter_length=18

Result: Speech becomes more present and intelligible, room sound reduced

🎸 Live Guitar Recording

Goal: Reduce room ambience from acoustic guitar recorded in a church

Settings:

Source: 5-minute guitar performance
Preset: Large Hall
Custom: delay=4 (conservative), filter_length=25

Result: Long reverb tails reduced, guitar more focused while retaining some natural ambience

🔬 Research: Reverb Analysis

Goal: Study reverb characteristics of different rooms

Settings:

Record impulse responses or speech in various rooms
Run with different presets, compare spectrograms
Measure RMS reduction as proxy for reverb amount

Result: Quantitative comparison of room reverberation

Troubleshooting Common Issues

Problem: Python not found or missing nara_wpe
Cause: Python not installed, or nara_wpe not installed
Solution: Install Python and required packages: pip install nara_wpe

Problem: Output has artifacts (metallic sounds, warbling)
Cause: WPE parameters too aggressive for the material
Solution: Reduce iterations, increase delay, reduce filter_length

Problem: Dereverberation too weak
Cause: Parameters too conservative for the amount of reverb
Solution: Increase iterations, decrease delay, increase filter_length

Problem: High frequencies sound different
Cause: 8 kHz limit — frequencies above are unchanged by design
Solution: Accept as intended, or modify max_freq_hz in Python script

Problem: Processing very slow
Cause: Long files, high filter length
Solution: Reduce filter_length, or process shorter segments

Advanced Techniques

Adjust frequency limit:

In the Python script, modify max_freq_hz (currently 8000) to process higher frequencies. Be aware that WPE may introduce artifacts in high frequencies.

Multi-channel processing:

The script processes all channels independently. For stereo linked processing, modify to use the same prediction coefficients for both channels.

Batch processing:

For batch processing, save form settings and use Praat's scripting to call the script with different parameter sets.

Parallel processing:

For very long files, consider splitting into shorter segments, processing separately, and recombining.