Blind Dereverberation β WPE β User Guide
Blind dereverberation using WPE (Weighted Prediction Error). Removes room reverberation without knowing the room acoustics. Powered by nara_wpe (Python package).
What this does
This script implements blind dereverberation using the Weighted Prediction Error (WPE) algorithm. It removes room reverberation from an audio recording without requiring knowledge of the room acoustics or the original dry signal. The result is a "dryer" version of the input, with reduced reverb tails and improved clarity.
ποΈ What is WPE Dereverberation?
WPE (Weighted Prediction Error) is a state-of-the-art blind dereverberation algorithm that:
- Models late reverberation as a long-term linear prediction
- Estimates prediction coefficients from the reverberant signal itself
- Subtracts predicted reverberation to recover the dry signal
- Works blindly β no knowledge of room impulse response needed
The algorithm operates in the STFT domain and iteratively refines the dereverberated signal.
Key Features:
- 4 Preset Strategies β Light Room to Maximum, plus Custom
- WPE Algorithm β Powered by nara_wpe Python package
- Adjustable Parameters β Iterations, delay, filter length
- Multi-Channel Support β Processes stereo and multichannel audio
- 8 kHz Limit β Focuses on reverb-dominated frequencies (adjustable in script)
- Peak Normalization β Preserves dynamics, only attenuates if clipping would occur
- Comprehensive Visualization β 4-panel display with waveforms, spectrograms, intensity comparison
Technical Implementation: (1) STFT: Convert audio to frequency domain with 512-point FFT, 256-sample hop. (2) Frequency Limiting: Process only bins up to 8000 Hz (reverb is concentrated in lower frequencies). (3) WPE: Apply nara_wpe.wpe with specified parameters. (4) ISTFT: Reconstruct time-domain signal. (5) Normalization: Scale only if peak exceeds 0.95. (6) Visualization: 4-panel display.
Quick start
- In Praat, select exactly one Sound object (any duration, mono or stereo).
- Run scriptβ¦ β select
Dereverberation.praat. - Choose Preset (2-5 for specific strategies, 1 for custom).
- Adjust parameters if using custom mode (iterations, delay, filter length).
- Enable Draw_visualization for analysis display.
- Click OK β script exports audio, runs Python WPE, imports result.
pip install nara_wpe. PROCESSING TIME scales with audio length and filter length β 30-second files may take 20-40 seconds. 8 kHz LIMIT β The script processes only frequencies up to 8000 Hz, as reverberation is most prominent there. Higher frequencies are left unchanged. NORMALIZATION only applies if peak exceeds 0.95 β dynamics are preserved.
WPE Theory
The Reverberation Model
WPE Algorithm
π Weighted Prediction Error
The WPE algorithm iteratively estimates the prediction coefficients and the dry signal:
Key parameters:
- Iterations: More iterations allow better convergence but risk over-subtraction. 5-20 typical.
- Delay (D): Frames after direct sound before prediction starts. 2-5 frames typical. Larger = safer, smaller = more aggressive.
- Filter length (L): Number of past frames used for prediction. 10-30 typical. Longer captures more reverb but increases computation.
Frequency Limiting
Parameter Guidelines
| Parameter | Low | Medium | High |
|---|---|---|---|
| Iterations | 5 β subtle, safe | 10 β balanced | 20 β aggressive, risk of artifacts |
| Delay | 2 β aggressive (may remove early reflections) | 3 β balanced | 4-5 β conservative, safer |
| Filter length | 10 β short reverb | 15-20 β medium rooms | 25-30 β large halls, long reverb tails |
Preset Strategies
Preset 2: Light Room (small reverb)
π Small Room
Iterations: 5 | Delay: 2 | Filter length: 10
Character: Subtle dereverberation for lightly reverberant spaces β safe, minimal artifacts
Use on: Speech in small rooms, close-miked recordings
Preset 3: Medium Room
π’ Typical Room
Iterations: 10 | Delay: 3 | Filter length: 15
Character: Balanced dereverberation for typical recording environments
Use on: General purpose, speech, instrumental recordings
Preset 4: Large Hall (heavy reverb)
ποΈ Concert Hall
Iterations: 15 | Delay: 3 | Filter length: 20
Character: Strong dereverberation for large spaces with long reverb tails
Use on: Church recordings, concert halls, very reverberant rooms
Preset 5: Maximum (very wet signal)
π§ Extreme Dereverb
Iterations: 20 | Delay: 4 | Filter length: 30
Character: Maximum dereverberation β may introduce artifacts, use with caution
Use on: Extremely wet recordings, where clarity is paramount
Parameters & Controls
WPE Parameters
| Parameter | Default | Description |
|---|---|---|
| Iterations | 10 | Number of WPE iterations (5-20) β more = stronger dereverberation |
| Delay | 3 | Prediction delay in frames (2-5) β larger = more conservative |
| Filter_length | 15 | Prediction filter taps (10-30) β longer = more reverb removed |
Output
| Parameter | Default | Description |
|---|---|---|
| Draw_visualization | 1 | Generate 4-panel analysis display |
| Play_result | 1 | Audition after processing |
Fixed Parameters (internal)
| Parameter | Value | Description |
|---|---|---|
| FFT size | 512 | STFT window length |
| Hop size | 256 | STFT hop length |
| Max frequency | 8000 Hz | Frequencies above this are left unchanged |
| Normalization | 0.95 | Only scale if peak exceeds this threshold |
Visualization & Analysis
4-Panel Display
Reading the Spectrograms
- Reverb tails: In the original spectrogram, look for energy decaying after transient events (especially in higher frequencies). In the processed version, these tails should be reduced.
- Transient preservation: The attack portions of sounds should remain sharp β if they become smeared, WPE parameters are too aggressive.
- High frequencies: Above 8000 Hz, the spectrogram should be identical to original (since they're unchanged).
- Low frequencies: Dereverberation is most effective in the 1-5 kHz range where reverb is most problematic.
Interpreting Intensity Comparison
- Gray (original): Full intensity contour including reverb energy
- Teal (processed): Should have lower intensity in reverb tails, similar peaks during transients
- Ratio: The RMS ratio (processed/original) indicates overall energy reduction β typically 0.8-0.95 for moderate dereverberation
Applications
Speech Enhancement
Use case: Improving intelligibility of speech recorded in reverberant rooms
Technique: Medium Room preset, adjust based on room size
Workflow:
- Import speech recording with noticeable reverb
- Run with Medium Room preset
- Listen for clarity improvement β should sound "closer" and more direct
- If artifacts appear, reduce iterations or increase delay
Music Production
Use case: Reducing unwanted room sound in live recordings
Technique: Light Room or Medium Room depending on reverb amount
Applications:
- Live instruments: Clean up acoustic recordings made in untreated rooms
- Vocal tracks: Reduce ambience for a more studio-like sound
- Field recordings: Tame environmental reverb while preserving character
Forensic Audio
Use case: Enhancing intelligibility of recordings made in reverberant spaces
Technique: Maximum preset with caution
Considerations:
- Aggressive dereverberation may introduce artifacts that could be mistaken for evidence tampering
- Document all processing parameters for transparency
Research & Education
Use case: Demonstrating reverberation and dereverberation concepts
Technique: Enable visualization, compare presets on test signals
Learning outcomes:
- See reverberation in spectrograms as decaying energy after transients
- Observe how different WPE parameters affect dereverberation
- Understand trade-off between reverb reduction and artifact introduction
Practical Workflow Examples
π€ Podcast Recording Enhancement
Goal: Clean up podcast recorded in a living room with noticeable reverb
Settings:
- Source: 30-minute podcast recording
- Preset: Medium Room
- Custom: iterations=8 (slightly gentler), filter_length=18
Result: Speech becomes more present and intelligible, room sound reduced
πΈ Live Guitar Recording
Goal: Reduce room ambience from acoustic guitar recorded in a church
Settings:
- Source: 5-minute guitar performance
- Preset: Large Hall
- Custom: delay=4 (conservative), filter_length=25
Result: Long reverb tails reduced, guitar more focused while retaining some natural ambience
π¬ Research: Reverb Analysis
Goal: Study reverb characteristics of different rooms
Settings:
- Record impulse responses or speech in various rooms
- Run with different presets, compare spectrograms
- Measure RMS reduction as proxy for reverb amount
Result: Quantitative comparison of room reverberation
Troubleshooting Common Issues
Cause: Python not installed, or nara_wpe not installed
Solution: Install Python and required packages: pip install nara_wpe
Cause: WPE parameters too aggressive for the material
Solution: Reduce iterations, increase delay, reduce filter_length
Cause: Parameters too conservative for the amount of reverb
Solution: Increase iterations, decrease delay, increase filter_length
Cause: 8 kHz limit β frequencies above are unchanged by design
Solution: Accept as intended, or modify max_freq_hz in Python script
Cause: Long files, high filter length
Solution: Reduce filter_length, or process shorter segments
Advanced Techniques
In the Python script, modify max_freq_hz (currently 8000) to process higher frequencies. Be aware that WPE may introduce artifacts in high frequencies.
The script processes all channels independently. For stereo linked processing, modify to use the same prediction coefficients for both channels.
For batch processing, save form settings and use Praat's scripting to call the script with different parameter sets.
For very long files, consider splitting into shorter segments, processing separately, and recombining.