Multichannel to Binaural — v1.2 User Guide
Multichannel‑to‑binaural downmix via Spat5 virtual speakers. Takes any multichannel Sound (mono to 24‑channel 22.2) and renders a stereo binaural output using HRTF convolution. Single‑stage pipeline: input channels are treated as speaker feeds at known positions, and spat5.virtualspeakers~ convolves each with the corresponding HRIR pair.
What this does
Multichannel to Binaural v1.2 renders a multichannel audio file to a stereo binaural output using Spat5’s spat5.virtualspeakers~ command‑line tool. The input channels are treated as speaker feeds at known positions (e.g., L/R, quad, 5.1, 7.1, 22.2). Each channel is convolved with the corresponding HRIR (Head‑Related Impulse Response) pair, and the results are summed to produce a binaural stereo signal.
- Praat exports the selected multichannel Sound to a temporary WAV.
- Python bridge calls
spat5.virtualspeakers~with the chosen layout, HRTF, ITD scaling, and room preset. - Praat imports the resulting stereo binaural WAV and optionally visualises the input (first 2 channels) and output waveforms.
Supported layouts include mono, stereo, quad, 5.0, 5.1, 7.0, 7.1, 7.1.2, 7.1.4, and full 22.2 (24 channels). The HRTF can be KEMAR (neutral, hall, or studio) or a custom SOFA file. Room simulation (early reflections) can be added for presets 1–3.
Quick start
- Install Spat5 (IRCAM) and ensure
spat5.virtualspeakers~.exeis in the tools folder. - Place the Python helper script
spat_binaural_bridge.pyin the same folder as the Praat script. - In Praat, select a multichannel Sound object (mono to 24 channels).
- Run script… →
Multichannel_to_Binaural.praat. - Set the Tools_folder to the directory containing
spat5.virtualspeakers~. - Choose a Layout_preset (Auto‑detect from channel count, or select manually).
- Select an Hrtf_preset (KEMAR / neutral, KEMAR / hall, KEMAR / studio, or custom SOFA).
- Optionally adjust Pre_gain_db (to avoid clipping).
- Click OK. Praat exports WAV, calls Python, and imports the result as
originalname_binaural.
Supported layouts
- 1 ch → Mono
- 2 ch → 2.0 (L/R at ±30°)
- 4 ch → 4.0 (FL/FR/BL/BR)
- 5 ch → 5.0 (L/R/C/Ls/Rs)
- 6 ch → 5.1 (5.0 + LFE)
- 7 ch → 7.0 (L/R/C/Ls/Rs/Lss/Rss)
- 8 ch → 7.1 (7.0 + LFE)
- 10 ch → 7.1.2 (7.1 + TpFL/TpFR)
- 12 ch → 7.1.4 (7.1 + TpFL/TpFR/TpBL/TpBR)
- 24 ch → 22.2 (NHK full sphere)
You can also select a layout manually if your file has an unconventional channel count (e.g., a 6‑channel file that is actually 5.0 + LFE is “5.1”). The layout determines the speaker angles used by Spat5 for HRTF convolution.
HRTF & room presets
🎧 KEMAR HRTF (neutral / hall / studio)
KEMAR (Knowles Electronics Manikin for Acoustic Research) is a widely used dummy‑head HRTF dataset. The three presets differ only in the room simulation (early reflections and late reverb):
- KEMAR / neutral – no room added (anechoic).
- KEMAR / hall – concert hall simulation.
- KEMAR / studio – studio control room simulation.
🎧 Custom SOFA + room preset
For custom SOFA HRTF files (e.g., cipic.sofa, ari.sofa), select SOFA custom / none and provide the SOFA filename (without extension). Then choose a Room_preset (none, hall, livingroom, studio) to add room simulation.
Room simulation is applied after the HRTF convolution and adds early reflections and late reverberation based on the selected room model.
ITD_percent scales the interaural time difference (100% = natural, 0% = no ITD). Values up to 200% exaggerate the stereo width.
Parameters & defaults
Spat5 tools
| Parameter | Default | Description |
|---|---|---|
| Tools_folder | C:/Users/User/Documents/Max 9/Packages/spat5-x64/media/tools/ | 。
Layout
HRTF / room
Output
Visualization (Praat picture)
When Draw_visualization = 1, the script draws a 3‑panel figure:
- Input waveform – first two channels of the multichannel input (grey and light grey).
- Output binaural waveform – left ear (blue) and right ear (orange).
- Info panel – channel count, layout, HRTF, ITD, room, pre‑gain, sample rate.
- Summary bar – output name, layout, duration, HRTF, room.
FAQ / troubleshooting
Update the Tools_folder to the directory containing the Spat5 command‑line tools. On Windows, this is typically C:\...\Max 9\Packages\spat5-x64\media\tools\. The folder must contain spat5.virtualspeakers~.exe.
The Python bridge checks L/R correlation and logs a warning if correlation > 0.999. Common causes:
- Wrong layout token (e.g., “mono” for a stereo file).
- Input is effectively mono (all channels identical).
- Spat5 rendering with no angular spread (e.g., all speakers at the same angle).
Check the log file (mcbin_log.txt) for the correlation value and warnings.
The Python bridge checks per‑channel RMS and logs if either channel is near‑silent. Possible causes:
- Pre_gain_db too low (increase to 0 or +6 dB).
- Input file has very low amplitude – normalise before processing.
- HRTF convolution produced silence (unlikely).
If your file has an unconventional channel count (e.g., 3‑channel L/C/R), Auto‑detect will fail. Select a layout manually. The layout determines how Spat5 interprets the channel order – for example, 5.1 expects L, R, C, LFE, Ls, Rs in that order.
For KEMAR presets (1–3), the room simulation is fixed (neutral = none, hall = hall, studio = studio). For custom SOFA, the room preset is independent – you can add room simulation to any custom HRTF.
ITD (interaural time difference) is the arrival time difference between the two ears. Natural ITD depends on head size (≈ 660 µs for KEMAR). Scaling to 200% exaggerates the width; 0% removes ITD entirely, making the image appear centred regardless of angle.