Multichannel to Binaural — v1.2 User Guide

Multichannel‑to‑binaural downmix via Spat5 virtual speakers. Takes any multichannel Sound (mono to 24‑channel 22.2) and renders a stereo binaural output using HRTF convolution. Single‑stage pipeline: input channels are treated as speaker feeds at known positions, and spat5.virtualspeakers~ convolves each with the corresponding HRIR pair.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.2 (2026) License: MIT License Repo: GitHub

Contents:

What it does Quick start Supported layouts HRTF & room presets Parameters Visualization FAQ / troubleshooting

What this does

Multichannel to Binaural v1.2 renders a multichannel audio file to a stereo binaural output using Spat5’s spat5.virtualspeakers~ command‑line tool. The input channels are treated as speaker feeds at known positions (e.g., L/R, quad, 5.1, 7.1, 22.2). Each channel is convolved with the corresponding HRIR (Head‑Related Impulse Response) pair, and the results are summed to produce a binaural stereo signal.

Single‑stage pipeline:

Praat exports the selected multichannel Sound to a temporary WAV.
Python bridge calls spat5.virtualspeakers~ with the chosen layout, HRTF, ITD scaling, and room preset.
Praat imports the resulting stereo binaural WAV and optionally visualises the input (first 2 channels) and output waveforms.

Supported layouts include mono, stereo, quad, 5.0, 5.1, 7.0, 7.1, 7.1.2, 7.1.4, and full 22.2 (24 channels). The HRTF can be KEMAR (neutral, hall, or studio) or a custom SOFA file. Room simulation (early reflections) can be added for presets 1–3.

Quick start

Install Spat5 (IRCAM) and ensure spat5.virtualspeakers~.exe is in the tools folder.
Place the Python helper script spat_binaural_bridge.py in the same folder as the Praat script.
In Praat, select a multichannel Sound object (mono to 24 channels).
Run script… → Multichannel_to_Binaural.praat.
Set the Tools_folder to the directory containing spat5.virtualspeakers~.
Choose a Layout_preset (Auto‑detect from channel count, or select manually).
Select an Hrtf_preset (KEMAR / neutral, KEMAR / hall, KEMAR / studio, or custom SOFA).
Optionally adjust Pre_gain_db (to avoid clipping).
Click OK. Praat exports WAV, calls Python, and imports the result as originalname_binaural.

Tip: For a 22.2 (24‑channel) file, select layout 22.2. For a standard stereo file, Auto‑detect works or choose 2.0. The visualisation shows the first two input channels (if available) and the binaural output (blue = left ear, orange = right ear).

Important: This script requires Spat5 command‑line tools and a working Python installation. The Python bridge validates the output WAV (2‑channel stereo, non‑silent, reasonable L/R decorrelation) and logs warnings for dual‑mono or silent channels.

Supported layouts

Channel count → layout mapping (Auto‑detect):

1 ch → Mono
2 ch → 2.0 (L/R at ±30°)
4 ch → 4.0 (FL/FR/BL/BR)
5 ch → 5.0 (L/R/C/Ls/Rs)
6 ch → 5.1 (5.0 + LFE)
7 ch → 7.0 (L/R/C/Ls/Rs/Lss/Rss)
8 ch → 7.1 (7.0 + LFE)
10 ch → 7.1.2 (7.1 + TpFL/TpFR)
12 ch → 7.1.4 (7.1 + TpFL/TpFR/TpBL/TpBR)
24 ch → 22.2 (NHK full sphere)

You can also select a layout manually if your file has an unconventional channel count (e.g., a 6‑channel file that is actually 5.0 + LFE is “5.1”). The layout determines the speaker angles used by Spat5 for HRTF convolution.

HRTF & room presets

🎧 KEMAR HRTF (neutral / hall / studio)

KEMAR (Knowles Electronics Manikin for Acoustic Research) is a widely used dummy‑head HRTF dataset. The three presets differ only in the room simulation (early reflections and late reverb):

KEMAR / neutral – no room added (anechoic).
KEMAR / hall – concert hall simulation.
KEMAR / studio – studio control room simulation.

🎧 Custom SOFA + room preset

For custom SOFA HRTF files (e.g., cipic.sofa, ari.sofa), select SOFA custom / none and provide the SOFA filename (without extension). Then choose a Room_preset (none, hall, livingroom, studio) to add room simulation.

Room simulation is applied after the HRTF convolution and adds early reflections and late reverberation based on the selected room model.

ITD_percent scales the interaural time difference (100% = natural, 0% = no ITD). Values up to 200% exaggerate the stereo width.

Parameters & defaults

Spat5 tools

。spat5.virtualspeakers~.exe.

Parameter	Default	Description
Tools_folder	C:/Users/User/Documents/Max 9/Packages/spat5-x64/media/tools/

Layout

ParameterOptionsDefaultDescription Layout_presetAuto‑detect / 2.0 / 4.0 / 5.0 / 5.1 / 7.0 / 7.1 / 7.1.2 / 7.1.4 / 22.2Auto‑detect。

HRTF / room

ParameterOptionsDefaultDescription Hrtf_presetKEMAR / neutral / KEMAR / hall / KEMAR / studio / SOFA custom / noneKEMAR / neutral。 Sofa_fileanykemar。 Itd_percent0–200100。 Room_presetnone / hall / livingroom / studionone。

Output

ParameterRangeDefaultDescription Pre_gain_db-30–0-6。 Draw_visualizationyes/noyes。 Play_resultyes/noyes。

Visualization (Praat picture)

When Draw_visualization = 1, the script draws a 3‑panel figure:

Input waveform – first two channels of the multichannel input (grey and light grey).
Output binaural waveform – left ear (blue) and right ear (orange).
Info panel – channel count, layout, HRTF, ITD, room, pre‑gain, sample rate.
Summary bar – output name, layout, duration, HRTF, room.

Tip: The output waveform should show decorrelated L/R channels (different shapes). If they look identical, check that your layout token is correct – a mono source rendered as 2.0 may produce dual‑mono output. The Python log includes a correlation check and warns about this.

FAQ / troubleshooting

“spat5.virtualspeakers~ not found”

Update the Tools_folder to the directory containing the Spat5 command‑line tools. On Windows, this is typically C:\...\Max 9\Packages\spat5-x64\media\tools\. The folder must contain spat5.virtualspeakers~.exe.

Output is mono / L and R identical

The Python bridge checks L/R correlation and logs a warning if correlation > 0.999. Common causes:

Wrong layout token (e.g., “mono” for a stereo file).
Input is effectively mono (all channels identical).
Spat5 rendering with no angular spread (e.g., all speakers at the same angle).

Check the log file (mcbin_log.txt) for the correlation value and warnings.

Output is silent / very quiet

The Python bridge checks per‑channel RMS and logs if either channel is near‑silent. Possible causes:

Pre_gain_db too low (increase to 0 or +6 dB).
Input file has very low amplitude – normalise before processing.
HRTF convolution produced silence (unlikely).

Auto‑detect layout

If your file has an unconventional channel count (e.g., 3‑channel L/C/R), Auto‑detect will fail. Select a layout manually. The layout determines how Spat5 interprets the channel order – for example, 5.1 expects L, R, C, LFE, Ls, Rs in that order.

Room preset vs. HRTF preset

For KEMAR presets (1–3), the room simulation is fixed (neutral = none, hall = hall, studio = studio). For custom SOFA, the room preset is independent – you can add room simulation to any custom HRTF.

ITD scaling

ITD (interaural time difference) is the arrival time difference between the two ears. Natural ITD depends on head size (≈ 660 µs for KEMAR). Scaling to 200% exaggerates the width; 0% removes ITD entirely, making the image appear centred regardless of angle.