IRCAM Pan to Binaural — User Guide

Mono/stereo → HOA encode → HOA decode → binaural (3‑stage pipeline). Uses Spat5 command‑line tools via a Python bridge. Supports static positioning and 10 animated trajectories (Linear, Circular, Figure‑8, Spiral, Pendulum, Zigzag, Random Walk, Ellipse, Square).

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 0.9 (2026) – Python Bridge Edition License: MIT License Repo: GitHub

Contents:

What it does Quick start IRCAM Spat5 framework 3‑stage pipeline Trajectory types (10) Parameters Visualization FAQ / troubleshooting

What this does

IRCAM Pan to Binaural is a bridge between Praat and the Spat5 suite (IRCAM – Institut de Recherche et Coordination Acoustique/Musique). It performs a 3‑stage spatial audio pipeline:

HOA encode – spatialise a mono or stereo source to a Higher‑Order Ambisonics (HOA) soundfield (order 1–5).
HOA decode – decode the HOA soundfield to a speaker layout (2.0, 4.0, 5.0, 7.0, 7.1).
Binaural – render the speaker feed to stereo binaural using HRTFs (KEMAR / SOFA).

Key features:

Static positioning – place the source at a fixed azimuth/elevation.
Animated trajectories – 10 movement patterns (linear, circular, figure‑8, spiral, pendulum, zigzag, random walk, ellipse, square).
HOA order – 1 (4ch) to 5 (36ch) for increasing spatial resolution.
HRTF presets – KEMAR (neutral / hall), custom SOFA files, ITD scaling.
Room simulation – hall, living room, studio, and legacy presets.
Pre‑gain calibration – compensate for HOA chain gain (+7 dB typical).

Quick start

Install Spat5 (IRCAM) and ensure the command‑line tools (spat5.hoa.encoder~, spat5.hoa.decoder~, spat5.virtualspeakers~) are available.
Place the Python helper script spat_bridge.py in the same folder as this Praat script.
In Praat, select exactly one Sound object (mono or stereo).
Run script… → IRCAM_Pan_to_Binaural.praat.
Choose movement_enabled (static or animated).
If static: set azimuth (‑180…180°) and elevation (‑90…90°).
If animated: select a movement_type (1–10) and adjust parameters (radius, speed, chunk duration).
Set hoa_order (1–5), decode_layout (2.0, 4.0, 5.0, 7.0, 7.1).
Select an HRTF preset (KEMAR / neutral, KEMAR / hall, or custom SOFA).
Click OK. Praat exports chunks (if moving), calls the Python bridge, and returns a binaural stereo Sound object.

Tip: For a simple static panner, turn off movement and set azimuth to 0° (centre), 90° (right), or ‑90° (left). For a dramatic sweep, enable movement with Linear trajectory from ‑90° to 90°.

Important: This script requires Spat5 command‑line tools and a working Python installation. The Python bridge expects the Spat5 tools folder to be correctly set (tools_folder$). The default path is for a typical Max 9 installation – adjust if needed.

IRCAM Spat5 framework

🎛️ What is Spat5?

Spat5 is a suite of spatialisation tools developed at IRCAM (Institut de Recherche et Coordination Acoustique/Musique) in Paris. It provides high‑quality spatial audio processing for composition, performance, and research. The tools are available as Max/MSP externals, standalone command‑line applications, and VST/AU plugins.

This script uses the command‑line versions of three Spat5 processors:

spat5.hoa.encoder~ – encodes a mono/stereo source into a Higher‑Order Ambisonics soundfield.
spat5.hoa.decoder~ – decodes an HOA soundfield to a specific speaker layout.
spat5.virtualspeakers~ – renders a speaker‑layout feed to binaural stereo using HRTFs.

Higher‑Order Ambisonics (HOA)

Ambisonics is a full‑sphere surround sound technique that represents a soundfield using spherical harmonics. Higher‑Order Ambisonics (HOA) increases spatial resolution by using more harmonics:

Order 1 → 4 channels (basic horizontal + height).
Order 2 → 9 channels (finer spatial detail).
Order 3 → 16 channels.
Order 4 → 25 channels.
Order 5 → 36 channels.

Higher orders produce a more accurate soundfield but require more processing and channel bandwidth. The script supports orders 1–5.

Normalisation schemes

Ambisonics components can be scaled using different normalisation conventions. Spat5 supports:

SN3D (Semi‑Normalised, 3D) – standard for many modern Ambisonics applications.
N3D (Normalised, 3D) – often used in older systems.
FuMa (Furse‑Malham) – legacy format from early Ambisonics.

Choose the same normalisation that your downstream decoder expects. N3D is a safe default.

Speaker layouts

The decoder stage converts the HOA soundfield to a specific speaker layout. Available layouts:

2.0 – stereo (L30°, R30°).
4.0 – quadraphonic (L30°, R30°, Ls110°, Rs110°).
5.0 – standard surround (L30°, R30°, C0°, Ls110°, Rs110°).
7.0 – 7‑channel surround (adds L90°, R90°).
7.1 – 7.1 surround (includes LFE channel).

The binaural stage then renders these virtual speakers to headphones using HRTFs.

HRTF / binaural rendering

Binaural rendering simulates the way sound reaches the ears by convolving the speaker feed with Head‑Related Transfer Functions (HRTFs). Spat5 supports:

SOFA (Spatially Oriented Format for Acoustics) – standard file format for HRTF data.
ITD scaling – adjusts the Interaural Time Difference (natural 100 % = realistic).
Room presets – add early reflections and late reverberation (hall, living room, studio, etc.).

The script includes presets for the KEMAR manikin (Knowles Electronics Manikin for Acoustic Research), a widely used HRTF dataset.

Why use Spat5?

Spat5 is a professional‑grade spatialisation toolkit used in countless electroacoustic compositions and research projects. By bridging Praat to Spat5, this script gives Praat users access to:

High‑quality Ambisonics encoding/decoding.
Accurate binaural rendering with validated HRTFs.
Room simulation and spatial effects.
Command‑line automation (no GUI needed).

This makes it ideal for batch processing, algorithmic composition, and research pipelines.

3‑stage pipeline

Stage 1 – HOA encode (spat5.hoa.encoder~)
Mono input → encoded as a point source at (az, el).
Stereo input → encoded as two sources spread by stereo_spread_deg (left = az+spread/2, right = az‑spread/2).
Output: HOA‑order multichannel WAV (e.g., order 1 → 4ch, order 2 → 9ch, etc.).

Stage 2 – HOA decode (spat5.hoa.decoder~)
HOA WAV → decoded to a speaker layout (2.0, 4.0, 5.0, 7.0, 7.1).
Speaker positions are predefined (e.g., 5.0: L30°, R30°, C0°, Ls110°, Rs110°).

Stage 3 – Binaural (spat5.virtualspeakers~)
Speaker‑layout WAV → convolved with HRTFs to produce stereo binaural output.
Supports SOFA HRTF files, ITD scaling, and room simulation presets.

For moving sources, the sound is split into chunks (movement_chunk_dur seconds), each chunk is processed independently with its own azimuth, then all chunks are concatenated with crossfades.

Trajectory types (10)

。start_az to end_az.。。。。。。。。。。。。。。。。。。。

Type	Description	Parameters
Linear
Circular
Figure‑8
Spiral In
Spiral Out
Pendulum
Zigzag
Random Walk
Ellipse
Square

All trajectories are 2D (elevation fixed to the form value). Coordinates are converted to azimuth using az = arctan2(x, y) (0° = front, +90° = right). The chunk duration determines the time resolution of the animation.

Parameters & defaults

Position / movement

。。。。。。。。

Parameter	Range	Default
movement_enabled	0/1	0
azimuth / elevation	-180…180 / -90…90	0 / 0
movement_type	1–10	2 (Circular)
start_az / end_az	-180…180	-90 / 90
trajectory_radius	0–1.4	0.8
trajectory_speed	0.2–3.0	1.0
movement_chunk_dur	0.1–2.0	0.5
xfade_dur	0–0.3	0.1

Source / HOA

。。。。。

Parameter	Range	Default
stereo_spread_deg	0–180	60
hoa_order	1–5	1
hoa_norm	SN3D / N3D / FuMa	N3D
decode_layout	2.0 / 4.0 / 5.0 / 7.0 / 7.1	2.0
pre_gain_db	-30…0	-9

HRTF / room

ParameterOptionsDefaultDescription preset_managementManual / KEMAR neutral / KEMAR hall / SOFA customManual。 sofa_file。kemar。 itd_percent0–200100。 room_presetnone / hall / living_room / studio / preset1‑4 / legacy1‑3none。

Visualization (Praat picture)

When the script finishes, it draws a 5‑panel visualisation:

Spatial map (top‑down) – unit circle with speaker positions (numbered) and the source trajectory (red line). For static mode, a single red dot with azimuth label.
Parameter panel – source mode (mono/stereo), HOA order/layout, movement parameters (or static position).
Waveforms – original input (grey), binaural left (blue), binaural right (orange).
Summary bar – output name, HRTF preset, room, duration.

Tip: The spatial map shows the trajectory path for animated sources. Speaker numbers correspond to the decode layout (e.g., for 5.0: 1=L30°, 2=R30°, 3=C0°, 4=Ls110°, 5=Rs110°).

FAQ / troubleshooting

“spat5.hoa.encoder~ not found”

The script expects the Spat5 command‑line tools to be in tools_folder$ (default C:/Users/User/Documents/Max 9/Packages/spat5-x64/media/tools/). Update this path to your Spat5 installation. On macOS/Linux, the path will differ – edit the script accordingly. Spat5 can be installed via the Max package manager or directly from IRCAM.

Python bridge fails / output not created

Ensure spat_bridge.py is in the same folder as the Praat script. Also check that Python is in your PATH and that the Spat5 tools are executable. The script prints the log file contents on error – examine the Info window for details. On Windows, the bridge adds the Spat5 support folder to PATH automatically.

Movement chunks cause clicks at boundaries

Increase xfade_dur (e.g., to 0.15 s) to smooth transitions. The crossfade is applied to each chunk before concatenation. If the source has strong transients, a longer crossfade may be needed. For very smooth trajectories, reduce movement_chunk_dur (e.g., 0.2 s) and increase xfade_dur accordingly.

Pre‑gain calibration

The HOA chain (encode → decode → binaural) adds approximately +7 dB of gain. The default pre‑gain of -9 dB compensates for this, keeping the output level similar to the input. If the output is too quiet, increase pre_gain_db (e.g., to -3 dB). You can also adjust the final output gain in Praat after processing.

HOA order vs. decode layout

Higher HOA orders allow better spatial resolution, but the decode layout determines how many speakers are used in the intermediate stage. For binaural output, the speaker layout is virtualised via HRTF – using a 7.1 layout with HOA order 3 gives a more detailed spatial image than order 1. However, higher orders increase channel count and processing time.

SOFA HRTF files

The script expects SOFA files (e.g., kemar.sofa) to be in the Spat5 support folder. The sofa_file parameter should be the base name without extension (e.g., “kemar”). The actual file path is determined by Spat5’s search path. Common HRTF datasets available in SOFA format include KEMAR, CIPIC, ARI, and SADIE II.

Spat5 licensing

Spat5 is free for non‑commercial research and educational use. Commercial users should contact IRCAM for licensing. The command‑line tools are distributed with the Spat5 package and can be used without a Max license.