IRCAM Pan to Binaural — User Guide
Mono/stereo → HOA encode → HOA decode → binaural (3‑stage pipeline). Uses Spat5 command‑line tools via a Python bridge. Supports static positioning and 10 animated trajectories (Linear, Circular, Figure‑8, Spiral, Pendulum, Zigzag, Random Walk, Ellipse, Square).
What this does
IRCAM Pan to Binaural is a bridge between Praat and the Spat5 suite (IRCAM – Institut de Recherche et Coordination Acoustique/Musique). It performs a 3‑stage spatial audio pipeline:
- HOA encode – spatialise a mono or stereo source to a Higher‑Order Ambisonics (HOA) soundfield (order 1–5).
- HOA decode – decode the HOA soundfield to a speaker layout (2.0, 4.0, 5.0, 7.0, 7.1).
- Binaural – render the speaker feed to stereo binaural using HRTFs (KEMAR / SOFA).
- Static positioning – place the source at a fixed azimuth/elevation.
- Animated trajectories – 10 movement patterns (linear, circular, figure‑8, spiral, pendulum, zigzag, random walk, ellipse, square).
- HOA order – 1 (4ch) to 5 (36ch) for increasing spatial resolution.
- HRTF presets – KEMAR (neutral / hall), custom SOFA files, ITD scaling.
- Room simulation – hall, living room, studio, and legacy presets.
- Pre‑gain calibration – compensate for HOA chain gain (+7 dB typical).
Quick start
- Install Spat5 (IRCAM) and ensure the command‑line tools (
spat5.hoa.encoder~,spat5.hoa.decoder~,spat5.virtualspeakers~) are available. - Place the Python helper script
spat_bridge.pyin the same folder as this Praat script. - In Praat, select exactly one Sound object (mono or stereo).
- Run script… →
IRCAM_Pan_to_Binaural.praat. - Choose movement_enabled (static or animated).
- If static: set azimuth (‑180…180°) and elevation (‑90…90°).
- If animated: select a movement_type (1–10) and adjust parameters (radius, speed, chunk duration).
- Set hoa_order (1–5), decode_layout (2.0, 4.0, 5.0, 7.0, 7.1).
- Select an HRTF preset (KEMAR / neutral, KEMAR / hall, or custom SOFA).
- Click OK. Praat exports chunks (if moving), calls the Python bridge, and returns a binaural stereo Sound object.
tools_folder$). The default path is for a typical Max 9 installation – adjust if needed.
IRCAM Spat5 framework
🎛️ What is Spat5?
Spat5 is a suite of spatialisation tools developed at IRCAM (Institut de Recherche et Coordination Acoustique/Musique) in Paris. It provides high‑quality spatial audio processing for composition, performance, and research. The tools are available as Max/MSP externals, standalone command‑line applications, and VST/AU plugins.
This script uses the command‑line versions of three Spat5 processors:
spat5.hoa.encoder~– encodes a mono/stereo source into a Higher‑Order Ambisonics soundfield.spat5.hoa.decoder~– decodes an HOA soundfield to a specific speaker layout.spat5.virtualspeakers~– renders a speaker‑layout feed to binaural stereo using HRTFs.
Higher‑Order Ambisonics (HOA)
Ambisonics is a full‑sphere surround sound technique that represents a soundfield using spherical harmonics. Higher‑Order Ambisonics (HOA) increases spatial resolution by using more harmonics:
- Order 1 → 4 channels (basic horizontal + height).
- Order 2 → 9 channels (finer spatial detail).
- Order 3 → 16 channels.
- Order 4 → 25 channels.
- Order 5 → 36 channels.
Higher orders produce a more accurate soundfield but require more processing and channel bandwidth. The script supports orders 1–5.
Normalisation schemes
Ambisonics components can be scaled using different normalisation conventions. Spat5 supports:
- SN3D (Semi‑Normalised, 3D) – standard for many modern Ambisonics applications.
- N3D (Normalised, 3D) – often used in older systems.
- FuMa (Furse‑Malham) – legacy format from early Ambisonics.
Choose the same normalisation that your downstream decoder expects. N3D is a safe default.
Speaker layouts
The decoder stage converts the HOA soundfield to a specific speaker layout. Available layouts:
- 2.0 – stereo (L30°, R30°).
- 4.0 – quadraphonic (L30°, R30°, Ls110°, Rs110°).
- 5.0 – standard surround (L30°, R30°, C0°, Ls110°, Rs110°).
- 7.0 – 7‑channel surround (adds L90°, R90°).
- 7.1 – 7.1 surround (includes LFE channel).
The binaural stage then renders these virtual speakers to headphones using HRTFs.
HRTF / binaural rendering
Binaural rendering simulates the way sound reaches the ears by convolving the speaker feed with Head‑Related Transfer Functions (HRTFs). Spat5 supports:
- SOFA (Spatially Oriented Format for Acoustics) – standard file format for HRTF data.
- ITD scaling – adjusts the Interaural Time Difference (natural 100 % = realistic).
- Room presets – add early reflections and late reverberation (hall, living room, studio, etc.).
The script includes presets for the KEMAR manikin (Knowles Electronics Manikin for Acoustic Research), a widely used HRTF dataset.
Why use Spat5?
Spat5 is a professional‑grade spatialisation toolkit used in countless electroacoustic compositions and research projects. By bridging Praat to Spat5, this script gives Praat users access to:
- High‑quality Ambisonics encoding/decoding.
- Accurate binaural rendering with validated HRTFs.
- Room simulation and spatial effects.
- Command‑line automation (no GUI needed).
This makes it ideal for batch processing, algorithmic composition, and research pipelines.
3‑stage pipeline
Mono input → encoded as a point source at (az, el).
Stereo input → encoded as two sources spread by
stereo_spread_deg (left = az+spread/2, right = az‑spread/2).Output: HOA‑order multichannel WAV (e.g., order 1 → 4ch, order 2 → 9ch, etc.).
Stage 2 – HOA decode (spat5.hoa.decoder~)
HOA WAV → decoded to a speaker layout (2.0, 4.0, 5.0, 7.0, 7.1).
Speaker positions are predefined (e.g., 5.0: L30°, R30°, C0°, Ls110°, Rs110°).
Stage 3 – Binaural (spat5.virtualspeakers~)
Speaker‑layout WAV → convolved with HRTFs to produce stereo binaural output.
Supports SOFA HRTF files, ITD scaling, and room simulation presets.
For moving sources, the sound is split into chunks (movement_chunk_dur seconds), each chunk is processed independently with its own azimuth, then all chunks are concatenated with crossfades.
Trajectory types (10)
| Type | Description | Parameters | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Linear | 。|||||||||
| Circular | 。|||||||||
| Figure‑8 | 。|||||||||
| Spiral In | 。|||||||||
| Spiral Out | 。|||||||||
| Pendulum | 。|||||||||
| Zigzag | 。|||||||||
| Random Walk | 。|||||||||
| Ellipse | 。|||||||||
| Square | 。
All trajectories are 2D (elevation fixed to the form value). Coordinates are converted to azimuth using az = arctan2(x, y) (0° = front, +90° = right). The chunk duration determines the time resolution of the animation.
Parameters & defaults
Position / movement
| Parameter | Range | Default | Description | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| movement_enabled | 0/1 | 0 | 。|||||||||||||||||||||
| azimuth / elevation | -180…180 / -90…90 | 0 / 0 | 。|||||||||||||||||||||
| movement_type | 1–10 | 2 (Circular) | 。|||||||||||||||||||||
| start_az / end_az | -180…180 | -90 / 90 | 。|||||||||||||||||||||
| trajectory_radius | 0–1.4 | 0.8 | 。|||||||||||||||||||||
| trajectory_speed | 0.2–3.0 | 1.0 | 。|||||||||||||||||||||
| movement_chunk_dur | 0.1–2.0 | 0.5 | 。|||||||||||||||||||||
| xfade_dur | 0–0.3 | 0.1 | 。
Source / HOA
| Parameter | Range | Default | Description | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| stereo_spread_deg | 0–180 | 60 | 。||||||||||||
| hoa_order | 1–5 | 1 | 。||||||||||||
| hoa_norm | SN3D / N3D / FuMa | N3D | 。||||||||||||
| decode_layout | 2.0 / 4.0 / 5.0 / 7.0 / 7.1 | 2.0 | 。||||||||||||
| pre_gain_db | -30…0 | -9 | 。
HRTF / room
Visualization (Praat picture)
When the script finishes, it draws a 5‑panel visualisation:
- Spatial map (top‑down) – unit circle with speaker positions (numbered) and the source trajectory (red line). For static mode, a single red dot with azimuth label.
- Parameter panel – source mode (mono/stereo), HOA order/layout, movement parameters (or static position).
- Waveforms – original input (grey), binaural left (blue), binaural right (orange).
- Summary bar – output name, HRTF preset, room, duration.
FAQ / troubleshooting
The script expects the Spat5 command‑line tools to be in tools_folder$ (default C:/Users/User/Documents/Max 9/Packages/spat5-x64/media/tools/). Update this path to your Spat5 installation. On macOS/Linux, the path will differ – edit the script accordingly. Spat5 can be installed via the Max package manager or directly from IRCAM.
Ensure spat_bridge.py is in the same folder as the Praat script. Also check that Python is in your PATH and that the Spat5 tools are executable. The script prints the log file contents on error – examine the Info window for details. On Windows, the bridge adds the Spat5 support folder to PATH automatically.
Increase xfade_dur (e.g., to 0.15 s) to smooth transitions. The crossfade is applied to each chunk before concatenation. If the source has strong transients, a longer crossfade may be needed. For very smooth trajectories, reduce movement_chunk_dur (e.g., 0.2 s) and increase xfade_dur accordingly.
The HOA chain (encode → decode → binaural) adds approximately +7 dB of gain. The default pre‑gain of -9 dB compensates for this, keeping the output level similar to the input. If the output is too quiet, increase pre_gain_db (e.g., to -3 dB). You can also adjust the final output gain in Praat after processing.
Higher HOA orders allow better spatial resolution, but the decode layout determines how many speakers are used in the intermediate stage. For binaural output, the speaker layout is virtualised via HRTF – using a 7.1 layout with HOA order 3 gives a more detailed spatial image than order 1. However, higher orders increase channel count and processing time.
The script expects SOFA files (e.g., kemar.sofa) to be in the Spat5 support folder. The sofa_file parameter should be the base name without extension (e.g., “kemar”). The actual file path is determined by Spat5’s search path. Common HRTF datasets available in SOFA format include KEMAR, CIPIC, ARI, and SADIE II.
Spat5 is free for non‑commercial research and educational use. Commercial users should contact IRCAM for licensing. The command‑line tools are distributed with the Spat5 package and can be used without a Max license.