22.2 Synthetic Stem Renderer — v1.0 User Guide
Psychoacoustic approximation of a 22.2‑style synthetic stem renderer. Takes a mono or stereo input and produces a 24‑channel synthetic surround array (Middle, Upper, Lower layers + LFE) and an optional headphone output (clean fold‑down or true binaural using CNMAT KEMAR HRIRs).
What this does
22.2 Synthetic Stem Renderer upmixes a mono or stereo source into a 24‑channel surround array inspired by the 22.2 multichannel format (NHK / cinema). The engine creates a psychoacoustic approximation using subtractive side extraction, band‑limited delays, and matrix‑derived stems – no external plug‑ins or binaries required.
- 24‑channel output – Middle Layer (10 ch), Upper Layer (9 ch), Lower Layer (3 ch), LFE (2 ch).
- Three presets – Cinematic Film (aggressive), Subtle Music (gentle), Wide Mono (optimised for mono sources).
- Two headphone modes – Clean fold‑down (transparent) or True Binaural (CNMAT KEMAR HRIRs).
- Stereo input: front L/R preserved; surround stems derived from side signals (L‑R, R‑L).
- Mono input: real ITD widening (L = original, R = delayed by
front_width_ms). - Surround and height channels are band‑limited, delayed, and gain‑scaled for diffuse placement.
- LFE channels are low‑passed mid signal.
Quick start
- In Praat, select exactly one Sound object (mono or stereo).
- Run script… →
22.2_Synthetic_Stem_Renderer.praat. - Choose a Preset:
- Cinematic Film (aggressive surround), Subtle Music (gentle), Wide Mono (optimised for mono).
- For custom mode (preset = Custom), adjust Front_width_ms (mono widening) and other parameters via the form.
- Select Render_22_2_output and/or Render_headphone_output.
- If using headphone output, choose Headphone_mode:
- Headphone Fold‑down (Clean) – transparent FL/FR + a touch of FC.
- True Binaural (CNMAT KEMAR) – requires HRIR folder with the specific CNMAT KEMAR files.
- Click OK. The script builds 24 channels, optionally renders the headphone mix, and creates Sound objects named
originalname_22_2_Arrayandoriginalname_Headphone_Previewororiginalname_True_Binaural.
The 3 presets (+ Custom)
| Preset | FC reinforce | Wide gain | Amb gain | Upper gain | Lower gain | Side delay | Description | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Subtle Music | 0.30 | -10 dB | -8 dB | -12 dB | -18 dB | 25 ms | 。||||||||||||||
| Wide Mono | 0.40 | -9 dB | -6 dB | -9 dB | -12 dB | 35 ms | 。||||||||||||||
| Cinematic Film | 0.50 | -6 dB | -4 dB | -6 dB | -9 dB | 40 ms | 。
Additional fixed parameters: decorr_ms = 7.0, hp_rear = 120 Hz, lp_rear = 6000 Hz, hp_top = 250 Hz, lp_top = 4000 Hz, lfe_hz = 120 Hz.
24‑channel layout
1. FL – Front Left
2. FR – Front Right
3. FC – Front Center
4. FWL – Front Wide Left
5. FWR – Front Wide Right
6. SiL – Side Left
7. SiR – Side Right
8. BL – Back Left
9. BR – Back Right
10. BC – Back Center
Upper Layer (9 channels)
11. TpFL – Top Front Left
12. TpFR – Top Front Right
13. TpFC – Top Front Center
14. TpSiL – Top Side Left
15. TpSiR – Top Side Right
16. TpBL – Top Back Left
17. TpBR – Top Back Right
18. TpBC – Top Back Center
19. TpC – Top Center
Lower Layer (3 channels)
20. BFL – Bottom Front Left
21. BFR – Bottom Front Right
22. BFC – Bottom Front Center
LFE (2 channels)
23. LFE1
24. LFE2
Processing pipeline
- FL, FR – original (or mono→ITD widened).
- FC – Mid (L+R) ×
c_reinforce. - Amb_L, Amb_R – side signals: L − R×0.5 and R − L×0.5.
- FW_L, FW_R – blend of FL/FR and Amb (0.6 each).
- Mid – (L+R)/2 (used for BC, TpBC, TpC, LFE).
- Each channel is built from a base stem, optionally filtered (HP/LP), delayed, and gain‑scaled.
- Surround/height channels use band‑limiting (HP 120 Hz / 250 Hz, LP 6000 Hz / 4000 Hz) to reduce directness.
- LFE channels are low‑passed at 120 Hz (no delay).
- All channels are peak‑normalised to the specified headroom.
The decorrelation offset (decorr_ms = 7 ms) is applied symmetrically to many pairs (e.g., FWL/FWR, SiL/SiR, TpFL/TpFR) to create a natural stereo width in the surround field.
Parameters & defaults
Preset
Select one of the three presets to load pre‑configured gains and delays. For custom mode (preset = Custom), the script uses the fixed internal parameters (as in Cinematic Film) but allows manual adjustment of the form fields.
Front widening (mono input only)
| Parameter | Range | Default | Description |
|---|---|---|---|
| Front_width_ms | 0–2 ms | 0.35 ms | 。
Output options
| Parameter | Options | Default | Description | ||
|---|---|---|---|---|---|
| Render_22_2_output | yes/no | yes | 。|||
| Render_headphone_output | yes/no | yes | 。
Headphone mix
| Parameter | Options | Default | Description | ||
|---|---|---|---|---|---|
| Headphone_mode | Fold‑down / True Binaural | Fold‑down | 。|||
| Hrir_folder | folder path | Kemar_HRIR/ | 。
Master
Headphone modes
🎧 Headphone Fold‑down (Clean)
A deliberately EQ‑transparent reference consisting of:
- FL + FR (full gain).
- FC at 0.10 gain (subtle centre reinforcement).
- No delayed, filtered, or subtractive surround feeds – preserves original timbre.
Use this mode for a clean, neutral headphone preview of the front image.
🎧 True Binaural (CNMAT KEMAR)
Experimental binaural render using the CNMAT KEMAR HRIR dataset. Only speaker positions with verified HRIR pairs are used:
- FL, FR, FC, SiL, SiR, BL, BR, TpFL, TpFR, TpSiL, TpSiR, TpBL, TpBR, TpC.
- Each stem is convolved with the corresponding left/right HRIR, then summed into the binaural ear signals.
- Gain scaling per role: FC (0.12), SiL/SiR (0.10), BL/BR (0.08), TpFL/TpFR (0.06), TpSiL/TpSiR (0.05), TpBL/TpBR (0.04), TpC (0.03).
If any required HRIR file is missing, the binaural render aborts cleanly rather than producing a corrupted output. The required files are listed in the script (see source).
FAQ / troubleshooting
Check that Render_22_2_output is enabled. Also ensure that the source sound has reasonable amplitude – the upmixer preserves levels but if the input is very quiet, the output will also be quiet. The script normalises to headroom_dBFS.
Download the CNMAT KEMAR HRIR set (available from CNMAT or via the AudioTools distribution). Place the required WAV files in the folder specified in Hrir_folder. The required filenames are hardcoded in the script – ensure they match exactly.
The script applies an ITD delay to the right channel (front_width_ms). For a convincing Haas effect, values between 0.2 ms and 1 ms work well. The Wide Mono preset uses 0.35 ms. Increase this value for a wider image, but be aware that delays >1 ms may become perceptible as an echo.
The 22.2 format (22 full‑range channels + 2 LFE) is a standard for immersive audio (NHK, cinema). This synthetic renderer is a psychoacoustic approximation – it does not claim physical accuracy, but it provides a practical way to upmix any source into a multichannel array for experimentation or playback on 24‑channel systems.
Many channel pairs (e.g., FWL/FWR, SiL/SiR, TpFL/TpFR) receive a small extra delay offset (decorr_ms = 7 ms). This creates a natural stereo width in the surround field without phase cancellation. The offset is symmetric: left channel = base delay, right channel = base delay + decorr_ms.
Surround and height channels are high‑passed (120 Hz / 250 Hz) to remove low‑frequency buildup, and low‑passed (6000 Hz / 4000 Hz) to reduce directness and prevent tonal clutter. This makes them sound diffuse and “behind” the listener, while the front channels retain full frequency response.