PCA Tone Shaper – User Guide

Applies dynamic spectral shaping by using Principal Component Analysis to derive time-varying equalization curves from acoustic features, processing audio in chunks with adaptive three-band filtering.

Category: Synthesis / Processing Praat Script: PCA_Tone_Shaper.praat

Contents:

What this does Quick start Parameters Outputs

What this does

This Praat script implements an intelligent tone shaping system that uses Principal Component Analysis to create adaptive spectral equalization based on the evolving acoustic characteristics of the audio. The script first extracts eight acoustic features per analysis frame: the first three formants (F1, F2, F3), formant ratios (F2/F1 and F3/F2), fundamental frequency (F0), intensity, and harmonics-to-noise ratio (HNR). These features are standardized into z-scores and projected into a three-dimensional principal component space, where the components typically capture information about spectral tilt, presence/harmonicity, and body/low-end characteristics. For each processing chunk (user-defined duration, default 200ms), the script computes the mean values of the first three principal components and maps these to gain values for three frequency bands: low (0–200 Hz), mid (200–2000 Hz), and high (2000–8000 Hz). The audio is split into these bands using Hann-windowed spectral filtering, each band is scaled by its computed gain factor (constrained between 0.5× and 1.5×), and the bands are recombined to produce a dynamically equalized output where the tonal balance evolves in response to the timbral content, creating an adaptive tone shaping effect that emphasizes different spectral regions based on the acoustic character of each moment in the audio.

Quick start

In Praat, select exactly one Sound object.
Run script… → PCA_Tone_Shaper.praat.
Set chunk duration (default 200ms) to control temporal resolution of adaptive equalization.
Adjust PCA strength (default 0.8) to control the intensity of the tone shaping effect.
Customize frequency band crossover points if needed (defaults: 200 Hz and 2000 Hz).
Configure analysis parameters (formant analysis, pitch range) as appropriate for your audio.
Enable play_result if you want to hear the output immediately.
Click OK.
The output object, named [OriginalName]_PCATone, is created alongside the original sound.

Parameters (form fields)

Name (GUI)	Type	Default	Description
chunk_ms	positive	200	Duration in milliseconds of each processing chunk. Smaller values create more temporally detailed tone shaping; larger values produce smoother, more stable equalization curves.
frame_step_seconds	positive	0.01	Time step in seconds between consecutive analysis frames (10ms default).
max_formant_hz	positive	5500	Maximum formant frequency in Hz for formant analysis (adjust based on voice type or instrument range).
n_formants	integer	5	Number of formants to track during analysis (first three are used as features).
f0_min	positive	75	Minimum fundamental frequency (Hz) for pitch analysis.
f0_max	positive	600	Maximum fundamental frequency (Hz) for pitch analysis.
pca_strength	positive	0.8	Overall scaling factor (0.0–1.5) controlling how strongly PCA-derived parameters influence band gains. Higher values create more dramatic tone shaping.
low_hi_crossover1_hz	positive	200	Crossover frequency in Hz between low and mid bands. Defines the upper boundary of the low-frequency band.
low_hi_crossover2_hz	positive	2000	Crossover frequency in Hz between mid and high bands. Defines the transition from midrange to high frequencies.
high_band_top_hz	positive	8000	Upper frequency limit in Hz for the high band (must be below Nyquist frequency).
headroom	positive	0.97	Peak amplitude scaling factor (0.0–1.0) applied to final output to prevent clipping.
play_result	boolean	0 (false)	If enabled (1), automatically plays the processed audio after completion.

Outputs

Object name: [OriginalName]_PCATone
Type: Sound (mono; stereo input is converted to mono for processing).
Feedback: Processing summary printed to Praat Info window, including:
- Audio properties (duration, sample rate)
- Chunk duration and processing parameters
- Frequency band definitions (Low/Mid/High ranges)
- PCA strength setting
- Variance explained by first three principal components
Peak scaling: Output is automatically scaled to the specified headroom value (default 97%).
Playback: Plays automatically only if play_result is enabled.
Both original and processed sounds remain in the Objects list.