Partial Editing & Resynthesis — User Guide
A sinusoidal modeling tool that deconstructs audio into pure sine waves (partials), allowing for independent pitch shifting, formant manipulation, and textural jittering.
What this does
This script performs Analysis-by-Synthesis based on the McAulay-Quatieri (MQ) paradigm. Instead of treating audio as a waveform or a static spectrum, it views sound as a collection of individual sine waves (partials) that evolve over time.
It analyzes the sound in small windows, finds the most prominent frequencies (peaks), and discards the rest (noise). It then reconstructs the sound using a bank of oscillators. This allows you to:
- Purify Audio: Remove breathiness or noise, leaving only the tonal content.
- Shift Pitch/Formants: Change the pitch without changing the speed, or change the vocal tract size (chipmunk/giant) without changing the pitch.
- Texturize: Add random "jitter" to the frequencies and amplitudes to create organic, shimmering, or glassy textures.
Quick start
- Select a Sound object in the Praat Objects window.
- Run script… →
Partial Editing & Resynthesis.praat. - Choose a Preset:
Clean Resynth: Faithful reconstruction (removes noise).Formant Shift Down: Turns voices into "giants."Glassy Shimmer: Adds high jitter for a texturizing effect.
- Click OK.
max_partials_per_frame setting.
Theory: Sinusoidal Modeling
Fourier's Theorem
Any periodic waveform can be represented as a sum of sine waves at different frequencies, amplitudes, and phases.
[Image of Fourier Series decomposition]The Process
- Windowing: The sound is sliced into overlapping frames (e.g., every 15ms).
- FFT Analysis: Each frame is converted to a Spectrum.
- Peak Picking: The script scans the spectrum and picks the loudest $N$ peaks (partials).
- Transformation: The frequency and amplitude of these peaks can be mathematically altered (shifted, scaled, jittered).
- Oscillator Bank: The script generates fresh sine waves for these new values.
- Overlap-Add: The generated sine waves are windowed (Hanning) and layered on top of each other to rebuild the continuous sound.
Parameters & Presets
Analysis Parameters
| Parameter | Default | Description |
|---|---|---|
| window_length | 0.060s | Size of the analysis chunk. Longer = better frequency resolution (bass), worse time resolution (transients). |
| hop_size | 0.015s | How often to analyze. Smaller = smoother changes but slower processing. |
| max_partials... | 10 | Density. How many sine waves to generate per frame. 5 = hollow/sparse. 20 = rich/full. |
| min/max_frequency | 60-8000 | Range of detection. Frequencies outside this range are ignored. |
Transformation Parameters
| Parameter | Description |
|---|---|
| transpose_semitones | Pitch shift. +12 = Octave Up. -12 = Octave Down. |
| formant_shift_ratio | Timbre shift. 1.0 = Normal. 1.2 = Smaller vocal tract (Chipmunk/Child). 0.8 = Larger vocal tract (Giant). |
| amplitude_scale | Global volume control (multiplier). |
Jitter (Texture) Parameters
- freq_jitter_range: (Hz) Adds random $\pm Hz$ to every partial. Creates detuning/chorus effects.
- amp_jitter_range: (dB) Adds random volume fluctuation. Creates tremolo/roughness.
Algorithm Deep Dive
1. Peak Picking & Suppression
How does the script find the "important" parts of the sound?
The suppress_bins parameter is crucial here. It prevents the script from picking the "shoulders" of a strong peak as separate partials, ensuring it moves on to different harmonic frequencies.
2. Resynthesis Formula
For every extracted peak, a sine wave is generated using the formula:
🧮 Formants vs. Pitch
Pitch Shifting: Multiplies all frequencies by a constant factor. The harmonic relationships remain identical.
Formant Shifting: In this script, it is implemented similarly to pitch shifting but intended to be used against transposition or on its own to simulate physical size changes.
Applications
1. De-Noising / "Spectral Cleaning"
Preset: Clean Resynth / Sparse Partials
Because the script only resynthesizes the loud peaks (harmonics) and ignores the low-level valleys (noise floor), the output is often a "cleaner," albeit more synthetic, version of the original. Great for isolating melody from noisy backgrounds.
2. "Ghost" or "Whisper" Textures
Preset: Whisper Ghost
By using a low number of partials (sparse) and high jitter, the sound loses its tonal center and becomes a diffuse cloud of sine waves. This is excellent for horror sound design or ambient pads.
3. Robotic / Sci-Fi Voices
Preset: Robotic
Setting jitter to 0 and max_partials to a moderate number (10) creates a perfect, phase-incoherent reconstruction. The lack of phase alignment between frames gives speech a distinct "metallic" or "vocoded" character without using a carrier signal.
Limitations
- Phase Coherence: The script generates new sine waves every frame (15ms). It does not "track" partials across frames. This means the phase is reset every frame, leading to a "smearing" of transients. Drum beats will lose their punch; sustained tones work best.
- Processing Time: It is computationally expensive. It is an offline process, not real-time.