Partial Editing & Resynthesis — User Guide

A sinusoidal modeling tool that deconstructs audio into pure sine waves (partials), allowing for independent pitch shifting, formant manipulation, and textural jittering.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 0.2 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

This script performs Analysis-by-Synthesis based on the McAulay-Quatieri (MQ) paradigm. Instead of treating audio as a waveform or a static spectrum, it views sound as a collection of individual sine waves (partials) that evolve over time.

It analyzes the sound in small windows, finds the most prominent frequencies (peaks), and discards the rest (noise). It then reconstructs the sound using a bank of oscillators. This allows you to:

Relation to SPEAR: If you have used the software SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis), this script performs a similar function frame-by-frame within Praat, automating the extraction and transformation process.

Quick start

  1. Select a Sound object in the Praat Objects window.
  2. Run script…Partial Editing & Resynthesis.praat.
  3. Choose a Preset:
    • Clean Resynth: Faithful reconstruction (removes noise).
    • Formant Shift Down: Turns voices into "giants."
    • Glassy Shimmer: Adds high jitter for a texturizing effect.
  4. Click OK.
Performance Warning: This script uses a "brute force" method to synthesize thousands of sine waves. A 10-second file may take 30-60 seconds to process depending on the max_partials_per_frame setting.

Theory: Sinusoidal Modeling

Fourier's Theorem

Any periodic waveform can be represented as a sum of sine waves at different frequencies, amplitudes, and phases.

[Image of Fourier Series decomposition]

The Process

  1. Windowing: The sound is sliced into overlapping frames (e.g., every 15ms).
  2. FFT Analysis: Each frame is converted to a Spectrum.
  3. Peak Picking: The script scans the spectrum and picks the loudest $N$ peaks (partials).
  4. Transformation: The frequency and amplitude of these peaks can be mathematically altered (shifted, scaled, jittered).
  5. Oscillator Bank: The script generates fresh sine waves for these new values.
  6. Overlap-Add: The generated sine waves are windowed (Hanning) and layered on top of each other to rebuild the continuous sound.

Parameters & Presets

Analysis Parameters

ParameterDefaultDescription
window_length0.060sSize of the analysis chunk. Longer = better frequency resolution (bass), worse time resolution (transients).
hop_size0.015sHow often to analyze. Smaller = smoother changes but slower processing.
max_partials...10Density. How many sine waves to generate per frame. 5 = hollow/sparse. 20 = rich/full.
min/max_frequency60-8000Range of detection. Frequencies outside this range are ignored.

Transformation Parameters

ParameterDescription
transpose_semitonesPitch shift. +12 = Octave Up. -12 = Octave Down.
formant_shift_ratioTimbre shift.
1.0 = Normal.
1.2 = Smaller vocal tract (Chipmunk/Child).
0.8 = Larger vocal tract (Giant).
amplitude_scaleGlobal volume control (multiplier).

Jitter (Texture) Parameters

Jitter adds random deviation to every partial in every frame. This breaks the "perfect" digital silence and creates "organic" or "shimmering" textures.

Algorithm Deep Dive

1. Peak Picking & Suppression

How does the script find the "important" parts of the sound?

Loop k from 1 to max_partials: 1. Find absolute maximum in Spectrum. 2. Record Frequency and Amplitude. 3. "Suppress" this peak in the Spectrum matrix: Set amplitude to 0 for the peak bin AND ±2 neighbor bins. 4. Repeat.

The suppress_bins parameter is crucial here. It prevents the script from picking the "shoulders" of a strong peak as separate partials, ensuring it moves on to different harmonic frequencies.

2. Resynthesis Formula

For every extracted peak, a sine wave is generated using the formula:

Grain(t) = Amp * sin(2 * π * Freq * t) * HanningWindow(t) Where: Freq = (OriginalFreq * 2^(transpose/12)) * FormantRatio ± Random(Jitter) Amp = OriginalAmp * Scale ± Random(Jitter)

🧮 Formants vs. Pitch

Pitch Shifting: Multiplies all frequencies by a constant factor. The harmonic relationships remain identical.

Formant Shifting: In this script, it is implemented similarly to pitch shifting but intended to be used against transposition or on its own to simulate physical size changes.

Applications

1. De-Noising / "Spectral Cleaning"

Preset: Clean Resynth / Sparse Partials

Because the script only resynthesizes the loud peaks (harmonics) and ignores the low-level valleys (noise floor), the output is often a "cleaner," albeit more synthetic, version of the original. Great for isolating melody from noisy backgrounds.

2. "Ghost" or "Whisper" Textures

Preset: Whisper Ghost

By using a low number of partials (sparse) and high jitter, the sound loses its tonal center and becomes a diffuse cloud of sine waves. This is excellent for horror sound design or ambient pads.

3. Robotic / Sci-Fi Voices

Preset: Robotic

Setting jitter to 0 and max_partials to a moderate number (10) creates a perfect, phase-incoherent reconstruction. The lack of phase alignment between frames gives speech a distinct "metallic" or "vocoded" character without using a carrier signal.

Limitations