Formant Swarm Granulator — v1.1 User Guide

Resonance‑organised granular cloud engine. Segments a sound into grains, extracts formant profiles, and generates a swarm where grain placement is guided by formant similarity, temporal repulsion, and local density. Four swarm modes shape the migration of grains through the resonance space.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.1 (2026) License: MIT License Repo: GitHub
Contents:

What this does

Formant Swarm Granulator treats each short grain (30–250 ms) as a particle carrying a resonance fingerprint (F1, F2, F3, bandwidths, pitch, intensity). A Python engine projects these particles into a 2‑D similarity field and walks a path through the swarm, selecting grains according to attraction (formant similarity) and two types of repulsion (temporal – avoid grains from the same source region; density – avoid overusing any grain).

Why “swarm”? Grains are not randomly reordered. They move through a resonance space where similar vowels / timbres cluster together. The walk is governed by a potential field: you can control how strongly the swarm stays in one formant neighbourhood (attraction), how much it tries to move to distant source times (temporal repulsion), and how much it avoids grains already used many times (density repulsion). The result is an evolving cloud organised by hidden vowel anatomy, not raw randomness.

Quick start

  1. In Praat, select exactly one Sound object (mono or stereo).
  2. Run script…FormantSwarmGranulator.praat.
  3. Choose a Swarm_mode:
    • vowel_cloud – balanced, forms clusters by vowel quality.
    • resonance_turbulence – chaotic, high stochasticity.
    • migration – drifts between cluster centers.
    • counterpoint – alternates voiced/unvoiced grains.
  4. Set Grain_length_ms (e.g. 55 ms), Grain_jitter_ms (random variation), and Grain_overlap_percent (0% = sequential, 50% = heavy overlap).
  5. Adjust Density_grains_per_sec (overall event density).
  6. Tune Attraction, Temporal_repulsion, Density_repulsion – the three swarm forces.
  7. Optionally set Pan_spread (stereo width) and Pitch_drift_semitones (random pitch variation per grain).
  8. Click OK. Praat analyses each grain (pitch, intensity, formants, bandwidths), exports CSV, and calls the Python engine. The result is imported as originalname_formantSwarm.
Quick tip: Start with vowel_cloud, Attraction=1.0, Temporal=0.9, Density=0.7. For a slowly evolving texture, increase Temporal_repulsion (forces the swarm to explore new source regions). For a static wash, set Temporal_repulsion low and Attraction high.
Important: Python dependencies: numpy, soundfile. Grain analysis uses Praat’s built‑in pitch, intensity, spectrum, and formant objects. Very short grains (<30 ms) may produce unreliable formants – the script enforces a minimum grain length. The output is stereo, with per‑grain panning derived from the 2‑D swarm projection.

The four swarm modes

ModeScore formula (conceptual)Behaviour
vowel_cloud score = attraction × similarity + 0.22×cluster_bonus + 0.10×crowd_bonus Forms tight clusters around similar formant centres; balanced, the default.
resonance_turbulence score = 0.75×attraction×similarity + 0.45×random + 0.12×crowd High stochasticity, chaotic migrations, unpredictable grain sequences.
migration score = 0.55×attraction×similarity + 0.30×exp(-cluster_distance) + 0.10×crowd Drifts slowly between cluster centers; smooth timbral evolution.
counterpoint score = 0.70×attraction×similarity + 0.20×voiced_match + 0.10×crowd Alternates voiced/unvoiced grains; creates rhythmic separation.

Forces: similarity = exponential of negative distance in formant space. cluster_bonus = grains from same resonance family. crowd_bonus = 1/(1+usage) – discourages overusing any grain. temporal_penalty = exponential of time difference – favours grains from far‑away source moments.

Pipeline — five stages

Stage 1 – Grain extraction (Praat) – sliding window with overlap, jitter; extract pitch, intensity, spectral centroid, formants (F1‑F3) and bandwidths. Write CSV.
Stage 2 – Build resonance feature space (Python) – robust‑normalised feature matrix (f1,f2,f3,bandwidth composite, intensity, centroid, voiced flag).
Stage 3 – Clustering & projection – k‑means (k=6) into resonance families; principal component projection to 2‑D (x,y for each grain).
Stage 4 – Swarm walk – iterative selection: choose next grain based on score combining attraction, repulsion, and mode‑specific terms. Produces schedule of events.
Stage 5 – Render & balance – extract grains, resample (pitch drift), fade, pan according to grain.x, overlay, per‑channel RMS balance, final gain match to source RMS.

Parameters & defaults

Grain generation

ParameterDefaultDescription
Grain_length_ms55Target grain duration (ms). Jitter adds random variation.
Grain_jitter_ms15Random ± variation applied to each grain’s length.
Grain_overlap_percent0Overlap between consecutive grains (0–90%). Hop = length × (1‑overlap).
Density_grains_per_sec18Number of scheduled output events per second (not input grains).

Swarm forces

ParameterDefaultDescription
Attraction1.0Weight for formant similarity (0 = ignore, 2 = strongly attracted).
Temporal_repulsion0.9Penalty for selecting grains close in source time; higher = explore different phrase regions.
Density_repulsion0.7Penalty for over‑used grains; higher = more even usage.

Spatial / pitch

ParameterDefaultDescription
Pan_spread1.0Stereo width multiplier (0 = centre, 1 = full, >1 = hyper‑wide).
Pitch_drift_semitones0.5Random pitch variation per grain (±, via varispeed resampling).

Formant analysis

ParameterDefaultDescription
Max_formant_hz5500Maximum formant frequency (Praat setting).
Number_of_formants5Number of formants to track (only F1‑F3 are used).

Visualization (Praat picture)

When Draw_visualization = 1, the script draws a multi‑panel plot:

Tip: The formant bar panel gives a quick profile of the vowel space in the original sound. The output will re‑organise grains according to this space, so the formant bars are the “colour palette”.

FAQ / troubleshooting

“Python not found” or missing packages

Install: pip install numpy soundfile. On Windows, the script tries py, python, python3.

Grains analysed = 0 or very few

The sound may be too short (need at least one full grain). Reduce Grain_length_ms or increase overlap to generate more grains.

Output is quiet / RMS mismatch

The engine applies per‑channel RMS balance (to avoid one channel being louder) then scales the combined output to match the source RMS. If the source has very low‑level passages, the active‑region RMS may be higher – but the script uses the whole‑file RMS, which is safe. If the output sounds too quiet, you can increase gain manually after import.

Pitch_drift_semitones has no audible effect

Pitch drift is implemented as varispeed resampling (duration changes). Because grains are very short, a 0.5 st shift is subtle. Increase to 2–3 st for noticeable variation. Very large shifts may cause unnatural transients – use with care.

Pan_spread > 1.0 produces weird stereo

Pan is derived from the grain’s x‑coordinate in the 2‑D swarm projection. pan = tanh(0.9*x + noise). Multiplying by pan_spread can push some grains beyond ±1, which is still legal – it just makes them louder in one channel. Use values 0–1 for normal width, >1 for exaggerated separation.

Cluster count (k=6)

The k‑means clustering uses a fixed k=6 (maximum distinct resonance families). This is a design choice – six formant‑based classes are enough to capture most vowel diversity without over‑fragmenting.

Command‑line usage of formant_swarm_granulator.py

The Python engine can be run independently (batch processing):

python formant_swarm_granulator.py --grains grains.csv --input in.wav --output out.wav --stats stats.txt [options]

Options: --mode, --density, --attraction, --temporal_repulsion, --density_repulsion, --pan_spread, --pitch_drift, --seed.