SSM Morph Composer — v1.0 User Guide

Structure‑driven audio recomposition via Self‑Similarity Matrix transformation. Segment a sound, build an SSM from spectral features, transform the matrix (blur, sharpen, diffuse, amplify motifs, warp structure), and navigate a new path through the modified similarity landscape. Praat reassembles events with crossfade.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.0 (2026) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

SSM Morph Composer extracts the structural blueprint of a sound and lets you reshape it. The sound is segmented into short events. For each event we compute a 5‑dimensional feature vector (spectral centroid, flatness, entropy, flux, RMS). The pairwise similarity between events forms a Self‑Similarity Matrix (SSM) – a map of repetitions and contrasts. The engine then applies one of five geometric transformations to the SSM, and finally a path walker moves through the transformed similarity space, generating a new sequence of events that is reassembled into audio.

Why SSM recomposition? Traditional concatenative synthesis uses surface features. SSM recomposition works on the relational structure of the sound. By transforming the similarity matrix you can: smooth structural boundaries → ambient washes; amplify diagonal bands → hypnotic loops; warp coordinates → folded time / non‑linear narratives. The result is a genuine morph of the original form.

Five transformation modes (the “SSM engines”):

Key innovation v1.0: the path navigator includes teleport probability and visit penalty – parameters that turn the walk into a structural dramaturgy engine: teleport jumps to low‑visited events (section changes), visit penalty discourages cliques, creating longer‑range form. All in pure Python + Praat, no external ML.

Quick start

  1. In Praat, select exactly one Sound object (mono or stereo).
  2. Run script…SSMComposer.praat.
  3. Choose a Preset (Ambient blur, Motif mirror, Diffuse field, … Spectral labyrinth).
  4. Adjust the SSM_mode (if custom) and navigation parameters (temperature, tabu, teleport, visit penalty).
  5. Set segmentation thresholds (silence threshold, min intervals).
  6. Enable Draw_SSM to see original and modified matrices (requires matplotlib for PNG export).
  7. Click OK – Python segmentation, feature extraction, SSM building, transformation, path walking, and Praat reconstruction run.
Quick tip: Start with “Ambient blur” (Blur mode) to hear a smooth, washed‑out version. For percussive / rhythmic material, “Motif mirror” (Sharpen) reinforces loops. The “Folded time” preset (StructureWarp) creates strange temporal distortions. Enable Draw_SSM to visualise how the matrix changes – the path is overlaid in the main plot.
Important: Python dependencies: numpy, scipy, soundfile. matplotlib is required only for Draw_SSM PNG export. The segmentation uses Praat’s “To TextGrid (silences)” – adjust thresholds if your material is very quiet. Reconstructed duration may differ from original (because events are concatenated with crossfade). Very long outputs (Output_events up to 10000) are possible but may be slow.

Pipeline — five stages

Stage 1 – Segmentation (silence‑based, or fallback 0.25 s grid) → list of events with start/end.
Stage 2 – Feature extraction (from each mono patch): 5 features – centroid, flatness, entropy, flux, RMS.
Stage 3 – Build SSM (cosine or euclidean similarity, normalised 0…1).
Stage 4 – Transform SSM according to chosen mode (Blur/Sharpen/Diffusion/MotifAmplify/StructureWarp).
Stage 5 – Navigate path through modified SSM (temperature, tabu, teleport, visit penalty) → new event sequence.
Stage 6 – Praat reconstruction: extract each event from original, crossfade, concatenate, normalise.

Feature details

FeatureDescription
Spectral centroidMean frequency (normalised by Nyquist)
Spectral flatnessWiener entropy (tonal/noisy)
Spectral entropyNormalised Shannon entropy of spectrum
Spectral fluxMean absolute frame‑to‑frame difference
RMS energyRoot‑mean‑square amplitude

All features are robust‑normalised (median + IQR) before SSM computation.

SSM transformation modes

🌀 Blur

SSM_mod = gaussian_filter(SSM, sigma=2.5)
Smooths the similarity matrix, reducing sharp boundaries. Creates ambient, seamless transitions. Path walks become less sensitive to local contrasts.

⚡ Sharpen

SSM_mod = SSM^gamma (gamma=3.0)
Exaggerates high similarities, suppresses low ones. Reinforces motifs and repetitive patterns. The path becomes more attracted to diagonal bands.

🌊 Diffusion

SSM_{t+1} = alpha·SSM_t + (1‑alpha)·(SSM_t @ SSM_t)
Spreads similarity along paths of structural connection. Events that are indirectly similar become more alike. Creates evolving families of events.

🔁 MotifAmplify

Detects diagonal energy bands (repeated patterns) and adds boost:
SSM += boost * diag_energy where diag_energy accumulates off‑diagonal values. Perfect for turning any sound into a loop‑based texture.

📐 StructureWarp

Smooth nonlinear coordinate warp: SSM_warp(i,j) = SSM(i+f(i), j+f(j)), where f is a superposition of low‑frequency sines. Distorts phrase positions, folds time, creates non‑linear narrative arcs.

Built‑in presets (8)

PresetModeTempTabuTeleportVisit λDescription
Ambient blurBlur0.850.030.4Heavy smoothing, long crossfade (30 ms)
Motif mirrorSharpen0.130.010.1Low temperature, tight tabu → repetitive
Diffuse fieldDiffusion0.580.020.3Medium temperature, evolving families
Loop engineMotifAmplify0.05200.0050.5Near‑deterministic, strong diagonal boost
Folded timeStructureWarp0.4120.020.3Euclidean metric, warp amplitude 0.15
Frozen textureBlur0.0220.0050.2Near‑frozen path, very long crossfade
Spectral labyrinthDiffusion0.9150.050.6Max stochastic, euclidean, preserve durations off

Crossfade_ms, preserve_event_durations and segmentation thresholds vary per preset – see Praat form for full values.

Parameters & defaults

Core parameters

ParameterDefaultDescription
Output_events300Length of new event sequence (path steps)
Similarity_metriccosinecosine or euclidean
Temperature0.30 = greedy (most similar); 1 = fully stochastic
Tabu_length10Number of recent events forbidden in next step
Teleport_probability0.02Chance to jump to a low‑visited event (section change)
Visit_penalty0.3Down‑weights over‑visited events: 0=off, higher = more exploration
Seed1234Random seed for reproducibility

Segmentation

ParameterDefaultDescription
Silence_threshold_dB-25 dBBelow this = silence
Min_silent_interval_s0.08 sMinimum silent gap to split
Min_sounding_interval_s0.03 sMinimum event duration

Reconstruction

ParameterDefaultDescription
Crossfade_ms10 msFade‑in/out between concatenated events
Preserve_event_durationsyesUse original duration (yes) or stretch to mean? (no)

Output

FlagDefaultEffect
Draw_SSM0Export original/modified SSM as PNG (needs matplotlib)
Draw_visualization1Praat picture: waveforms, spectrograms, event path, stats
Play_result1Auto‑play after processing

Visualization (Praat picture)

When Draw_visualization = 1, the script draws a multi‑panel plot containing:

Note: If Draw_SSM is 1, two PNG files are also saved in the plugin directory: temp_ssm_original.png and temp_ssm_modified.png. You can open them externally.

Advanced use — custom transformations

The Python engine (ssm_morph_engine.py) can be called independently with your own event CSV and patches. The five transformation functions are modular; you can replace them or add new modes. The path navigator accepts --teleport_prob and --visit_lambda from command line.

Troubleshooting:
  • No events / too few events: lower silence threshold or reduce min sounding interval; fallback grid activates if nEvents < 4.
  • Python not found: install numpy, scipy, soundfile; ensure they are importable.
  • Draw_SSM but no PNGs: matplotlib missing – install it or set Draw_SSM=0.
  • Output silent / clicks: crossfade too short? increase crossfade_ms. Some events may be extremely short – check min_sounding_interval.
  • Path gets stuck in one event: increase tabu_length, teleport probability, or temperature.

Algorithmic detail: feature extraction

For each event patch: compute STFT (Hann window, 1024 FFT, hop 512).
centroid = mean(∑ f·M_f / ∑ M_f) over frames, normalised by Nyquist.
flatness = exp(mean(log(power))) / mean(power).
entropy = –∑ p·log(p) / log(n_bins).
flux = mean(abs(diff(magnitude))) across frames.
RMS = √(mean(signal²)).
Features are then robust‑normalised (median, IQR).