SSM Morph Composer — v1.0 User Guide
Structure‑driven audio recomposition via Self‑Similarity Matrix transformation. Segment a sound, build an SSM from spectral features, transform the matrix (blur, sharpen, diffuse, amplify motifs, warp structure), and navigate a new path through the modified similarity landscape. Praat reassembles events with crossfade.
What this does
SSM Morph Composer extracts the structural blueprint of a sound and lets you reshape it. The sound is segmented into short events. For each event we compute a 5‑dimensional feature vector (spectral centroid, flatness, entropy, flux, RMS). The pairwise similarity between events forms a Self‑Similarity Matrix (SSM) – a map of repetitions and contrasts. The engine then applies one of five geometric transformations to the SSM, and finally a path walker moves through the transformed similarity space, generating a new sequence of events that is reassembled into audio.
Five transformation modes (the “SSM engines”):
- Blur — Gaussian smoothing, reduces sharp contrasts → ambient, seamless texture
- Sharpen — power mapping (SSM^γ), exaggerates motifs → repetitive / maximalist
- Diffusion — iterative matrix diffusion, spreads similarity → evolving families of events
- MotifAmplify — boosts diagonal energy bands → loop‑like / pattern reinforcement
- StructureWarp — smooth nonlinear coordinate warp → folded time, phrase stretching
Quick start
- In Praat, select exactly one Sound object (mono or stereo).
- Run script… →
SSMComposer.praat. - Choose a Preset (Ambient blur, Motif mirror, Diffuse field, … Spectral labyrinth).
- Adjust the SSM_mode (if custom) and navigation parameters (temperature, tabu, teleport, visit penalty).
- Set segmentation thresholds (silence threshold, min intervals).
- Enable Draw_SSM to see original and modified matrices (requires matplotlib for PNG export).
- Click OK – Python segmentation, feature extraction, SSM building, transformation, path walking, and Praat reconstruction run.
numpy, scipy, soundfile.
matplotlib is required only for Draw_SSM PNG export.
The segmentation uses Praat’s “To TextGrid (silences)” – adjust thresholds if your material is very quiet.
Reconstructed duration may differ from original (because events are concatenated with crossfade).
Very long outputs (Output_events up to 10000) are possible but may be slow.
Pipeline — five stages
Stage 2 – Feature extraction (from each mono patch): 5 features – centroid, flatness, entropy, flux, RMS.
Stage 3 – Build SSM (cosine or euclidean similarity, normalised 0…1).
Stage 4 – Transform SSM according to chosen mode (Blur/Sharpen/Diffusion/MotifAmplify/StructureWarp).
Stage 5 – Navigate path through modified SSM (temperature, tabu, teleport, visit penalty) → new event sequence.
Stage 6 – Praat reconstruction: extract each event from original, crossfade, concatenate, normalise.
Feature details
| Feature | Description |
|---|---|
| Spectral centroid | Mean frequency (normalised by Nyquist) |
| Spectral flatness | Wiener entropy (tonal/noisy) |
| Spectral entropy | Normalised Shannon entropy of spectrum |
| Spectral flux | Mean absolute frame‑to‑frame difference |
| RMS energy | Root‑mean‑square amplitude |
All features are robust‑normalised (median + IQR) before SSM computation.
SSM transformation modes
🌀 Blur
SSM_mod = gaussian_filter(SSM, sigma=2.5)
Smooths the similarity matrix, reducing sharp boundaries. Creates ambient, seamless transitions.
Path walks become less sensitive to local contrasts.
⚡ Sharpen
SSM_mod = SSM^gamma (gamma=3.0)
Exaggerates high similarities, suppresses low ones. Reinforces motifs and repetitive patterns.
The path becomes more attracted to diagonal bands.
🌊 Diffusion
SSM_{t+1} = alpha·SSM_t + (1‑alpha)·(SSM_t @ SSM_t)
Spreads similarity along paths of structural connection. Events that are indirectly similar become more alike.
Creates evolving families of events.
🔁 MotifAmplify
Detects diagonal energy bands (repeated patterns) and adds boost:
SSM += boost * diag_energy where diag_energy accumulates off‑diagonal values.
Perfect for turning any sound into a loop‑based texture.
📐 StructureWarp
Smooth nonlinear coordinate warp: SSM_warp(i,j) = SSM(i+f(i), j+f(j)),
where f is a superposition of low‑frequency sines. Distorts phrase positions, folds time,
creates non‑linear narrative arcs.
Built‑in presets (8)
| Preset | Mode | Temp | Tabu | Teleport | Visit λ | Description |
|---|---|---|---|---|---|---|
| Ambient blur | Blur | 0.8 | 5 | 0.03 | 0.4 | Heavy smoothing, long crossfade (30 ms) |
| Motif mirror | Sharpen | 0.1 | 3 | 0.01 | 0.1 | Low temperature, tight tabu → repetitive |
| Diffuse field | Diffusion | 0.5 | 8 | 0.02 | 0.3 | Medium temperature, evolving families |
| Loop engine | MotifAmplify | 0.05 | 20 | 0.005 | 0.5 | Near‑deterministic, strong diagonal boost |
| Folded time | StructureWarp | 0.4 | 12 | 0.02 | 0.3 | Euclidean metric, warp amplitude 0.15 |
| Frozen texture | Blur | 0.02 | 2 | 0.005 | 0.2 | Near‑frozen path, very long crossfade |
| Spectral labyrinth | Diffusion | 0.9 | 15 | 0.05 | 0.6 | Max stochastic, euclidean, preserve durations off |
Crossfade_ms, preserve_event_durations and segmentation thresholds vary per preset – see Praat form for full values.
Parameters & defaults
Core parameters
| Parameter | Default | Description |
|---|---|---|
| Output_events | 300 | Length of new event sequence (path steps) |
| Similarity_metric | cosine | cosine or euclidean |
| Temperature | 0.3 | 0 = greedy (most similar); 1 = fully stochastic |
| Tabu_length | 10 | Number of recent events forbidden in next step |
| Teleport_probability | 0.02 | Chance to jump to a low‑visited event (section change) |
| Visit_penalty | 0.3 | Down‑weights over‑visited events: 0=off, higher = more exploration |
| Seed | 1234 | Random seed for reproducibility |
Segmentation
| Parameter | Default | Description |
|---|---|---|
| Silence_threshold_dB | -25 dB | Below this = silence |
| Min_silent_interval_s | 0.08 s | Minimum silent gap to split |
| Min_sounding_interval_s | 0.03 s | Minimum event duration |
Reconstruction
| Parameter | Default | Description |
|---|---|---|
| Crossfade_ms | 10 ms | Fade‑in/out between concatenated events |
| Preserve_event_durations | yes | Use original duration (yes) or stretch to mean? (no) |
Output
| Flag | Default | Effect |
|---|---|---|
| Draw_SSM | 0 | Export original/modified SSM as PNG (needs matplotlib) |
| Draw_visualization | 1 | Praat picture: waveforms, spectrograms, event path, stats |
| Play_result | 1 | Auto‑play after processing |
Visualization (Praat picture)
When Draw_visualization = 1, the script draws a multi‑panel plot containing:
- Original + modified SSM (if Draw_SSM enabled, else only if events ≤ 200)
- Original waveform with event boundaries (red ticks)
- Output waveform (reconstructed)
- Original and output spectrograms (0–5 kHz)
- Event path – step vs event index, showing navigation
- Summary panel with mode, temperature, unique events, entropy, diag energy, durations
Draw_SSM is 1, two PNG files are also saved in the plugin directory:
temp_ssm_original.png and temp_ssm_modified.png. You can open them externally.
Advanced use — custom transformations
The Python engine (ssm_morph_engine.py) can be called independently with your own event CSV and patches.
The five transformation functions are modular; you can replace them or add new modes.
The path navigator accepts --teleport_prob and --visit_lambda from command line.
- No events / too few events: lower silence threshold or reduce min sounding interval; fallback grid activates if nEvents < 4.
- Python not found: install numpy, scipy, soundfile; ensure they are importable.
- Draw_SSM but no PNGs: matplotlib missing – install it or set Draw_SSM=0.
- Output silent / clicks: crossfade too short? increase crossfade_ms. Some events may be extremely short – check min_sounding_interval.
- Path gets stuck in one event: increase tabu_length, teleport probability, or temperature.
Algorithmic detail: feature extraction
• centroid = mean(∑ f·M_f / ∑ M_f) over frames, normalised by Nyquist.
• flatness = exp(mean(log(power))) / mean(power).
• entropy = –∑ p·log(p) / log(n_bins).
• flux = mean(abs(diff(magnitude))) across frames.
• RMS = √(mean(signal²)).
Features are then robust‑normalised (median, IQR).