Granular Attention Re-synthesis — User Guide

ReLU + Softmax applied to grain selection. At high temperature this becomes motif extraction / crystallized stutter. At low temperature it becomes a textural self-remix.

Author: Shai Cohen Version: 1.0 (2025) Technique: Attention-Based Grain Selection Category: Granular / Composition / Experimental Citation: Cohen, S. (2025). Praat AudioTools

Contents:

What this does Quick start Attention Selection Theory Preset Strategies Parameters & Controls Visualization & Analysis Applications

What this does

This script implements a Granular Attention Re-synthesis engine — a novel approach where grains compete for being chosen (selection probability), not for gain. The source re-synthesizes itself from its own most energetic (or most transient) moments using a ReLU + Softmax attention mechanism applied to grain selection.

🎯 What is Attention-Based Grain Selection?

Unlike traditional granular synthesis where grains are selected uniformly or randomly, this engine uses an attention mechanism:

Candidate grains are extracted from the source at regular intervals
Each grain is scored by RMS energy, transient slope, or a mixed measure
ReLU gate: grains below (mean score + floor_dB) are zeroed out
Softmax with temperature α: converts gated scores into a probability distribution
Sampling: grains are drawn from this distribution for each output hop
Recency penalty: recently chosen grains are less likely to be chosen again

The result: the source "attends" to its own salient moments, repeating, stuttering, or crystallizing them into new textures.

Key Features:

7 Preset Strategies — Self-Remix to Cloud, plus Custom
3 Score Types — RMS (energy), Transient (slope), Mixed (blended)
ReLU Gate — Floor eliminates low-energy grains (in dB above mean)
Softmax Competition — Temperature α controls winner-take-all behavior
Recency Penalty — Reduces probability of recently chosen grains
Time & Pitch Jitter — Optional micro-variation per grain
Efficient OLA — Single Concatenate with overlap after all grains extracted
Comprehensive Visualization — 6-panel display with scores, probabilities, selection sequence, usage histogram

Technical Implementation: (1) Candidate Extraction: Extract N grains at candidate_hop intervals. (2) Scoring: Compute RMS, transient slope, or mixed score per grain. (3) ReLU + Softmax: Apply floor, then softmax with temperature. (4) Sampling: For each output hop, draw grain from distribution with recency penalty. (5) Grain Processing: Apply Hanning window, optional pitch jitter (resample), store. (6) Concatenation: All grains concatenated with overlap in one O(n) operation. (7) Wet/Dry Mix & Output.

Quick start

In Praat, select exactly one Sound object (any duration, minimum 50 ms).
Run script… → select Granular_Attention_Resynth.praat.
Choose Preset (2-8 for specific strategies, 1 for custom).
Set grain parameters (size, synthesis hop, candidate hop).
Configure competition parameters (temperature, floor dB).
Select score type and adjust variation parameters (jitter, recency penalty).
Set wet/dry mix and enable visualization.
Click OK — engine extracts candidates, computes attention, synthesizes output.

Quick tip: Start with Self-Remix preset on a 5-10 second recording with varied dynamics. Enable visualization — you'll see the grain scores (grey bars), probability curve (blue), and usage circles (orange). Listen to how the output is a textural remix of the source, with louder moments recurring more often. The output appears as "source_GAR_SelfRemix" in the Objects window.

Important: SYNTHESIS HOP must be ≤ grain_size/2 for stable overlap-add (script auto-clamps). CANDIDATE HOP determines grain density — smaller = more candidates, slower processing. TEMPERATURE α controls competition: 1-3 = gentle (all grains used), 5-10 = crystallized (dominant grains repeat), 15+ = extreme stutter (few grains dominate). RECENCY PENALTY (0-1) reduces repetition — higher = more variety. PITCH JITTER uses resampling, which may cause slight duration changes (grains are trimmed/padded to exact length).

Attention Selection Theory

The Attention Mechanism

🧮 ReLU + Softmax for Grain Selection

For each candidate grain i with raw score sᵢ:

meanScore = (1/N) Σ sᵢ floorValue = meanScore × 10^(floor_dB / 10) gatedᵢ = max(0, sᵢ - floorValue) softmax with temperature α: maxGated = max(gated) wᵢ = exp( (gatedᵢ - maxGated) / meanScore × α ) pᵢ = wᵢ / Σ wⱼ

Interpretation:

ReLU gate: Grains below floor are eliminated from competition
meanScore normalization: Makes temperature α independent of score scale
α → 0: uniform distribution (all active grains equally likely)
α → ∞: winner-take-most (highest-scoring grain dominates)

Score Types

📊 Three Scoring Methods

Score Type	Formula	Musical Effect
RMS (Energy)	RMS² per grain	Loud moments repeat — emphasizes dynamics, creates rhythmic loops
Transient (Slope)	\|RMS²[i] - RMS²[i-1]\|	Attacks/onsets repeat — creates stutter, glitch, percussive textures
Mixed	(1-w)×RMS + w×Transient	Blend of both — transient_weight controls balance

All scores are normalized to [0,1] before mixing.

Recency Penalty

During sampling, probabilities are modified based on the last three chosen grains: if candidate == lastGrain1: penalty factor = (1 - recency_penalty) if candidate == lastGrain2: penalty factor = (1 - recency_penalty × 0.6) if candidate == lastGrain3: penalty factor = (1 - recency_penalty × 0.3) Penalized probabilities are renormalized to sum to 1. Effect: Prevents the same grain from repeating too frequently, encouraging variety even in winner-take-all regimes.

Musical Effects by Temperature α

α Range	Behavior	Musical Result	Example Preset
1-3	Gentle, near-uniform	Self-remix, all grains roughly equally used	Self-Remix (α=3)
5-10	Moderate competition	Energetic moments dominate, texture crystallizes	Crystallize (α=12)
15+	Winner-take-most	Few grains repeat obsessively — motif extraction, stutter loops	MotifExtract (α=25)

Grain Processing Pipeline

For each output hop (position = hop × synthHop): 1. Draw grain index i from probability distribution (with recency penalty) 2. Extract grain from source at time grainStartTime[i] with Hanning window 3. Optional pitch jitter: resample by factor 2^(pitchShift/12), then back to original SR 4. Store grain in array After all hops: Concatenate all grains with overlap = grainDur - synthHop using Praat's built-in Concatenate with overlap (single O(n) operation)

Preset Strategies

Preset 2: Self-Remix (gentle, textural)

🌱 Gentle Textural Remix

Grain: 60 ms | Hop: 30 ms | Cand: 20 ms

α: 3.0 | Floor: -3 dB (open) | Score: RMS

Jitter: ±8 ms | Recency: 0.3

Character: Gentle competition — most grains used, textural self-remix

Use on: Ambient, pads, general purpose

Preset 3: Crystallize (high α, dense repeats)

💎 Crystallized Texture

Grain: 40 ms | Hop: 20 ms | Cand: 15 ms

α: 12.0 | Floor: +3 dB | Score: RMS

Jitter: ±3 ms | Recency: 0.4

Character: Energetic moments dominate, texture crystallizes around loud events

Use on: Rhythmic material, percussive sources

Preset 4: Motif Extract (very high α, stutter)

🔁 Stutter / Motif Extraction

Grain: 30 ms | Hop: 15 ms | Cand: 10 ms

α: 25.0 | Floor: +6 dB | Score: RMS

Jitter: ±1 ms | Recency: 0.6

Character: Winner-take-most — few grains repeat obsessively, stutter effect

Use on: Creating loops, stutter effects, motif extraction

Preset 5: Onset Harvest (transient score)

⚡ Attack-Focused

Grain: 50 ms | Hop: 25 ms | Cand: 15 ms

α: 8.0 | Floor: +2 dB | Score: Transient

Jitter: ±5 ms | Recency: 0.4

Character: Attacks and onsets repeat — percussive, glitchy textures

Use on: Drums, percussion, plosives

Preset 6: Shimmer (small grains, light jitter)

✨ Shimmering Texture

Grain: 20 ms | Hop: 10 ms | Cand: 10 ms

α: 2.0 | Floor: -6 dB | Score: RMS

Jitter: ±4 ms | Pitch: ±0.3 st | Recency: 0.2

Mix: 85%

Character: Small grains, light pitch jitter, gentle competition — shimmering, ethereal

Use on: Pads, sustained tones, ambient

Preset 7: Slabs (large grains, slow mosaic)

🧱 Large-Grain Mosaic

Grain: 300 ms | Hop: 150 ms | Cand: 50 ms

α: 10.0 | Floor: +2 dB | Score: RMS

Jitter: ±15 ms | Recency: 0.5

Character: Large slabs of sound, moderate competition — mosaic-like texture

Use on: Speech, instrumental phrases, longer gestures

Preset 8: Cloud (very large grains, drifting)

☁️ Drifting Cloud

Grain: 600 ms | Hop: 300 ms | Cand: 80 ms

α: 4.0 | Floor: -3 dB | Score: Mixed (w=0.4)

Jitter: ±30 ms | Recency: 0.3

Mix: 90%

Character: Very large grains, gentle competition, drifting layers

Use on: Drone, ambient, long-form textures

Parameters & Controls

Grain Parameters

Parameter	Default	Description
Grain_size_ms	50.0	Duration of each grain (milliseconds)
Synthesis_hop_ms	25.0	Hop between output grain starts (auto-clamped to ≤ grain/2)
Candidate_hop_ms	20.0	Hop between candidate grain starts (density of pool)

Competition Parameters

Parameter	Default	Description
Temperature	8.0	Softmax temperature α (1-3 = gentle, 5-10 = crystallized, 15+ = extreme)
Floor_dB	0.0	ReLU gate threshold relative to mean score (dB)

Score Type

Parameter	Default	Description
Score_type	RMS	RMS (energy), Transient (slope), or Mixed
Transient_weight	0.5	Weight of transient score in Mixed mode (0-1)

Variation Parameters

Parameter	Default	Description
Time_jitter_ms	5.0	Random offset in grain start time (± ms)
Pitch_jitter_semitones	0.0	Random pitch shift per grain (± semitones)
Recency_penalty	0.5	Penalty for recently chosen grains (0-1, higher = less repetition)

Mix & Output

Parameter	Default	Description
Wet_percent	100.0	Wet/dry mix (0 = dry, 100 = full wet)
Draw_visualization	1	Generate 6-panel analysis display
Play_result	1	Audition after processing

Visualization & Analysis

6-Panel Display

Granular Attention Re-synthesis Visualization: Panel 1: TITLE • Script name, preset, source name, α, floor, grain/hop, unique grain count Panel 2: INPUT WAVEFORM • X-axis: Time, Y-axis: Amplitude • Gray waveform • Title: "Original waveform" Panel 3: OUTPUT WAVEFORM • Same axes as input • Blue waveform = synthesized output • Title: "Output filename" Panel 4: GRAIN SCORE / PROBABILITY / USAGE • X-axis: Time (candidate grain positions) • Y-axis: Normalized score (0-1.3) • Grey bars = raw scores (height = score) • Orange dotted line = ReLU floor • Blue line = probability curve (scaled to same axis) • Orange circles = grain usage (size = frequency of selection) • Legend: score (grey), probability (blue), usage (orange) • Title: "Grain score / probability / usage (orange dot = chosen, size = frequency)" Panel 5: GRAIN SELECTION SEQUENCE • X-axis: Output hop (1 to N, limited to 200) • Y-axis: Grain number (1 to nCandGrains) • Color-coded cells: blue = low score, orange = high score • Shows which grains are chosen at each hop • Title: "Grain selection sequence (blue=low score orange=high score)" Panel 6: USAGE HISTOGRAM • X-axis: Grain number • Y-axis: Usage count • Color-coded bars: blue to orange gradient by score • Dashed line = uniform mean usage • Title: "Usage histogram (dashed = uniform mean)" Panel 7: STATS PANEL • Preset, α, floor, grain/hop sizes • Candidate count, active grains, unique used • Top grain info (index, time, usage count)

Reading the Visualization

What to look for:

Grey bars (scores): Shows the raw score of each candidate grain — tall bars = loud/transient moments
Orange dotted line (floor): Grains below this line are eliminated from competition
Blue line (probability): Follows score peaks but is sharpened by temperature α
Orange circles (usage): Size indicates how often each grain was chosen — should correlate with probability
Selection sequence: Scan horizontally to see patterns — repeating colors indicate grain stutter
Usage histogram: Compare to uniform mean (dashed) — tall bars = dominant grains

Applications

Generative Composition

Use case: Creating evolving textures from any source material

Technique: Self-Remix or Cloud presets on varied sources

Workflow:

Select source with interesting timbral variety
Apply Self-Remix preset for gentle textural evolution
Listen to how the output "remixes" the source — louder moments recur
Layer multiple outputs with different temperatures for complexity

Rhythmic & Glitch Effects

Use case: Creating stutter, glitch, or rhythmic loops

Technique: Motif Extract or Crystallize presets on percussive material

Settings:

Stutter: Motif Extract (α=25, small grains) — few grains repeat obsessively
Glitch: Onset Harvest (transient score) — attacks repeat, creating glitch patterns
Rhythmic loops: Crystallize (α=12) on drum loops — loud hits dominate

Sound Design for Media

Use case: Creating evolving backgrounds, transitions, impacts

Technique: Slabs or Shimmer presets on appropriate sources

Applications:

Backgrounds: Cloud preset on ambient recordings — drifting, ethereal layers
Transitions: Self-Remix on risers — creates evolving texture
Impacts: Onset Harvest on explosion — repeats the attack for stuttering impact

Research & Education

Use case: Demonstrating attention mechanisms, granular synthesis, probability

Technique: Enable visualization, compare presets on simple test signals

Learning outcomes:

See how temperature affects probability distribution
Understand ReLU gating and floor effects
Observe recency penalty in selection sequence
Correlate score types with auditory results

Practical Workflow Examples

🎬 Film Scene: Tension Buildup

Goal: Create 30-second tension cue from 5-second drone

Settings:

Source: 5-second low drone
Preset: Cloud (large grains, gentle competition)
Custom: grain=800 ms, hop=400 ms, α=3.0
Transient_weight=0.3 (some attack sensitivity)

Result: 30-second evolving drone with subtle grain repetition, creating tension

🎚️ Electronic Music: Stutter Loop

Goal: Create stuttering vocal effect

Settings:

Source: 3-second vocal phrase
Preset: Motif Extract
Custom: α=30 (extreme), recency_penalty=0.8 (variety)
Score: Mixed (transient_weight=0.7) — emphasize attacks

Result: Vocal stutter where consonants repeat obsessively

🎙️ Voice Processing: Textural Voice

Goal: Transform speech into abstract texture

Settings:

Source: 10-second spoken phrase
Preset: Self-Remix
Custom: α=2.0 (gentle), pitch_jitter=0.2 st (subtle shimmer)

Result: Speech becomes abstract, textural cloud while retaining intelligibility

Troubleshooting Common Issues

Problem: Output has clicks/pops
Cause: Hop too large relative to grain size, or grains not properly windowed
Solution: Ensure synthesis_hop ≤ grain_size/2 (script auto-clamps), Hanning window applied

Problem: Output much shorter/longer than source
Cause: Number of hops = floor(srcDur / synthHop) + 1; may not exactly match source
Solution: Output is trimmed to source duration; for longer output, increase source length

Problem: Only a few grains used (stutter extreme)
Cause: Temperature too high, floor too high, or recency penalty too low
Solution: Reduce α, lower floor, increase recency_penalty

Problem: Processing very slow
Cause: Many candidate grains (small candidate_hop) and many output hops
Solution: Increase candidate_hop, reduce synthesis_hop, or use shorter source

Problem: Pitch jitter causes duration mismatch
Cause: Resampling changes grain duration slightly
Solution: Script trims/pads grains back to exact grainDur; for large pitch shifts, consider disabling jitter

Advanced Techniques

Custom score functions:

Edit the scoring section to use other features (spectral centroid, pitch, zero-crossing rate) as attention drivers.

Time-varying temperature:

Modify script to make temperature change over time — start high for exploration, end low for crystallization.

Grain position jitter:

Time_jitter_ms shifts grain start time; useful for de-correlating repeats, creating thicker textures.

Multi-channel output:

Process each channel separately or convert mono output to stereo; for true multi-channel, modify to process each channel independently.