Adaptive Grain Cloud Synthesis — User Guide

Spectral-sorted granular resynthesis: fragments source audio into grains, analyzes spectral centroid (brightness) of each, applies adaptive processing based on frequency content, sorts grains from dark to bright (or reverse), and concatenates with gaps creating perceptually organized sonic trajectories.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 0.1 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

This script implements adaptive grain cloud synthesis with spectral sorting — a sophisticated granular technique that reorganizes sound based on frequency content. The algorithm fragments the source into grains (50-300ms), analyzes each grain's spectral centroid (perceptual "brightness" measured in Hz), applies content-aware processing (pitch shifting, spectral emphasis, amplitude scaling) based on frequency characteristics, sorts grains by brightness value, and concatenates them with optional gaps. The result: temporal structure replaced by spectral organization, creating trajectories from dark (low-frequency) to bright (high-frequency) content or vice versa. Optional spectral exaggeration enhances brightness differences for more dramatic sorting effects. Adaptive processing ensures bright grains become brighter and dark grains darker, creating clear perceptual journeys through timbral space.

Key Features:

What is spectral sorting? Traditional granular synthesis: Temporal order preserved (grains assembled chronologically). Spectral sorting: Temporal order replaced by spectral organization. This script: (1) Fragmentation: Source divided into overlapping grains. (2) Analysis: Each grain's spectral centroid calculated (frequency center of mass in Hz). (3) Adaptive processing: Bright grains (>1500Hz) → boosted highs, louder, upward pitch shift; Dark grains (<800Hz) → boosted lows, quieter, downward pitch shift; Medium grains → balanced treatment. (4) Spectral exaggeration (optional): Frequency-selective boost (1.2x to 2x) enhances brightness separation. (5) Sorting: Grains reordered by brightness value (ascending or descending). (6) Concatenation: Sorted grains assembled with optional gaps. Result: Sound "plays" from low frequencies to high (or reverse), creating perceptual journey through spectral space. Original temporal relationships destroyed, new spectral narrative created. Applications: Experimental composition, timbral exploration, spectral morphing, teaching frequency perception.

Technical Implementation: (1) Grain generation: Calculate number of grains from duration/overlap/density, Extract grains at random source positions, Apply window function (rectangular/triangular/parabolic). (2) Spectral analysis per grain: Convert to Spectrum object, Calculate center of gravity (spectral centroid in Hz), Store as "brightness" value. (3) Adaptive processing per grain: IF brightness >1500Hz: Emphasize >1000Hz (multiply by boost factor), Apply upward pitch shift, Scale amplitude to 0.35 (louder); IF brightness <800Hz: Emphasize <800Hz, Apply downward pitch shift, Scale amplitude to 0.25 (quieter); ELSE: Neutral processing. (4) Optional spectral exaggeration: Subtle (1.2x), Moderate (1.5x), Strong (2.0x) frequency-selective boost, Adjust brightness value to reflect processing. (5) Sorting algorithm: Bubble sort by brightness value, Swap grain IDs, brightness values, durations in parallel arrays. (6) Concatenation: Assemble sorted grains sequentially, Insert silence gaps between grains if specified. (7) Statistics logging: Display grain count, size range, brightness range, sort order. Key insight: Content-aware processing (adaptive) + spectral exaggeration create maximum brightness separation for clear perceptual sorting effect.

Quick start

  1. In Praat, select exactly one Sound object.
  2. Run script…Sorts grains from dark to bright.praat.
  3. Set grain_size_ms (base duration, 100-200ms recommended).
  4. Choose grain_size_mode: Fixed (uniform) or Random (varied within range).
  5. If Random, set grain_size_variation (±variation in ms).
  6. Adjust grain_overlap (0.3 = 30% overlap) and density_factor (grain count multiplier).
  7. Enable sort_grains and choose sort_direction (Dark_to_bright or Bright_to_dark).
  8. Enable exaggerate_spectral_differences and set sorting_intensity (Subtle/Moderate/Strong).
  9. Set gap_between_grains (0 = continuous, 50-100ms = distinct grains).
  10. Click OK — processing completes, result appears as "originalname_granular_sorted".
Quick tip: Start with grain_size_ms=150, overlap=0.3, density_factor=1.5 for balanced grain cloud. Use Fixed grain mode first for uniform texture, try Random for organic variation. Enable exaggerate_spectral_differences with Moderate intensity — dramatically improves sorting perceptibility. Set gap_between_grains=50ms for clear grain separation (helps hear sorting trajectory). Dark_to_bright creates ascending spectral journey (dark→bright = natural progression). Watch Info window for detailed logging — shows each grain's brightness, duration, sorting order. Processing time: ~2-5 seconds per sound (depends on grain count). Stereo sources automatically converted to mono. Result scaled to 0.9 peak with dynamic range preservation.
Important: TEMPORAL STRUCTURE DESTROYED — sorting removes original temporal relationships, creates new spectral-based order. Source material becomes unrecognizable rhythmically/melodically. Works best with: Sustained tones (clear brightness), rich harmonic content (variety for sorting), non-rhythmic material (rhythm lost anyway). Less effective with: Pure noise (no clear brightness), percussive material (transients dominate, similar brightness), very short sounds (<2s = few grains). Grain size critical: Too short (<50ms) = artifacts, unclear brightness; Too long (>300ms) = sparse texture, less granular. Density_factor >2.0 creates very dense clouds (processing slower). Gap_between_grains >200ms creates sparse, disconnected result. Spectral exaggeration Strong can create distortion/harshness. Random grain mode with high variation creates uneven texture. Sorting requires at least 10+ grains to be perceptible.

Spectral Theory

Spectral Centroid (Brightness)

Center of Gravity in Frequency Domain

What is spectral centroid:

Spectral centroid = Weighted average of frequencies in spectrum Mathematical formula: Centroid = Σ(f × |X(f)|) / Σ|X(f)| Where: f = frequency (Hz) |X(f)| = magnitude at frequency f Σ = sum across all frequencies Physical interpretation: "Center of mass" of the frequency spectrum Low centroid (e.g., 300Hz) = dark, bass-heavy timbre High centroid (e.g., 3000Hz) = bright, treble-heavy timbre Example calculations: Pure sine 440Hz: Centroid = 440Hz (single frequency) Square wave 440Hz (odd harmonics): 440Hz (fundamental), 1320Hz, 2200Hz, 3080Hz... Centroid ≈ 800-1000Hz (harmonics shift center up) White noise: Equal energy at all frequencies Centroid ≈ 2000-3000Hz (center of audible range) Praat implementation: To Spectrum: "yes" (convert grain to frequency domain) Get centre of gravity: 2 (calculate centroid, power=2) Returns value in Hz

Perceptual Correlate of Brightness

Why centroid represents "brightness":

DARK SOUNDS (Low centroid: 200-800Hz): Bass instruments: Double bass, tuba, bass drum Low male voices: Baritone, bass Dark vowels: "oo" as in "boot" Energy concentrated in low frequencies

MEDIUM SOUNDS (Mid centroid: 800-1500Hz): Mid-range instruments: Guitar, alto voice Balanced vowels: "ah" as in "father" Even spectral distribution

BRIGHT SOUNDS (High centroid: 1500-4000Hz+): Treble instruments: Flute, violin, cymbals Female voices: Soprano, children Bright vowels: "ee" as in "see" Sibilants: "s", "sh" Energy concentrated in high frequencies

Perceptual thresholds (used in script): < 800Hz: Dark processing (boost lows, pitch down) 800-1500Hz: Neutral processing > 1500Hz: Bright processing (boost highs, pitch up)

Adaptive Processing

Content-Aware Grain Treatment

Processing decision tree:

FOR each grain: Calculate brightness = spectral centroid (Hz) IF brightness > 1500Hz (BRIGHT): Spectral emphasis: IF exaggerate_spectral_differences = ON: Boost frequencies >1000Hz by spectral_boost factor Adjust brightness value: brightness × 1.3 ENDIF Pitch shifting: Target shift = randomGauss(0.5, pitch_scatter × 1.5) (Upward shift, increased variation) Amplitude scaling: Scale to 0.35 (relatively loud) ELSIF brightness < 800Hz (DARK): Spectral emphasis: IF exaggerate_spectral_differences = ON: Boost frequencies <800Hz by spectral_boost factor Adjust brightness value: brightness × 0.7 ENDIF Pitch shifting: Target shift = randomGauss(-0.3, pitch_scatter × 0.8) (Downward shift, reduced variation) Amplitude scaling: Scale to 0.25 (relatively quiet) ELSE (MEDIUM 800-1500Hz): Spectral emphasis: No frequency-selective boost Pitch shifting: Target shift = randomGauss(0, pitch_scatter) (Neutral, standard variation) Amplitude scaling: Scale to 0.3 (moderate level) ENDIF END FOR Result: Bright grains become brighter and louder Dark grains become darker and quieter Creates maximum spectral separation for sorting

Spectral Exaggeration

Frequency-selective boost:

Exaggeration process: Convert grain to Spectrum IF bright grain (centroid >1000Hz): Formula: "if x > 1000 then self × spectral_boost else self fi" (Boost high frequencies) IF dark grain (centroid <800Hz): Formula: "if x < 800 then self × spectral_boost else self fi" (Boost low frequencies) Convert back to Sound Boost factors by intensity: Subtle: 1.2 (20% increase, gentle enhancement) Moderate: 1.5 (50% increase, clear effect) Strong: 2.0 (100% increase, dramatic transformation) Example: Bright grain, original 2000Hz centroid, Moderate boost Frequencies >1000Hz multiplied by 1.5 Spectral energy shifts upward New centroid ≈ 2000 × 1.3 = 2600Hz (script adjustment) Grain sounds significantly brighter Effect on sorting: Without exaggeration: Brightness range 500-2500Hz (2000Hz span) With Moderate: Range 350-3250Hz (2900Hz span, 45% wider) Clearer perceptual separation in sorted result

Pitch Shifting for Brightness

Frequency Transposition via Spectral Scaling

Pitch shift implementation:

Calculate shift in semitones: grain_pitch_shift = randomGauss(mean, standard_deviation) Where mean and std depend on brightness: Bright (>1500Hz): mean=0.5, std=pitch_scatter×1.5 (upward bias) Dark (<800Hz): mean=-0.3, std=pitch_scatter×0.8 (downward bias) Medium: mean=0, std=pitch_scatter (neutral) Apply shift: Convert to Spectrum shift_factor = 2^(grain_pitch_shift / 12) Formula: "if x > 0 then self × shift_factor else self fi" Convert back to Sound Example: Bright grain, pitch_scatter=0.2, random_value=0.8 grain_pitch_shift = 0.5 + (0.2 × 1.5 × 0.8) = 0.74 semitones shift_factor = 2^(0.74/12) = 1.044 All frequencies multiplied by 1.044 (4.4% increase) 2000Hz → 2088Hz, 3000Hz → 3132Hz Grain sounds slightly brighter Why this enhances sorting: Bright grains shifted up → even brighter Dark grains shifted down → even darker Separates brightness categories further Creates more dramatic spectral trajectory

Grain Extraction and Windowing

Grain Generation Process

Extraction parameters:

Grain count calculation: hop_time = grain_duration × (1 - grain_overlap) num_grains = round((duration / hop_time) × density_factor) Example: 10s sound, grain_size=150ms, overlap=0.3, density=1.5 hop_time = 0.15 × (1 - 0.3) = 0.105s (105ms between grain starts) num_grains = round((10 / 0.105) × 1.5) = round(95.2 × 1.5) = 143 grains Grain duration modes: Fixed mode: grain_duration = base_grain_duration (constant) All grains same length Uniform texture Random mode: variation_seconds = (grain_size_variation / 1000) × randomUniform(-1, 1) grain_duration = base + variation_seconds Constrained to [base×0.3, base×2.0] Example: base=150ms, variation=50ms, random=-0.6 variation_seconds = 0.05 × (-0.6) = -0.03s grain_duration = 0.15 - 0.03 = 0.12s (120ms) Variable durations create organic texture Source position: source_time = randomUniform(0, duration - grain_duration) Random extraction from valid range Temporal scrambling (not sequential)

Window Functions

Three window types:

Rectangular window: No envelope, abrupt start/end Maximum grain energy preserved Risk of clicks at boundaries Use: When transients important

Triangular window: Linear fade in/out Attack: 0 → 1 linearly over first half Decay: 1 → 0 linearly over second half Peak at center Use: Balanced, general purpose

Parabolic window (DEFAULT): Smooth quadratic envelope Similar to Hann/Hamming but simpler Smoother than triangular Reduces clicks, musical Use: Smooth textures, recommended

Praat extraction: Extract part: start, end, window_shape$, 1, 0 window_shape$ = "rectangular", "triangular", or "parabolic"

Sorting Algorithm

Bubble Sort by Brightness

Sorting implementation:

Parallel arrays: grainIDs# = array of Sound object IDs grainBrightness# = array of centroid values (Hz) grainDurations# = array of grain durations (s) grainOriginalBrightness# = array of pre-processing centroids Bubble sort algorithm: FOR i = 1 to grainCount: FOR j = i+1 to grainCount: Comparison: IF sort_direction = Dark_to_bright: swap_condition = grainBrightness#[i] > grainBrightness#[j] ELSE (Bright_to_dark): swap_condition = grainBrightness#[i] < grainBrightness#[j] ENDIF IF swap_condition: Swap all parallel array values: Swap grainIDs#[i] <-> grainIDs#[j] Swap grainBrightness#[i] <-> grainBrightness#[j] Swap grainDurations#[i] <-> grainDurations#[j] Swap grainOriginalBrightness#[i] <-> grainOriginalBrightness#[j] ENDIF ENDFOR ENDFOR Result: All arrays sorted by brightness grainIDs# in spectral order (dark→bright or reverse) Corresponding brightness/duration values aligned Time complexity: O(n²) (acceptable for typical grain counts <500)

Sort Direction Effects

Perceptual trajectories:

Dark to bright (ascending): Begins: Low frequencies, bass-heavy, dark timbre Progresses: Gradual frequency rise Ends: High frequencies, treble-heavy, bright timbre Perceptual effect: Natural progression (darkness → light metaphor) Rising energy/excitement Opening up of sound Musical analogy: Crescendo + brightening filter sweep

Bright to dark (descending): Begins: High frequencies, bright, airy Progresses: Gradual frequency lowering Ends: Low frequencies, dark, heavy Perceptual effect: Settling, calming trajectory Closing down, darkening Tension release Musical analogy: Decrescendo + darkening filter sweep

Statistics example output: Brightness range: 450-2800Hz (2350Hz span) Sort order: Dark to bright Grain 1: 450Hz, Grain 50: 1600Hz, Grain 100: 2800Hz Clear spectral arc

Gap Insertion and Concatenation

Assembling Sorted Grains

Concatenation with gaps:

Create silence: gap_duration = gap_between_grains / 1000 (convert ms to s) silence = Create Sound from formula: "silence", 1, 0, gap_duration, 44100, "0" Concatenation loop: Initialize: temp_sound = Copy of grainIDs#[1] FOR i = 2 to grainCount: IF gap_duration > 0: Concatenate: temp_sound + silence + grainIDs#[i] ELSE: Concatenate: temp_sound + grainIDs#[i] ENDIF Update temp_sound for next iteration ENDFOR Result: Sorted grains with gaps between Gap effects: 0ms (gap=0): Continuous texture, grains blend 20-50ms: Subtle separation, still cohesive 50-100ms (RECOMMENDED): Clear grain boundaries, sorting perceptible 100-200ms: Distinct grains, fragmented >200ms: Sparse, disconnected, too obvious Why gaps help: Separates grains perceptually Makes sorting trajectory clearer Prevents overlapping timbres Creates rhythmic articulation

Complete Processing Pipeline

INITIALIZATION: Check: Exactly one Sound selected Convert to mono (if stereo) Get: duration, sample_rate, sound_name Validate: sound_duration >= grain_size GRAIN GENERATION LOOP: Calculate num_grains from duration/overlap/density Initialize arrays: grainIDs#, grainBrightness#, grainDurations# FOR each potential grain: Determine grain_duration: IF Fixed mode: use base_grain_duration IF Random mode: base + random_variation, clamped to [min, max] Extract grain: source_time = random position in valid range Extract part with window function grain = extracted Sound object Analyze brightness: To Spectrum centroid = Get centre of gravity: 2 original_brightness = centroid Remove Spectrum Spectral exaggeration (if enabled): To Spectrum IF centroid >1000Hz: boost high frequencies IF centroid <800Hz: boost low frequencies Adjust brightness value accordingly To Sound, replace grain Adaptive processing: IF bright: pitch_shift up, scale 0.35 IF dark: pitch_shift down, scale 0.25 IF medium: neutral shift, scale 0.3 Apply pitch shift via Spectrum scaling Scale amplitude based on grain duration Optional reversal: IF reverse_grains AND random > 0.7: Reverse grain Adjust brightness × 0.9 Store grain: grainIDs#[count] = grain object ID grainBrightness#[count] = processed brightness grainOriginalBrightness#[count] = original brightness grainDurations#[count] = grain_duration Increment count ENDFOR SORTING (if enabled): Bubble sort all arrays by grainBrightness# Direction: ascending (dark→bright) or descending (bright→dark) CONCATENATION: Create silence object (if gaps needed) Concatenate sorted grains with gaps Result: "originalname_granular_sorted" FINALIZATION: Scale peak: 0.9 (preserve dynamics) Calculate statistics: grain count, brightness range, duration range Display detailed log with all grain info Clean up: Remove individual grain objects OUTPUT: Sorted grain cloud in Objects window Comprehensive Info window log

Processing Modes

Grain Size Modes

🎯 Fixed Grain Size

Method: All grains exactly same duration

grain_duration = base_grain_duration (constant)

Character: Uniform, metronomic, mechanical texture

Effect: Regular rhythm, consistent grain density

Best for: Clear sorting perception, controlled experiments, rhythmic applications

Example: grain_size_ms=150, Fixed mode

🎲 Random Grain Size

Method: Variable duration within specified range

grain_duration = base ± variation (random)

Character: Organic, irregular, natural texture

Effect: Varied rhythm, non-uniform density, more musical

Best for: Natural-sounding results, organic textures, avoiding mechanical feel

Example: grain_size_ms=150, variation=50, Random mode

Sort Direction Modes

DirectionStartEndCharacter
Dark to brightLow frequenciesHigh frequenciesAscending, opening, brightening
Bright to darkHigh frequenciesLow frequenciesDescending, closing, darkening

Spectral Exaggeration Intensity

IntensityBoost FactorEffectUse Case
Subtle1.2×Gentle enhancement, transparentPreserve naturalness, slight emphasis
Moderate1.5×Clear effect, noticeable separationGeneral use, recommended default
Strong2.0×Dramatic transformation, obviousMaximum sorting clarity, experimental

Window Type Effects

WindowEnvelopeCharacterBest For
RectangularNo fadeMaximum energy, potential clicksTransient preservation, percussive
TriangularLinear fadeBalanced, moderate smoothnessGeneral purpose, neutral
ParabolicSmooth quadraticMusical, click-free, smoothRecommended default, sustained material

Parameters

Grain Generation Parameters

ParameterTypeDefaultDescription
grain_size_mspositive150Base grain duration in milliseconds
grain_size_variationpositive50±variation for Random mode (ms)
grain_size_modechoiceFixedFixed (uniform) or Random (varied)
grain_overlappositive0.3Overlap ratio (0.3 = 30% overlap)
density_factorpositive1.5Grain count multiplier (1.0 = normal density)
window_typechoiceParabolicGrain envelope: Rectangular/Triangular/Parabolic

Processing Parameters

ParameterTypeDefaultDescription
pitch_scatterpositive0.2Random pitch variation (std dev in semitones)
time_scatterpositive0.1Temporal position randomness (unused in current version)
reverse_grainsbooleanfalseRandomly reverse some grains (30% probability)

Sorting Parameters

ParameterTypeDefaultDescription
sort_grainsbooleantrueEnable spectral sorting
sort_directionchoiceDark_to_brightSorting order: ascending or descending
exaggerate_spectral_differencesbooleantrueEnable frequency-selective boost
sorting_intensitychoiceModerateBoost factor: Subtle/Moderate/Strong
gap_between_grainspositive50Silence between grains (ms, 0 = continuous)

Parameter Tuning Guide

Grain Size (grain_size_ms)

50-80ms: Very short, dense texture, granular character dominant Risk: Difficult to analyze brightness (too short for clear spectrum) Use: Extreme granulation, textural density 100-150ms: Short-medium, balanced granular feel Sweet spot for most applications Clear brightness analysis possible Grain identity preserved 150-200ms: Medium-long, less granular, more phrase-like Individual grains more recognizable Good for source material preservation Recommended default range 200-300ms: Long, minimal granular character Grains become "chunks" or "phrases" Very clear brightness analysis Less abstract, more literal >300ms: Very long, not really granular anymore More like segment rearrangement Consider other techniques

Overlap and Density

Overlap (grain_overlap): 0.0-0.2: Low overlap, sparse texture, gaps between grains 0.3-0.5: Moderate overlap (RECOMMENDED), balanced density 0.6-0.8: High overlap, very dense, smooth texture >0.8: Extreme overlap, almost continuous Density factor (density_factor): 0.5-0.8: Sparse, fewer grains, clear separation 1.0-1.5: Normal density (RECOMMENDED), balanced 1.5-2.0: Dense, many grains, rich texture >2.0: Very dense, processing slower, potential mud Combined effect: Low overlap + low density = very sparse (10-20 grains) High overlap + high density = very dense (200+ grains) Moderate both = balanced (50-100 grains for 10s sound)

Spectral Exaggeration

Recommendation: ALWAYS enable exaggerate_spectral_differences
  • Without exaggeration: Brightness differences subtle, sorting barely perceptible
  • With Moderate (1.5×): Clear sorting trajectory, noticeable brightness arc
  • With Strong (2.0×): Dramatic effect, extreme brightness separation
  • Subtle (1.2×) for transparent processing, maintaining naturalness
  • Moderate recommended for most applications (clear without harshness)

Gap Between Grains

Gap effects by duration: 0ms (continuous): Grains blend seamlessly Sorting trajectory smooth but less obvious Dense, flowing texture Use: When continuity desired 20-30ms (subtle): Slight separation, still cohesive Sorting perceptible but not emphasized Natural rhythm 50-75ms (RECOMMENDED): Clear grain boundaries Sorting trajectory obvious Good balance separation/flow Use: Default for most cases 100-150ms (distinct): Very clear grain separation Sorting extremely obvious Rhythmic, articulated Use: When grain identity important >200ms (sparse): Disconnected grains Sorting overly obvious Fragmented, unmusical Use: Extreme experimental effects

Applications

Spectral Trajectory Creation

Use case: Creating perceptual journeys through frequency space

Technique: Dark→bright sorting with moderate exaggeration

Example workflow:

Timbral Exploration

Use case: Revealing hidden spectral content in sounds

Technique: Random grain size with Strong exaggeration

Benefits:

Reverse Trajectories

Use case: Creating descending spectral motion

Technique: Bright→dark sorting

Expressive effect:

Teaching Spectral Concepts

Use case: Demonstrating frequency analysis to students

Pedagogical value:

Compositional Techniques

🎼 Spectral Morphing

Method: Process two contrasting sources, concatenate results

Example:

  • Source A (bright): Cymbals, sorted Dark→bright
  • Source B (dark): Bass drum, sorted Dark→bright
  • Result: A starts with dark cymbals, B starts with (rare) bright bass elements
  • Both evolve to characteristic timbre, then crossfade
  • Creates spectral continuum between contrasting sources

🌈 Spectral Layering

Method: Create multiple sorted versions, layer with EQ

Example:

  • Version 1: Dark→bright, HPF >500Hz (treble trajectory)
  • Version 2: Bright→dark, LPF <1000Hz (bass trajectory)
  • Layer: Contrary motion in frequency space
  • Result: Complex spectral counterpoint

Sound Design Applications

Film/game audio:

Music production:

Practical Workflow Examples

🎯 Workflow: Voice Spectral Journey

Goal: Transform speech into spectral arc

Settings:

  • Source: 10-second voice recording (varied vowels/consonants)
  • grain_size_ms: 120 (balanced for voice)
  • grain_size_mode: Random, variation: 40ms (organic)
  • overlap: 0.3, density: 1.5 (moderate cloud)
  • exaggerate: ON, Moderate (clear trajectory)
  • sort_direction: Dark_to_bright (ascending)
  • gap_between_grains: 60ms (clear articulation)

Result: Speech becomes spectral sweep, vowels sorted by formant structure, intelligibility lost but frequency narrative clear

🎸 Workflow: Guitar Chord Decomposition

Goal: Separate chord into constituent frequencies

Settings:

  • Source: Sustained guitar chord (rich harmonics)
  • grain_size_ms: 180 (longer to capture pitch)
  • grain_size_mode: Fixed (uniform for comparison)
  • overlap: 0.4, density: 2.0 (dense for smooth arc)
  • exaggerate: ON, Strong (maximum separation)
  • sort_direction: Dark_to_bright
  • gap_between_grains: 0ms (continuous sweep)

Result: Chord "fans out" spectrally, bass notes first, treble last, harmonic structure revealed temporally

Troubleshooting Common Issues

Problem: Sorting not perceptible
Causes: Source lacks spectral variety (e.g., pure noise), exaggeration disabled, grains too short
Solutions: Enable exaggerate_spectral_differences with Moderate/Strong, use grain_size >100ms, choose harmonically rich source, increase gap_between_grains to 75-100ms
Problem: Result sounds harsh/distorted
Causes: Strong exaggeration on already bright source, excessive pitch scatter
Solutions: Reduce to Subtle exaggeration, lower pitch_scatter to 0.1-0.15, check source isn't already spectrally extreme
Problem: Too few grains generated
Causes: Source too short, grain_size too large, density_factor too low
Solutions: Increase density_factor to 2.0+, reduce grain_size_ms, increase overlap, use longer source (>5s recommended)
Problem: Processing very slow
Causes: High grain count (density × overlap high), pitch shifting every grain
Solutions: Reduce density_factor to 1.0, reduce overlap to 0.2, reduce pitch_scatter (less shifting), use shorter source for testing
Problem: Result too sparse/disconnected
Causes: Large gaps, low density, minimal overlap
Solutions: Reduce gap_between_grains to 20-40ms, increase density_factor to 1.5-2.0, increase overlap to 0.4-0.5

Technical Details

Brightness Measurement Accuracy

Spectral centroid calculation:

Praat's "Get centre of gravity: power" method: Center of gravity = Σ(f^power × |X(f)|²) / Σ|X(f)|² For power=2 (used in script): Emphasizes higher frequencies more Centroid = Σ(f² × magnitude²) / Σ(magnitude²) Why power=2: Linear power (=1) less discriminating Quadratic power (=2) better separates bright vs dark Higher powers overemphasize treble Accuracy factors: Grain duration: >50ms needed for accurate FFT Window function: Parabolic smooths spectrum, improves accuracy Source material: Harmonic content better than noise Brightness ranges observed: Male speech (vowels): 400-1200Hz Female speech: 500-2000Hz Flute: 1500-3500Hz Double bass: 200-600Hz Cymbals: 2000-6000Hz+

Adaptive Processing Logic

Decision tree for grain treatment:

BRIGHTNESS ANALYSIS: Spectral centroid calculated → brightness value (Hz)

IF brightness > 1500Hz (BRIGHT CATEGORY): • Spectral emphasis: Boost >1000Hz by intensity factor • Brightness adjustment: × 1.3 (reflects boost) • Pitch shift target: +0.5 semitones (upward bias) • Pitch randomness: × 1.5 (increased variation) • Amplitude scaling: 0.35 (relatively loud)

ELSIF brightness < 800Hz (DARK CATEGORY): • Spectral emphasis: Boost <800Hz by intensity factor • Brightness adjustment: × 0.7 (reflects processing) • Pitch shift target: -0.3 semitones (downward bias) • Pitch randomness: × 0.8 (reduced variation) • Amplitude scaling: 0.25 (relatively quiet)

ELSE 800-1500Hz (MEDIUM CATEGORY): • Spectral emphasis: None (neutral) • Brightness adjustment: None • Pitch shift target: 0 semitones (neutral) • Pitch randomness: × 1.0 (standard variation) • Amplitude scaling: 0.3 (moderate level)

ADDITIONAL SCALING: • Duration-based: Short grains × 1.1, long grains × 0.9 • Reverse adjustment: Reversed grains brightness × 0.9

Memory and Performance

Computational considerations:

Memory usage: Each grain = temporary Sound object in memory 100 grains × ~150ms × 44.1kHz = ~660KB Peak memory: All grains before concatenation Typical: 1-2MB for moderate settings Processing time breakdown: Grain extraction: ~10ms per grain Spectral analysis: ~15ms per grain (FFT) Spectral exaggeration: ~20ms per grain (if enabled) Pitch shifting: ~25ms per grain (if applied) Sorting: Negligible (< 1s for 500 grains) Concatenation: ~50ms total Total estimate: 100 grains: 3-5 seconds 200 grains: 6-10 seconds 500 grains: 15-25 seconds Optimization tips: Lower density_factor reduces grain count (faster) Disable reverse_grains (skips some processing) Reduce pitch_scatter (less shifting needed) Fixed grain mode slightly faster than Random