Dramaturgical Structure Composer β€” User Guide

Advanced electroacoustic composition tool that reshapes musical form through spectral novelty detection, texture classification, and structural operations β€” NOT an FX processor, but a form architect.

Author: Shai Cohen Version: 4.0 (2025) Technique: Structural Analysis + Form Archetypes Application: Praat scripting language
Contents:

What this does

This script implements a Dramaturgical Structure Composer β€” a tool that analyzes the structural properties of an audio recording and reshapes its form according to compositional archetypes. Unlike processors that apply continuous effects, this tool recomposes the material by detecting sections, classifying their textures, and applying structural operations (reordering, looping, silence insertion, stretching, and transformed recalls) to create new musical forms.

🎭 What is Dramaturgical Composition?

Dramaturgy in music refers to the shaping of musical events to create narrative arcs, tension, release, and structural coherence. This tool implements dramaturgical principles through:

  • Spectral novelty detection β€” finding natural section boundaries where timbre changes
  • Silence detection β€” identifying structural gaps in the sound
  • Register tracking β€” measuring pitch centroid to classify texture brightness/darkness
  • Texture classification β€” categorizing sections as tonal, sparse, bright, dark, or noise
  • Form archetypes β€” applying classical structural patterns (arch, contrast, rondo, narrative)

The result is not an "effect" but a recomposition β€” a new musical structure derived from the source material's own characteristics.

Key Features (v4.0):

Quick start

  1. In Praat, select exactly one Sound object (minimum 20 seconds, any material).
  2. Run script… β†’ select Dramaturgical_Structure_Composer.praat.
  3. Choose Strategy (Conservative, Dramatic, or Radical).
  4. Select Reorder_mode (Arch, Contrast, Rondo, Narrative, Random).
  5. Enable/disable options (looping, silences, stretching, recalls, tension arc).
  6. Click OK β€” advanced settings dialog appears with additional controls.
  7. Configure advanced parameters (tension arc position, crossfade mode, silence mode, recall transformations).
  8. Click OK β€” processor analyzes, reorders, applies operations, creates new structure.
Quick tip: Start with Dramatic strategy + Arch form on a recording with varied dynamics (orchestral, electronic, field recording). Enable visualization β€” you'll see detected sections color-coded by texture, then the reordered output. Listen for how the material has been reshaped into an arch structure (build to peak, then mirror down). For more radical transformation, try Radical strategy with Narrative form.
Important: MINIMUM DURATION β€” Input must be at least 20 seconds for meaningful structural analysis. MACRO-DYNAMICS PRESERVED β€” Unlike previous versions, the output maintains the original dynamic relationships between sections; only the tension arc (if enabled) modifies overall contour. THIS IS NOT AN FX PROCESSOR β€” It reshapes form, not just timbre. COMPUTATIONAL LOAD β€” Analysis of long files with many sections can take 30-60 seconds; Radical strategy with many operations may take several minutes.

Dramaturgical Structure Theory

The Analysis Pipeline

Analysis Flow: 1. Silence Detection └─ Identify gaps > min_silence_duration_s where intensity < silence_threshold_dB 2. Spectral Novelty Detection └─ Compute spectrogram, measure frame-to-frame spectral difference └─ Find peaks > novelty_threshold β†’ section boundaries └─ Enforce min_section_duration_s and max_section_duration_s 3. Per-Section Analysis β”œβ”€ Spectral centroid β†’ register (bright/dark) β”œβ”€ Harmonicity β†’ tonal vs. noise β”œβ”€ Energy/RMS β†’ dynamic level β”œβ”€ Texture classification (1-5): β”‚ 1 = tonal (high harmonicity) β”‚ 2 = sparse (low energy) β”‚ 3 = bright (centroid > high threshold) β”‚ 4 = dark (centroid < low threshold) β”‚ 5 = noise (otherwise) └─ Store RMS for dynamic preservation

Texture Classification

🎨 The Five Texture Types

CodeNameCriteriaColor in Viz
1TonalHarmonicity > harmonicity_threshold (0.15)Light blue
2SparseEnergy < 0.001 (very quiet, isolated events)Gray
3BrightSpectral centroid > spectral_centroid_high_hz (3000 Hz)Light yellow
4DarkSpectral centroid < spectral_centroid_low_hz (500 Hz)Blue-gray
5NoiseNone of the above (broadband, mixed)Orange

Texture distance formula (used for contrast maximization):

distance = 0.4 Γ— |tex₁ - texβ‚‚| + 0.3 Γ— |cent₁ - centβ‚‚|/5000 + 0.3 Γ— |rms₁ - rmsβ‚‚|/max(rms)

Tension Arc Envelope

Tension arc shapes the overall dynamic contour of the output: f(x) = 0.3 + 0.7 Γ— g(x)^(1/arc_exaggeration) where: g(x) = { x / peak_position for x ≀ peak_position 1 - (x - peak_position) / (1 - peak_position) for x > peak_position } arc_exaggeration > 1 = wider dynamic range arc_exaggeration = 1 = preserves original dynamic range arc_exaggeration < 1 = compresses (not recommended) Example: peak_position = 0.65, exaggeration = 1.5 β€’ Builds slowly to peak at 65% of duration β€’ Falls more quickly after peak β€’ Exaggeration increases the depth of the arc

Texture-Aware Crossfades

Crossfade duration depends on the textures being joined: | From β†’ To | Duration | Dramaturgical Meaning | |------------------|-----------|----------------------| | Tonal β†’ Tonal | 1.5 s | Smooth, connected | | Tonal β†’ other | 0.8 s | Transition | | other β†’ Tonal | 0.6 s | Gradual emergence | | Sparse involved | 0.15 s | Let space breathe | | Dense/Noise β†’ Sparse | 0.01 s | Hard cut, dramatic | | Silence β†’ anything | 1.5 s | Slow fade-in (anticipation) | | anything β†’ Silence | 0.08 s | Quick fade-out | | Noise β†’ Noise | 0.4 s | Blended texture | | Dark ↔ Bright | 0.05 s | Sharp contrast | Fixed mode (30 ms) ignores texture relationships.

Analysis & Detection

Analysis Parameters

ParameterDefaultDescription
Min_section_duration_s8Smallest allowed section (seconds)
Max_section_duration_s90Largest allowed section (clipped)
Novelty_threshold0.25Spectral change threshold for section boundaries
Silence_threshold_dB-45Below this = silence
Min_silence_duration_s0.5Minimum silence to register (advanced)
Harmonicity_threshold0.15Above this = tonal
Spectral_centroid_low_hz500Below this = dark
Spectral_centroid_high_hz3000Above this = bright

Spectral Novelty Algorithm

For each analysis frame (step = 0.1s): 1. Extract power spectrum (0-5000 Hz in 50 Hz bins) 2. Compare with previous frame: novelty = Ξ£|current_bin - previous_bin| / 100 3. Find local peaks > novelty_threshold 4. Ensure at least min_section_duration_s between boundaries Result: Section boundaries at moments of significant spectral change

Silence Detection

Intensity analysis at 10 ms resolution: if intensity < silence_threshold_dB: start silence else: end silence if duration > min_silence_duration_s Silences are stored separately from sections Used for context-aware placement in operations

Form Archetypes

πŸ›οΈ Arch Form (mode 2)

Structure: Build to peak at arc_peak_position, then mirror down

Algorithm:

  • Sort sections by RMS (energy) ascending
  • Place quietest sections at beginning and end
  • Place loudest section at peak position
  • Fill outward alternating left/right from peak

Dramaturgy: Gradual increase in intensity/density to climax, then relaxation

Example: Introduction β†’ Build β†’ Climax β†’ Dissolution β†’ Coda

⚑ Contrast Form (mode 3)

Structure: Maximize difference between adjacent sections

Algorithm (greedy):

  • Start with darkest section (lowest centroid)
  • For each position, pick unused section with maximum texture distance from previous
  • Continue until all sections placed

Dramaturgy: Juxtaposition, abrupt shifts, dialectical structure

Example: Dark ↔ Bright ↔ Sparse ↔ Tonal ↔ Noise

πŸ”„ Rondo Form (mode 4)

Structure: Refrain alternates with contrasting episodes

Algorithm:

  • Find most "distinctive" section (max harmonicity + centroid deviation) as refrain
  • Order: Refrain β†’ Episode₁ β†’ Refrain β†’ Episodeβ‚‚ β†’ Refrain β†’ ...
  • Final Refrain at end

Dramaturgy: Recurring material provides unity; episodes provide variety

Example: A B A C A D A (classical rondo)

πŸ“– Narrative Form (mode 5)

Structure: Dark β†’ Building β†’ Bright β†’ Recall of opening β†’ Fade

Algorithm:

  • Sort sections by spectral centroid (brightness) ascending
  • Place darkest first, brightest at ~70% of duration
  • Append recall of opening section at end

Dramaturgy: Story arc: exposition β†’ development β†’ climax β†’ recollection β†’ dissolution

Example: Dark introduction β†’ developing middle β†’ bright climax β†’ distant memory β†’ fade

🎲 Random Swap (mode 6)

Structure: Random permutations of section order

Algorithm: Perform N random swaps (N = floor(num_sections/2))

Dramaturgy: Chance operations, unpredictable structure

Use: Experimental work, aleatoric composition

Structural Operations

Strategy Settings

StrategyLoop ProbSilence ProbStretch ProbRecall ProbMax OpsSilence RangeStretch Range
Conservative0.20.150.20.2532 s0.7-1.4Γ—
Dramatic0.40.350.40.468 s0.4-2.5Γ—
Radical0.60.50.50.61020 s0.25-4.0Γ—

Operation Types

Operation 1: LOOP β€’ Insert (N-1) copies of target section after its position β€’ N = random(2,3) β€’ Only applies to sections 15-60s long β€’ Creates repetitive structures, ostinati Operation 3: SILENCE INSERTION β€’ Context-aware: placed after highest-energy section β€’ Duration = 3s + (contrast/maxRMS) Γ— silence_range β€’ Higher contrast = longer silence (dramatic breath) β€’ Can use digital zero or organic noise tail Operation 4: TIME STRETCH β€’ Target: tonal, bright, or noise sections (avoid sparse) β€’ Factor: random between min/max for strategy β€’ Stretch > 1 = slower, < 1 = faster β€’ Preserves pitch (overlap-add resynthesis) Operation 5: MATERIAL RECALL β€’ Insert transformed copy of earlier section β€’ Optional transformations: - Reverse (35% chance) - Low-pass filter (recall_lowpass_hz) - Amplitude reduction (recall_amplitude_factor) β€’ Creates "echoes," "memories," "distant recalls"

Advanced Settings Dialog

ParameterDefaultDescription
Arc_peak_position0.65Where climax occurs (0-1)
Arc_exaggeration1.5Dynamic range multiplier
Crossfade_modeTexture-awareFixed (30 ms) or texture-based durations
Silence_modeNoise tailDigital zero or organic noise from source
Recall_apply_lowpass1Filter recalled material
Recall_reduce_amplitude1Quieter recalls
Recall_allow_reverse135% chance of reversed recall
Recall_amplitude_factor0.7Amplitude multiplier for recalls
Recall_lowpass_hz2500Cutoff for low-pass filter
Keep_debug_objects0Preserve intermediate sounds

Visualization & Analysis

Visualization Panels

Dramaturgical Structure Composer v4.0 Visualization: Panel 1: TITLE β€’ Script name, input filename, strategy, reorder mode, section count, operation count Panel 2: ORIGINAL SECTIONS β€’ X-axis: Time, Y-axis: Arbitrary (0-1) β€’ Color-coded rectangles = detected sections: - Light blue = tonal - Orange = noise - Gray = sparse - Light yellow = bright - Blue-gray = dark β€’ Black vertical lines = section boundaries β€’ Small dark blue bars = RMS energy (height = normalized amplitude) β€’ Section numbers centered in each rectangle Panel 3: OUTPUT WAVEFORM β€’ Blue waveform of final composition β€’ Duration displayed β€’ Operation count and arc status shown Panel 4: TENSION ARC (if enabled) β€’ Red curve showing dynamic envelope β€’ Peak position marked β€’ Y-axis: multiplier (0.3-1.0)

Reading the Section Visualization

What the colors tell you:
  • Light blue (tonal): Pitched, harmonic material β€” likely melodic or harmonic content
  • Orange (noise): Broadband, unpitched β€” textures, effects, breath, friction
  • Gray (sparse): Low energy, isolated events β€” silence, single notes, gaps
  • Light yellow (bright): High spectral centroid β€” high frequencies prominent
  • Blue-gray (dark): Low spectral centroid β€” bass, low register, muffled

The dark blue bars show relative RMS energy β€” taller bars = louder sections. This helps identify climax sections and dynamic structure.

Section numbers correspond to original detection order. In the reordered output, these numbers are rearranged according to the selected archetype.

Diagnostic Info Output

The Info window displays:
  • Number of silences detected
  • Number of sections detected with timestamps
  • Each section's texture classification, centroid, RMS
  • Final section order with arrow notation
  • List of planned operations with parameters
  • Output duration relative to input

Applications

Electroacoustic Composition

Use case: Generating new structural versions of existing recordings

Technique: Dramatic strategy with Narrative or Arch form

Workflow:

Sound Design for Media

Use case: Creating evolving textures with narrative arcs

Technique: Radical strategy with Contrast or Rondo form

Applications:

Research & Education

Use case: Demonstrating structural analysis and recomposition

Technique: Enable visualization, compare archetypes on same source

Learning outcomes:

Experimental Music

Use case: Chance operations and aleatoric structure generation

Technique: Radical strategy with Random swap and high operation probabilities

Approach:

Practical Workflow Examples

🎬 Film Scene: Building Tension

Goal: Create suspenseful cue that builds to climax

Settings:

  • Strategy: Dramatic
  • Reorder mode: Arch
  • Tension arc: ON, peak at 0.7, exaggeration 1.8
  • Operations: Allow silences, stretching, recalls
  • Source: Ambient drone + occasional impacts

Result: Quiet sections at beginning and end, loudest at 70% point, with dramatic silences after peaks

🎚️ Rondo from Field Recording

Goal: Create rondo structure from environmental sounds

Settings:

  • Strategy: Dramatic
  • Reorder mode: Rondo
  • Texture-aware crossfades: ON
  • Operations: Loops and recalls allowed
  • Source: 2-minute field recording (birds, traffic, wind)

Result: Most distinctive sound becomes refrain, alternates with other textures β€” creates "theme and variations" from found sound

πŸŒ€ Radical Narrative Transformation

Goal: Extreme recomposition of speech into abstract narrative

Settings:

  • Strategy: Radical
  • Reorder mode: Narrative
  • Tension arc: ON, peak at 0.65, exaggeration 2.0
  • All operations allowed, high probabilities
  • Source: 60-second spoken monologue

Result: Speech becomes abstract: dark/filtered recalls of opening, stretched phonemes, compressed silences β€” creates dreamlike narrative arc

Troubleshooting Common Issues

Problem: Too many/few sections detected
Cause: Novelty_threshold too low/high for material
Solution: Adjust novelty_threshold (0.15-0.35 range), check min_section_duration
Problem: Sections misclassified (tonal vs. noise)
Cause: Harmonicity_threshold inappropriate for source
Solution: Adjust harmonicity_threshold (0.1-0.25), or modify centroid thresholds
Problem: Output much longer/shorter than expected
Cause: Many operations (loops, silences) adding/subtracting time
Solution: Reduce max_operations, check stretch factors, or use Conservative strategy
Problem: Abrupt cuts between sections
Cause: Crossfade_mode = fixed or texture-aware producing short crossfades
Solution: Switch to texture-aware mode, or manually increase crossfade durations in procedure
Problem: Recalls inaudible or too quiet
Cause: recall_amplitude_factor too low, or low-pass filter too aggressive
Solution: Increase recall_amplitude_factor (0.8-0.9), raise recall_lowpass_hz (3000-4000)

Advanced Techniques

Customizing texture classification:
  • More tonal-sensitive: Reduce harmonicity_threshold to 0.08
  • More noise-sensitive: Increase to 0.2
  • Narrow brightness range: Set centroid_low=1000, centroid_high=2000
  • Broad classification: Use wider thresholds (300-4000 Hz)
Operation probability tuning (edit script):
  • Loop-heavy: Increase loop_probability, reduce others
  • Silence-heavy: Increase silence_insert_probability, silence_duration_range_s
  • Stretch-heavy: Increase stretch_probability, widen stretch_factor range
  • Recall-heavy: Increase recall_probability, enable all recall transformations
Organic silence generation:

When silence_mode = 2, the script extracts the last 2 seconds of the source, heavily low-passes it (0-400 Hz), and scales to very low amplitude (0.02 peak). This creates a natural ambient floor rather than digital silence. For different "silence colors," modify the filter frequencies or use different source regions.