Musikalisches Würfelspiel Audio Game — User Guide

Algorithmic audio recombination based on harmonic function analysis, creating musical dice games from any audio source through feature‑based classification and expressive reordering.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.0 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools

Contents:

What this does Quick start Musical Theory Processing Workflow Parameters Applications

What this does

This script implements a Musikalisches Würfelspiel (Musical Dice Game) that analyzes audio segments for harmonic‑functional qualities, then reorders them according to a T‑P‑D‑C (Tonic‑Predominant‑Dominant‑Cadence) pattern to create new musical structures from existing audio material.

Key Features:

Automatic Function Classification — Analyzes intensity, pitch, spectral features
Weighted Feature Analysis — Customizable importance of different audio characteristics
Live Visual Grid — Real‑time processing visualization with color‑coded functions
Expressive Phrasing — Ritardando and diminuendo at phrase endings
Musical Dice Game Logic — Random selection within functional constraints
Progressive Processing — Step‑by‑step visual feedback during recombination

Historical Context: The original Musikalisches Würfelspiel (Musical Dice Game) was an 18th‑century composition method where musical fragments were selected by dice rolls to create minuets and other forms. This script modernizes the concept by using audio feature analysis instead of dice, and harmonic function instead of pre‑composed measures.

Musikalisches Würfelspiel Audio Game

Analyzes audio segments through intensity, pitch, and spectral centroid features to classify them into harmonic functions (Tonic, Predominant, Dominant, Cadence), then reorders them according to a T‑P‑D‑C pattern while applying expressive phrasing (ritardando, diminuendo) at phrase endings, creating algorithmic recompositions from any audio source.

Quick start

In Praat, select exactly one Sound object (mono or stereo).
Run script… → musikalisches_wuerfelspiel.praat.
Set numberOfSegments (typically 16 or 32 for musical phrases).
Adjust feature weights if desired (defaults work well).
Enable/disable expressive effects (ritardando, diminuendo).
Set visualization and playback options.
Click OK — watch real‑time grid visualization as segments are reordered.
Result named "reordered_output" appears in Objects window.

Quick tip: Start with 16 segments for clear four‑phrase structure (4×4). Enable playDuringProcessing to hear each segment as it's placed. Watch the live grid visualization to understand the algorithm's decisions. Stereo files are auto‑converted to mono for analysis. Processing time depends on segment count.

Important: Classification depends on feature variation — if audio is very uniform, classification may be less meaningful. Ritardando and diminuendo only affect phrase endings (every 4th segment). High numberOfSegments (>32) can create dense grids. VisualizationDelay controls animation speed (lower = faster).

Musical Theory & Algorithm

Harmonic Function Analysis

🎲 The Four Functional Roles

Tonic (T) – Stability: Low tension segments. Function: Establish tonal center, resolution.

Predominant (P) – Preparation: Moderate tension. Function: Prepare for dominant, build energy.

Dominant (D) – Tension: High tension segments. Function: Create harmonic tension, need resolution.

Cadence (C) – Release: Very low tension. Function: Conclude phrases, release energy.

Feature Extraction & Tension Calculation

Three audio features analyzed per segment:

Mean Intensity — Overall loudness/energy (higher = more tension)
Mean Pitch — Fundamental frequency (higher = more tension)
Spectral Centroid — "Brightness" of spectrum (higher = more tension)

Feature normalization: z‑score = (value − mean) / standard deviation Weighted tension score: Tension = (z₁ × intensityWeight) + (z₂ × pitchWeight) + (z₃ × spectralWeight) / (intensityWeight + pitchWeight + spectralWeight) Function mapping: Tension > 0.5 → Dominant (D) Tension > 0 → Predominant (P) Tension > −0.5 → Tonic (T) Tension ≤ −0.5 → Cadence (C)

Output Pattern

The script imposes a T‑P‑D‑C repeating pattern regardless of input:

16‑segment T‑P‑D‑C repeating pattern (4 complete phrases)

Random Selection within Constraints

For each position in the output pattern, the script:

Identifies all input segments classified with the required function
Randomly selects one matching segment (dice‑roll equivalent)
If no segments match the function, selects any random segment

This creates constrained randomness — musical structure with stochastic variation.

Processing Workflow

Segmentation

Audio divided into equal‑length segments (numberOfSegments). Stereo converted to mono for analysis.

Feature Extraction

Each segment analyzed for mean intensity, mean pitch, spectral centroid. Features normalized to z‑scores.

Function Classification

Weighted tension score calculated. Segments classified as Tonic, Predominant, Dominant, or Cadence.

Pattern Assignment

T‑P‑D‑C repeating pattern created based on numberOfSegments.

Segment Selection

For each pattern position, random segment with matching function selected (dice‑roll).

Progressive Processing

Segments processed one by one with live grid visualization. Expression applied at phrase endings.

Concatenation

All processed segments concatenated into final output. Final visualization displayed.

Visual Grid System

The live visualization shows:

Grid cells — Each represents an output position
Color coding — Purple (T), Cyan (P), Magenta (D), Pink (C)
Text labels →X = selected input segment number, F = function
Progressive filling — Cells fill left‑to‑right, top‑to‑bottom
Current position — Highlighted with thicker border

Visualization Tips: Set visualizationDelay to control animation speed. Disable playDuringProcessing for faster visual progression. The grid automatically adjusts dimensions based on numberOfSegments (square or nearly‑square layout).

Parameters

Form Parameters

Parameter	Type	Default	Description
numberOfSegments	positive	16	Number of segments to create (4‑64)
intensityWeight	positive	1.0	Weight for intensity in classification
spectralWeight	positive	1.0	Weight for spectral centroid
pitchWeight	positive	0.5	Weight for pitch in classification
applyRitardando	boolean	1	Apply slowing at phrase endings
ritardandoModerate	positive	1.15	Slowdown factor at phrase ends
ritardandoFinal	positive	1.3	Slowdown factor at final cadence
applyDiminuendo	boolean	1	Apply fade‑out at phrase endings
diminuendoModerate	positive	0.6	Volume reduction at phrase ends
diminuendoFinal	positive	0.4	Volume reduction at final cadence
playDuringProcessing	boolean	1	Play each segment as placed
playFinalResult	boolean	0	Auto‑play complete output
visualizationDelay	positive	0.05	Delay between visual steps (seconds)

Parameter Details

Phrase Structure (numberOfSegments):
Determines how many segments the audio is divided into. Should be multiple of 4 for clear phrase structure (e.g., 16 = 4 phrases of 4 segments each). Minimum 4, maximum 64. Larger values create more granular but potentially fragmented results.

Feature Weighting Triad:
Controls which audio characteristics most influence function classification:
• intensityWeight — Loudness/energy importance
• pitchWeight — Fundamental frequency importance
• spectralWeight — Brightness/timbre importance
Higher weight = greater influence on tension calculation. Defaults emphasize intensity and brightness over pitch.

Musical Expression Pair:
Ritardando — Slows down segments at phrase endings (every 4th segment) and final cadence. Creates natural phrasing.
Diminuendo — Fades out segments at phrase endings. Creates dynamic shaping.
Moderate values affect phrase endings; final values affect last segment (strongest effect).

Playback & Visualization:
playDuringProcessing — Plays each segment as it's placed (helps understand selection).
playFinalResult — Auto‑plays complete output after processing.
visualizationDelay — Controls animation speed (0 = instant, 0.1 = slow).

Applications

Algorithmic Composition

Use case: Create new musical pieces from existing audio fragments

Technique: Use speech, environmental sounds, or instrument samples as source

Example: Field recording → structured sound composition

Musical Analysis Tool

Use case: Analyze existing music for functional qualities

Technique: Process classical/romantic music to see T‑P‑D‑C structure

Example: Beethoven excerpt → functional segmentation visualization

Educational Tool

Use case: Teach harmonic function through audio manipulation

Technique: Students adjust weights, hear how classification changes

Learning outcomes:

Understand tension‑release in audio features
Hear effect of different functional orders
Explore relationship between acoustics and harmony

Sound Design & Film Scoring

Use case: Create structured soundscapes with narrative arc

Technique: Use environmental sounds with expressive phrasing

Example: City sounds → structured urban symphony

Practical Workflow Examples

🎵 Classical Analysis

Goal: Analyze classical piano piece for functional structure

Settings:

numberOfSegments: 32 (8 phrases)
intensityWeight: 1.0, pitchWeight: 0.8, spectralWeight: 0.7
applyRitardando: 1, applyDiminuendo: 1
playDuringProcessing: 0 (focus on visualization)

Result: Visual functional analysis of classical harmonic progression

🗣️ Speech Recomposition

Goal: Create musical structure from spoken word

Settings:

numberOfSegments: 16 (clear phrase structure)
intensityWeight: 1.2, pitchWeight: 0.3, spectralWeight: 0.8
applyRitardando: 1, ritardandoModerate: 1.2
playDuringProcessing: 1 (hear each selection)

Result: Speech transformed into rhythmically structured composition

🌊 Environmental Soundscape

Goal: Create structured nature soundscape

Settings:

numberOfSegments: 24 (6 phrases)
intensityWeight: 0.7, pitchWeight: 0.5, spectralWeight: 1.2
applyDiminuendo: 1, diminuendoFinal: 0.3
visualizationDelay: 0.1 (slow, meditative)

Result: Nature sounds organized into rising/falling tension arcs

Advanced Techniques

Weight Exploration Strategies:

Emphasize intensity: Creates drama‑based structure (loud = tension)
Emphasize pitch: Creates melody‑based structure (high = tension)
Emphasize spectral: Creates timbre‑based structure (bright = tension)
Balance all three: Holistic tension assessment

Segment Count Strategies:

16 segments: Clear 4‑phrase structure (T‑P‑D‑C × 4)
32 segments: Extended development (8 phrases)
12 segments: Unusual structure (3 phrases)
Prime numbers: A‑musical, experimental results

Troubleshooting

Problem: All segments classified as one function
Cause: Audio too uniform, or weights unbalanced
Solution: Adjust weights, use more varied source material

Problem: Output sounds fragmented/jarring
Cause: Segments too short, or transitions abrupt
Solution: Reduce numberOfSegments (longer segments), enable expression

Problem: Processing very slow
Cause: High numberOfSegments, or long audio
Solution: Reduce segment count, or process shorter excerpt

Problem: Visualization unclear/crowded
Cause: Too many segments for grid display
Solution: Reduce numberOfSegments to ≤36 for clear grid

Creative Extensions

Alternative Output Patterns

Modify script to use different functional patterns (e.g., T‑D‑T‑C, or user‑defined patterns).

Multi‑Layer Würfelspiel

Process multiple audio files simultaneously, creating layered recompositions.

Interactive Control

Add manual override options — let user select segments for certain positions.

Export Visualization

Save grid visualization as image file for documentation or presentation.

Pro tip: Run the script multiple times on the same audio with different Random_seed values (modify script) to generate multiple "performances" of the same dice game. Compare results to understand algorithm behavior.