LZ-Inspired Audio Variations — User Guide

Lempel-Ziv inspired audio analysis and variation: Automated segmentation, similarity detection, and algorithmic transformation using vectorization, sort-and-sweep, and matrix optimization.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 2.0 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

This script implements Lempel-Ziv inspired audio variation generation — an algorithmic approach to automatically analyze, segment, and transform audio based on pattern similarity detection. Inspired by LZ data compression algorithms, it identifies repeating patterns in audio and creates variations through transformation rules. The implementation includes three major optimizations: vectorization, sort-and-sweep, and matrix operations for efficient large-scale audio analysis.

Key Features:

What is LZ-inspired audio processing? Inspired by Lempel-Ziv compression algorithms (LZ77/LZ78): (1) Dictionary building: Identify repeating patterns in data stream. (2) Reference encoding: Replace repetitions with references to dictionary. (3) Adaptation for audio: Instead of compression, use pattern detection for creative variation. (4) Key concepts: Sliding window analysis, similarity matching, pattern substitution. This script adapts LZ principles: (a) Segment audio into windows. (b) Extract features per window. (c) Build similarity dictionary. (d) Create variations by transforming similar segments. (e) Recombine into new composition. Applications: Algorithmic composition, sound design, audio texture generation, pattern discovery in recordings.

Technical Implementation: (1) Segmentation: Divide audio into overlapping windows (configurable size/overlap). (2) Feature extraction: Analyze each window using pitch, spectral, or intensity features. (3) Vectorization: Load features into arrays for fast memory access. (4) Sort-and-sweep: Sort features to enable early breaking in comparisons. (5) Matrix operations: For correlation metric, use matrix multiplication for efficiency. (6) Dictionary construction: Identify similar segments using distance metrics and threshold. (7) Variation generation: Apply one of six transformation methods to selected segments. (8) Recomposition: Concatenate varied segments to specified output duration. Key insight: O(n²) similarity comparison optimized to near O(n log n) through sorting and early breaking.

Quick start

  1. In Praat Objects window, select a Sound object to analyze.
  2. Open script: LZ_audio_variations.praat
  3. Configure segmentation:
    • Analysis_type: Pitch, Spectrum, or Intensity
    • Window_size: 0.1 seconds (100ms segments)
    • Overlap: 0.5 (50% overlap between windows)
  4. Set similarity detection:
    • Similarity_threshold: 0.8 (higher = more strict matching)
    • Distance_metric: Euclidean, Correlation, or Cosine
  5. Choose variation method:
    • Variation_method: 6 options from pitch shift to granular shuffle
    • Variation_amount: 0.5 (moderate transformation)
  6. Configure output:
    • Output_duration: 10 seconds (final composition length)
    • Randomize_dictionary_order: Yes (shuffle pattern selection)
    • Play_output: Yes (auto-play result)
  7. Click Run — script analyzes, builds dictionary, creates variations
  8. Output appears as "originalname_LZ_variation" in Objects window
Quick tip: Start with Window_size = 0.1s for speech, 0.05s for fast audio, 0.2s for music. Use Overlap = 0.5 for smooth transitions. For pitch-based analysis of voice: analysis_type = Pitch, similarity_threshold = 0.7-0.8. For texture generation: analysis_type = Spectrum, variation_method = Granular shuffle. Check Info window for performance metrics — speedup factor shows optimization effectiveness. Output duration controls final length regardless of input — set to 2-3× input duration for substantial variation. Variation_amount = 0.2-0.3 for subtle changes, 0.7-0.9 for extreme transformation.
Important: COMPUTATIONALLY INTENSIVE — large files or small windows create many segments (n² comparisons). Optimizations help but: 5-minute audio with 0.1s windows = 3000 windows = ~4.5M possible comparisons. Sort-and-sweep reduces this dramatically but still memory intensive. Window size affects results: Too small = fragmented patterns, too large = coarse analysis. Similarity_threshold critical: 0.9+ finds near-identical segments, 0.6+ finds broadly similar. Variation_method changes character drastically: Pitch shift preserves timing, granular shuffle destroys continuity. Randomize_dictionary_order: Yes for variation, No for predictable pattern repetition. Output may be discontinuous: Segments concatenated without crossfade.

LZ Theory & Algorithm

Lempel-Ziv Compression Basis

🔤 Original LZ Algorithm (LZ77)

Data compression concept:

  1. Sliding window: Move through data with fixed look-ahead buffer
  2. Pattern matching: Find longest match between buffer and dictionary
  3. Reference encoding: Encode as (offset, length, next_char)
  4. Dictionary update: Add new patterns as they're discovered

Adaptation for audio:

ORIGINAL LZ77 (text): Input: "ABRACADABRA" Output: (0,0,'A'), (0,0,'B'), (0,0,'R'), (0,0,'A'), (0,0,'C'), (0,0,'A'), (0,0,'D'), (7,4,'A') ADAPTED LZ77 (audio): Input: Audio signal Process: 1. Segment into windows [W1, W2, ..., Wn] 2. Extract features [F1, F2, ..., Fn] 3. Find similar windows: similarity(Fi, Fj) ≥ threshold 4. Build dictionary: D = {(i,j): similar(i,j)} 5. Generate variations: Transform windows based on dictionary 6. Recompose: Concatenate transformed windows

Key differences from compression:

  • No compression goal — creative variation instead
  • Features instead of exact byte matching
  • Similarity threshold instead of exact match
  • Transformation rules instead of reference encoding

Segmentation Strategy

WINDOW CALCULATIONS: Given: total_duration = length of input sound (seconds) window_size = analysis window size (seconds) overlap = overlap proportion (0-0.99) Calculations: hop_size = window_size × (1 - overlap) num_windows = floor((total_duration - window_size) / hop_size) + 1 Window i properties: start_time[i] = (i - 1) × hop_size end_time[i] = start_time[i] + window_size center_time[i] = start_time[i] + (window_size / 2) Example: 10s audio, window_size=0.1s, overlap=0.5 hop_size = 0.1 × (1 - 0.5) = 0.05s num_windows = floor((10 - 0.1) / 0.05) + 1 = 199 windows Window 1: 0.0-0.1s Window 2: 0.05-0.15s (50% overlap) Window 199: 9.9-10.0s

Feature Extraction Methods

📊 Three Analysis Types

1. Pitch Analysis (analysis_type = 1):

  • Features extracted: Mean F0, Standard deviation of F0
  • Praat command: To Pitch: 0, 75, 600
  • Best for: Vocal audio, monophonic instruments, pitch-based patterns
  • Feature range: F0: 75-600 Hz (adjustable via script)

2. Spectral Analysis (analysis_type = 2):

  • Features extracted: Spectral centroid (CoG), Spectral standard deviation
  • Praat commands: To Spectrum → Get centre of gravity, Get standard deviation
  • Best for: Textural sounds, noise, complex timbres, polyphonic music
  • Feature range: CoG: 0-5000 Hz typically

3. Intensity Analysis (analysis_type = 3):

  • Features extracted: Mean intensity, Maximum intensity
  • Praat command: To Intensity: 100, 0, "yes"
  • Best for: Percussive sounds, amplitude envelopes, dynamic patterns
  • Feature range: Intensity: 0-100 dB typically

Distance Metrics

📐 Similarity Calculation Methods

1. Euclidean Distance (distance_metric = 1):

Given two windows i and j with features (f1, f2): distance = √[(f1_i - f1_j)² + (f2_i - f2_j)²] Normalized similarity: similarity = 1 - (distance / max_possible_distance) max_possible_distance depends on feature type: • Pitch: ~600 Hz (range 75-600) • Spectrum: ~5000 Hz • Intensity: ~100 dB

2. Correlation Distance (distance_metric = 2):

Pearson correlation between feature vectors: r = cov(features_i, features_j) / (σ_i × σ_j) distance = 1 - r Matrix optimization: Feature matrix M (n_windows × 2 features) Correlation = (M × Mᵀ) / normalization Computed via matrix multiplication for speed

3. Cosine Distance (distance_metric = 3):

Cosine similarity between feature vectors: cos_sim = (f1_i·f1_j + f2_i·f2_j) / (||f_i|| × ||f_j||) distance = 1 - cos_sim Simplified in script: distance = 1 - [min(|f1_i|,|f1_j|) / max(|f1_i|,|f1_j|)] (Uses only primary feature for speed)

When to use each:

  • Euclidean: General purpose, magnitude-sensitive
  • Correlation: Pattern shape matching, magnitude-invariant
  • Cosine: Direction/orientation matching

Optimization Techniques

Optimization 1: Vectorization

⚡ Array-Based Processing

Problem: Table access in Praat is slow for large datasets

Solution: Load features into numeric arrays once

BEFORE (slow - table access each comparison): for i to num_windows for j to num_windows f1_i = Get value: i, "mean_f0" # Table access f1_j = Get value: j, "mean_f0" # Table access # Compare... AFTER (fast - array access): # Load once at start feature1# = zero#(num_windows) for i to num_windows feature1# [i] = Get value: i, "mean_f0" # Fast comparisons for i to num_windows for j to num_windows f1_i = feature1# [i] # Array access f1_j = feature1# [j] # Array access # Compare...

Performance gain: 10-100× faster for large datasets

Arrays used:

  • start_times# — Window start times
  • end_times# — Window end times
  • feature1# — Primary feature (mean F0, CoG, mean intensity)
  • feature2# — Secondary feature (stdev F0, spectral stdev, max intensity)
  • original_indices# — Original window indices

Optimization 2: Sort-and-Sweep

🔍 Early Breaking Algorithm

Problem: O(n²) comparisons for n windows (e.g., 3000 windows = 9M comparisons)

Solution: Sort by primary feature, break inner loop when difference too large

ALGORITHM: 1. Sort windows by primary feature (mean F0, CoG, or mean intensity) 2. For each window i: f1_i = primary feature of window i For each window j > i: f1_j = primary feature of window j diff = |f1_j - f1_i| # Since sorted, if diff > threshold, all subsequent j will also have diff > threshold if diff > max_acceptable_diff: break # Skip remaining comparisons for this i else: # Calculate full distance (including secondary feature) compute similarity... max_acceptable_diff calculation: For pitch: 600 × (1 - similarity_threshold) For spectrum: 5000 × (1 - similarity_threshold) For intensity: 100 × (1 - similarity_threshold)

Performance gain: Reduces O(n²) to approximately O(n log n) for sorted data

Example with numbers:

  • Windows sorted by mean F0: [100, 105, 110, 200, 205, 210] Hz
  • similarity_threshold = 0.8, max_diff = 600×(1-0.8) = 120 Hz
  • Comparing window 1 (100 Hz) with window 4 (200 Hz): diff = 100 Hz ≤ 120 → compare
  • Comparing window 1 with window 5 (205 Hz): diff = 105 Hz ≤ 120 → compare
  • In original algorithm: all 15 comparisons made
  • With sort-and-sweep: comparisons stop when diff > 120 Hz

Optimization 3: Matrix Operations

🧮 Correlation Matrix Calculation

Problem: Correlation calculation O(n² × m) where m = feature dimension

Solution: Use matrix multiplication for batch correlation

STANDARD APPROACH (slow): for i to n_windows for j to n_windows corr = correlation(features_i, features_j) # O(n²) operations MATRIX APPROACH (fast): # Create feature matrix M (n_windows × 1) # Transpose: Mᵀ (1 × n_windows) # Correlation matrix = M × Mᵀ (n_windows × n_windows) # Praat implementation: matrix = To Matrix (column): "feature_column" transposed = Transpose correlation_matrix = Multiply: matrix, transposed # Access precomputed values for i to n_windows for j to n_windows corr_value = Get value in cell: i, j # O(1) access after O(n²) setup

Limitation: Only works for correlation metric with single feature column

Performance gain: O(n²) setup then O(1) access vs O(n²) calculation each time

Memory tradeoff: Stores n² matrix vs calculating on demand

Performance Metrics

SCRIPT REPORTS OPTIMIZATION EFFECTIVENESS: Total possible comparisons (brute force): total = n × (n - 1) / 2 Comparisons actually made: made = count of distance calculations Comparisons skipped: skipped = early break savings Speedup factor: speedup = total / made Example from script output: Number of windows: 199 Total possible comparisons: 199×198/2 = 19,701 Comparisons made: 1,542 Comparisons skipped: 18,159 Speedup factor: 12.78x Interpretation: • Sort-and-sweep skipped 92% of comparisons • Processing 12.8× faster than brute force

Parameters Guide

Segmentation Parameters

Parameter Default Range Description
Analysis_type Pitch Pitch/Spectrum/Intensity Feature extraction method. Pitch for melodic, Spectrum for timbral, Intensity for dynamic patterns
Window_size 0.1 0.01-1.0 Analysis window in seconds. Smaller = more detail but more windows. Speech: 0.05-0.2s, Music: 0.1-0.5s
Overlap 0.5 0.0-0.99 Overlap proportion between windows. 0.5 = 50% overlap. Higher = smoother analysis but more windows

Similarity Detection Parameters

Parameter Default Range Description
Similarity_threshold 0.8 0.0-1.0 Minimum similarity for pattern matching. 0.9 = near-identical, 0.7 = broadly similar, 0.5 = loosely related
Distance_metric Euclidean Euclidean/Correlation/Cosine Distance calculation method. Euclidean for magnitude, Correlation for pattern shape, Cosine for direction

Variation Parameters

Parameter Default Range Description
Variation_method Pitch shift 6 methods Transformation type. See detailed section below for each method
Variation_amount 0.5 0.0-1.0 Intensity of variation. 0.1 = subtle, 0.5 = moderate, 0.9 = extreme

Output Parameters

Parameter Default Range Description
Output_duration 10 0.1-3600 Final output length in seconds. Independent of input duration. Can be shorter or longer
Randomize_dictionary_order 1 (yes) 0/1 Random selection from dictionary vs sequential. Yes for variation, No for predictable patterns
Play_output 1 (yes) 0/1 Auto-play result after processing

Parameter Interactions

Window Size vs Similarity Threshold:
  • Small windows (0.05s) + high threshold (0.9): Finds exact micro-patterns
  • Large windows (0.5s) + low threshold (0.6): Finds broadly similar sections
  • Medium windows (0.1-0.2s): Good balance for most audio
Analysis Type Recommendations:
Audio Type Analysis Type Window Size Similarity
Speech Pitch 0.05-0.1s 0.7-0.8
Singing Pitch 0.1-0.2s 0.8-0.9
Percussion Intensity 0.02-0.05s 0.6-0.7
Ambient texture Spectrum 0.2-0.5s 0.5-0.6
Polyphonic music Spectrum 0.1-0.3s 0.7-0.8

Variation Methods

Method 1: Pitch Shift

🎵 Pitch-Based Transformation

Algorithm:

1. Convert segment to Manipulation object 2. Extract PitchTier 3. Apply shift formula: new_f0 = original_f0 × 2^(shift_semitones/12) Where: shift_semitones = randomGauss(0, 12 × variation_amount) variation_amount = parameter (0-1) Example: variation_amount = 0.5 shift_semitones = randomGauss(0, 6) Most shifts within ±6 semitones 4. Resynthesize with overlap-add

Effect: Transposes pitch while preserving timing and formants

Best for: Melodic variation, harmonic exploration

Parameters:

  • variation_amount = 0.2: Subtle detuning (±2.4 semitones)
  • variation_amount = 0.5: Moderate shifts (±6 semitones)
  • variation_amount = 0.8: Extreme transposition (±9.6 semitones)

Method 2: Time Stretch

⏱️ Duration Transformation

Algorithm:

1. Convert segment to Manipulation object 2. Extract DurationTier 3. Add single point at center: Add point: center_time, stretch_factor Where: stretch_factor = 1 + randomGauss(0, variation_amount) Constrained to [0.5, 2.0] Example: variation_amount = 0.5 stretch_factor = 1 + randomGauss(0, 0.5) Typically 0.5-1.5× original duration 4. Resynthesize with overlap-add

Effect: Changes duration while preserving pitch

Best for: Rhythmic variation, tempo changes

Parameters:

  • variation_amount = 0.2: 0.8-1.2× duration (subtle)
  • variation_amount = 0.5: 0.5-1.5× duration (moderate)
  • variation_amount = 0.8: 0.2-1.8× duration (extreme)

Method 3: Amplitude Modulation

📈 Dynamic Transformation

Algorithm:

1. Copy original segment 2. Apply modulation formula: sample[t] = sample[t] × (1 + variation_amount × sin(2πft)) Where: f = 10 × (1 + variation_amount × 10) Hz variation_amount = parameter (0-1) Example: variation_amount = 0.5 f = 10 × (1 + 5) = 60 Hz modulation Amplitude varies at 60Hz with 50% depth

Effect: Adds tremolo/amplitude modulation

Best for: Adding movement to static sounds, rhythmic effects

Parameters:

  • variation_amount = 0.2: 12 Hz, 20% depth (subtle)
  • variation_amount = 0.5: 60 Hz, 50% depth (moderate)
  • variation_amount = 0.8: 90 Hz, 80% depth (strong)

Method 4: Spectral Filter

🎛️ Timbral Transformation

Algorithm:

1. Convert segment to Spectrum 2. Apply filtering formula: if frequency < cutoff then amplitude = amplitude × (1 - variation_amount) endif Where: cutoff = 1000 + randomUniform(-500, 500) × variation_amount variation_amount = parameter (0-1) Example: variation_amount = 0.5 cutoff = 1000 ± 250 Hz (750-1250 Hz) Frequencies below cutoff reduced by 50% 3. Convert back to Sound

Effect: Low-pass filtering with random cutoff

Best for: Timbre variation, muffling effects

Parameters:

  • variation_amount = 0.2: Cutoff 900-1100 Hz, 20% reduction
  • variation_amount = 0.5: Cutoff 750-1250 Hz, 50% reduction
  • variation_amount = 0.8: Cutoff 600-1400 Hz, 80% reduction

Method 5: Reverse

↪️ Temporal Transformation

Algorithm:

1. With probability = variation_amount: Copy segment and reverse it Where: variation_amount = probability of reversal (0-1) Example: variation_amount = 0.5 50% chance segment is reversed Otherwise remains unchanged

Effect: Randomly reverses segments

Best for: Glitch effects, surreal textures

Parameters:

  • variation_amount = 0.2: 20% reversed (sparse)
  • variation_amount = 0.5: 50% reversed (balanced)
  • variation_amount = 0.8: 80% reversed (mostly reversed)

Method 6: Granular Shuffle

🌀 Micro-Segmentation

Algorithm:

1. Divide segment into grains: grain_size = 0.02 seconds (20ms) num_grains = floor(window_size / grain_size) 2. Create array of grain indices 3. Shuffle grain order randomly 4. Extract grains in shuffled order 5. Concatenate shuffled grains Example: 0.1s segment (100ms) grain_size = 0.02s num_grains = 5 Original order: [1,2,3,4,5] Shuffled order: [3,1,5,2,4] Result: Grains concatenated as 3→1→5→2→4

Effect: Scrambles micro-structure while preserving macro features

Best for: Textural transformation, granular synthesis effects

Parameters:

  • Grain size fixed: 0.02s (20ms) — optimal for granular effects
  • variation_amount: Not used for this method
  • Effect strength: Controlled by how many segments use this method

Applications

Algorithmic Composition

Use case: Generate new musical material from existing recordings

Technique: Use pitch analysis with moderate similarity threshold

Example workflow:

Sound Design

Use case: Create complex textures from simple source material

Technique: Use spectral analysis with granular shuffle

Example workflow:

Voice Processing

Use case: Transform speech into musical or textural material

Technique: Pitch analysis with extreme variations

Example workflow:

Audio Restoration

Use case: Fill gaps or damaged sections using similar intact material

Technique: High similarity threshold with time stretching

Example workflow:

Educational Tool

Use case: Demonstrate pattern recognition in audio

Technique: Vary parameters to show different similarity concepts

Learning objectives:

Practical Example Configurations

🎹 Melodic Recomposition

Goal: Create new melody from existing one

Settings:

  • Analysis_type: Pitch
  • Window_size: 0.15s
  • Similarity_threshold: 0.75
  • Distance_metric: Euclidean
  • Variation_method: Pitch shift
  • Variation_amount: 0.4
  • Output_duration: 30s

Result: New melodic line with similar contour but different pitches

🌊 Textural Transformation

Goal: Transform discrete sounds into continuous texture

Settings:

  • Analysis_type: Spectrum
  • Window_size: 0.08s
  • Similarity_threshold: 0.65
  • Distance_metric: Correlation
  • Variation_method: Granular shuffle
  • Output_duration: 60s

Result: Smooth, evolving texture from source sounds

🗣️ Speech Deconstruction

Goal: Deconstruct speech into abstract sound

Settings:

  • Analysis_type: Pitch
  • Window_size: 0.06s
  • Similarity_threshold: 0.85
  • Distance_metric: Cosine
  • Variation_method: Reverse + Spectral filter
  • Variation_amount: 0.6
  • Output_duration: 45s

Result: Abstract, surreal version of original speech

Complete Workflow

Step-by-Step Process

🔧 Script Execution Flow

Phase 1: Setup & Initialization

1. Validate input: Single Sound object selected 2. Get sound properties: name, sample_rate, duration 3. Calculate segmentation parameters: • hop_size = window_size × (1 - overlap) • num_windows = floor((duration - window_size)/hop_size) + 1 4. Create features table with window metadata

Phase 2: Feature Extraction

For each window i (1 to num_windows): 1. Calculate start_time, end_time 2. Based on analysis_type: • Pitch: Extract mean F0 and stdev F0 • Spectrum: Extract spectral centroid and stdev • Intensity: Extract mean and max intensity 3. Store in features table Optimization: Batch processing where possible (e.g., entire Pitch object created once for pitch analysis)

Phase 3: Vectorization

1. Create numeric arrays: • start_times#, end_times# • feature1#, feature2# (primary/secondary features) • original_indices# 2. Load data from table into arrays (Single pass, O(n) operation) 3. Memory benefit: Array access faster than table access for subsequent comparisons

Phase 4: Sorting for Optimization

1. Sort features table by primary feature: • Pitch: mean_f0 • Spectrum: spectral_cog • Intensity: mean_intensity 2. Reload arrays with sorted order (Maintains connection between features and original indices) 3. Enables sort-and-sweep optimization: Comparisons can break early when feature difference too large

Phase 5: Matrix Precomputation (Optional)

If distance_metric = Correlation: 1. Create matrix from primary feature column 2. Transpose matrix 3. Multiply: correlation_matrix = matrix × transposed 4. Result: Precomputed correlation values for all pairs Tradeoff: O(n²) memory for O(1) access during comparison

Phase 6: Similarity Detection & Dictionary Building

For i = 1 to num_windows-1: For j = i+1 to num_windows: 1. Get features from arrays: f1_i, f2_i, f1_j, f2_j 2. Early break if |f1_j - f1_i| > max_acceptable_diff 3. Calculate distance based on metric: • Euclidean: √[(f1_i-f1_j)² + (f2_i-f2_j)²] • Correlation: 1 - precomputed_value(i,j) • Cosine: 1 - min(|f1_i|,|f1_j|)/max(|f1_i|,|f1_j|) 4. Normalize to similarity: 1 - (distance/max_distance) 5. If similarity ≥ threshold: Add to dictionary: (window_i, window_j, distance) Report statistics: • Comparisons made vs skipped • Speedup factor • Dictionary size

Phase 7: Variation Generation

num_output_windows = floor(output_duration / window_size) For out_i = 1 to num_output_windows: 1. Select source window: • If dictionary not empty and randomize_order: random pair from dictionary • Else: sequential from original 2. Extract audio segment at window's time range 3. Apply variation_method: • Method 1-6 as described in Variations section 4. Store varied segment ID Progress reporting every 10 windows

Phase 8: Recomposition & Output

1. Concatenate all varied segments 2. Rename: originalname + "_LZ_variation" 3. Trim to exact output_duration if longer 4. Scale peak to 0.99 (normalize) 5. Display final statistics: • Output duration • Dictionary size • Performance metrics 6. Auto-play if play_output enabled 7. Clean up temporary objects

Information Window Output

TYPICAL OUTPUT: === LZ-Inspired Audio Analysis (OPTIMIZED) === Analysis type: Pitch Window size: 0.1 s Overlap: 50% Number of windows: 199 Analyzing... Extracting pitch contours... Loading features into memory... Sorting features for efficient comparison... Building similarity dictionary... Found 542 similar pattern pairs Comparisons made: 1,542 Comparisons skipped: 18,159 Speedup factor: 12.78x Creating variations... Processing window 10/100 Processing window 20/100 ... Processing window 100/100 Concatenating segments... === Complete === Output duration: 10.0 seconds Dictionary size: 542 similar pairs Output sound created: originalname_LZ_variation Playing output...

Troubleshooting Common Issues

Problem: Script runs very slowly
Causes: Too many windows (small window_size), high overlap, large file
Solutions: • Increase window_size (0.2s instead of 0.05s)
• Reduce overlap (0.3 instead of 0.7)
• Use shorter input file
• Check speedup factor in output — if < 2x, sort-and-sweep not helping
Problem: No patterns found (dictionary size = 0)
Causes: Similarity_threshold too high, distance metric inappropriate
Solutions: • Lower similarity_threshold (0.6 instead of 0.9)
• Try different distance_metric
• Try different analysis_type
• Check if features are being extracted correctly (undefined values?)
Problem: Output is chaotic/disjointed
Causes: Low similarity_threshold, extreme variation_amount
Solutions: • Increase similarity_threshold for more coherent patterns
• Reduce variation_amount (0.3 instead of 0.8)
• Use less disruptive variation_method (pitch shift instead of granular shuffle)
• Increase window_size for longer, more coherent segments
Problem: Memory error or Praat crash
Causes: Too many windows creating large matrices, memory limits
Solutions: • Reduce number of windows (increase window_size, reduce overlap)
• Use shorter input file
• Avoid correlation metric with very large window counts
• Increase Praat memory allocation in preferences
Problem: Output has clicks/pops between segments
Causes: No crossfade between concatenated segments
Solutions: • Increase overlap parameter (creates overlapping windows)
• Apply crossfade manually after generation
• Use smaller variation_amount for smoother transitions
• This is inherent to segment concatenation approach