HFD-Driven Time Warping — User Guide

Adaptive time‑stretching based on signal complexity: uses Higuchi Fractal Dimension (HFD) to measure local signal irregularity, then warps time proportionally to create dynamic, content‑aware temporal transformations.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 2.1 (2025) License: MIT License Repo: https://github.com/ShaiCohen‑ops/Praat‑plugin_AudioTools
Contents:

What this does

This script implements HFD‑driven time warping — an adaptive time‑stretching technique that adjusts stretching factors based on local signal complexity measured by the Higuchi Fractal Dimension (HFD). Unlike uniform time‑stretching, which applies the same factor to the entire signal, this method stretches complex/irregular regions more (or less) than simple/regular regions, creating dynamic temporal transformations that respect the signal's inherent structure.

Key Features:

What is Higuchi Fractal Dimension (HFD)? A measure of signal complexity/irregularity originally developed for time‑series analysis. HFD quantifies how "fractal‑like" a signal is—higher values indicate more complex, irregular patterns; lower values indicate simpler, more regular patterns. For audio: noise has high HFD (~2.0), pure sine waves have low HFD (~1.0), speech and music have intermediate values. By measuring HFD frame‑by‑frame, we can identify which parts of a signal are complex (consonants, transients, texture) vs. simple (vowels, sustained tones).

Technical Implementation: (1) Preprocessing: Convert to mono, downsample for analysis, apply material‑specific filtering. (2) HFD Analysis: Calculate Higuchi Fractal Dimension for each overlapping frame. (3) Voicing Analysis: Optional harmonicity‑based voicing detection to protect voiced segments. (4) Mapping: Convert HFD values to stretch factors using configurable curves and voicing influence. (5) Smoothing & Limiting: Apply temporal smoothing and slew‑rate limiting to prevent artifacts. (6) Time Warping: Create Praat DurationTier and resynthesize using overlap‑add. (7) Visualization: Display HFD, stretch factors, voicing, and before/after waveforms.

Quick start

  1. In Praat, select exactly one Sound object (mono or stereo).
  2. Run script…hfd_time_warping.praat.
  3. Choose a Preset (start with "Moderate" for balanced results).
  4. Select Material_type (Speech vs. Music/Field Recording).
  5. Enable Draw_visualization to see the analysis curves.
  6. Click OK — the warped sound appears as "originalname_HFDwarp_presetname".
  7. Listen and compare original vs. warped; adjust parameters as needed.
Quick tip: Start with Preset 3 (Moderate) for balanced results on most material. Use Material_type = Speech for spoken word, Music/Field Recording for music or environmental sounds. Enable visualization to see how HFD values map to stretch factors — complex regions (high HFD) get stretched more if max_stretch_factor > 1. The voicing gate (enabled by default) protects voiced (harmonic) regions from extreme stretching. For experimental effects, try Preset 6 (Glitch) with voicing gate disabled.
Important: This is a destructive time‑stretching operation — duration changes permanently. Very short sounds (<1 s) may not analyze properly. Extreme stretching factors (<0.4 or >2.5) can create artifacts. The script converts stereo to mono for analysis but outputs mono; stereo input yields mono output. Processing time increases with audio duration and frame density. Visualization can be CPU‑intensive for long files; disable for faster processing. Output is normalized to ‑0.95 dBFS.

HFD & Time Warping Theory

Higuchi Fractal Dimension

Mathematical Definition

For a time series x(1), x(2), ..., x(N):

1. Construct k new time series: Xₘᵏ = {x(m), x(m+k), x(m+2k), ..., x(m+floor((N‑m)/k)·k)} for m = 1, 2, ..., k 2. Calculate length Lₘ(k) for each: Lₘ(k) = [∑|x(m+ik) − x(m+(i‑1)k)| · (N‑1)/(floor((N‑m)/k)·k)] / k 3. Average over m: L(k) = ⟨Lₘ(k)⟩ 4. If L(k) ∝ k⁻ᴰ, then D is fractal dimension HFD = slope of ln(L(k)) vs. ln(1/k) Interpretation: HFD ≈ 1.0: Smooth, predictable signal (sine wave) HFD ≈ 1.5: Brownian noise HFD ≈ 2.0: White noise

📊 HFD Values for Common Audio

Pure tones (sine, saw, square): 1.0‑1.3

Voiced speech (vowels): 1.2‑1.5

Unvoiced speech (fricatives): 1.5‑1.8

Musical tones (sustained): 1.3‑1.6

Percussion transients: 1.6‑1.9

White/pink noise: 1.8‑2.0

Complex textures (grains, crowds): 1.7‑2.0

Practical range in analysis: Typically 1.0‑2.0, with most audio 1.2‑1.8.

Windowed HFD Implementation

The script calculates HFD per analysis frame:

For each frame (typically 50 ms): 1. Extract frame samples (optionally Hann‑windowed) 2. Remove DC offset (subtract mean) 3. Calculate L(k) for k = 1 to k_max (typically 4‑6) 4. Linear regression: ln(L(k)) vs. ln(1/k) 5. Slope = HFD for that frame Parameters: frame_length_s: 0.03‑0.06 s (30‑60 ms) hop_size_s: Usually equals frame length (50% overlap if half) k_max: Maximum scale factor (4‑6 typical)

Time Warping Principles

From HFD to Stretch Factors

The core mapping logic:

Given: HFD_min, HFD_max: Observed or configured HFD range stretch_min, stretch_max: User‑defined stretch limits HFD_frame: HFD value for current frame Mapping: 1. Normalize: norm = (HFD_frame − HFD_min) / (HFD_max − HFD_min) 2. Clamp: norm ∈ [0, 1] 3. Apply curve: norm' = f(norm) [linear, square, sqrt, quantized] 4. Stretch factor: s = stretch_min + norm' × (stretch_max − stretch_min) Result: Low HFD → near stretch_min High HFD → near stretch_max

Voicing‑Aware Adjustment

🎵 Protecting Voiced Content

Problem: HFD alone doesn't distinguish between desirable complexity (textures) and undesirable complexity (noise in voiced regions).

Solution: Use harmonicity (HNR) as voicing measure:

  • High harmonicity → voiced (vowels, musical tones)
  • Low harmonicity → unvoiced (noise, consonants)

Adjustment formula:

effective_stretch = 1.0 + voicing × (stretch − 1.0)

Where voicing ∈ [0,1] (0=unvoiced, 1=fully voiced)

Result: Voiced regions resist stretching, unvoiced regions follow HFD mapping.

Artifact Control Mechanisms

Smoothing Strategies

Three‑level smoothing pipeline:

1. HFD smoothing: Moving average over smoothing_window_size frames
Purpose: Reduce frame‑to‑frame HFD fluctuation

2. Stretch factor smoothing: Moving average over final_stretch_smooth frames
Purpose: Create smooth stretch factor trajectory

3. Slew‑rate limiting: max_stretch_change_per_sec constraint
Formula: |s[i] − s[i‑1]| ≤ max_change × hop_size
Purpose: Prevent abrupt stretch changes ("zipper" artifacts)

Percentile mapping: Use 5th‑95th percentile instead of min‑max for robustness against outliers.

Praat DurationTier Implementation

The final time warping uses Praat's built‑in resynthesis:

1. Create Manipulation object from original sound 2. Extract/Replace DurationTier with computed stretch factors 3. Resynthesize using overlap‑add (PSOLA‑like) Advantages: • High‑quality time‑stretching (Praat's algorithm) • Non‑uniform stretching (different factors at different times) • Preserves pitch (time‑scale modification only) DurationTier points: (time, stretch_factor) Example: (0.5 s, 1.2) means "at time 0.5 s, stretch by 1.2×"

Preset Strategies

Five Preset Configurations

PresetStretch RangeSmoothingVoicingSlew LimitCharacter
1. Subtle0.85‑1.15×Heavy (7)High (80%)3.0/sGentle, natural‑sounding
2. Moderate0.7‑1.5×Medium (5)Medium (70%)4.0/sBalanced, artistic
3. Dramatic0.5‑2.0×Light (4)Medium‑low (60%)6.0/sExpressive, noticeable
4. Extreme0.4‑2.5×Very light (3)Low (50%)8.0/sRadical transformation
5. Glitch0.4‑2.5×Minimal (2)None (0%)20.0/sDigital artifacts, choppy

🎚️ Preset Deep Dive: Moderate

Design philosophy: Artistic time‑warping that's clearly audible but still musical.

Parameters: Stretch 0.7‑1.5× (compression and expansion), medium smoothing (5 frames), 70% voicing influence, 4.0/s slew limit.

Result: Complex regions (consonants, transients) are stretched up to 1.5×, simple regions (vowels, sustained tones) compressed to 0.7×. Voicing protection prevents extreme manipulation of harmonic content. Smooth changes maintain natural flow.

⚡ Preset Deep Dive: Extreme

Design philosophy: Push time‑warping to its limits for experimental sound design.

Parameters: Stretch 0.4‑2.5× (severe compression/expansion), light smoothing (3 frames), 50% voicing influence, 8.0/s slew limit, skip_windowing enabled.

Result: Dramatic temporal manipulation where complex sections can stretch to 2.5× original length while simple sections compress to 40%. Reduced smoothing creates more abrupt transitions. Voicing influence halved allows more manipulation of harmonic content.

🔧 Preset Deep Dive: Glitch

Design philosophy: Embrace digital artifacts for glitch‑art aesthetic.

Parameters: Stretch 0.4‑2.5×, minimal smoothing (2 frames), no voicing gate, 20.0/s slew limit, quantized mapping, skip_windowing enabled.

Result: Abrupt, stepped time changes that create digital‑sounding artifacts. No voicing protection means harmonic content gets wildly stretched. Quantized mapping creates discrete stretch levels (like bit‑crushing for time). High slew limit allows instant stretch changes.

Material‑Type Adaptation

🎵 Speech vs. Music Processing

Speech‑specific processing (Material_type = 1):

  • High‑pass filter: 100 Hz cutoff to remove rumble
  • Voicing gate: Typically enabled (protects vowels)
  • Stretch range: Often asymmetric (compress simple, expand complex)
  • Typical use: Speech rate modification, dramatic effect, glitch art

Music/Field‑Recording processing (Material_type = 2):

  • High‑pass filter: 30 Hz cutoff (preserve bass)
  • Voicing gate: Optional (depends on material)
  • Stretch range: Often symmetric or tailored to genre
  • Typical use: Tempo manipulation, texture stretching, ambient creation

Parameters & Controls

Analysis Parameters (Custom Mode)

ParameterTypeDefaultDescription
Frame_length_spositive0.05Analysis frame length in seconds (0.03‑0.10 typical)
Hop_size_spositive0.05Time between analysis frames (usually = frame_length)
K_maxinteger5Maximum scale factor for HFD calculation (4‑6 typical)

Smoothing & Mapping Parameters

ParameterTypeDefaultDescription
Smoothing_window_sizeinteger5Moving average window for HFD smoothing (3‑9 typical)
Min_stretch_factorreal0.5Minimum time‑stretch factor (0.4‑0.9 typical)
Max_stretch_factorreal2.0Maximum time‑stretch factor (1.1‑2.5 typical)
Use_percentile_mappingboolean1Use 5th‑95th percentile instead of min‑max (robust to outliers)
Mapping_curveoptionLinearHFD‑to‑stretch mapping: Linear, Emphasize extremes, Emphasize changes, Quantized steps

Voicing Control Parameters

ParameterTypeDefaultDescription
Use_voicing_gateboolean1Enable harmonicity‑based voicing protection
Voicing_influencereal0.7Strength of voicing protection (0‑1, 0=none, 1=full)
Voicing_smooth_windowinteger3Smoothing window for voicing track (1‑7 typical)

Control Smoothing Parameters

ParameterTypeDefaultDescription
Final_stretch_smoothinteger3Final smoothing window for stretch factors (1‑7 typical)
Max_stretch_change_per_secreal5.0Maximum stretch factor change per second (prevents "zipper" artifacts)

Pitch Settings

ParameterTypeDefaultDescription
Minimum_pitch_Hzpositive75Minimum pitch for harmonicity analysis (Hz)
Maximum_pitch_Hzpositive600Maximum pitch for harmonicity analysis (Hz)

Speed Optimization Parameters

ParameterTypeDefaultDescription
Downsample_factorinteger6Downsampling factor for analysis (higher = faster but less accurate)
Skip_windowingboolean0Skip Hann windowing in HFD calculation (faster but less accurate)

Output Parameters

ParameterTypeDefaultDescription
Draw_visualizationboolean1Draw analysis curves and waveforms after processing
Play_resultboolean1Play the warped sound automatically
Parameter Interactions & Tips:
  • Frame_length_s vs. Hop_size_s: Equal values = no overlap; half values = 50% overlap (smoother but slower).
  • K_max: Higher values = more accurate HFD but slower; 4‑6 works for most audio.
  • Min/Max_stretch_factor: Values <1 = compression (speed up), >1 = expansion (slow down). Asymmetric ranges (e.g., 0.7‑1.5) create interesting effects.
  • Mapping_curve: "Emphasize extremes" exaggerates differences; "Quantized steps" creates stepped/stutter effects.
  • Voicing_influence: 0.7 means 70% of stretch adjustment comes from voicing‑adjusted value, 30% from raw HFD mapping.
  • Max_stretch_change_per_sec: Critical for natural‑sounding results; too high causes abrupt changes, too low limits dynamic range.

Applications

Speech Processing & Effects

Use case: Create dramatic speech effects, alter speech rate dynamically, generate glitch‑art vocals.

Technique: Use Material_type = Speech with Moderate or Dramatic preset. Enable voicing gate to protect vowels while stretching consonants.

Example: Process spoken word with Dramatic preset (0.5‑2.0×) — consonants elongate, vowels compress, creating "dragging" effect on fricatives while maintaining vowel intelligibility.

Musical Time‑Manipulation

Use case: Create dynamic tempo changes, stretch specific musical elements, generate rhythmic variations.

Technique: Use Material_type = Music with Moderate preset. Adjust stretch range based on desired effect (0.8‑1.2× for subtle, 0.5‑2.0× for extreme).

Workflow:

Sound Design & Texture Creation

Use case: Transform field recordings into evolving textures, create time‑based granular effects.

Technique: Use Extreme or Glitch preset with wide stretch range (0.4‑2.5×). Disable voicing gate for non‑harmonic material.

Advantages:

Audio Restoration & Enhancement

Use case: Smooth irregular tempo in recordings, reduce/emphasize noise characteristics.

Technique: Use Subtle preset (0.85‑1.15×) with heavy smoothing and high voicing influence.

Example: Process recording with irregular tempo — HFD detects rhythmic complexity, stretches complex (rushed) sections, compresses simple (dragging) sections towards more uniform tempo.

Experimental Composition

Use case: Generate algorithmic time‑structures, create musique concrète transformations.

Technique: Chain multiple HFD warping operations with different parameters, or combine with other processing.

Example: Process sound → HFD warp → reverse → HFD warp again → layer with original → spatialize.

Practical Workflow Examples

🎤 Dynamic Speech Compression/Expansion

Goal: Make spoken word more dramatic by stretching pauses and compressing speech.

Settings:

  • Preset: Moderate (0.7‑1.5×)
  • Material_type: Speech
  • Voicing_influence: 0.8
  • Mapping_curve: Emphasize changes
  • Result: Pauses/silence (low HFD) compress to 0.7×, consonants (high HFD) stretch to 1.5×, vowels moderately affected.

🎵 Rhythmic Transformation of Drum Loop

Goal: Create "humanized" or "swinging" rhythm from rigid drum machine loop.

Settings:

  • Preset: Subtle (0.85‑1.15×)
  • Material_type: Music
  • Voicing_influence: 0.0 (drums aren't harmonic)
  • Smoothing: Heavy (7)
  • Result: Kick/snare transients stretch slightly, hi‑hat patterns compress, creating natural‑sounding groove.

🌊 Ambient Texture from Field Recording

Goal: Transform 10‑second water stream into evolving 60‑second ambient texture.

Settings:

  • Preset: Extreme (0.4‑2.5×)
  • Material_type: Music/Field Recording
  • Voicing_influence: 0.0
  • Mapping_curve: Linear
  • Additional: Process multiple times with different parameters, layer results.

Advanced Techniques

Multi‑stage processing:
  • Analysis‑synthesis separation: Run HFD analysis once, save stretch factors, apply with different mapping curves.
  • Cascaded warping: Apply HFD warp → analyze result → apply again with different parameters.
  • Hybrid approaches: Combine HFD warping with pitch‑shifting, filtering, or spatialization.
  • Parameter automation: Modify script to change parameters over time (e.g., gradually increase max_stretch_factor).
Creative parameter explorations:
  • Inverted mapping: Set min_stretch_factor > max_stretch_factor to compress complex regions and expand simple ones.
  • Extreme quantization: Use Mapping_curve = Quantized steps with only 2‑3 steps for robotic/stutter effects.
  • Micro‑analysis: Very short frames (0.01 s) with high k_max for granular‑level control.
  • Spectral‑HFD: Modify script to calculate HFD per frequency band, create multi‑band time warping.

Troubleshooting Common Issues

Problem: "Zipper" or "clicking" artifacts
Cause: Stretch factors changing too rapidly between frames.
Solution: Increase final_stretch_smooth, reduce max_stretch_change_per_sec, or increase smoothing_window_size.
Problem: Output duration not as expected
Cause: Extreme stretch factors or non‑linear duration accumulation.
Solution: Check average stretch factor in visualization; duration ≈ original × average_stretch.
Problem: Voiced regions still get stretched
Cause: Voicing_influence too low, or material not harmonic enough.
Solution: Increase voicing_influence to 0.8‑0.9, check harmonicity in visualization.
Problem: Processing very slow
Cause: Many frames (short hop_size), high k_max, or visualization enabled.
Solution: Increase hop_size_s, reduce k_max to 4, disable visualization, increase downsample_factor.

Technical Deep Dive

HFD Calculation Details

Windowed Implementation

The windowedHFD procedure implements robust HFD:

procedure windowedHFD: .mat, .start, .end, .kmax, .skip_win 1. Extract frame from matrix columns .start to .end 2. Optionally apply Hann window (if .skip_win = 0) 3. Remove DC offset (subtract mean) 4. For k = 1 to .kmax: • Construct k sub‑series starting at m = 1..k • Calculate length Lₘ(k) for each • Average to get L(k) 5. Linear regression: ln(L(k)) vs. ln(1/k) 6. Slope = HFD Optimizations: • Skip windowing for speed (.skip_win = 1) • Early exit if frame too short (<20 samples) • Use matrix access instead of sound objects

Numerical Considerations

🔢 HFD Calculation Nuances

Hann windowing: Reduces edge effects but alters signal statistics. Skip for noisy/transient‑rich material.

k_max selection: Too low → inaccurate; too high → sensitive to noise. 4‑6 works for audio‑rate signals.

Frame length: Must contain enough samples for multi‑scale analysis. Minimum ~20 samples after downsampling.

Downsampling: Reduces computation but loses high‑frequency information. Factor 6 = 44.1 kHz → ~7.35 kHz (sufficient for HFD).

Outlier handling: Percentile mapping (5th‑95th) ignores extreme values that could skew min‑max.

Voicing Detection Algorithm

Harmonicity‑Based Approach

Praat's Harmonicity (HNR) object provides continuous voicing measure:

To Harmonicity (cc): 0.01, minimum_pitch_Hz, 0.1, 1.0 Get value at time: returns HNR in dB Mapping to 0‑1 scale: voicing = (HNR + 5) / 20 clamped to [0, 1] Rationale: HNR < 0 dB → unvoiced/noise → voicing ≈ 0 HNR ≈ 10 dB → moderately voiced → voicing ≈ 0.75 HNR > 15 dB → strongly voiced → voicing ≈ 1.0 Smoothing: Moving average over voicing_smooth_window frames

Visualization System

Multi‑Panel Display

Six visualization panels:

1. Title: Script name, original filename, preset

2. Original waveform: Gray, for comparison

3. Warped waveform: Purple, shows temporal distortion

4. HFD curve: Blue (raw) and dark blue (smoothed)
Shows signal complexity over time

5. Stretch factors: Pink (raw) and dark red (final)
Unity line at 1.0 for reference

6. Voicing curve: Green (if enabled)
Shows harmonicity‑based voicing strength

7. Stats box: Numerical summary of processing

Auto‑Scaling Logic

Dynamic axis scaling based on data:

For HFD plot: minHFD = min(smoothed_hfd#) − margin maxHFD = max(smoothed_hfd#) + margin margin = (max−min) × 0.1 (minimum 0.05) For stretch plot: Uses user‑defined min_stretch_factor and max_stretch_factor margin = (max−min) × 0.1 Ensures all data visible with comfortable margins.

Performance Optimizations

Speed‑Quality Tradeoffs

⚡ Optimization Strategies

Downsampling: Factor 6 reduces samples 36× (6²) for HFD calculation.

Skip windowing: Bypasses Hann window computation and application.

Matrix‑based access: Convert sound to matrix once, access columns directly (faster than Get value at time).

Early exit: Skip HFD calculation for very short frames.

Progress reporting: Update info window every 10 frames rather than every frame.

Visualization optional: Drawing is CPU‑intensive; disable for batch processing.

Object cleanup: Remove temporary objects immediately after use to free memory.