HFD-Driven Time Warping — User Guide
Adaptive time‑stretching based on signal complexity: uses Higuchi Fractal Dimension (HFD) to measure local signal irregularity, then warps time proportionally to create dynamic, content‑aware temporal transformations.
What this does
This script implements HFD‑driven time warping — an adaptive time‑stretching technique that adjusts stretching factors based on local signal complexity measured by the Higuchi Fractal Dimension (HFD). Unlike uniform time‑stretching, which applies the same factor to the entire signal, this method stretches complex/irregular regions more (or less) than simple/regular regions, creating dynamic temporal transformations that respect the signal's inherent structure.
Key Features:
- 5 Built‑in Presets — Subtle, Moderate, Dramatic, Extreme, Glitch
- Material‑Adaptive Processing — Different settings for speech vs. music/field recordings
- Voicing‑Aware Processing — Optional harmonicity‑based gating to protect voiced segments
- Advanced HFD Calculation — Windowed, multi‑scale fractal dimension estimation
- 4 Mapping Curves — Linear, emphasize extremes, emphasize changes, quantized steps
- Comprehensive Visualization — Shows HFD curve, stretch factors, voicing, and waveforms
- Artifact Control — Slew‑rate limiting, smoothing, percentile mapping for robustness
Technical Implementation: (1) Preprocessing: Convert to mono, downsample for analysis, apply material‑specific filtering. (2) HFD Analysis: Calculate Higuchi Fractal Dimension for each overlapping frame. (3) Voicing Analysis: Optional harmonicity‑based voicing detection to protect voiced segments. (4) Mapping: Convert HFD values to stretch factors using configurable curves and voicing influence. (5) Smoothing & Limiting: Apply temporal smoothing and slew‑rate limiting to prevent artifacts. (6) Time Warping: Create Praat DurationTier and resynthesize using overlap‑add. (7) Visualization: Display HFD, stretch factors, voicing, and before/after waveforms.
Quick start
- In Praat, select exactly one Sound object (mono or stereo).
- Run script… →
hfd_time_warping.praat. - Choose a Preset (start with "Moderate" for balanced results).
- Select Material_type (Speech vs. Music/Field Recording).
- Enable Draw_visualization to see the analysis curves.
- Click OK — the warped sound appears as "originalname_HFDwarp_presetname".
- Listen and compare original vs. warped; adjust parameters as needed.
HFD & Time Warping Theory
Higuchi Fractal Dimension
Mathematical Definition
For a time series x(1), x(2), ..., x(N):
📊 HFD Values for Common Audio
Pure tones (sine, saw, square): 1.0‑1.3
Voiced speech (vowels): 1.2‑1.5
Unvoiced speech (fricatives): 1.5‑1.8
Musical tones (sustained): 1.3‑1.6
Percussion transients: 1.6‑1.9
White/pink noise: 1.8‑2.0
Complex textures (grains, crowds): 1.7‑2.0
Practical range in analysis: Typically 1.0‑2.0, with most audio 1.2‑1.8.
Windowed HFD Implementation
The script calculates HFD per analysis frame:
Time Warping Principles
From HFD to Stretch Factors
The core mapping logic:
Voicing‑Aware Adjustment
🎵 Protecting Voiced Content
Problem: HFD alone doesn't distinguish between desirable complexity (textures) and undesirable complexity (noise in voiced regions).
Solution: Use harmonicity (HNR) as voicing measure:
- High harmonicity → voiced (vowels, musical tones)
- Low harmonicity → unvoiced (noise, consonants)
Adjustment formula:
effective_stretch = 1.0 + voicing × (stretch − 1.0)
Where voicing ∈ [0,1] (0=unvoiced, 1=fully voiced)
Result: Voiced regions resist stretching, unvoiced regions follow HFD mapping.
Artifact Control Mechanisms
Smoothing Strategies
1. HFD smoothing: Moving average over smoothing_window_size frames
Purpose: Reduce frame‑to‑frame HFD fluctuation
2. Stretch factor smoothing: Moving average over final_stretch_smooth frames
Purpose: Create smooth stretch factor trajectory
3. Slew‑rate limiting: max_stretch_change_per_sec constraint
Formula: |s[i] − s[i‑1]| ≤ max_change × hop_size
Purpose: Prevent abrupt stretch changes ("zipper" artifacts)
Percentile mapping: Use 5th‑95th percentile instead of min‑max for robustness against outliers.
Praat DurationTier Implementation
The final time warping uses Praat's built‑in resynthesis:
Preset Strategies
Five Preset Configurations
| Preset | Stretch Range | Smoothing | Voicing | Slew Limit | Character |
|---|---|---|---|---|---|
| 1. Subtle | 0.85‑1.15× | Heavy (7) | High (80%) | 3.0/s | Gentle, natural‑sounding |
| 2. Moderate | 0.7‑1.5× | Medium (5) | Medium (70%) | 4.0/s | Balanced, artistic |
| 3. Dramatic | 0.5‑2.0× | Light (4) | Medium‑low (60%) | 6.0/s | Expressive, noticeable |
| 4. Extreme | 0.4‑2.5× | Very light (3) | Low (50%) | 8.0/s | Radical transformation |
| 5. Glitch | 0.4‑2.5× | Minimal (2) | None (0%) | 20.0/s | Digital artifacts, choppy |
🎚️ Preset Deep Dive: Moderate
Design philosophy: Artistic time‑warping that's clearly audible but still musical.
Parameters: Stretch 0.7‑1.5× (compression and expansion), medium smoothing (5 frames), 70% voicing influence, 4.0/s slew limit.
Result: Complex regions (consonants, transients) are stretched up to 1.5×, simple regions (vowels, sustained tones) compressed to 0.7×. Voicing protection prevents extreme manipulation of harmonic content. Smooth changes maintain natural flow.
⚡ Preset Deep Dive: Extreme
Design philosophy: Push time‑warping to its limits for experimental sound design.
Parameters: Stretch 0.4‑2.5× (severe compression/expansion), light smoothing (3 frames), 50% voicing influence, 8.0/s slew limit, skip_windowing enabled.
Result: Dramatic temporal manipulation where complex sections can stretch to 2.5× original length while simple sections compress to 40%. Reduced smoothing creates more abrupt transitions. Voicing influence halved allows more manipulation of harmonic content.
🔧 Preset Deep Dive: Glitch
Design philosophy: Embrace digital artifacts for glitch‑art aesthetic.
Parameters: Stretch 0.4‑2.5×, minimal smoothing (2 frames), no voicing gate, 20.0/s slew limit, quantized mapping, skip_windowing enabled.
Result: Abrupt, stepped time changes that create digital‑sounding artifacts. No voicing protection means harmonic content gets wildly stretched. Quantized mapping creates discrete stretch levels (like bit‑crushing for time). High slew limit allows instant stretch changes.
Material‑Type Adaptation
🎵 Speech vs. Music Processing
Speech‑specific processing (Material_type = 1):
- High‑pass filter: 100 Hz cutoff to remove rumble
- Voicing gate: Typically enabled (protects vowels)
- Stretch range: Often asymmetric (compress simple, expand complex)
- Typical use: Speech rate modification, dramatic effect, glitch art
Music/Field‑Recording processing (Material_type = 2):
- High‑pass filter: 30 Hz cutoff (preserve bass)
- Voicing gate: Optional (depends on material)
- Stretch range: Often symmetric or tailored to genre
- Typical use: Tempo manipulation, texture stretching, ambient creation
Parameters & Controls
Analysis Parameters (Custom Mode)
| Parameter | Type | Default | Description |
|---|---|---|---|
| Frame_length_s | positive | 0.05 | Analysis frame length in seconds (0.03‑0.10 typical) |
| Hop_size_s | positive | 0.05 | Time between analysis frames (usually = frame_length) |
| K_max | integer | 5 | Maximum scale factor for HFD calculation (4‑6 typical) |
Smoothing & Mapping Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Smoothing_window_size | integer | 5 | Moving average window for HFD smoothing (3‑9 typical) |
| Min_stretch_factor | real | 0.5 | Minimum time‑stretch factor (0.4‑0.9 typical) |
| Max_stretch_factor | real | 2.0 | Maximum time‑stretch factor (1.1‑2.5 typical) |
| Use_percentile_mapping | boolean | 1 | Use 5th‑95th percentile instead of min‑max (robust to outliers) |
| Mapping_curve | option | Linear | HFD‑to‑stretch mapping: Linear, Emphasize extremes, Emphasize changes, Quantized steps |
Voicing Control Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Use_voicing_gate | boolean | 1 | Enable harmonicity‑based voicing protection |
| Voicing_influence | real | 0.7 | Strength of voicing protection (0‑1, 0=none, 1=full) |
| Voicing_smooth_window | integer | 3 | Smoothing window for voicing track (1‑7 typical) |
Control Smoothing Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Final_stretch_smooth | integer | 3 | Final smoothing window for stretch factors (1‑7 typical) |
| Max_stretch_change_per_sec | real | 5.0 | Maximum stretch factor change per second (prevents "zipper" artifacts) |
Pitch Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
| Minimum_pitch_Hz | positive | 75 | Minimum pitch for harmonicity analysis (Hz) |
| Maximum_pitch_Hz | positive | 600 | Maximum pitch for harmonicity analysis (Hz) |
Speed Optimization Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Downsample_factor | integer | 6 | Downsampling factor for analysis (higher = faster but less accurate) |
| Skip_windowing | boolean | 0 | Skip Hann windowing in HFD calculation (faster but less accurate) |
Output Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| Draw_visualization | boolean | 1 | Draw analysis curves and waveforms after processing |
| Play_result | boolean | 1 | Play the warped sound automatically |
- Frame_length_s vs. Hop_size_s: Equal values = no overlap; half values = 50% overlap (smoother but slower).
- K_max: Higher values = more accurate HFD but slower; 4‑6 works for most audio.
- Min/Max_stretch_factor: Values <1 = compression (speed up), >1 = expansion (slow down). Asymmetric ranges (e.g., 0.7‑1.5) create interesting effects.
- Mapping_curve: "Emphasize extremes" exaggerates differences; "Quantized steps" creates stepped/stutter effects.
- Voicing_influence: 0.7 means 70% of stretch adjustment comes from voicing‑adjusted value, 30% from raw HFD mapping.
- Max_stretch_change_per_sec: Critical for natural‑sounding results; too high causes abrupt changes, too low limits dynamic range.
Applications
Speech Processing & Effects
Use case: Create dramatic speech effects, alter speech rate dynamically, generate glitch‑art vocals.
Technique: Use Material_type = Speech with Moderate or Dramatic preset. Enable voicing gate to protect vowels while stretching consonants.
Example: Process spoken word with Dramatic preset (0.5‑2.0×) — consonants elongate, vowels compress, creating "dragging" effect on fricatives while maintaining vowel intelligibility.
Musical Time‑Manipulation
Use case: Create dynamic tempo changes, stretch specific musical elements, generate rhythmic variations.
Technique: Use Material_type = Music with Moderate preset. Adjust stretch range based on desired effect (0.8‑1.2× for subtle, 0.5‑2.0× for extreme).
Workflow:
- Process drum loop: transients (high HFD) stretch, sustains compress
- Process melodic instrument: fast passages expand, slow passages compress
- Layer original and processed versions
- Export for sampling or further processing
Sound Design & Texture Creation
Use case: Transform field recordings into evolving textures, create time‑based granular effects.
Technique: Use Extreme or Glitch preset with wide stretch range (0.4‑2.5×). Disable voicing gate for non‑harmonic material.
Advantages:
- Content‑aware stretching respects signal structure
- Complex textures (rain, crowds) stretch more than simple elements
- Can create "time‑lapse" or "slow‑motion" effects that vary dynamically
Audio Restoration & Enhancement
Use case: Smooth irregular tempo in recordings, reduce/emphasize noise characteristics.
Technique: Use Subtle preset (0.85‑1.15×) with heavy smoothing and high voicing influence.
Example: Process recording with irregular tempo — HFD detects rhythmic complexity, stretches complex (rushed) sections, compresses simple (dragging) sections towards more uniform tempo.
Experimental Composition
Use case: Generate algorithmic time‑structures, create musique concrète transformations.
Technique: Chain multiple HFD warping operations with different parameters, or combine with other processing.
Example: Process sound → HFD warp → reverse → HFD warp again → layer with original → spatialize.
Practical Workflow Examples
🎤 Dynamic Speech Compression/Expansion
Goal: Make spoken word more dramatic by stretching pauses and compressing speech.
Settings:
- Preset: Moderate (0.7‑1.5×)
- Material_type: Speech
- Voicing_influence: 0.8
- Mapping_curve: Emphasize changes
- Result: Pauses/silence (low HFD) compress to 0.7×, consonants (high HFD) stretch to 1.5×, vowels moderately affected.
🎵 Rhythmic Transformation of Drum Loop
Goal: Create "humanized" or "swinging" rhythm from rigid drum machine loop.
Settings:
- Preset: Subtle (0.85‑1.15×)
- Material_type: Music
- Voicing_influence: 0.0 (drums aren't harmonic)
- Smoothing: Heavy (7)
- Result: Kick/snare transients stretch slightly, hi‑hat patterns compress, creating natural‑sounding groove.
🌊 Ambient Texture from Field Recording
Goal: Transform 10‑second water stream into evolving 60‑second ambient texture.
Settings:
- Preset: Extreme (0.4‑2.5×)
- Material_type: Music/Field Recording
- Voicing_influence: 0.0
- Mapping_curve: Linear
- Additional: Process multiple times with different parameters, layer results.
Advanced Techniques
- Analysis‑synthesis separation: Run HFD analysis once, save stretch factors, apply with different mapping curves.
- Cascaded warping: Apply HFD warp → analyze result → apply again with different parameters.
- Hybrid approaches: Combine HFD warping with pitch‑shifting, filtering, or spatialization.
- Parameter automation: Modify script to change parameters over time (e.g., gradually increase max_stretch_factor).
- Inverted mapping: Set min_stretch_factor > max_stretch_factor to compress complex regions and expand simple ones.
- Extreme quantization: Use Mapping_curve = Quantized steps with only 2‑3 steps for robotic/stutter effects.
- Micro‑analysis: Very short frames (0.01 s) with high k_max for granular‑level control.
- Spectral‑HFD: Modify script to calculate HFD per frequency band, create multi‑band time warping.
Troubleshooting Common Issues
Cause: Stretch factors changing too rapidly between frames.
Solution: Increase final_stretch_smooth, reduce max_stretch_change_per_sec, or increase smoothing_window_size.
Cause: Extreme stretch factors or non‑linear duration accumulation.
Solution: Check average stretch factor in visualization; duration ≈ original × average_stretch.
Cause: Voicing_influence too low, or material not harmonic enough.
Solution: Increase voicing_influence to 0.8‑0.9, check harmonicity in visualization.
Cause: Many frames (short hop_size), high k_max, or visualization enabled.
Solution: Increase hop_size_s, reduce k_max to 4, disable visualization, increase downsample_factor.
Technical Deep Dive
HFD Calculation Details
Windowed Implementation
The windowedHFD procedure implements robust HFD:
Numerical Considerations
🔢 HFD Calculation Nuances
Hann windowing: Reduces edge effects but alters signal statistics. Skip for noisy/transient‑rich material.
k_max selection: Too low → inaccurate; too high → sensitive to noise. 4‑6 works for audio‑rate signals.
Frame length: Must contain enough samples for multi‑scale analysis. Minimum ~20 samples after downsampling.
Downsampling: Reduces computation but loses high‑frequency information. Factor 6 = 44.1 kHz → ~7.35 kHz (sufficient for HFD).
Outlier handling: Percentile mapping (5th‑95th) ignores extreme values that could skew min‑max.
Voicing Detection Algorithm
Harmonicity‑Based Approach
Praat's Harmonicity (HNR) object provides continuous voicing measure:
Visualization System
Multi‑Panel Display
1. Title: Script name, original filename, preset
2. Original waveform: Gray, for comparison
3. Warped waveform: Purple, shows temporal distortion
4. HFD curve: Blue (raw) and dark blue (smoothed)
Shows signal complexity over time
5. Stretch factors: Pink (raw) and dark red (final)
Unity line at 1.0 for reference
6. Voicing curve: Green (if enabled)
Shows harmonicity‑based voicing strength
7. Stats box: Numerical summary of processing
Auto‑Scaling Logic
Dynamic axis scaling based on data:
Performance Optimizations
Speed‑Quality Tradeoffs
⚡ Optimization Strategies
Downsampling: Factor 6 reduces samples 36× (6²) for HFD calculation.
Skip windowing: Bypasses Hann window computation and application.
Matrix‑based access: Convert sound to matrix once, access columns directly (faster than Get value at time).
Early exit: Skip HFD calculation for very short frames.
Progress reporting: Update info window every 10 frames rather than every frame.
Visualization optional: Drawing is CPU‑intensive; disable for batch processing.
Object cleanup: Remove temporary objects immediately after use to free memory.