Temporal Elasticity — Latent Time Warping Engine — User Guide
Segments a Sound into events, extracts acoustic features, and calls a Python engine that learns a latent space and builds a temporal field to warp event durations. Reconstruction is via PSOLA, resampling, or placement.
What this does
This script implements a Temporal Elasticity engine — a latent time warping system that segments audio into events, extracts acoustic features, learns a latent space, builds a temporal field, and warps event durations based on their position in that space. Reconstruction is performed via PSOLA, resampling, or simple placement.
⏱️ What is Temporal Elasticity?
This approach treats time as an elastic medium that can stretch or compress depending on acoustic properties:
- Events are segmented from the source (silence-based)
- Feature extraction (24-dimensional): MFCCs, spectral centroid, flatness, RMS, ZCR, duration, delta-MFCCs
- Latent space learned via autoencoder or PCA
- Temporal field built over latent space — 5 modes: gravitational, inversion, turbulence, gradient, relativistic
- Duration rules apply additional transformations based on event characteristics
- Reconstruction via PSOLA (pitch-preserving), resampling, or placement
Key Features:
- 8 Preset Strategies — Gentle Warp to Relativistic, plus Custom
- 24-Dimensional Features — Comprehensive acoustic description
- 2 Latent Methods — Autoencoder or PCA
- 5 Temporal Field Modes — gravitational, inversion, turbulence, gradient, relativistic
- 5 Extra Duration Rules — none, short_stretch, long_compress, harmonic_compress, noisy_dilate
- 3 Reconstruction Methods — PSOLA, resampling, placement only
- Comprehensive Visualization — 5-panel display with waveforms, spectrograms, scale bar chart, stats
Technical Implementation: (1) Event Segmentation: Silence-based detection. (2) Feature Extraction: 24-dim vector per event (MFCC, centroid, flatness, RMS, ZCR, duration, delta-MFCCs). (3) Latent Learning: Autoencoder (numpy) or PCA. (4) Temporal Field: Maps latent position → time scale factor. (5) Duration Rules: Apply additional modifications. (6) Reconstruction: PSOLA, resampling, or placement.
Quick start
- In Praat, select exactly one Sound object (any duration, any content).
- Run script… → select
TemporalElasticity.praat. - Choose Preset (2-9 for specific strategies, 1 for custom).
- Set segmentation parameters (silence threshold, min event duration).
- Set latent space parameters (dimensions, iterations, method, seed).
- Choose temporal field mode and adjust amplitude, sigma, clusters.
- Select extra duration rules and reconstruction method.
- Enable Draw_visualization for analysis display.
- Click OK — engine segments, extracts features, learns latent space, builds field, reconstructs.
Temporal Elasticity Theory
Feature Extraction (24 dimensions)
Latent Learning Methods
🧠 Autoencoder
Symmetric autoencoder: input (24) → hidden (h) → latent (z) → hidden (h) → output (24)
h = max(z×2, min(128, √(24×z)))
Training: 20-400 iterations, Adam optimiser, MSE loss
📊 PCA
Principal Component Analysis via SVD. Faster but linear.
Temporal Field Modes
Extra Duration Rules
📏 Additional Modifications
| Rule | Formula | Effect |
|---|---|---|
| short_stretch | s × (1 + (0.1 - dur)/0.1) for dur < 0.1s | Short events get extra stretch |
| long_compress | s × max(0.5, 1 - 0.3 × (dur-0.5)/0.5) for dur > 0.5s | Long events get extra compression |
| harmonic_compress | s × (1 - 0.4 × (1 - flatness)) | Tonal events (low flatness) compress |
| noisy_dilate | s × (1 + 0.6 × flatness) | Noisy events (high flatness) stretch |
Reconstruction Methods
Preset Strategies
Preset 2: Gentle Warp
🌊 Subtle Time Warping
Latent: 4 | Method: AE | Mode: gravitational
Amplitude: 0.3 | Sigma: 1.5 | Clusters: 3
Rules: none | Recon: PSOLA
Character: Gentle gravitational pull, subtle duration changes
Use on: Ambient, gradual transformations
Preset 3: Rhythmic Stretch
🥁 Pulse-Oriented
Latent: 4 | Method: PCA | Mode: gradient
Amplitude: 0.6 | Rules: short_stretch | Recon: resampling
Character: Linear gradient along principal axis, short events stretched
Use on: Rhythmic material, loops
Preset 4: Gravitational Pull
🪐 Strong Attraction
Latent: 6 | Method: AE | Mode: gravitational
Amplitude: 1.2 | Sigma: 0.8 | Clusters: 4
Rules: none | Recon: PSOLA
Character: Strong gravitational wells, significant stretch near cluster centers
Use on: Transformative material, creating emphasis
Preset 5: Turbulent Scatter
🌪️ Chaotic Fluctuations
Latent: 8 | Method: AE | Mode: turbulence
Amplitude: 1.0 | Sigma: 0.6 | Clusters: 5
Rules: noisy_dilate | Recon: placement
Character: Stochastic field, noisy events dilated
Use on: Glitch, chaotic textures
Preset 6: Time Inversion
🔄 Density Inversion
Latent: 4 | Method: AE | Mode: inversion
Amplitude: 1.0 | Rules: long_compress | Recon: PSOLA
Character: Dense regions compress, sparse stretch — inverts density
Use on: Dramatic temporal restructuring
Preset 7: Spectral Drift
🎨 Spectral-Based
Latent: 6 | Method: PCA | Mode: gradient
Amplitude: 0.7 | Sigma: 1.2 | Rules: harmonic_compress
Recon: resampling
Character: Gradient along spectral axis, harmonic events compress
Use on: Spectral transformations
Preset 8: Deep Mutation
🧬 Strong Transformation
Latent: 12 | Method: AE | Mode: gravitational
Amplitude: 1.8 | Sigma: 0.5 | Clusters: 6
Rules: none | Recon: PSOLA
Character: Large latent, high amplitude, strong transformation
Use on: Radical time warping
Preset 9: Relativistic
⚡ Velocity-Based
Latent: 6 | Method: AE | Mode: relativistic
Amplitude: 1.2 | Rules: none | Recon: PSOLA
Character: Lorentz time dilation based on latent velocity
Use on: Where change of identity should slow time
Parameters & Controls
Segmentation Parameters
| Parameter | Default | Description |
|---|---|---|
| Silence_threshold (dB) | 25.0 | dB below which is considered silence |
| Min_event_duration (s) | 0.05 | Minimum event length |
| Min_silence_duration (s) | 0.03 | Minimum silence length for segmentation |
Latent Space Parameters
| Parameter | Default | Description |
|---|---|---|
| Latent_dimensions | 4 | Latent dimensions (2–8) |
| Training_iterations | 120 | AE training iterations (20–400) |
| Latent_method | ae | ae (autoencoder) or pca |
| Random_seed | 42 | Seed for reproducibility |
Temporal Field Parameters
| Parameter | Default | Description |
|---|---|---|
| Field_mode | gravitational | gravitational, inversion, turbulence, gradient, relativistic |
| Amplitude | 0.8 | Field strength (0.1–3.0) |
| Sigma | 1.0 | Well width for gravitational/turbulence (0.1–5.0) |
| Clusters | 3 | Number of clusters for gravitational mode (1–8) |
Duration Rules
| Parameter | Default | Description |
|---|---|---|
| Extra_rules | none | none, short_stretch, long_compress, harmonic_compress, noisy_dilate |
Reconstruction
| Parameter | Default | Description |
|---|---|---|
| Reconstruction_method | PSOLA | PSOLA, resampling, placement_only |
Output
| Parameter | Default | Description |
|---|---|---|
| Draw_visualization | 1 | Generate 5-panel analysis display |
| Play_result | 1 | Audition after processing |
Visualization & Analysis
5-Panel Display
Reading the Scale Bar Chart
- Blue bars (scale < 1): Event is compressed — shorter than original
- Red bars (scale > 1): Event is stretched — longer than original
- Red line at 1.0: No change
- The height shows the exact scale factor (e.g., 1.5 = 50% longer)
- Distribution reveals the temporal field's effect
Interpreting Metrics
- Scale entropy: Higher = more varied stretching; lower = uniform
- Compression ratio: Total output duration / input duration
- Scale mean: Average stretch factor (should be near 1 if total duration preserved)
- Latent dispersion: How spread out events are in latent space
Applications
Electroacoustic Composition
Use case: Creating time-warped textures where duration reflects acoustic properties
Technique: Gravitational or Inversion presets on varied source material
Workflow:
- Select a 20-60 second recording with diverse acoustic events
- Run with Gravitational Pull preset
- Examine scale bar chart to see which events stretch/compress
- Export and use as movement in larger work
Rhythmic Manipulation
Use case: Creating rhythmic variations where duration depends on spectral characteristics
Technique: Rhythmic Stretch or Turbulent Scatter on percussive material
Applications:
- Drum loops: Short events stretched, creating glitchy effects
- Pulse patterns: Gradient mode creates gradual tempo changes
- Randomized rhythms: Turbulence mode with noisy_dilate rule
Sound Design for Media
Use case: Creating evolving textures, risers, transformations
Technique: Deep Mutation or Relativistic presets
Examples:
- Risers: Gradient mode with increasing amplitude
- Drones: Gravitational mode stretches tonal regions
- Chaotic textures: Turbulence with noisy_dilate
Research & Education
Use case: Studying relationship between acoustic features and perceived duration
Technique: Compare field modes on same source, examine scale distributions
Learning outcomes:
- Understand how latent space organizes acoustic features
- See how different field modes produce different temporal profiles
- Explore extra rules and their effects
- Compare reconstruction methods
Practical Workflow Examples
🎬 Film Scene: Time Dilation
Goal: Create 60-second cue where time slows during moments of change
Settings:
- Source: 30-second ambient with events
- Preset: Relativistic
- Amplitude=1.5, PSOLA reconstruction
Result: Events with high latent velocity (rapid change) are stretched — time slows
🎚️ Electronic Music: Glitch Drums
Goal: Create glitchy drum loop
Settings:
- Source: 8-second drum loop
- Preset: Turbulent Scatter
- Rules: noisy_dilate, placement reconstruction
Result: Noisy hits dilated, placed with gaps — glitchy texture
🎙️ Voice Processing: Expressive Speech
Goal: Stretch expressive moments in speech
Settings:
- Source: 10-second spoken phrase
- Preset: Gravitational Pull
- Rules: harmonic_compress (tonal parts compress)
Result: Tonal vowels compress, expressive consonants stretch
Troubleshooting Common Issues
Cause: Python not installed, or packages missing
Solution: Install Python and required packages: pip install numpy soundfile
Cause: Silence threshold too high/low, or source has no clear events
Solution: Adjust silence_threshold, min_event_duration; script falls back to 0.25s grid
Cause: Extreme time stretching/compression on noisy material
Solution: Use resampling or placement instead, or reduce amplitude
Cause: PSOLA synthesis artifacts or poor concatenation
Solution: Use placement method (adds silence gaps) or adjust PSOLA parameters
Cause: Amplitude too low, or field mode not creating variation
Solution: Increase amplitude, try different field mode
Advanced Techniques
In build_temporal_field(), add new mode definitions to create custom temporal fields.
In extract_features_from_patch(), modify the feature vector to include different descriptors (e.g., MFCC variance, spectral rolloff).
Modify apply_duration_rules() to apply multiple rules in sequence.
Script converts to mono for analysis; output preserves original channel count. For multichannel, modify reconstruction to handle each channel separately.