Tempo Curve (IOI) Estimator — User Guide

Musical tempo analysis: detects onsets, calculates inter-onset intervals, and estimates BPM over time for tempo curve visualization.

Author: Based on Praat AudioTools by Shai Cohen Version: 2025 Outputs: TextGrid (beats) + Table (time, BPM curve)
Contents:

What this does

This script implements tempo curve estimation — analyzing musical audio to detect rhythmic onsets and calculate tempo variations over time. The process: (1) Onset detection: Identify musical events using spectral flux or intensity slope methods, (2) IOI calculation: Compute inter-onset intervals between detected beats, (3) Tempo estimation: Convert IOIs to BPM values with octave correction, (4) Curve smoothing: Apply temporal smoothing for stable tempo curves, (5) Output generation: Create TextGrid with beat positions and Table with BPM over time. Result: detailed tempo analysis showing rhythmic variations throughout musical performances.

Key Features:

What is tempo curve analysis? Traditional tempo detection: single BPM estimate for entire piece. Tempo curve analysis: BPM variations over time, capturing rubato, accelerando, ritardando. Advantages: (1) Musical accuracy: Captures expressive timing variations, (2) Detailed analysis: Shows tempo evolution throughout piece, (3) Performance analysis: Quantifies interpretive choices, (4) Musicological insight: Reveals structural tempo relationships, (5) Practical applications: Music transcription, performance analysis, audio synchronization. Use cases: Musicology research (performance practice), music education (timing analysis), audio production (tempo mapping), dance/movement studies (rhythm analysis), music information retrieval (feature extraction).

Technical Implementation: (1) Pre-processing: High-pass filtering, optional resampling for efficiency. (2) Onset detection: Spectral flux (frequency content changes) or intensity slope (amplitude changes) with adaptive thresholding. (3) IOI calculation: Time differences between consecutive onsets. (4) Tempo estimation: Sliding window median IOI calculation with octave correction. (5) Smoothing: One-pole low-pass filter for stable curves. (6) Output: TextGrid with beat markers, Table with time-BPM-confidence data. Key insight: Onset timing → inter-onset intervals → local tempo estimates → smoothed tempo curve.

Quick start

  1. In Praat, select exactly one Sound object (musical audio).
  2. Run script…tempo_curve_estimator.praat.
  3. Set tempo range (Min_BPM, Max_BPM) appropriate for your audio.
  4. Choose onset detection method: Spectral flux (complex music) or Intensity slope (percussive).
  5. Adjust sensitivity (1.0-2.0, higher = more onsets detected).
  6. Set smoothing (0.1-1.0 Hz, lower = smoother tempo curve).
  7. Click OK — processes audio, creates TextGrid and Table outputs.
Quick tip: Start with Spectral flux method + sensitivity 1.5 for most music. Use Intensity slope for strongly percussive material. Set tempo range to expected BPM ± 40%. For classical music with rubato, use lower smoothing (0.2-0.5 Hz). For electronic music, use higher smoothing (0.5-1.0 Hz). Check Info window for detection statistics and mean BPM. Processing time depends on audio length and method choice.
Important: TEMPO RANGE CRITICAL — incorrect range causes octave errors (half/double tempo). Very wide ranges reduce accuracy. Onset detection sensitive to recording quality — noisy recordings may need sensitivity adjustment. Spectral flux works better for complex music but is computationally intensive. Intensity slope faster but may miss subtle onsets. Smoothing trades off between stability and responsiveness — too much smoothing misses tempo changes, too little creates jitter. Always verify results against auditory perception.

Tempo Analysis Theory

Inter-Onset Interval (IOI) Fundamentals

Basic Concepts

Definitions:

Onset: Point in time where musical event begins (note, beat, attack) Inter-Onset Interval (IOI): Time between consecutive onsets IOI = tₙ₊₁ - tₙ Tempo (BPM): Beats per minute BPM = 60 / IOI_median Where IOI_median is typical beat interval Example: IOI = 0.5 seconds → BPM = 60 / 0.5 = 120 IOI = 0.75 seconds → BPM = 60 / 0.75 = 80

Tempo Ambiguity (Octave Errors)

The doubling/halving problem:

Given IOI, possible tempo interpretations: Fast interpretation: BPM = 60 / IOI Medium interpretation: BPM = 60 / (2 × IOI) Slow interpretation: BPM = 60 / (4 × IOI) Example: IOI = 0.25 seconds Fast: 60 / 0.25 = 240 BPM Medium: 60 / 0.5 = 120 BPM ← Usually correct Slow: 60 / 1.0 = 60 BPM Solution: Test multiple candidates, choose most continuous Candidates: {raw_BPM/2, raw_BPM, raw_BPM×2} Select candidate closest to previous tempo Constrain to user-specified range [min_BPM, max_BPM]

Sliding Window Analysis

Window Configuration

Adaptive window sizing:

Window size: Adaptive based on minimum tempo window_size = max(4.0, 4 × (60 / min_BPM)) Hop size: 1/8 of window for smooth overlapping hop_size = window_size / 8 Rationale: - Window must contain multiple beats for reliable statistics - 4× longest expected beat period ensures multiple samples - 1/8 overlap provides smooth temporal resolution Example: min_BPM = 60 Longest beat = 1.0 second Window size = max(4.0, 4×1.0) = 4.0 seconds Hop size = 4.0 / 8 = 0.5 seconds

Temporal Resolution

Trade-off between stability and responsiveness:

Large window (4-8 seconds): + Stable tempo estimates + Robust to missing onsets - Slow response to tempo changes - Poor for music with rubato Small window (2-4 seconds): + Fast response to tempo changes + Good for expressive timing - Sensitive to missing onsets - More estimation variance This script: Adaptive 4× period, minimum 4 seconds Good compromise for most musical styles

Statistical Robustness

Median-Based Estimation

Why median instead of mean:

IOI distribution in music: - Majority of IOIs close to beat period - Some IOIs half/double period (off-beats, syncopation) - Occasional outliers (missing detections, extra detections) Mean IOI: Sensitive to outliers Example: IOIs = [0.5, 0.5, 0.5, 0.25, 2.0] seconds Mean = (0.5+0.5+0.5+0.25+2.0)/5 = 0.75 seconds → 80 BPM (wrong) Median IOI: Robust to outliers Same data: Median = 0.5 seconds → 120 BPM (correct) This script: Uses median IOI per window

Confidence Metrics

Quality indicators:

Confidence = number of IOIs in analysis window Interpretation: Confidence ≥ 4: High reliability Confidence = 2-3: Moderate reliability Confidence = 1: Low reliability (single interval) Confidence = 0: No data (uses previous estimate) Mean BPM calculation: Only uses confident estimates if confidence ≥ 2: include in average Excludes low-quality estimates from final summary

Complete Processing Pipeline

INPUT: Sound object (musical audio) STEP 1: PRE-PROCESSING - Resample to 11025 Hz if needed (efficiency) - High-pass filter at 30 Hz (remove rumble/DC) - Convert to mono if stereo STEP 2: ONSET DETECTION IF Spectral flux method: - Create spectrogram (30ms windows, 5ms hop) - Calculate spectral flux in log-spaced bands - Adaptive thresholding (local median + sensitivity×MAD) - Peak picking with refractory period ELSE Intensity slope method: - Calculate intensity contour - Median filter intensity - Compute intensity slope (derivative) - Adaptive thresholding and peak picking STEP 3: BEAT ANNOTATION - Create TextGrid with "beats" tier - Insert point at each detected onset time STEP 4: TEMPO CURVE ESTIMATION FOR each analysis window (sliding): - Collect IOIs within window - Calculate median IOI - Convert to raw BPM = 60 / median_IOI - Test octave candidates: {raw/2, raw, raw×2} - Select candidate within [min_BPM, max_BPM] - Prefer candidate closest to previous BPM - Store confidence = number of IOIs in window STEP 5: TEMPORAL SMOOTHING - Apply one-pole low-pass filter - Cutoff frequency = smoothing parameter - Causal filter (depends only on past) STEP 6: OUTPUT GENERATION - Create Table with columns: time, BPM, confidence - Calculate mean BPM (confident regions only) - Display summary statistics OUTPUTS: TextGrid (beat positions), Table (BPM curve)

Onset Detection Methods

Method 1: Spectral Flux

📊 Frequency Content Analysis

Principle: Detect changes in spectral energy distribution

Calculation: Sum of positive spectral differences across frequency bands

Strengths: Works for complex music, polyphonic textures, non-percussive onsets

Weaknesses: Computationally intensive, sensitive to noise

Best for: Classical music, jazz, acoustic ensembles, vocal music

Technical Implementation

Spectral Flux Algorithm: 1. SPECTROGRAM CONSTRUCTION - Window: 30ms Gaussian - Hop: 5ms (high temporal resolution) - Frequency range: 0-4000Hz (most musical energy) 2. LOG-SPACED FREQUENCY BANDS - 30 bands with exponential spacing - More resolution at low frequencies (perceptual weighting) - Band selection: row = 1 + (nRows-1) × (ratio^1.5) 3. SPECTRAL FLUX CALCULATION FOR each frame (after first): flux = 0 FOR each frequency band: mag_curr = ln(1 + |spectrum[band, frame]|) mag_prev = ln(1 + |spectrum[band, frame-1]|) flux += max(mag_curr - mag_prev, 0) // Positive differences only 4. ADAPTIVE THRESHOLDING - Local median in 0.5s window - Threshold = local_median + sensitivity × MAD - MAD = Median Absolute Deviation (robust spread measure) 5. PEAK PICKING - Local maxima detection (±2 frames) - Refractory period: 60/max_BPM seconds - Prevents multiple detections on same event

Method 2: Intensity Slope

📈 Amplitude Change Analysis

Principle: Detect rapid increases in sound intensity

Calculation: Positive derivative of smoothed intensity contour

Strengths: Fast computation, works well for percussive sounds

Weaknesses: Misses spectral onsets, sensitive to overall level changes

Best for: Percussive music, drum tracks, electronic music, speech rhythm

Technical Implementation

Intensity Slope Algorithm: 1. INTENSITY CONTOUR - 50Hz minimum pitch (broadband analysis) - 10ms time steps (high resolution) - Cubic interpolation for smooth values 2. MEDIAN FILTERING - 3-point median filter - Removes spikes while preserving edges - Values: [intensity[i-1], intensity[i], intensity[i+1]] - Output: median of three values 3. INTENSITY SLOPE CALCULATION slope[i] = intensity[i] - intensity[i-1] (Positive values indicate increasing intensity) 4. ADAPTIVE THRESHOLDING - Local median in 0.5s window - Threshold = local_median + sensitivity × MAD - Same robust statistics as spectral method 5. PEAK PICKING - Local maxima detection (±2 frames) - Refractory period: 60/max_BPM seconds - Output: onset times

Method Selection Guide

Music TypeRecommended MethodSensitivityNotes
ClassicalSpectral Flux1.3-1.7Handles complex textures, legato
JazzSpectral Flux1.4-1.8Syncopation, complex rhythms
Rock/PopIntensity Slope1.2-1.6Strong beats, percussive
ElectronicIntensity Slope1.1-1.5Clear attacks, synthetic sounds
Folk/WorldSpectral Flux1.5-2.0Varied instrumentation
Speech/RapIntensity Slope1.6-2.0Rhythmic speech, clear syllables
Mixed/UnknownSpectral Flux1.5Default starting point
When to choose each method:
  • Choose Spectral Flux if: Music has complex textures, legato passages, multiple instruments, or you're unsure
  • Choose Intensity Slope if: Music is strongly percussive, has clear attacks, or you need faster processing
  • Try both methods if: Results from one method seem unsatisfactory
  • Higher sensitivity: More detections (good for subtle rhythms, risk of false positives)
  • Lower sensitivity: Fewer detections (good for clear beats, risk of missed onsets)

Parameters & Settings

Tempo Range Parameters

ParameterTypeDefaultDescription
Min_BPMpositive60Minimum expected tempo (beats per minute)
Max_BPMpositive180Maximum expected tempo (beats per minute)

Onset Detection Parameters

ParameterTypeDefaultDescription
MethodoptionmenuSpectral fluxOnset detection algorithm
Sensitivitypositive1.5Detection sensitivity (1.0-2.0)

Tempo Curve Parameters

ParameterTypeDefaultDescription
Smoothing_(Hz)positive0.5Tempo curve smoothing cutoff frequency

Auto-calculated Parameters

Refractory period = 60 / max_BPM Prevents multiple detections on same musical event Window size = max(4.0, 4 × (60 / min_BPM)) Analysis window duration in seconds Hop size = window_size / 8 Time between consecutive analysis windows Example: min_BPM=60, max_BPM=180 Refractory = 60/180 = 0.333 seconds Window size = max(4.0, 4×1.0) = 4.0 seconds Hop size = 4.0/8 = 0.5 seconds

Output Formats

Output 1: TextGrid with Beat Annotations

📝 Temporal Annotation File

Structure: Single-tier TextGrid with "beats" interval tier

Content: Point annotations at each detected onset time

Usage: Visual inspection in Praat, manual correction, further analysis

Name format: Same as input sound

TextGrid Applications

Manual verification: Play sound with beat markers to check detection accuracy Correction: Manually add/move/delete beat points as needed Export: Save as text file for external analysis Integration: Use with other Praat scripts for rhythm analysis Example TextGrid structure: File type = "ooTextFile" Object class = "TextGrid" xmin = 0 xmax = 120.5 tiers? size = 1 item []: item [1]: class = "IntervalTier" name = "beats" xmin = 0 xmax = 120.5 intervals: size = 184 intervals [1]: xmin = 0.125 xmax = 0.125 text = "beat" intervals [2]: xmin = 0.567 xmax = 0.567 text = "beat" ... etc ...

Output 2: Table with Tempo Curve

📊 Time-BPM Data Table

Structure: Three-column table with time, BPM, confidence

Content: Sampled tempo curve with quality indicators

Usage: Quantitative analysis, plotting, statistical processing

Name format: "TempoCurve_originalname"

Table Columns

ColumnTypeDescriptionExample Values
timenumericCenter time of analysis window (seconds)2.25, 2.75, 3.25, ...
bpmnumericEstimated tempo at window center (BPM)118.5, 119.2, 117.8, ...
confidencenumericNumber of IOIs used in estimate4, 3, 5, 2, 0, ...

Table Applications

Statistical analysis: Calculate mean, variance, trends in tempo Visualization: Plot BPM vs time in external software Export: Save as CSV for spreadsheet analysis Filtering: Use confidence column to filter reliable estimates Comparative analysis: Compare tempo curves between performances Example table usage: - Calculate mean BPM: average of bpm column - Filter high-confidence: where confidence ≥ 3 - Find tempo range: min(bpm) to max(bpm) - Detect tempo changes: large differences between consecutive bpm values - Export to CSV: File → Save as comma-separated file

Info Window Summary

Processing summary: === Tempo Curve Estimator === Refractory period: 0.333 s Window size: 4.00 s Hop size: 0.50 s Detecting onsets... Processing 24000 frames with 30 log-spaced bands... Found 184 onsets Calculating tempo curve... Smoothing tempo curve... Results summary: === RESULTS === TextGrid: 184 beat points Table: 235 tempo estimates Mean BPM (confident regions): 118.7 Done!

Applications

Music Performance Analysis

Use case: Analyzing expressive timing in musical performances

Technique: Compare tempo curves across performances

Example: Different interpretations of the same classical piece

Music Education

Use case: Teaching rhythm and tempo concepts

Technique: Visualize student performance timing

Example: Showing tempo inconsistencies in student performances

Music Information Retrieval

Use case: Feature extraction for music classification

Technique: Use tempo curve as input to machine learning

Example: Distinguishing musical genres by tempo variability

Audio Production

Use case: Tempo mapping for audio editing

Technique: Create tempo maps for synchronization

Example: Aligning audio to video with variable tempo

Practical Workflow Examples

🎵 Classical Music Analysis

Goal: Analyze rubato in Chopin performance

Settings:

  • Method: Spectral flux
  • Tempo range: 40-160 BPM
  • Sensitivity: 1.7
  • Smoothing: 0.3 Hz

Result: Detailed tempo curve showing expressive timing

🥁 Drum Track Analysis

Goal: Extract tempo from drum recording

Settings:

  • Method: Intensity slope
  • Tempo range: 80-200 BPM
  • Sensitivity: 1.3
  • Smoothing: 0.8 Hz

Result: Stable tempo curve from clear percussive onsets

📚 Research: Performance Comparison

Goal: Compare tempo curves across multiple performances

Workflow:

  1. Process each performance with identical settings
  2. Export Tables as CSV files
  3. Import to statistical software
  4. Calculate correlation, mean differences, variability

Result: Quantitative comparison of interpretive choices

Advanced Techniques

Tempo range optimization:
  • Known piece: Use published metronome marking ±40%
  • Unknown piece: Start with 60-180 BPM, adjust based on results
  • Very slow music: 30-120 BPM (adagio, largo)
  • Very fast music: 100-240 BPM (presto, vivace)
  • Mixed tempo: Use wider range 40-200 BPM
Smoothing strategies:
  • Rubato analysis: 0.2-0.4 Hz (preserve expressive timing)
  • Steady tempo: 0.6-1.0 Hz (remove performance noise)
  • Electronic music: 0.8-1.2 Hz (very stable tempo expected)
  • Live performance: 0.3-0.6 Hz (balance stability and expression)

Troubleshooting Common Issues

Problem: Half/double tempo detected
Cause: Tempo range too wide or incorrect
Solution: Adjust min_BPM/max_BPM to expected range, check mean BPM against perception
Problem: Too few onsets detected
Cause: Sensitivity too low or wrong method
Solution: Increase sensitivity, try alternative detection method
Problem: Too many false detections
Cause: Sensitivity too high or noisy recording
Solution: Decrease sensitivity, use spectral flux method for complex music
Problem: Tempo curve too jittery
Cause: Insufficient smoothing or low confidence
Solution: Increase smoothing, check confidence values in output table
Problem: Processing very slow
Cause: Long audio file with spectral flux method
Solution: Use intensity slope method, or process shorter segments