This script implements tempo curve estimation — analyzing musical audio to detect rhythmic onsets and calculate tempo variations over time. The process: (1) Onset detection: Identify musical events using spectral flux or intensity slope methods, (2) IOI calculation: Compute inter-onset intervals between detected beats, (3) Tempo estimation: Convert IOIs to BPM values with octave correction, (4) Curve smoothing: Apply temporal smoothing for stable tempo curves, (5) Output generation: Create TextGrid with beat positions and Table with BPM over time. Result: detailed tempo analysis showing rhythmic variations throughout musical performances.
Adaptive Thresholding — Automatic sensitivity adjustment to signal content
Octave Correction — Handles tempo doubling/halving ambiguities
Temporal Continuity — Smooth tempo transitions between windows
Confidence Metrics — Quality indicators for tempo estimates
Dual Output — TextGrid beats + Table BPM curve
What is tempo curve analysis? Traditional tempo detection: single BPM estimate for entire piece. Tempo curve analysis: BPM variations over time, capturing rubato, accelerando, ritardando. Advantages: (1) Musical accuracy: Captures expressive timing variations, (2) Detailed analysis: Shows tempo evolution throughout piece, (3) Performance analysis: Quantifies interpretive choices, (4) Musicological insight: Reveals structural tempo relationships, (5) Practical applications: Music transcription, performance analysis, audio synchronization. Use cases: Musicology research (performance practice), music education (timing analysis), audio production (tempo mapping), dance/movement studies (rhythm analysis), music information retrieval (feature extraction).
Technical Implementation: (1) Pre-processing: High-pass filtering, optional resampling for efficiency. (2) Onset detection: Spectral flux (frequency content changes) or intensity slope (amplitude changes) with adaptive thresholding. (3) IOI calculation: Time differences between consecutive onsets. (4) Tempo estimation: Sliding window median IOI calculation with octave correction. (5) Smoothing: One-pole low-pass filter for stable curves. (6) Output: TextGrid with beat markers, Table with time-BPM-confidence data. Key insight: Onset timing → inter-onset intervals → local tempo estimates → smoothed tempo curve.
Quick start
In Praat, select exactly one Sound object (musical audio).
Run script… → tempo_curve_estimator.praat.
Set tempo range (Min_BPM, Max_BPM) appropriate for your audio.
Adjust sensitivity (1.0-2.0, higher = more onsets detected).
Set smoothing (0.1-1.0 Hz, lower = smoother tempo curve).
Click OK — processes audio, creates TextGrid and Table outputs.
Quick tip: Start with Spectral flux method + sensitivity 1.5 for most music. Use Intensity slope for strongly percussive material. Set tempo range to expected BPM ± 40%. For classical music with rubato, use lower smoothing (0.2-0.5 Hz). For electronic music, use higher smoothing (0.5-1.0 Hz). Check Info window for detection statistics and mean BPM. Processing time depends on audio length and method choice.
Important:TEMPO RANGE CRITICAL — incorrect range causes octave errors (half/double tempo). Very wide ranges reduce accuracy. Onset detection sensitive to recording quality — noisy recordings may need sensitivity adjustment. Spectral flux works better for complex music but is computationally intensive. Intensity slope faster but may miss subtle onsets. Smoothing trades off between stability and responsiveness — too much smoothing misses tempo changes, too little creates jitter. Always verify results against auditory perception.
Tempo Analysis Theory
Inter-Onset Interval (IOI) Fundamentals
Basic Concepts
Definitions:
Onset: Point in time where musical event begins (note, beat, attack)
Inter-Onset Interval (IOI): Time between consecutive onsets
IOI = tₙ₊₁ - tₙ
Tempo (BPM): Beats per minute
BPM = 60 / IOI_median
Where IOI_median is typical beat interval
Example:
IOI = 0.5 seconds → BPM = 60 / 0.5 = 120
IOI = 0.75 seconds → BPM = 60 / 0.75 = 80
Tempo Ambiguity (Octave Errors)
The doubling/halving problem:
Given IOI, possible tempo interpretations:
Fast interpretation: BPM = 60 / IOI
Medium interpretation: BPM = 60 / (2 × IOI)
Slow interpretation: BPM = 60 / (4 × IOI)
Example: IOI = 0.25 seconds
Fast: 60 / 0.25 = 240 BPM
Medium: 60 / 0.5 = 120 BPM ← Usually correct
Slow: 60 / 1.0 = 60 BPM
Solution: Test multiple candidates, choose most continuous
Candidates: {raw_BPM/2, raw_BPM, raw_BPM×2}
Select candidate closest to previous tempo
Constrain to user-specified range [min_BPM, max_BPM]
Sliding Window Analysis
Window Configuration
Adaptive window sizing:
Window size: Adaptive based on minimum tempo
window_size = max(4.0, 4 × (60 / min_BPM))
Hop size: 1/8 of window for smooth overlapping
hop_size = window_size / 8
Rationale:
- Window must contain multiple beats for reliable statistics
- 4× longest expected beat period ensures multiple samples
- 1/8 overlap provides smooth temporal resolution
Example: min_BPM = 60
Longest beat = 1.0 second
Window size = max(4.0, 4×1.0) = 4.0 seconds
Hop size = 4.0 / 8 = 0.5 seconds
Temporal Resolution
Trade-off between stability and responsiveness:
Large window (4-8 seconds):
+ Stable tempo estimates
+ Robust to missing onsets
- Slow response to tempo changes
- Poor for music with rubato
Small window (2-4 seconds):
+ Fast response to tempo changes
+ Good for expressive timing
- Sensitive to missing onsets
- More estimation variance
This script: Adaptive 4× period, minimum 4 seconds
Good compromise for most musical styles
Statistical Robustness
Median-Based Estimation
Why median instead of mean:
IOI distribution in music:
- Majority of IOIs close to beat period
- Some IOIs half/double period (off-beats, syncopation)
- Occasional outliers (missing detections, extra detections)
Mean IOI: Sensitive to outliers
Example: IOIs = [0.5, 0.5, 0.5, 0.25, 2.0] seconds
Mean = (0.5+0.5+0.5+0.25+2.0)/5 = 0.75 seconds → 80 BPM (wrong)
Median IOI: Robust to outliers
Same data: Median = 0.5 seconds → 120 BPM (correct)
This script: Uses median IOI per window
Confidence Metrics
Quality indicators:
Confidence = number of IOIs in analysis window
Interpretation:
Confidence ≥ 4: High reliability
Confidence = 2-3: Moderate reliability
Confidence = 1: Low reliability (single interval)
Confidence = 0: No data (uses previous estimate)
Mean BPM calculation: Only uses confident estimates
if confidence ≥ 2: include in average
Excludes low-quality estimates from final summary
Complete Processing Pipeline
INPUT: Sound object (musical audio)
STEP 1: PRE-PROCESSING
- Resample to 11025 Hz if needed (efficiency)
- High-pass filter at 30 Hz (remove rumble/DC)
- Convert to mono if stereo
STEP 2: ONSET DETECTION
IF Spectral flux method:
- Create spectrogram (30ms windows, 5ms hop)
- Calculate spectral flux in log-spaced bands
- Adaptive thresholding (local median + sensitivity×MAD)
- Peak picking with refractory period
ELSE Intensity slope method:
- Calculate intensity contour
- Median filter intensity
- Compute intensity slope (derivative)
- Adaptive thresholding and peak picking
STEP 3: BEAT ANNOTATION
- Create TextGrid with "beats" tier
- Insert point at each detected onset time
STEP 4: TEMPO CURVE ESTIMATION
FOR each analysis window (sliding):
- Collect IOIs within window
- Calculate median IOI
- Convert to raw BPM = 60 / median_IOI
- Test octave candidates: {raw/2, raw, raw×2}
- Select candidate within [min_BPM, max_BPM]
- Prefer candidate closest to previous BPM
- Store confidence = number of IOIs in window
STEP 5: TEMPORAL SMOOTHING
- Apply one-pole low-pass filter
- Cutoff frequency = smoothing parameter
- Causal filter (depends only on past)
STEP 6: OUTPUT GENERATION
- Create Table with columns: time, BPM, confidence
- Calculate mean BPM (confident regions only)
- Display summary statistics
OUTPUTS: TextGrid (beat positions), Table (BPM curve)
Onset Detection Methods
Method 1: Spectral Flux
📊 Frequency Content Analysis
Principle: Detect changes in spectral energy distribution
Calculation: Sum of positive spectral differences across frequency bands
Strengths: Works for complex music, polyphonic textures, non-percussive onsets
Weaknesses: Computationally intensive, sensitive to noise
Best for: Classical music, jazz, acoustic ensembles, vocal music
Technical Implementation
Spectral Flux Algorithm:
1. SPECTROGRAM CONSTRUCTION
- Window: 30ms Gaussian
- Hop: 5ms (high temporal resolution)
- Frequency range: 0-4000Hz (most musical energy)
2. LOG-SPACED FREQUENCY BANDS
- 30 bands with exponential spacing
- More resolution at low frequencies (perceptual weighting)
- Band selection: row = 1 + (nRows-1) × (ratio^1.5)
3. SPECTRAL FLUX CALCULATION
FOR each frame (after first):
flux = 0
FOR each frequency band:
mag_curr = ln(1 + |spectrum[band, frame]|)
mag_prev = ln(1 + |spectrum[band, frame-1]|)
flux += max(mag_curr - mag_prev, 0) // Positive differences only
4. ADAPTIVE THRESHOLDING
- Local median in 0.5s window
- Threshold = local_median + sensitivity × MAD
- MAD = Median Absolute Deviation (robust spread measure)
5. PEAK PICKING
- Local maxima detection (±2 frames)
- Refractory period: 60/max_BPM seconds
- Prevents multiple detections on same event
Method 2: Intensity Slope
📈 Amplitude Change Analysis
Principle: Detect rapid increases in sound intensity
Calculation: Positive derivative of smoothed intensity contour
Strengths: Fast computation, works well for percussive sounds
Weaknesses: Misses spectral onsets, sensitive to overall level changes
Best for: Percussive music, drum tracks, electronic music, speech rhythm
Technical Implementation
Intensity Slope Algorithm:
1. INTENSITY CONTOUR
- 50Hz minimum pitch (broadband analysis)
- 10ms time steps (high resolution)
- Cubic interpolation for smooth values
2. MEDIAN FILTERING
- 3-point median filter
- Removes spikes while preserving edges
- Values: [intensity[i-1], intensity[i], intensity[i+1]]
- Output: median of three values
3. INTENSITY SLOPE CALCULATION
slope[i] = intensity[i] - intensity[i-1]
(Positive values indicate increasing intensity)
4. ADAPTIVE THRESHOLDING
- Local median in 0.5s window
- Threshold = local_median + sensitivity × MAD
- Same robust statistics as spectral method
5. PEAK PICKING
- Local maxima detection (±2 frames)
- Refractory period: 60/max_BPM seconds
- Output: onset times
Method Selection Guide
Music Type
Recommended Method
Sensitivity
Notes
Classical
Spectral Flux
1.3-1.7
Handles complex textures, legato
Jazz
Spectral Flux
1.4-1.8
Syncopation, complex rhythms
Rock/Pop
Intensity Slope
1.2-1.6
Strong beats, percussive
Electronic
Intensity Slope
1.1-1.5
Clear attacks, synthetic sounds
Folk/World
Spectral Flux
1.5-2.0
Varied instrumentation
Speech/Rap
Intensity Slope
1.6-2.0
Rhythmic speech, clear syllables
Mixed/Unknown
Spectral Flux
1.5
Default starting point
When to choose each method:
Choose Spectral Flux if: Music has complex textures, legato passages, multiple instruments, or you're unsure
Choose Intensity Slope if: Music is strongly percussive, has clear attacks, or you need faster processing
Try both methods if: Results from one method seem unsatisfactory
Higher sensitivity: More detections (good for subtle rhythms, risk of false positives)
Lower sensitivity: Fewer detections (good for clear beats, risk of missed onsets)
Parameters & Settings
Tempo Range Parameters
Parameter
Type
Default
Description
Min_BPM
positive
60
Minimum expected tempo (beats per minute)
Max_BPM
positive
180
Maximum expected tempo (beats per minute)
Onset Detection Parameters
Parameter
Type
Default
Description
Method
optionmenu
Spectral flux
Onset detection algorithm
Sensitivity
positive
1.5
Detection sensitivity (1.0-2.0)
Tempo Curve Parameters
Parameter
Type
Default
Description
Smoothing_(Hz)
positive
0.5
Tempo curve smoothing cutoff frequency
Auto-calculated Parameters
Refractory period = 60 / max_BPM
Prevents multiple detections on same musical event
Window size = max(4.0, 4 × (60 / min_BPM))
Analysis window duration in seconds
Hop size = window_size / 8
Time between consecutive analysis windows
Example: min_BPM=60, max_BPM=180
Refractory = 60/180 = 0.333 seconds
Window size = max(4.0, 4×1.0) = 4.0 seconds
Hop size = 4.0/8 = 0.5 seconds
Output Formats
Output 1: TextGrid with Beat Annotations
📝 Temporal Annotation File
Structure: Single-tier TextGrid with "beats" interval tier
Content: Point annotations at each detected onset time
Usage: Visual inspection in Praat, manual correction, further analysis
Name format: Same as input sound
TextGrid Applications
Manual verification: Play sound with beat markers to check detection accuracy
Correction: Manually add/move/delete beat points as needed
Export: Save as text file for external analysis
Integration: Use with other Praat scripts for rhythm analysis
Example TextGrid structure:
File type = "ooTextFile"
Object class = "TextGrid"
xmin = 0
xmax = 120.5
tiers?
size = 1
item []:
item [1]:
class = "IntervalTier"
name = "beats"
xmin = 0
xmax = 120.5
intervals: size = 184
intervals [1]:
xmin = 0.125
xmax = 0.125
text = "beat"
intervals [2]:
xmin = 0.567
xmax = 0.567
text = "beat"
... etc ...
Output 2: Table with Tempo Curve
📊 Time-BPM Data Table
Structure: Three-column table with time, BPM, confidence
Content: Sampled tempo curve with quality indicators
Statistical analysis: Calculate mean, variance, trends in tempo
Visualization: Plot BPM vs time in external software
Export: Save as CSV for spreadsheet analysis
Filtering: Use confidence column to filter reliable estimates
Comparative analysis: Compare tempo curves between performances
Example table usage:
- Calculate mean BPM: average of bpm column
- Filter high-confidence: where confidence ≥ 3
- Find tempo range: min(bpm) to max(bpm)
- Detect tempo changes: large differences between consecutive bpm values
- Export to CSV: File → Save as comma-separated file
Info Window Summary
Processing summary:
=== Tempo Curve Estimator ===
Refractory period: 0.333 s
Window size: 4.00 s
Hop size: 0.50 s
Detecting onsets...
Processing 24000 frames with 30 log-spaced bands...
Found 184 onsets
Calculating tempo curve...
Smoothing tempo curve...
Results summary:
=== RESULTS ===
TextGrid: 184 beat points
Table: 235 tempo estimates
Mean BPM (confident regions): 118.7
Done!
Applications
Music Performance Analysis
Use case: Analyzing expressive timing in musical performances
Technique: Compare tempo curves across performances
Example: Different interpretations of the same classical piece
Music Education
Use case: Teaching rhythm and tempo concepts
Technique: Visualize student performance timing
Example: Showing tempo inconsistencies in student performances
Music Information Retrieval
Use case: Feature extraction for music classification
Technique: Use tempo curve as input to machine learning
Example: Distinguishing musical genres by tempo variability
Audio Production
Use case: Tempo mapping for audio editing
Technique: Create tempo maps for synchronization
Example: Aligning audio to video with variable tempo
Practical Workflow Examples
🎵 Classical Music Analysis
Goal: Analyze rubato in Chopin performance
Settings:
Method: Spectral flux
Tempo range: 40-160 BPM
Sensitivity: 1.7
Smoothing: 0.3 Hz
Result: Detailed tempo curve showing expressive timing
🥁 Drum Track Analysis
Goal: Extract tempo from drum recording
Settings:
Method: Intensity slope
Tempo range: 80-200 BPM
Sensitivity: 1.3
Smoothing: 0.8 Hz
Result: Stable tempo curve from clear percussive onsets
📚 Research: Performance Comparison
Goal: Compare tempo curves across multiple performances
Workflow:
Process each performance with identical settings
Export Tables as CSV files
Import to statistical software
Calculate correlation, mean differences, variability
Result: Quantitative comparison of interpretive choices
Advanced Techniques
Tempo range optimization:
Known piece: Use published metronome marking ±40%
Unknown piece: Start with 60-180 BPM, adjust based on results
Electronic music: 0.8-1.2 Hz (very stable tempo expected)
Live performance: 0.3-0.6 Hz (balance stability and expression)
Troubleshooting Common Issues
Problem: Half/double tempo detected Cause: Tempo range too wide or incorrect Solution: Adjust min_BPM/max_BPM to expected range, check mean BPM against perception
Problem: Too few onsets detected Cause: Sensitivity too low or wrong method Solution: Increase sensitivity, try alternative detection method
Problem: Too many false detections Cause: Sensitivity too high or noisy recording Solution: Decrease sensitivity, use spectral flux method for complex music
Problem: Tempo curve too jittery Cause: Insufficient smoothing or low confidence Solution: Increase smoothing, check confidence values in output table
Problem: Processing very slow Cause: Long audio file with spectral flux method Solution: Use intensity slope method, or process shorter segments