Pitch Loop Finder — User Guide

Turbo‑speed pitch‑based loop detection: finds repeating melodic/harmonic patterns in audio by comparing pitch contours across time offsets using optimized matrix formulas.

Technique: Pitch‑Contour Self‑Similarity Implementation: Praat Script Category: Audio Pattern Detection Version: Turbo (Optimized) Algorithm: Diagonal SSM Scanning
Contents:

What this does

This script detects melodic/harmonic loops — repeating pitch patterns in audio — by comparing pitch contours at different time offsets. Unlike beat‑based loop finders that look for rhythmic repetitions, this algorithm focuses specifically on pitch similarity, making it ideal for finding melodic motifs, vocal phrases, or harmonic progressions that repeat at various time distances.

Key Features:

What is a pitch loop? A pitch loop occurs when a melodic/harmonic pattern repeats later in the audio:
  • Exact repetition: Same notes/phrase repeated (e.g., chorus melody)
  • Transposed repetition: Same contour at different pitch level
  • Varied repetition: Similar but not identical (ornamentation, improvisation)
  • Delayed repetition: Pattern repeats after time gap (verse‑chorus structure)
This script detects all these types by comparing pitch values with configurable tolerance.

Technical Implementation: (1) Extract pitch contour using Praat's pitch tracker. (2) Convert to matrix and compute similarity scores using fast formula: S(i,j) = 1 - |pitch_i - pitch_j|/tolerance if both pitched and difference < tolerance, else 0. (3) Scan diagonal bands in similarity matrix for continuous high‑similarity segments. (4) Filter candidates by duration and score, preventing overlaps. (5) Annotate source and repeat locations in TextGrid. The "turbo" optimization uses Praat's built‑in C++ matrix operations instead of slow Praat‑script loops.

Quick start

  1. In Praat, select exactly one Sound object (mono works best).
  2. Run script…pitch_loop_finder.praat.
  3. Set time_step:
    • 0.05 (fast) — for initial exploration, long files
    • 0.02 (high precision) — for detailed analysis, short phrases
  4. Adjust pitch_floor and pitch_ceiling to match your audio's pitch range.
  5. Set tolerance_hz (default 50 Hz) — higher values allow more pitch variation.
  6. Define min_loop_duration (default 0.4s) — shortest loop to detect.
  7. Set num_loops_to_find (default 5) — top N loops to annotate.
  8. Click OK — script extracts pitch, computes similarity matrix, finds loops.
  9. Output: TextGrid appears with "Loops" and "Repeats" tiers.
  10. Info window shows detected loops with timings.
  11. Select Sound + TextGrid, click View & Edit to see annotations.
Quick tip: For speech/phonetic patterns, use time_step=0.05, pitch_floor=75, pitch_ceiling=600, tolerance_hz=30. For musical melodies, adjust pitch_ceiling to instrument range (e.g., 1000 for soprano, 400 for cello). For improvisation analysis, increase tolerance_hz to 100‑150 to capture varied repetitions. Start with time_step=0.05 for speed, then refine with 0.02 if needed. Check Info window output — it lists detected loops with scores. The TextGrid shows source (Loop 1) and repeat (Repeat 1) locations. Listen to each loop by selecting the interval in TextGrid editor.
Important: PITCH‑DEPENDENT — Only works on pitched audio (speech, singing, instruments). Unpitched/percussive sounds will not be detected. TIME‑STEP TRADE‑OFF — Smaller time_step = more accurate timing but slower computation. PITCH TRACKING ERRORS — Octave errors, voicing detection mistakes affect results. OVERLAP PREVENTION — Script avoids overlapping loops; may miss nested/simultaneous patterns. TOLERANCE SETTING — Too high = false positives (different patterns matched), too low = false negatives (variations missed). MEMORY USAGE — Similarity matrix requires N×N cells; large files need careful time_step selection.

Loop Detection Algorithm

Step 1: Pitch Extraction

🎵 Praat Pitch Tracking

To Pitch: time_step, pitch_floor, pitch_ceiling Parameters: • time_step: 0.05s = 20 frames/second, 0.02s = 50 frames/second • pitch_floor: Minimum F0 to consider (Hz) • pitch_ceiling: Maximum F0 to consider (Hz) Output: Pitch object with: • N frames at regular time intervals • Each frame: pitch value in Hz or undefined (unvoiced)

Frame conversion:

Time to frame: frame = round(time / time_step) + 1 Frame to time: time = (frame - 1) × time_step Example with time_step=0.05s: Frame 1 → 0.00s Frame 2 → 0.05s Frame 3 → 0.10s ... Frame 20 → 0.95s

Step 2: Similarity Matrix Computation

📊 Pitch Similarity Scoring

Similarity formula (for each cell i,j):

Let: p_i = pitch at frame i (Hz) p_j = pitch at frame j (Hz) T = tolerance_hz (default 50 Hz) If p_i > 0 AND p_j > 0 AND |p_i - p_j| < T: similarity = 1 - (|p_i - p_j| / T) Else: similarity = 0 Properties: • Range: 0 to 1 • 1 = identical pitch (difference = 0) • 0.5 = difference = T/2 • 0 = difference ≥ T or either frame unvoiced

Why this formula?

  • Linear decay: Similarity decreases linearly with pitch difference
  • Thresholded: Only considers differences within tolerance
  • Unvoiced handling: Unpitched frames score 0 (not compared)
  • Interpretable: Score directly relates to pitch difference

Step 3: Diagonal Scanning

🔍 Finding Loops Along Diagonals

Similarity Matrix (S[i,j]):

Time j →
     1 2 3 4 5 6 7 8 9 10
    ┌─────────────────────
  1 │ X . . . . . . . . .
  2 │ . X . . . . . . . .
  3 │ . . X . . ■ ■ ■ . .
  4 │ . . . X . . . . . .
T 5 │ . . . . X . . . . .
i 6 │ . . ■ . . X . . . .
m 7 │ . . ■ . . . X . . .
e 8 │ . . ■ . . . . X . .
  9 │ . . . . . . . . X .
 10 │ . . . . . . . . . X

Diagonal (gap = 3):
  S[3,6] = ■ (similar)
  S[4,7] = ■ (similar)  
  S[5,8] = ■ (similar)
  → Detected loop: frames 3‑5 repeating at frames 6‑8
    

Algorithm: For each possible time gap (offset), scan along that diagonal looking for consecutive high‑similarity cells.

🔄 Loop Candidate Detection

Scanning process:

For gap = min_len to max_gap: Initialize path_len = 0, path_start = 0 For i = 1 to (num_frames - gap): j = i + gap similarity = S[i,j] If similarity > threshold (0.5): If path_len = 0: path_start = i path_len = path_len + 1 Else: If path_len ≥ min_len_frames: Save candidate: start=path_start, length=path_len, gap=gap Reset path_len = 0 # Check path at end of diagonal If path_len ≥ min_len_frames: Save candidate

Parameters in frames:

  • min_len_frames = min_loop_duration / time_step
  • max_gap = num_frames - min_len_frames
  • threshold = 0.5 (hard‑coded, corresponds to pitch difference = tolerance/2)

Step 4: Candidate Scoring & Filtering

🏆 Scoring & Ranking

Score calculation:

For each candidate: score = path_length × average_similarity Where: • path_length = number of consecutive similar frames • average_similarity = mean of S[i,j] along diagonal segment Interpretation: • Longer loops score higher (more evidence) • Higher similarity scores higher (better match) • Combines length and accuracy

Filtering process:

  1. Sort by score: Highest score = best loop candidate
  2. Prevent overlaps: Skip candidates overlapping with higher‑scoring loops
  3. Limit output: Keep top num_loops_to_find non‑overlapping loops

Overlap check: Two loops overlap if their source segments (t1 to t2) overlap in time.

Complete Pipeline

🔄 Full Processing Flow

INPUT: Sound object 1. PITCH EXTRACTION (Praat's To Pitch) • time_step: 0.05s or 0.02s • Convert to N pitch frames 2. SIMILARITY MATRIX (Turbo Formula) • Create N×N matrix • Fill with: S(i,j) = similarity(pitch[i], pitch[j]) • Uses Praat's C++ Formula engine 3. DIAGONAL SCANNING • For each gap (time offset) • Find continuous high‑similarity segments • Save candidates with start, length, gap, score 4. FILTERING & RANKING • Sort candidates by score (descending) • Remove overlapping loops • Keep top K candidates 5. ANNOTATION • Create TextGrid with 2 tiers • Tier 1: Loop source segments • Tier 2: Repeat segments • Label with loop numbers OUTPUT: TextGrid + Info window report

Turbo Optimization Explained

The Speed Problem

🐌 Naïve Approach: O(N³) Disaster

Without optimization:

N = number of pitch frames (~duration/time_step) Naïve computation: For i = 1 to N: # O(N) For j = 1 to N: # O(N²) Calculate similarity(i,j) # O(N³) with nested loops Example: 30s audio, time_step=0.05s N = 30/0.05 = 600 frames Comparisons = 600 × 600 = 360,000 In Praat script loop: ~10‑30 seconds

Why slow? Praat script interpreter loops are slow. Each iteration involves function calls, variable lookups, etc.

The Turbo Solution

⚡ Praat's Formula Engine

Key insight: Praat's Formula command for matrices runs in compiled C++ code, not interpreted Praat script.

# SLOW (Praat script loop): Create simple Matrix: "SSM", N, N, "0" selectObject: ssmID for i to N for j to N pitch_i = pitch_vals#[i] pitch_j = pitch_vals#[j] if pitch_i > 0 and pitch_j > 0 diff = abs(pitch_i - pitch_j) if diff < tolerance_hz score = 1 - (diff / tolerance_hz) Set value: i, j, score endif endif endfor endfor # FAST (C++ Formula): Formula: "if Matrix_ThePitchData[row, 1] > 0 and Matrix_ThePitchData[col, 1] > 0 and abs(Matrix_ThePitchData[row, 1] - Matrix_ThePitchData[col, 1]) < " + string$(tolerance_hz) + " then 1 - (abs(Matrix_ThePitchData[row, 1] - Matrix_ThePitchData[col, 1]) / " + string$(tolerance_hz) + ") else 0 fi"

Speed comparison:

  • Praat script loops: ~1‑10 milliseconds per comparison
  • C++ Formula: ~0.001‑0.01 milliseconds per comparison
  • Speedup: 100‑1000× faster!

How the Formula Works

🔧 Matrix Formula Internals

Special variables in Praat Formula:

row = current row index (i) col = current column index (j) self = current matrix being computed Matrix_name = access other matrix by name Our formula references: Matrix_ThePitchData[row, 1] = pitch at frame i Matrix_ThePitchData[col, 1] = pitch at frame j

Formula execution:

  1. Praat parses formula string once
  2. Compiles to internal bytecode
  3. Executes in optimized C++ loop
  4. Processes all N×N cells in single pass
  5. No Praat script interpreter overhead

Important: The referenced matrix (ThePitchData) must exist and be named exactly that.

Additional Optimizations

🎯 Smart Scanning Heuristics

1. Variable step size:

if num_frames > 2000 step = 2 # Skip every other gap else step = 1 # Check all gaps endif Effect: Halves scanning time for long files

2. Early termination:

max_gap = num_frames - min_len_frames # Don't scan gaps longer than file length minus min loop while gap <= max_gap: # Process this gap gap = gap + step

3. Vectorized pitch extraction:

# Instead of: Get value in frame: i, "Hertz" in loop # Use: Extract to array once pitch_vals# = zero#(num_frames) for i to num_frames pitch_vals#[i] = Get value in frame: i, "Hertz" endfor Benefit: Single pass through Pitch object

Performance Benchmarks

Audio Lengthtime_stepFrames (N)Naïve TimeTurbo TimeSpeedup
10 seconds0.05s20010‑20s0.1‑0.2s100×
30 seconds0.05s60090‑180s0.5‑1s180×
60 seconds0.05s1200360‑720s2‑4s180×
30 seconds0.02s1500560‑1120s3‑6s180×
180s (3min)0.05s36003240‑6480s18‑36s180×
Note: Times are approximate. "Naïve" assumes Praat script loops at ~1ms per comparison. "Turbo" uses C++ Formula. Actual speedup depends on system, but 100‑200× is typical.

Parameters Explained

Speed Settings

⚡ Time Resolution

ParameterDefaultRangeDescription
time_step0.05s0.01‑0.10sTime between pitch analysis frames. Smaller = more precise timing but slower.

Recommendations:

  • 0.05s (20 fps): Default. Good for most music/speech. Fast computation.
  • 0.02s (50 fps): High precision. For detailed melodic analysis, fast passages.
  • 0.10s (10 fps): Very fast. For long files, initial exploration.
  • 0.01s (100 fps): Extremely detailed. For micro‑timing, but very slow.

Frame rate formula: frames/second = 1 / time_step

Pitch Settings

🎵 Pitch Detection Range

ParameterDefaultRangeDescription
pitch_floor75 Hz50‑200 HzMinimum fundamental frequency to detect. Lower = more sensitive but more errors.
pitch_ceiling600 Hz200‑2000 HzMaximum fundamental frequency. Set to expected range of audio.
tolerance_hz50 Hz10‑200 HzMaximum pitch difference to consider "similar". Accounts for tuning variations, vibrato, expression.

Pitch range recommendations:

  • Male speech: 75‑300 Hz
  • Female speech: 100‑500 Hz
  • Singing (bass‑tenor): 80‑500 Hz
  • Singing (alto‑soprano): 150‑1000 Hz
  • Violin: 200‑2000 Hz
  • Cello: 65‑1000 Hz

Tolerance guidelines:

  • 10‑20 Hz: Very strict. Only near‑identical pitches match.
  • 30‑50 Hz: Default. Allows slight variations, tuning differences.
  • 60‑100 Hz: Lenient. Allows expressive variations, larger intervals.
  • 100‑200 Hz: Very lenient. May match different notes in same region.

Loop Timing Settings

⏱️ Loop Duration Constraints

ParameterDefaultRangeDescription
min_loop_duration0.4s0.2‑5.0sShortest loop to detect. Shorter = more candidates but more false positives.
max_loop_duration10.0s1.0‑60.0sMaximum time gap between loop start and repeat. Longer = search more offsets.

Duration guidelines:

  • 0.2‑0.5s: Short motifs, rhythmic cells, phonetic units
  • 0.5‑1.5s: Musical phrases, spoken phrases
  • 1.5‑3.0s: Longer phrases, musical themes
  • 3.0‑10.0s: Complete musical sections, spoken sentences
  • >10.0s: Extended passages, verse‑chorus relationships

max_loop_duration tip: Set to slightly longer than expected repetition interval. For verse‑chorus form, 20‑30s. For motif repetition, 5‑10s.

Output Settings

📝 Results Configuration

ParameterDefaultRangeDescription
num_loops_to_find51‑50Number of top‑scoring non‑overlapping loops to annotate in TextGrid.

Selection process:

  1. All candidates sorted by score (highest first)
  2. Starting from top, add to output if not overlapping with already‑selected loops
  3. Continue until num_loops_to_find reached or candidates exhausted

When to increase: Complex audio with many repeating patterns. When to decrease: Simple audio, or to focus on best matches only.

Parameter Interactions

Important relationships:
  • time_step × min_loop_duration = minimum frames needed for detection
  • tolerance_hz / pitch_ceiling = relative tolerance (e.g., 50/600 = 8.3% of range)
  • max_loop_duration / time_step = maximum gap to search (in frames)
  • pitch_ceiling - pitch_floor = detection range (wider = more processing)
  • num_loops_to_find vs overlap:
    • High value → may include shorter/weaker loops
    • Low value → only strongest loops shown

Output & Interpretation

Info Window Report

📋 Processing Summary

=== PITCH LOOP FINDER === Sound: my_audio Extracted 600 frames. Time Step: 0.05s Calculating Matrix (Turbo Mode)... done! Scanning diagonals... done. Found loops: Loop 1: 12.35s -> 45.20s Loop 2: 5.60s -> 30.15s Loop 3: 18.90s -> 52.40s Loop 4: 2.10s -> 15.80s Loop 5: 25.30s -> 58.75s === COMPLETE ===

Interpretation:

  • "Extracted N frames": Total pitch analysis points
  • "Time Step": Resolution of analysis
  • "Loop X: A.s -> B.s": Source segment at time A repeats at time B
  • Loops listed in score order (highest first)

TextGrid Output

🏷️ Two‑Tier Annotation

Tier 1: "Loops" — Source segments (original occurrences)

Tier 2: "Repeats" — Repeat segments (later occurrences)

Time(s) | Loops tier | Repeats tier ---------|-------------------|------------------- 0.00-2.00| | 2.00-4.50| [Loop 1] | 4.50-7.00| | ... | ... | ... 12.00-14.5| | [Repeat 1] ... | ... | ...

TextGrid structure:

  • Identical numbering: "Loop 1" corresponds to "Repeat 1"
  • Same duration: Source and repeat segments have equal length
  • Time alignment: Intervals show exact start/end times
  • Non‑overlapping: Loops don't overlap within each tier (but can across tiers)

Using in Praat: Select Sound + TextGrid, click View & Edit. Click on intervals to hear segments.

Understanding Loop Scores

📈 What Makes a "Good" Loop?

Score components:

Score = path_length × average_similarity Where: path_length = number of consecutive similar frames average_similarity = mean similarity along diagonal (0‑1) Example: Loop A: 20 frames, average similarity 0.8 → score = 16.0 Loop B: 40 frames, average similarity 0.6 → score = 24.0 Loop C: 10 frames, average similarity 0.9 → score = 9.0 Loop B wins despite lower similarity per frame (longer duration)

Score interpretation:

  • High score (e.g., >20): Long, accurate repetition
  • Medium score (10‑20): Moderate length with good similarity, or long with moderate similarity
  • Low score (<10): Short loop, or longer loop with low similarity

Note: Scores are relative — absolute values depend on time_step and audio length.

Common Output Patterns

🎵 What to Expect

For speech with repetitions:

  • Loops correspond to repeated words/phrases
  • Similar durations for source and repeat
  • May capture filler words ("um", "ah")

For music with motifs:

  • Melodic motifs appear as loops
  • Transposed motifs may be detected if within tolerance
  • Rhythmic patterns without pitch won't appear

For singing/chanting:

  • Repeated phrases detected
  • Chorus sections show multiple loops
  • May detect vocalise/melisma patterns

For instrumental music:

  • Thematic material appears as loops
  • Basslines/harmonic progressions may be detected
  • Solo passages with repetition show loops

Troubleshooting Output

Problem: No loops found
Possible causes:
  • Audio has no pitched content
  • min_loop_duration too long
  • tolerance_hz too low
  • pitch_floor/pitch_ceiling wrong for audio
  • time_step too coarse
Solutions: Check pitch extraction first (use Praat's View & Edit on Pitch object). Adjust parameters.
Problem: Too many false positives
Possible causes:
  • tolerance_hz too high
  • min_loop_duration too short
  • Audio has steady pitch (e.g., drone)
  • Pitch tracking errors (octave jumps)
Solutions: Decrease tolerance, increase min_loop_duration, check pitch tracking quality.
Problem: Loops don't match perceived repetitions
Possible causes:
  • Algorithm focuses on pitch only (ignores timbre/rhythm)
  • Transposed repetitions not detected (if > tolerance)
  • Varied repetitions not similar enough
  • Solutions: Increase tolerance for transposed patterns. Remember: algorithm detects pitch similarity, not perceptual repetition.
    Problem: TextGrid intervals don't align exactly
    Possible causes:
    • time_step resolution limitations
    • Pitch tracking timing errors
    • Loop boundaries at frame edges
    Solutions: Use smaller time_step for better timing. Boundaries are at frame times, not sample‑accurate.

    Practical Applications

    Music Analysis

    🎵 Motif & Theme Detection

    Use case: Identify recurring melodic motifs in classical music

    Workflow:

    1. Extract solo instrument track or vocal line
    2. Set parameters: time_step=0.02, pitch_range=instrument_range, tolerance=20‑40 Hz
    3. Run Pitch Loop Finder
    4. Examine detected loops in TextGrid
    5. Map loops to score positions

    Example analysis: Beethoven's 5th Symphony opening motif

    Expected: Short loop (0.5‑1.0s) repeating at various points Parameters: min_loop_duration=0.3s, tolerance_hz=30 Result: Detects "da‑da‑da‑dum" motif occurrences

    Speech & Language Analysis

    🗣️ Repetition Analysis in Speech

    Use cases:

    • Language learning: Detect repeated phrases in practice
    • Speech therapy: Identify repetitive patterns in stuttering
    • Forensic analysis: Find repeated phrases in recordings
    • Conversation analysis: Identify recurring topics/questions

    Parameters for speech:

    time_step = 0.05s pitch_floor = 75 (male) or 100 (female) pitch_ceiling = 300 (male) or 500 (female) tolerance_hz = 30 (strict) to 50 (lenient) min_loop_duration = 0.5s (phrase level)

    Example: Political speech — detects repeated slogans/catchphrases.

    Composition & Sound Design

    🎼 Loop‑Based Composition

    Use case: Create compositions from found audio loops

    Creative workflow:

    1. Record improvisation or free playing
    2. Run Pitch Loop Finder to identify interesting motifs
    3. Extract loop segments using TextGrid boundaries
    4. Rearrange loops to create new composition
    5. Use repetition structure as formal guide

    Parameters for improvisation:

    tolerance_hz = 50‑100 (allow expressive variations) min_loop_duration = 0.3‑0.5s (capture short motifs) num_loops_to_find = 10‑20 (get many options)

    Example: Jazz improvisation — detects recurring licks/phrases.

    Music Education

    🏫 Teaching Musical Form

    Classroom activity: Visualizing repetition in music

    1. Students analyze favorite songs
    2. Predict where repetitions occur
    3. Run Pitch Loop Finder
    4. Compare predictions to detected loops
    5. Discuss why some repetitions detected/not detected

    Learning outcomes:

    • Understand repetition in musical structure
    • Learn about pitch vs. timbre perception
    • Explore algorithmic music analysis
    • Connect hearing to visual representation

    Ethnomusicology & Field Recordings

    🌍 Oral Tradition Analysis

    Use case: Analyze repetitive structures in oral traditions

    Applications:

    • Folk songs: Detect verse/chorus patterns
    • Chanting: Identify mantra/incantation repetitions
    • Storytelling: Find repeated phrases in oral narratives
    • Ceremonial music: Analyze ritual repetition structures

    Methodology:

    1. Record traditional performance 2. Run Pitch Loop Finder with lenient tolerance 3. Map loops to cultural/functional categories 4. Compare repetition structures across traditions

    Advanced Techniques

    For experienced users:
    • Multi‑stage analysis: Run with different parameters, compare results
    • Combined features: Export loops, then analyze with other tools (MFCC, rhythm)
    • Statistical analysis: Collect loop statistics across corpus
    • Real‑time adaptation: Adjust parameters based on initial results
    • Integration with other scripts: Use output TextGrid as input for segmentation, annotation
    • Custom scoring: Modify script to use different scoring formulas

    Limitations & Complementary Tools

    ⚠️ What This Script Doesn't Do

    Pitch‑only limitation:

    • Misses rhythmic repetitions without pitch change
    • Misses timbral repetitions (same pitch, different instrument)
    • May miss transposed repetitions beyond tolerance

    Complementary Praat scripts:

    • Rhythmic loop finder: Based on onset detection
    • MFCC similarity: For timbral repetitions
    • Multi‑feature loop detection: Combines pitch, rhythm, timbre
    • Novelty curve analysis: For boundary detection

    Best practice: Use pitch‑based detection first, then verify with listening and other analyses.