Continuous Pitch over MIDI Grid Visualizer — User Guide

Advanced pitch visualization: displays continuous pitch contours overlaid on a MIDI note grid with multiple color schemes and visualization options.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 2.0 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

This script implements continuous pitch visualization over a MIDI grid — a comprehensive tool for analyzing and displaying pitch contours with musical context. The visualizer extracts fundamental frequency (F0) from audio, converts to MIDI note values, and plots continuous pitch curves overlaid on a standard MIDI note grid. Multiple visualization options allow for detailed analysis of pitch behavior, intonation, vibrato, glissandi, and microtonal variations.

Key Features:

What is continuous pitch visualization? Traditional pitch analysis: discrete pitch estimates, MIDI note displays, spectral displays. Continuous pitch visualization: Shows the exact pitch contour as it moves between and around musical notes. Advantages: (1) Microtonal precision: Shows pitch variations smaller than semitones. (2) Glissando tracking: Visualizes smooth pitch transitions. (3) Vibrato analysis: Shows rate and extent of pitch modulation. (4) Intonation assessment: Reveals tuning tendencies relative to equal temperament. (5) Performance analysis: Shows expressive pitch variations in singing/instrument performance. Use cases: Music pedagogy (voice/instrument training), ethnomusicology (non-Western tuning systems), speech analysis (intonation patterns), sound design (pitch manipulation effects), musicology (performance practice studies).

Technical Implementation: (1) Pitch Extraction: Uses Praat's advanced pitch tracking algorithm with user-defined floor/ceiling and time step. (2) Intensity Analysis: Simultaneously extracts loudness information for visualization encoding. (3) MIDI Conversion: Converts Hz values to MIDI note numbers using standard formula: MIDI = 69 + 12×log₂(Hz/440). (4) Smoothing: Applies optional median filtering or moving average to reduce pitch tracking errors. (5) Grid Drawing: Creates MIDI note grid with customizable display options. (6) Visualization: Plots pitch contour using selected color scheme and line style, with intensity information mapped to visual properties. (7) Range Management: Automatically or manually sets MIDI display range with padding for optimal visualization.

Quick start

  1. In Praat, select exactly one Sound object.
  2. Run script…continuous_pitch_midi_grid_visualizer.praat.
  3. Set pitchFloor and pitchCeiling appropriate for your audio.
  4. Choose MIDI range: Auto-detect or manual specification.
  5. Select smoothing option if needed (recommended for noisy signals).
  6. Choose color scheme based on analysis goals.
  7. Select line style and adjust parameters if using dots.
  8. Configure display options (grid lines, labels, etc.).
  9. Set intensity range for loudness mapping.
  10. Click OK — visualization appears in Picture window.
Quick tip: Start with autoMidiRange = yes for automatic range detection. Use Pitch+Loudness Rainbow color scheme for general analysis. Enable showNoteLabels for musical context. For vocal analysis, try pitchFloor = 75 Hz (male) or 150 Hz (female). Use Median 3-frame smoothing for cleaner pitch tracks. Set intensityMinDb = 40 and intensityMaxDb = 80 for typical speech/singing. Enable useLogLoudness = yes for better perceptual mapping. The visualization shows exact pitch position relative to equal temperament — points on grid lines are exactly in tune, above = sharp, below = flat.
Important: PITCH TRACKING ACCURACY depends on signal quality and parameters. Very noisy signals, polyphonic audio, or signals with strong harmonics may produce inaccurate pitch tracks. Adjust pitchFloor and pitchCeiling to match expected frequency range. Time step affects temporal resolution — smaller values = more detail but slower processing. Smoothing can remove genuine microvariations — use judiciously. MIDI conversion assumes equal temperament — analysis of non-Western music may require interpretation. Intensity mapping depends on accurate intensity extraction — very quiet sections may not show pitch information. Visualization is destructive to current Picture window — save important plots before running.

Pitch Analysis System

Fundamental Frequency Extraction

Praat Pitch Algorithm

Advanced autocorrelation method:

Input: Sound waveform Process: Short-time autocorrelation analysis Output: Fundamental frequency (F0) estimates Key parameters: pitchFloor: Minimum expected F0 (Hz) pitchCeiling: Maximum expected F0 (Hz) timeStep: Analysis frame interval (seconds) Algorithm steps: 1. Divide signal into overlapping frames 2. Compute autocorrelation function for each frame 3. Find peak in autocorrelation corresponding to F0 4. Apply voicing decision (voiced/unvoiced) 5. Interpolate between frames for smooth contour Default values: pitchFloor = 75 Hz (typical male speech) pitchCeiling = 600 Hz (typical soprano singing) timeStep = 0.01 s (10 ms frames)

Parameter Guidelines

Optimal settings by voice type:

Voice TypepitchFloorpitchCeilingNotes
Bass65 Hz300 HzLow male singing
Tenor80 Hz500 HzHigh male singing
Alto130 Hz600 HzLow female singing
Soprano180 Hz1100 HzHigh female singing
Child200 Hz800 HzChildren's voices
Speech (male)75 Hz300 HzConversational speech
Speech (female)120 Hz500 HzConversational speech
Violin190 Hz3000 HzG3 - E7 range
Cello60 Hz1000 HzC2 - C6 range

MIDI Conversion System

Hz to MIDI Formula

Standard equal temperament conversion:

MIDI = 69 + 12 × log₂(Hz / 440) Where: MIDI = MIDI note number (0-127) Hz = fundamental frequency 440 = reference frequency (A4) 69 = MIDI note number of A4 Examples: 440 Hz → 69 + 12 × log₂(1) = 69 (A4) 220 Hz → 69 + 12 × log₂(0.5) = 69 - 12 = 57 (A3) 880 Hz → 69 + 12 × log₂(2) = 69 + 12 = 81 (A5) Continuous values: MIDI note 69.5 = halfway between A4 and A#4 MIDI note 68.9 = slightly flat A4 MIDI note 69.1 = slightly sharp A4

MIDI Note Numbering

Standard MIDI note reference:

C0 = 12, C1 = 24, C2 = 36, C3 = 48, C4 = 60, C5 = 72, C6 = 84, C7 = 96, C8 = 108 Common reference points: A4 = 69 (440 Hz, tuning standard) Middle C (C4) = 60 Lowest piano note (A0) = 21 Highest piano note (C8) = 108 Note naming: C, C#, D, D#, E, F, F#, G, G#, A, A#, B Numbers indicate octave (C4 = middle C) In visualization: Horizontal lines at integer MIDI values = equal temperament Points between lines = microtonal variations

Smoothing Algorithms

Median Filtering

3-frame median filter:

For each frame i: Collect values: [midiNote[i-1], midiNote[i], midiNote[i+1]] Sort values and take middle (median) Replace midiNote[i] with median Example: Original: [67.1, 45.2, 66.9] (45.2 is likely error) Sorted: [45.2, 66.9, 67.1] Median: 66.9 Result: [67.1, 66.9, 66.9] Advantages: Removes outliers (octave errors, noise spikes) Preserves sharp transitions No temporal smearing

Moving Average

3-frame moving average:

For each frame i: Calculate average of: midiNote[i-1], midiNote[i], midiNote[i+1] Replace midiNote[i] with average Example: Values: [67.0, 67.2, 67.1] Average: (67.0 + 67.2 + 67.1) / 3 = 67.1 Result: [67.0, 67.1, 67.1] Characteristics: Smooths all variations Reduces high-frequency fluctuations May oversmooth genuine rapid changes Causes temporal smearing

Smoothing Selection Guide

Smoothing TypeBest ForPreservesRemoves
No smoothingClean signals, vibrato analysisAll microvariationsNothing
Median 3-frameNoisy signals, pitch errorsSharp transitionsOutliers, octave jumps
Median 5-frameVery noisy signalsGeneral contourMore outliers
Moving averageGeneral smoothingSlow variationsHigh-frequency changes

MIDI Grid System

Automatic Range Detection

Smart MIDI Range

Algorithm for automatic range setting:

STEP 1: Extract all valid pitch values FOR each analysis frame: IF pitch is defined (voiced) Convert to MIDI: midi = 69 + 12 × log₂(Hz/440) Track min/max: midiMinFound, midiMaxFound STEP 2: Apply padding currentMidiMin = floor(midiMinFound) - midiPadding currentMidiMax = ceiling(midiMaxFound) + midiPadding STEP 3: Clamp to valid range currentMidiMin = max(0, currentMidiMin) currentMidiMax = min(127, currentMidiMax) Example: Found range: 67.3 - 72.8 (approx E4 - C5) After padding (padding=3): 64 - 76 (approx E4 - E5) Display range: MIDI 64-76 Benefits: Automatic focus on relevant pitch range Consistent padding for visual clarity No manual range guessing

Grid Display Options

Note Line Styling

Hierarchical grid system:

C notes (every octave): Color: {0.65, 0.65, 0.65} (medium gray) Line width: 2.0 Always displayed Octave boundaries (other C notes): Color: {0.75, 0.75, 0.75} (light gray) Line width: 1.2 Always displayed Other semitones (if showAllSemitones = yes): Color: {0.92, 0.92, 0.92} (very light gray) Line width: 0.4 Optional display Visual hierarchy: C notes most prominent → orientation Octave boundaries secondary → structure Other notes subtle → reference

Note Labeling

Automatic note name generation:

For each C note in range: noteClass = midiNumber mod 12 octave = floor(midiNumber / 12) - 1 IF noteClass = 0 (C note) noteName$ = "C" + string$(octave) Draw text at left margin: noteName$ Examples: MIDI 60 → C4 (middle C) MIDI 72 → C5 (tenor high C) MIDI 48 → C3 (cello C) Positioning: Left-aligned at startTime - 2% of duration Vertically centered on note line Font size: 8 points Color: {0.5, 0.5, 0.5} (medium gray)

Time Grid System

Vertical Time Markers

Optional time grid:

IF showTimeGrid = yes: timeMarker = ceiling(startTime / 0.1) × 0.1 WHILE timeMarker ≤ endTime: Draw vertical line at timeMarker timeMarker = timeMarker + 0.1 Line properties: Color: {0.95, 0.95, 0.95} (very light gray) Line width: 0.3 Full height: currentMidiMin-0.5 to currentMidiMax+0.5 Purpose: Visual time reference Rhythmic alignment Duration estimation

Visualization Options

Color Schemes

🎨 Pitch+Loudness Rainbow

Mapping: Pitch height → Hue, Loudness → Brightness

Visual effect: Rainbow gradient from low (red) to high (blue) pitches

Best for: General pitch contour analysis, overall pattern recognition

🎵 PitchClass+Loudness Wheel

Mapping: Pitch class → Hue, Loudness → Brightness

Visual effect: Color repeats every octave, same note = same color

Best for: Tonal music analysis, key relationships, chord progressions

⚫ Grayscale Loudness

Mapping: Loudness → Grayscale value

Visual effect: Monochromatic, loud sections = dark, quiet = light

Best for: Focus on dynamics, intensity-pitch relationships

🔥 Intensity Heatmap

Mapping: Loudness → Heat colors (blue-green-yellow-red)

Visual effect: Thermal display, loud = hot colors

Best for: Emphasis on loudness variations, intensity hotspots

🌀 Octave Spiral

Mapping: Pitch class → Hue, Octave → Brightness

Visual effect: Colors spiral through brightness with octaves

Best for: Wide-range music, register changes, vocal range analysis

Line Style Options

Thin Continuous Line

Standard continuous plot:

Properties: Line width: 1.5 (fixed) Style: Continuous connection between points Color: Varies by selected color scheme Advantages: Traditional scientific plot Clear continuous contour Good for detailed analysis Best for: High-quality pitch tracks Microtonal analysis Research publications

Thickness Varies with Loudness

Dynamic line thickness:

Mapping: intensity → line width (0.5 to 4.5) Quiet: thin lines (0.5) Loud: thick lines (4.5) Calculation: @mapToRange: intensity[i], intensityMinDb, intensityMaxDb, 0.5, 4.5 Line width = mapToRange.result Visual effect: Loud sections appear prominent Quiet sections subtle Combined pitch+loudness information Best for: Expressive performance analysis Dynamic shaping visualization

Dots with Size Varies with Loudness

Point-based visualization:

Properties: Shape: Circular dots at each analysis point Size mapping: intensity → dot diameter Range: minDotSize to maxDotSize (user-defined) Calculation: @mapToRange: intensity[i], intensityMinDb, intensityMaxDb, minDotSize, maxDotSize dotSize = mapToRange.result Paint circle at each point Visual effect: Scatter plot style Temporal discretization visible Loudness encoded in dot size Best for: Sparse pitch tracks Staccato articulation Percussive pitch sounds

Intensity Processing

Loudness Extraction

Praat intensity analysis:

Input: Sound waveform Process: RMS energy calculation in dB Parameters: Minimum pitch: same as pitchFloor Time step: same as timeStep Subtract mean: yes Output: Intensity contour in dB Mapping to brightness: Linear: intensity → [0.3, 1.0] linearly Logarithmic: intensity → compressed [0.3, 1.0] Default range: intensityMinDb = 40 dB (quiet speech) intensityMaxDb = 80 dB (loud singing)

Logarithmic Compression

Perceptual loudness mapping:

Formula (if useLogLoudness = yes): normalized = (dB - minDb) / (maxDb - minDb) normalized = max(0, min(1, normalized)) IF normalized > 0: result = log₁₀(normalized × 9 + 1) / log₁₀(10) ELSE: result = 0 Effect: 40 dB → brightness ≈ 0.3 60 dB → brightness ≈ 0.65 80 dB → brightness ≈ 1.0 Rationale: Matches human loudness perception Better utilization of visual range More natural appearance

Applications

Vocal Pedagogy

Use case: Voice training and intonation practice

Technique: Compare sung pitch to equal temperament reference

Settings: PitchClass+Loudness Wheel, thin continuous line, showNoteLabels=yes

Ethnomusicology

Use case: Analysis of non-Western tuning systems

Technique: Examine microtonal patterns and scale structures

Settings: No smoothing, thin continuous line, showAllSemitones=no

Speech Intonation Analysis

Use case: Study of prosody and intonation patterns

Technique: Analyze pitch contours in relation to linguistic structure

Settings: Pitch+Loudness Rainbow, median smoothing, autoMidiRange=yes

Instrument Performance Analysis

Use case: Study of expressive pitch variations

Technique: Analyze vibrato, portamento, and intonation tendencies

Settings: Octave Spiral, thickness varies with loudness

Sound Design

Use case: Analysis of synthetic pitch contours

Technique: Visualize glissando, pitch envelopes, FM modulation

Settings: Intensity Heatmap, dots with variable size

Practical Workflow Examples

🎤 Voice Lesson Analysis

Goal: Help student improve intonation

Settings:

  • Color: PitchClass+Loudness Wheel
  • Line: Thin continuous
  • Smoothing: Median 3-frame
  • Show note labels: Yes
  • MIDI range: Auto with padding 3

Analysis: Identify consistently sharp/flat notes, vibrato extent, pitch stability

🎻 Violin Vibrato Study

Goal: Analyze vibrato rate and extent

Settings:

  • Color: Pitch+Loudness Rainbow
  • Line: Thin continuous
  • Smoothing: No smoothing
  • Show all semitones: No
  • Time grid: Yes

Measurements: Count cycles per second, measure pitch deviation, analyze regularity

🗣️ Speech Prosody Research

Goal: Study question vs statement intonation

Settings:

  • Color: Grayscale Loudness
  • Line: Thickness varies
  • Smoothing: Moving average
  • Log loudness: Yes

Analysis: Compare final pitch direction, pitch range, loudness patterns

Advanced Analysis Techniques

Microtonal analysis:
  • Cent measurement: 1 semitone = 100 cents, visualization shows cent deviations
  • Just intonation: Compare to pure intervals (e.g., 5:4 major third = 386 cents)
  • Regional scales: Identify maqam, raga, or other non-ET pitch collections
  • Blue notes: Analyze pitch bending in blues/jazz performance
Expressive timing analysis:
  • Rubato: Use time grid to analyze tempo variations
  • Articulation: Examine attack and release timing
  • Phrasing: Identify musical phrases through pitch and dynamics
  • Emphasis: Correlate loudness peaks with structural points

Troubleshooting Common Issues

Problem: No pitch track or fragmented contour
Cause: Incorrect pitch range, noisy signal, or unvoiced audio
Solution: Adjust pitchFloor/pitchCeiling, use smoothing, check signal quality
Problem: Octave errors (jumping between registers)
Cause: Pitch tracking confusion with harmonics
Solution: Use median smoothing, narrow pitch range, manual correction
Problem: Visualization too crowded or sparse
Cause: Incorrect MIDI range or time step
Solution: Adjust autoMidiRange padding, change timeStep resolution
Problem: Color scheme not showing expected patterns
Cause: Misunderstanding of color mapping
Solution: Review color scheme descriptions, try different schemes

Technical Reference

Complete Parameter Reference

ParameterTypeDefaultDescription
pitchFloorpositive75Minimum expected pitch (Hz)
pitchCeilingpositive600Maximum expected pitch (Hz)
timeSteppositive0.01Analysis frame interval (s)
autoMidiRangeboolean1Auto-detect MIDI display range
manualMidiMininteger48Manual MIDI minimum (if auto=no)
manualMidiMaxinteger84Manual MIDI maximum (if auto=no)
midiPaddingpositive3Padding for auto range (semitones)
smoothingoptionmenu1Pitch smoothing algorithm
colorSchemeoptionmenu1Visualization color mapping
lineStyleoptionmenu1Pitch contour display style
minDotSizepositive0.8Minimum dot size (mm)
maxDotSizepositive3.5Maximum dot size (mm)
showAllSemitonesboolean1Show all semitone grid lines
showNoteLabelsboolean1Show note names on C lines
showTimeGridboolean0Show vertical time grid
intensityMinDbpositive40Minimum intensity for mapping (dB)
intensityMaxDbpositive80Maximum intensity for mapping (dB)
useLogLoudnessboolean1Use logarithmic loudness compression

Performance Considerations

Processing Time Factors

Major time consumers:

1. Pitch extraction: O(n) with audio length Smaller timeStep = more frames = longer processing 2. Intensity extraction: O(n) with audio length Parallel to pitch extraction 3. Smoothing algorithms: No smoothing: O(1) per frame Median 3-frame: O(1) per frame Median 5-frame: O(1) per frame Moving average: O(1) per frame 4. Visualization drawing: O(n) with number of frames More complex color schemes = slightly longer Typical performance: 1-minute audio: 2-5 seconds 5-minute audio: 10-25 seconds 30-minute audio: 1-3 minutes

Memory Usage

Storage requirements:

Arrays stored: t[]: time points (numFrames × 8 bytes) f0[]: pitch values (numFrames × 8 bytes) midiNote[]: MIDI values (numFrames × 8 bytes) quantizedMidi[]: rounded MIDI (numFrames × 8 bytes) voiced[]: voicing flags (numFrames × 1 byte) intensity[]: loudness values (numFrames × 8 bytes) Total memory ≈ numFrames × 41 bytes Typical usage: 1-minute audio (timeStep=0.01): 6000 frames ≈ 240 KB 5-minute audio: 30000 frames ≈ 1.2 MB 30-minute audio: 180000 frames ≈ 7.2 MB Efficient for most practical applications