Pitch Contour Transfer — User Guide
Mean-preserving pitch transfer: Apply the average pitch of a source sound to a target sound while preserving the target's original pitch contour shape and timing.
What this does
This script implements mean-preserving pitch contour transfer — a technique that transfers the average pitch (mean F0) from a source sound to a target sound while maintaining the target's original pitch contour shape and timing. The target sound's pitch is shifted by the difference between source and target means, preserving all relative pitch movements, micro-intonation, and timing patterns.
Key Features:
- Mean-Based Transfer — Calculates and applies difference between source and target average pitch
- Contour Preservation — Maintains target's pitch contour shape and relative movements
- Blend Control — Adjustable strength parameter for partial transfer
- Pitch Floor/Ceiling — Separate frequency bounds for source and target
- Voiced Frame Analysis — Detailed pitch tracking statistics
- Overlap-Add Resynthesis — High-quality pitch shifting without artifacts
Technical Implementation: (1) Pitch analysis: Extract pitch contours using Praat's Pitch object with configurable time step and frequency bounds. (2) Mean calculation: Compute arithmetic mean of voiced frames for both sounds. (3) Shift computation: Mean_shift = (mean_A - mean_B) × blend_strength. (4) Contour transfer: Create new pitch tier with shifted points: F0_new = F0_original + mean_shift. (5) Boundary enforcement: Clip shifted values to pitch floor/ceiling range. (6) Resynthesis: Use Manipulation object with overlap-add synthesis for artifact-free pitch modification. (7) Validation: Re-analyze result to verify pitch shift accuracy. Key insight: Additive shift preserves contour shape better than multiplicative scaling for small to moderate shifts.
Quick start
- In Praat Objects window, select two Sound objects:
- First selected: Sound A (source style)
- Second selected: Sound B (target sound)
- Open script:
pitch_contour_transfer.praat - Adjust analysis parameters (or use defaults):
- Analysis_time_step: 0.01 seconds (10ms)
- Pitch_floor_A: 75 Hz (source lower bound)
- Pitch_ceiling_A: 300 Hz (source upper bound)
- Pitch_floor_B: 50 Hz (target lower bound)
- Pitch_ceiling_B: 300 Hz (target upper bound)
- Set Blend_strength (1.0 = full transfer, 0.5 = half transfer, 0.0 = no change)
- Click Run — script analyzes both sounds, calculates shift, applies to target
- Result appears as "targetname_shifted" in Objects window
- Script displays shift details in Info window and auto-plays result
Pitch Transfer Theory
Mathematical Foundation
📐 The Transfer Equation
Given:
- meanA = mean F0 of source sound A (Hz)
- meanB = mean F0 of target sound B (Hz)
- blend = blend_strength parameter (0.0 to ∞)
Shift calculation:
For each pitch point in target contour:
Boundary enforcement:
Additive vs Multiplicative Shifting
🎯 Preserving Contour Shape
Multiplicative shifting (traditional):
Effect on contour: Expands/compresses pitch range proportionally.
Problem: Large shifts distort contour shape (wide vibrato becomes wider).
Additive shifting (this script):
Effect on contour: Preserves exact shape, just transposes.
Advantage: Maintains relative pitch movements and micro-intonation.
When to use each:
- Additive: Small-medium shifts, preserving expressive contours
- Multiplicative: Large shifts, key changes in music
- This script: Uses additive shifting for contour preservation
Pitch Analysis Methodology
Semitone Conversion
Blend Strength Interpretation
Parameters & Settings
Core Parameters
| Parameter | Default | Range | Description |
|---|---|---|---|
| Analysis_time_step | 0.01 | 0.001-0.05 | Time between pitch analysis points (seconds). Smaller = more detail, larger = faster |
| Pitch_floor_A | 75 | 50-500 | Minimum expected F0 for source sound (Hz). Set to speaker/instrument lowest note |
| Pitch_ceiling_A | 300 | 100-1000 | Maximum expected F0 for source sound (Hz). Set to speaker/instrument highest note |
| Pitch_floor_B | 50 | 30-400 | Minimum expected F0 for target sound (Hz). Lower than floor_A for flexibility |
| Pitch_ceiling_B | 300 | 100-1000 | Maximum expected F0 for target sound (Hz). Can match or exceed ceiling_A |
| Blend_strength | 1.0 | 0.0-5.0 | Strength of pitch transfer. 1.0 = full difference, 0.5 = half, 2.0 = double |
Parameter Guidelines by Sound Type
| Voice Type | Pitch Floor | Pitch Ceiling | Notes |
|---|---|---|---|
| Bass male | 80 Hz | 200 Hz | Capture fundamental, exclude first harmonic |
| Tenor male | 100 Hz | 300 Hz | Standard male speech range |
| Alto female | 150 Hz | 350 Hz | Female speech, lower singing |
| Soprano female | 200 Hz | 500 Hz | Higher female voice, child voice |
| Child | 250 Hz | 600 Hz | High-pitched voices |
| Instrument | Pitch Floor | Pitch Ceiling | Notes |
|---|---|---|---|
| Bass guitar | 40 Hz | 250 Hz | Very low fundamentals |
| Cello | 65 Hz | 500 Hz | Wide range, expressive |
| Violin | 200 Hz | 1000 Hz | High fundamentals, harmonics |
| Flute | 250 Hz | 1200 Hz | Pure tones, clear pitch |
| Trumpet | 150 Hz | 800 Hz | Bright, strong fundamentals |
Advanced Parameter Interactions
- Floor too high: Misses low pitches, mean calculation inaccurate
- Ceiling too low: Caps high pitches, distorts mean
- Floor too low: May capture subharmonics or noise
- Ceiling too high: May capture harmonics instead of fundamental
Rule of thumb: Set floor to 0.75× lowest expected pitch, ceiling to 1.5× highest expected pitch.
- 0.01s (default): Good for most speech and music (100 points/second)
- 0.005s: Higher detail for fast pitch movements (vibrato, glissandi)
- 0.02s: Smoother contours, faster processing (50 points/second)
- 0.001s: Extreme detail for research (1000 points/second, slow)
Performance: Smaller time step = more pitch points = longer processing but more accurate contour.
Complete Workflow
Step-by-Step Algorithm
🔧 Script Execution Flow
Phase 1: Input Validation
Phase 2: Source Analysis (Sound A)
Phase 3: Target Analysis (Sound B)
Phase 4: Shift Calculation
Phase 5: Contour Transfer
Phase 6: Resynthesis
Phase 7: Verification & Output
Information Window Output
Object Creation Chain
🔄 Praat Object Pipeline
Input Objects:
- Sound A (source, first selected)
- Sound B (target, second selected)
Analysis Objects (temporary):
- Pitch A (from Sound A)
- Pitch B (from Sound B)
- Manipulation (from Sound B)
- PitchTier (empty, then filled)
Synthesis Objects:
- Sound_result (resynthesized from Manipulation)
- Pitch_result (from Sound_result, for verification)
Final Output:
- "originalname_shifted" Sound object
Cleanup: All temporary objects removed automatically.
Troubleshooting Common Issues
Cause: Wrong number of sounds selected
Solution: Select exactly 2 Sound objects in Praat Objects window (Ctrl+click)
Cause: Incorrect pitch floor/ceiling capturing harmonics or missing F0
Solution: Adjust floor/ceiling parameters, check with View & Edit window
Cause: Large shift causing phase issues in overlap-add synthesis
Solution: Reduce blend_strength, try multiple smaller shifts
Cause: Target has unvoiced regions (consonants, silence)
Solution: This is normal — only voiced frames shifted, unvoiced unchanged
Cause: Extreme shift pushing beyond pitch bounds
Solution: Increase ceiling_B or decrease floor_B, or reduce blend_strength
Applications
Voice Transformation
Use case: Adjust speaker's average pitch while preserving speaking style
Technique: Use natural speech as source, target voice to be modified
Example: Make male voice speak at female average pitch while keeping male timbre and prosody
Singing Voice Modification
Use case: Adjust singing key while preserving vocal expression
Technique: Transfer pitch from reference performance to target recording
Workflow:
- Source: Well-tuned reference vocal
- Target: Expressive but pitch-inaccurate recording
- Result: Expressive performance at correct average pitch
- blend_strength: 0.7-0.9 (preserve some original pitch character)
Prosody Research
Use case: Study pitch contour patterns independent of absolute frequency
Technique: Normalize multiple speakers to common mean pitch
Research applications:
- Compare intonation patterns across speakers
- Isolate contour from pitch height effects
- Create pitch-normalized stimuli for perception tests
- Study gender differences in prosody separate from F0
Instrumental Sound Design
Use case: Apply vocal pitch characteristics to instruments
Technique: Use expressive vocal as source, sustained instrument as target
Example:
- Source: Expressive speech with natural pitch variation
- Target: Sustained violin note
- Result: Violin with speech-like pitch contour
- Creative applications: "Talking instruments", hybrid textures
Audio Restoration
Use case: Correct pitch drift in historical recordings
Technique: Use stable reference pitch, apply to drifting recording
Workflow:
- Source: Modern stable recording at correct pitch
- Target: Historical recording with wow/flutter
- Result: Historically informed but pitch-stable version
- Note: Works best for consistent drift, not random fluctuations
Language Teaching
Use case: Demonstrate intonation patterns at comfortable pitch
Technique: Transfer native speaker contour to learner's comfortable range
Example:
- Source: Native speaker with correct intonation
- Target: Learner's voice (or synthetic voice at learner's range)
- Result: Correct intonation pattern at comfortable pitch for imitation
Practical Examples
Example 1: Gender Voice Matching
👥 Male-to-Female Pitch Adjustment
Goal: Make male voice speak at typical female pitch range
Source (A): Female speech sample (mean ~220 Hz)
Target (B): Male speech sample (mean ~125 Hz)
Settings:
- Pitch_floor_A: 150 Hz (female lower bound)
- Pitch_ceiling_A: 350 Hz (female upper bound)
- Pitch_floor_B: 80 Hz (male lower bound)
- Pitch_ceiling_B: 300 Hz (male upper bound, extended)
- Blend_strength: 1.0 (full transfer)
Expected shift: ~95 Hz (+5-6 semitones)
Result: Male voice with female average pitch, male timbre and prosody preserved
Example 2: Singing Intonation Correction
🎵 Vocal Pitch Stabilization
Goal: Improve singing intonation while preserving expression
Source (A): Well-tuned reference vocal (mean 262 Hz ~ C4)
Target (B): Expressive but flat recording (mean 248 Hz ~ B3)
Settings:
- Pitch_floor_A: 200 Hz (below expected range)
- Pitch_ceiling_A: 400 Hz (above expected range)
- Pitch_floor_B: 180 Hz (allow downward shift)
- Pitch_ceiling_B: 420 Hz (allow upward shift)
- Blend_strength: 0.8 (partial correction)
Expected shift: +11 Hz (~0.75 semitones up)
Result: More in-tune vocal with 80% of expression preserved
Example 3: Instrument-Voice Hybrid
🎻 Expressive Instrument Creation
Goal: Make instrument follow speech pitch contour
Source (A): Emotional speech (high pitch variation)
Target (B): Sustained cello note (constant 196 Hz ~ G3)
Settings:
- Pitch_floor_A: 100 Hz (speech lower bound)
- Pitch_ceiling_A: 350 Hz (speech excited peaks)
- Pitch_floor_B: 65 Hz (cello lowest)
- Pitch_ceiling_B: 500 Hz (cello range)
- Blend_strength: 1.0
- Analysis_time_step: 0.005 (capture fast speech changes)
Result: Cello with speech-like pitch inflections, creating "talking instrument" effect
Example 4: Multi-Step Processing
Situation: Need large pitch shift but want to avoid artifacts
Solution: Multiple passes with moderate blend_strength
Example 5: Contour Exaggeration
Goal: Make subtle pitch variations more dramatic