Breathing Pitch Waves Effect — User Guide

Emotional vocal processing: applies breathing-like pitch modulation with configurable rate, depth, and emotional intensity for expressive vocal transformations.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 0.1 (2025) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools
Contents:

What this does

This script implements breathing pitch waves — a psychoacoustic vocal processing technique that simulates the natural pitch variations of human breathing. Creates emotionally expressive vocal effects by applying complex, multi-layered pitch modulation to audio signals. Process analyzes source pitch, generates dense pitch curves with breathing-like oscillations, and resynthesizes audio with emotional pitch variations. Result: vocals with breathing characteristics (gentle breath, emotional swell, dramatic breath, panic breathing, subtle tremor, etc.) defined by configurable parameters rather than manual automation.

Key Features:

What are breathing pitch waves? Traditional pitch effects: vibrato (regular oscillation), portamento (pitch slides), pitch correction (static tuning). Breathing pitch waves: complex, emotionally-driven pitch modulation simulating human breathing patterns. Advantages: (1) Emotional expression: Creates natural, human-like vocal qualities. (2) Complexity: Multi-layered modulation (fundamental + harmonics + subharmonics). (3) Realism: Includes micro-flutter and chaos for natural imperfections. (4) Configurable: Parameters control rate, depth, and emotional intensity. (5) Psychoacoustic: Based on human breathing/vocal physiology. Use cases: Vocal processing (singing, spoken word), sound design (emotional character voices), experimental music (expressive transformations), film/TV (emotional voice effects), therapeutic applications (breathing awareness).

Technical Implementation: (1) Pitch Analysis: Extract median pitch from source audio using Praat's pitch detection. (2) Dense Curve Generation: Create 200-2000 control points across duration for smooth transitions. (3) Multi-Layered Modulation: Generate breathing waveform with fundamental (sin³), harmonic (sin⁵), and subharmonic (sin²) components. (4) Emotional Components: Add micro-flutter (high-frequency variations), tremor (mid-frequency oscillation), and gasps (sudden pitch rises). (5) Intensity Envelope: Apply emotional build-up over time. (6) Pitch Conversion: Convert semitone shifts to frequency ratios, apply to median pitch. (7) Resynthesis: Replace original pitch with breathing pitch curve using overlap-add synthesis. (8) Output Processing: Optional resampling, cleanup, and result selection.

Quick start

  1. In Praat, select exactly one Sound object (preferably vocal content).
  2. Run script…breathing_pitch_waves.praat.
  3. Choose preset from dropdown (Gentle Breath, Emotional Swell, etc.) or select "Manual" for custom parameters.
  4. Adjust breath_rate (cycles per second, typically 0.1-0.8 Hz).
  5. Set pitch_depth_semitones (variation range, typically 4-36 semitones).
  6. Configure micro_flutter and emotional_intensity for realism and expression.
  7. Set pitch analysis parameters (time_step, minimum_pitch, maximum_pitch).
  8. Configure output settings (output_sample_rate, resample_precision).
  9. Click OK — breathing pitch effect applied, result named "originalname_breath_result".
Quick tip: Start with presets for immediate results — "Gentle Breath" for subtle expression, "Emotional Swell" for singing enhancement, "Panic Breathing" for intense drama. Use Manual mode for fine control. Lower breath_rate (0.1-0.3) for calm breathing, higher (0.5-0.8) for excited/panicked breathing. Moderate pitch_depth (8-15 semitones) for natural effect, extreme (24-36) for dramatic transformation. Enable play_after_processing to hear result immediately. Processing time depends on audio length and density of pitch points (typically 10-60 seconds).
Important: WORKS BEST WITH VOCALS — designed for monophonic pitched content (singing, speech). Polyphonic or noisy audio may produce unpredictable results. Pitch detection critical — ensure minimum_pitch and maximum_pitch brackets the actual pitch range of your audio. Very high emotional_intensity (>3.0) can create extreme pitch variations that may sound artificial. Very high micro_flutter (>6) can introduce audible artifacts. The effect replaces original pitch completely — original pitch contour is discarded. Enable keep_intermediate_objects for debugging if needed. Result may require normalization if pitch variations cause amplitude changes.

Breathing Pitch Theory

Human Breathing Physiology

Natural Breathing Patterns

Breathing characteristics in vocal expression:

Natural breathing patterns: - Rate: 0.1-0.8 Hz (6-48 breaths per minute) - Depth: Subtle to dramatic pitch variations - Complexity: Multi-layered oscillations - Emotion: Correlated with emotional state Emotional correlations: - Calm: 0.1-0.2 Hz, shallow depth (4-8 semitones) - Normal: 0.2-0.3 Hz, moderate depth (8-15 semitones) - Excited: 0.3-0.5 Hz, deep variations (15-24 semitones) - Panicked: 0.5-0.8 Hz, extreme depth (24-36 semitones)

Why Breathing Pitch Modulation?

Advantages of breathing pitch effects:

Vs other pitch effects:

Multi-Layered Modulation

Breathing Waveform Components

Complex breathing waveform construction:

Fundamental component: breath_fundamental = sin(phase)^3 (Primary breathing oscillation) Harmonic component: breath_harmonic = 0.6 * sin(phase * 2)^5 (Higher-frequency breathing harmonics) Subharmonic component: breath_subharmonic = 0.3 * sin(phase * 0.5)^2 (Slower, deeper breathing undertones) Total breathing curve: breath_curve = breath_fundamental + breath_harmonic + breath_subharmonic Where: phase = (time - start_time) * 2 * π * breath_rate (Normalized phase angle for breathing cycle)

Why Multi-Layered Approach?

Psychological and acoustic benefits:

🎭 Emotional Components

Micro-flutter: High-frequency variations (vocal cord instability)

Adds realism and natural vocal imperfections

Emotional tremor: Mid-frequency oscillation (emotional tension)

Conveys vulnerability, excitement, or anxiety

Gasp effects: Sudden pitch rises (surprise, intensity peaks)

Creates dramatic moments and emotional highlights

Phase and Timing

Breathing Cycle Calculation

Phase-based timing system:

Phase calculation: phase = (t - xmin) * 2 * π * breath_rate Where: t = current time (seconds) xmin = audio start time breath_rate = breathing cycles per second (Hz) Example: breath_rate = 0.3 Hz, duration = 10 seconds Total phases = 0.3 * 10 = 3 complete breathing cycles Phase at t=5s: (5-0) * 2 * π * 0.3 = 3π radians Phase components: flutter_phase = phase * 12 (high-frequency flutter) chaos_phase = phase * 23.7 (irregular chaos component) tremor_phase = phase * 7.3 (emotional tremor frequency) gasp_phase = phase * 3 (gasp trigger frequency)

Why Phase-Based System?

Advantages of phase-based modulation:

Phase-based vs time-based:

Phase-based: - Consistent regardless of audio duration - Natural breathing rhythm maintained - Easy to scale and modify - Mathematically elegant

Time-based: - Would require duration-specific calculations - Harder to maintain consistent breathing character - More complex implementation

Phase system ensures breathing character remains consistent across different audio lengths

Emotional Intensity Envelope

Time-Varying Emotional Build-up

Intensity progression algorithm:

Intensity envelope calculation: time_factor = (t - xmin) / duration intensity_envelope = 1 + emotional_intensity * time_factor^1.5 Where: time_factor = normalized time (0 to 1) emotional_intensity = user parameter (0.5 to 4.0) exponent 1.5 = gradual build-up curve Application: total_shift = pitch_depth_semitones * (breath_curve + flutter + tremor + gasp) * intensity_envelope Effect: - Beginning: lower intensity (emotional_intensity * 0^1.5 = 0) - Middle: moderate intensity - End: full intensity (emotional_intensity * 1^1.5 = emotional_intensity)

Why Intensity Envelope?

Purpose of time-varying intensity:

Without intensity envelope: Constant emotional level throughout May sound artificial or monotonous Lacks narrative progression With intensity envelope: Emotional build-up over time Creates sense of progression or narrative More psychologically engaging Mimics natural emotional arcs in performance Typical usage: emotional_intensity = 0.5 → subtle gradual increase emotional_intensity = 2.0 → noticeable emotional arc emotional_intensity = 4.0 → dramatic emotional journey

Complete Processing Pipeline

SETUP: Select Sound object (vocal content recommended) Choose preset or manual parameters Set breath_rate, pitch_depth_semitones, micro_flutter, emotional_intensity PITCH ANALYSIS: Copy original sound → "originalname_breath_tmp" To Manipulation: time_step, minimum_pitch, maximum_pitch Extract median pitch (reference frequency) DENSE PITCH CURVE GENERATION: Create PitchTier with 200-2000 points FOR each point (0 to npoints-1): Calculate phase = (time - start) * 2π * breath_rate Build multi-layer breathing waveform Add micro-flutter, tremor, gasp components Apply intensity envelope Convert to frequency, clamp to min/max range Add point to PitchTier RESYNTHESIS: Replace pitch tier in Manipulation Get resynthesis (overlap-add) Rename: "originalname_breath_result" OUTPUT PROCESSING: Optional resampling to output_sample_rate Cleanup intermediate objects Select final result Optional: Play result OUTPUT: "originalname_breath_result" with breathing pitch effect

Emotional Presets

Preset Overview

🎚️ Emotion-Based Configuration

Philosophy: Each preset captures specific emotional state through breathing patterns

Parameters: Breath rate, pitch depth, flutter, intensity optimized for each emotion

Character: Immediate emotional expression without manual tuning

Best for: Quick results, emotional specificity, production workflow

Built-in presets:

PresetBreath RatePitch DepthMicro FlutterEmotional IntensityCharacter
Gentle Breath0.2 Hz8 st1.01.2Subtle, calming, meditative
Emotional Swell0.25 Hz15 st2.02.0Expressive, singing-like, building
Dramatic Breath0.35 Hz24 st5.03.0Theatrical, intense, performance
Panic Breathing0.8 Hz36 st8.04.0Frantic, anxious, extreme
Subtle Tremor0.15 Hz6 st3.00.8Delicate, vulnerable, intimate
Deep Meditation0.1 Hz4 st0.50.5Calm, centered, minimal
Intense Gasping0.5 Hz30 st6.03.5Breathy, excited, passionate

Preset Applications

🎵 Vocal Enhancement Presets

Gentle Breath: Subtle naturalness for spoken word, podcast vocals

Emotional Swell: Singing enhancement, ballad vocals, emotional speech

Subtle Tremor: Intimate vocals, ASMR, whispered content

🎭 Dramatic Effect Presets

Dramatic Breath: Theater, voice acting, dramatic reading

Panic Breathing: Horror, thriller, intense scenes, emergency calls

Intense Gasping: Passionate singing, intense dialogue, climactic moments

🧘 Therapeutic Presets

Deep Meditation: Guided meditation, relaxation, mindfulness content

Gentle Breath: Breathing exercises, calm narration, sleep content

Parameters

Breathing Parameters

ParameterTypeDefaultRangeDescription
presetoptionManual8 optionsEmotional preset or manual configuration
breath_ratepositive0.30.05-2.0Breathing cycles per second (Hz)
pitch_depth_semitonespositive181-48Pitch variation range in semitones
micro_flutterpositive40-10High-frequency pitch variations
emotional_intensitypositive2.50.1-5.0Emotional build-up over time

Pitch Analysis Parameters

ParameterTypeDefaultRangeDescription
time_steppositive0.0050.001-0.05Pitch analysis time step (seconds)
minimum_pitchpositive5030-200Minimum expected pitch (Hz)
maximum_pitchpositive900200-2000Maximum expected pitch (Hz)

Output Parameters

ParameterTypeDefaultRangeDescription
output_sample_ratepositive441008000-192000Output sample rate (Hz)
resample_precisionpositive5010-100Resampling quality
play_after_processingbooleanyesyes/noAuto-play result
keep_intermediate_objectsbooleannoyes/noKeep temporary objects

Applications

Vocal Production

Use case: Adding emotional expression to singing vocals

Technique: Use Emotional Swell or Gentle Breath presets

Example: Apply to ballad vocals for natural breathing expression

Voice Acting & Character Voices

Use case: Creating specific emotional states for character voices

Technique: Match preset to character emotion

Example: Panic Breathing for frightened character, Dramatic Breath for heroic speech

Therapeutic Audio

Use case: Guided meditation and breathing exercises

Technique: Use Deep Meditation or Gentle Breath presets

Workflow:

Experimental Music

Use case: Creating unusual vocal textures and expressions

Technique: Extreme parameter settings or multiple processing passes

Examples:

Film & Game Audio

Use case: Emotional voice processing for media

Advantages:

Example: Processing dialogue for emotional scenes in films or games

ASMR & Relaxation Content

Use case: Creating calming, intimate vocal qualities

Technique: Subtle Tremor or Gentle Breath with low parameters

Application: Whispered content, relaxation narration, sleep stories

Practical Workflow Examples

🎤 Singing Enhancement (Music Production)

Goal: Add natural breathing expression to vocal performance

Settings:

  • Preset: Emotional Swell
  • breath_rate: 0.25 Hz
  • pitch_depth_semitones: 12-18
  • emotional_intensity: 1.5-2.5

Result: Expressive vocals with natural breathing character

🎭 Character Voice (Voice Acting)

Goal: Create specific emotional state for character

Settings:

  • Preset: Panic Breathing
  • breath_rate: 0.6-0.8 Hz
  • pitch_depth_semitones: 24-36
  • micro_flutter: 6-8

Result: Anxious, frantic character voice

🧘 Guided Meditation (Therapeutic)

Goal: Create calming, centered vocal quality

Settings:

  • Preset: Deep Meditation
  • breath_rate: 0.1 Hz
  • pitch_depth_semitones: 4-6
  • emotional_intensity: 0.5

Result: Calming voice for meditation guidance

Advanced Techniques

Layered processing for complex characters:
  • Multiple passes: Apply different presets to same audio
  • Section-based: Different parameters for different song sections
  • Automation: Vary parameters over time for dynamic effects
  • Combination: Use with other effects (reverb, delay, compression)

Experiment with parameter combinations for unique vocal characters

Parameter interaction strategies:
  • High rate + low depth: Nervous, anxious quality
  • Low rate + high depth: Dramatic, theatrical expression
  • High flutter + low intensity: Natural, imperfect voice
  • Low everything: Subtle, almost imperceptible enhancement

Troubleshooting Common Issues

Problem: No pitch detected/effect doesn't work
Cause: Pitch outside min/max range or non-pitched content
Solution: Adjust minimum_pitch and maximum_pitch brackets, use pitched vocal content
Problem: Artificial or robotic sounding result
Cause: Parameters too extreme or mismatched with content
Solution: Use more subtle settings, try different presets, ensure natural vocal source
Problem: Extreme pitch variations causing distortion
Cause: Very high pitch_depth_semitones or emotional_intensity
Solution: Reduce pitch depth, use moderate intensity values
Problem: Processing very slow
Cause: Very high density of pitch points or long audio
Solution: This is normal for long files, consider processing shorter segments

Technical Deep Dive

Pitch Manipulation Mathematics

Semitone to Frequency Conversion

Frequency ratio calculation:

Semitone to frequency ratio: ratio = 2 ^ (semitones / 12) Where: semitones = pitch shift in semitones ratio = frequency multiplication factor Example: +12 semitones (octave) ratio = 2^(12/12) = 2^1 = 2 (double the frequency) Example: +7 semitones (perfect fifth) ratio = 2^(7/12) ≈ 1.498 (approximately 3:2 ratio) Application in script: total_shift = [calculated semitone shift] ratio = 2 ^ (total_shift / 12) new_f0 = median_f0 * ratio

Phase-Based Component Frequencies

Multi-frequency modulation system:

Fundamental breathing: frequency = breath_rate (0.1-0.8 Hz) Harmonic component: frequency = breath_rate * 2 (0.2-1.6 Hz) Subharmonic component: frequency = breath_rate * 0.5 (0.05-0.4 Hz) Micro-flutter components: flutter_freq = breath_rate * 12 (1.2-9.6 Hz) chaos_freq = breath_rate * 23.7 (2.37-18.96 Hz) Tremor component: tremor_freq = breath_rate * 7.3 (0.73-5.84 Hz) Gasp component: gasp_freq = breath_rate * 3 (0.3-2.4 Hz) All components phase-locked to fundamental breathing rate

Dense Point Sampling

Smooth Curve Generation

Adaptive point density algorithm:

Point calculation: npoints = round(duration / 0.01) Constrained: 200 ≤ npoints ≤ 2000 Where: duration = audio length in seconds 0.01 = target point spacing (10 ms) 200 = minimum points (for very short audio) 2000 = maximum points (for very long audio) Example: 5 second audio npoints = round(5 / 0.01) = 500 points Example: 30 second audio npoints = round(30 / 0.01) = 3000 → clamped to 2000 Advantage: - Short audio: sufficient points for smoothness - Long audio: reasonable processing time - Consistent perceptual quality