Spectral Morph — User Guide
CDP‑style spectral morphing powered by a Python STFT engine. Interpolates between two sounds in the spectral domain using three distinct morph modes: log‑magnitude (preserve A phase), full complex (blend phase), and formant/envelope (cepstral envelope morph).
What this does
Spectral Morph performs a time‑varying spectral interpolation between two sounds. For each analysis frame, the short‑time Fourier transforms of sound A and sound B are computed. A morph factor m(t)(ranging from 0 to 1) determines the blend at that moment, according to a user‑selected curve (linear, cosine S‑curve, or fixed mix). The actual spectral interpolation is then carried out in one of three modes:
- Log magnitude – geometric interpolation of magnitude spectra; phase from sound A is preserved.
- Full complex – linear interpolation of magnitude and phase (phase is unwrapped).
- Formant/envelope – cepstral envelope of A and B is interpolated; the fine structure (excitation) comes from A. This is the classic “CDP‑style” formant morph.
The heavy DSP (STFT, morphing, overlap‑add) is offloaded to a Python script (spectral_morph.py),
while Praat handles the UI, file export, import, and a comprehensive 8‑panel visualisation.
Quick start
- In Praat, select exactly two Sound objects. The first selected will be A (source), the second B (target).
- Run script… →
Spectral_Morph.praat. - Choose a Preset:
- Tonal Sustained, Percussive, Voice / Formant Morph, Texture Blend, Fast Preview
- For custom mode (preset = Custom), adjust parameters as desired:
- Start_morph_s / End_morph_s – time region over which the morph evolves (0 = full duration).
- Curve_type – Linear, Cosine (smooth S‑curve), Full mix (fixed blend).
- Mix_amount – fixed blend ratio (only for Full mix).
- Window_ms – STFT window size in milliseconds.
- Morph_mode – Log magnitude, Full complex, Formant/envelope.
- Click OK. Praat exports both sounds, calls the Python engine, and imports the result as
A_morph_B.
numpy, soundfile. If the sample rates of A and B differ, scipy is required for resampling.
The engine automatically matches durations by time‑aligning the longer sound.
The 5 presets (+ Custom)
| Preset | Window | Morph mode | Curve | Description |
|---|---|---|---|---|
| Tonal Sustained | 60 ms | Log magnitude | Cosine | Smooth transition between sustained tones, pads. |
| Percussive | 25 ms | Log magnitude | Linear | Short window for transient preservation. |
| Voice / Formant Morph | 50 ms | Formant/envelope | Cosine | Morphs the spectral envelope (formants) while keeping A's excitation. |
| Texture Blend | 80 ms | Full complex | Cosine | Blends both magnitude and phase – creates rich, evolving textures. |
| Fast Preview | 120 ms | Log magnitude | Linear | Large window for quick preview (lower quality, faster). |
Morph modes
1. Log magnitude (geometric interpolation)
mag_out = exp( (1‑m)·log(mag_A) + m·log(mag_B) )
phase_out = phase_A
This mode interpolates the magnitude spectra in the log domain, which corresponds to a smooth morph between the power spectra. The phase of sound A is preserved, ensuring that the temporal fine structure (e.g., attack transients) remains aligned with the source. This is often the most natural‑sounding morph for tonal materials.
2. Full complex (blend magnitude and phase)
mag_out = (1‑m)·mag_A + m·mag_B
phase_out = phase_A + m·Δφ where Δφ is the phase difference wrapped appropriately.
Both magnitude and phase are interpolated linearly. This can create more dramatic, “phasy” textures, as the phase relationships evolve continuously. Useful for sound design and complex textures.
3. Formant / envelope (cepstral envelope morph)
This mode implements the classic CDP‑style formant morph:
- Extract the spectral envelope of A and B via cepstral liftering (order 60).
- Separate the fine structure (excitation) of A:
fine_A = mag_A / envelope_A. - Interpolate the envelopes in the log domain.
- Reconstruct:
mag_out = fine_A × envelope_out; phase from A.
Result: the formant structure of A gradually shifts toward that of B, while the excitation (pitch, noise) remains from A. Ideal for vocal morphs, instrument timbre shifts, etc.
Morph curves
The morph factor m(t) controls how much of B is present at time t. It varies from 0 (pure A) to 1 (pure B) according to one of three curves:
| Curve | Formula (inside morph region) | Behaviour |
|---|---|---|
| Linear | m(t) = (t‑t0)/(t1‑t0) | Constant rate of change. |
| Cosine (S‑curve) | m(t) = 0.5‑0.5·cos(π·(t‑t0)/(t1‑t0)) | Slow at start and end, fastest in the middle – natural, smooth transition. |
| Full mix | m(t) = mix_amount (constant) | Fixed blend throughout the region – no transition, just a crossfade. |
Outside the morph region (t < start_morph_s), the output is pure A; after end_morph_s, it is pure B. The curve is drawn in the visualisation panel.
Parameters & defaults
Morph region
| Parameter | Range | Default | Description |
|---|---|---|---|
| Start_morph_s | 0 … end_morph_s | 0 | Time at which morph begins (0 = start of sound). |
| End_morph_s | ≥ start_morph_s | 0 (max duration) | Time at which morph ends. 0 means the end of the longer sound. |
Curve & mix
| Parameter | Options | Default | Description |
|---|---|---|---|
| Curve_type | Linear / Cosine / Full mix | Cosine | Shape of the morph factor over time. |
| Mix_amount | 0–1 | 0.5 | Fixed blend ratio (only used for Full mix). |
Analysis & mode
| Parameter | Range | Default | Description |
|---|---|---|---|
| Window_ms | ≥ 10 ms | 60 ms | STFT analysis window length. Larger = better frequency resolution, worse time resolution. |
| Morph_mode | Log mag / Full complex / Formant | Log mag | Which spectral components to interpolate. |
Output
| Parameter | Default | Description |
|---|---|---|
| Draw_visualization | yes | Show 8‑panel visualisation (waveforms, spectrograms, morph curve, summary). |
| Play_output | yes | Auto‑play after processing. |
Visualization (8‑panel Praat picture)
When Draw_visualization = 1, the script draws a comprehensive multi‑panel plot:
- Waveforms A and B – side‑by‑side, colour‑coded (blue for A, red for B).
- Morph curve – a large central panel showing the morph factor m(t) over time, with the morph region highlighted. The curve is drawn according to the selected type, and labels show A (left) and B (right).
- Spectrograms A and B – 0–8 kHz, side‑by‑side.
- Output waveform – green, with duration displayed.
- Output spectrogram – 0–8 kHz.
- Summary panel – source names, durations, mode, curve, window, morph region, preset.
FAQ / troubleshooting
Install: pip install numpy soundfile. If sample rates differ, also pip install scipy.
On Windows, the script uses py (Python launcher).
The engine time‑aligns the two sounds to the longer of the two. If A is 3 s and B is 5 s, the output will be 5 s. The extra time after A’s end is pure B (since morph factor = 1).
The full‑complex mode (phase interpolation) can produce phase artefacts. Try the log‑magnitude mode, which preserves A’s phase and is generally cleaner. If you need phase morphing, reduce the window size to improve time resolution.
The cepstral lifter order is fixed at 60. This is suitable for speech and most musical sounds. For extremely low‑pitched sounds, you may want to increase the order – edit the order=60 argument in spectral_envelope() inside the Python script.
If A and B have different numbers of channels, the engine duplicates the last channel to match the higher count. For example, if A is mono and B is stereo, A is treated as two identical channels. This ensures a consistent output channel count.