Bimodal Contour Grammar — v0.2 User Guide

A generative contour method: builds melodic shape from phrase‑level gesture primitives (Onset → Nucleus → Coda). The same abstract contour is rendered in two modalities: as sound (via pitch resynthesis) and as image (coloured pitch contour).

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 0.2 (2025) License: MIT License Repo: GitHub

Contents:

What it does The method – generative contour grammar Quick start Grammar rules What counts as a phrase Bimodal primitives Theoretical claim Parameters Visual styles FAQ / troubleshooting

What this does

Bimodal Contour Grammar generates a musical contour by recursively expanding a simple formal grammar. The grammar is applied in a loop, building a sequence of phrases (each composed of an onset, nucleus, and coda). For every generated “point” (time, pitch), two actions occur simultaneously:

Audio mode: the point is added to a PitchTier, which is later used to resynthesize the original sound via PSOLA (pitch‑synchronous overlap‑add).
Visual mode: the point is drawn in the Praat picture window as a line segment or dot, with colour and thickness/size modulated by intensity and pitch class.

Bimodal core idea: The same grammar rules (phrase, onset, nucleus, coda) call the addPoint procedure, which simultaneously updates the PitchTier and draws on the screen. This ensures perfect synchronisation between the visual contour and the resulting audio.

The method – generative contour grammar

📐 What the method is

This is a generative contour method: instead of starting from notes, scales, or a fixed melody, it starts from a small set of pitch gestures and combines them into larger units using a grammar.

The basic idea:

a contour is built from phrases
each phrase has a beginning, middle, and ending
each of those parts is chosen from a small inventory of gesture types
the same generated contour is then used in two modalities: as sound (via pitch resynthesis) and as image (via a drawn contour)

That is why it is bimodal: one abstract contour structure is realized both audibly and visually.

The underlying model

The method treats melodic shape as a kind of syntax. Instead of saying “compose note by note,” it says:

a whole contour can be recursively expanded into phrases
a phrase can be expanded into functional components
each component can be realized by one of several gesture types

So the contour is not arbitrary. It is rule‑governed, but still variable. In this sense, it is close to:

generative linguistics, where sentences are built from phrase rules
gesture‑based music theory, where shape matters more than exact pitch content
procedural composition, where structure is generated from reusable primitives

Quick start

In Praat, select exactly one Sound object (any duration). This sound provides the base timbre – the grammar will impose a new pitch contour on it.
Run script… → Bimodal_Contour_Grammar.praat.
Adjust parameters:
- randomSeed – 0 = auto‑seed from current time; any other number = fixed seed for reproducible results.
- Image_width / height – dimensions of the visual output in the Praat picture window.
- colorScheme – Pitch+Loudness Rainbow, PitchClass+Loudness Wheel, Intensity Heatmap, Octave Spiral.
- lineStyle – Thin continuous line, Thickness varies with loudness, Dots with size varies with loudness.
- baseLoudness / loudnessVariation – controls intensity modulation of the generated contour.
Click OK. The script runs the grammar, draws the contour, resynthesizes the sound via PSOLA, and creates a new Sound object named originalname_grammar.

Tip: To reproduce a generation exactly, note the seed printed in the Info window and re‑run with that seed. The visualisation automatically adapts to the chosen colour scheme and line style.

Important: This effect uses Praat’s Manipulation and PSOLA resynthesis. The source sound must have a reasonable duration (≥0.5 s) and contain some voiced content for the pitch‑shifting to work naturally. The grammar generates pitch values in Hz (range 50–200 Hz typical) – these are converted to MIDI for visualisation.

Grammar rules

📐 Formal grammar

S → Phrase + S (recursive, terminates when near target duration)
Phrase → Onset + Nucleus + Coda

Each phrase generates a small melodic gesture, then adds a short breath pause (0.05–0.1 s) before the next phrase.

Terminal rules

Rule	Description	Parameters (random)
Onset	Initial rise into the phrase	duration 0.05–0.15 s; chooses either JumpUp (step) or GlissandoUp (continuous rise).
Nucleus	Sustained / wobbling core	duration 0.2–0.5 s; chooses either Plateau (flat) or Wobble (small random steps).
Coda	Final fall	duration 0.1–0.2 s; chooses either Fall (20–40 Hz drop) or DeepDrop (50–80 Hz drop).

The grammar uses a seeded random number generator (LCG) to ensure reproducibility. All random choices (duration, pitch offsets, rule selection) are derived from this seed.

What counts as a phrase

A phrase has three functional zones, forming a classic arch of activation → sustain → release:

Onset (↑) → Nucleus (— or ~) → Coda (↓)
jump or glide | plateau or wobble | fall or deep drop

1. Onset – the entry gesture

Its role is to establish motion and direction. In this method, the onset tends to rise – either by a jump upward or by a gliding ascent. So the phrase begins with activation or lift.

2. Nucleus – the main body

Its role is to stabilize or elaborate the pitch region reached in the onset. The nucleus is either a plateau (sustained pitch) or a wobble (local oscillation around a centre). This gives the phrase its central identity: steadiness or internal activity.

3. Coda – the release gesture

Its role is to close the phrase. In this method, the coda is always downward – either a moderate fall or a deep drop. Thus each phrase follows a broad shape of upward activation → sustain/fluctuation → downward closure. This is the real compositional core of the method.

Bimodal primitives – the `addPoint` procedure

addPoint(time, freq_Hz) does:

Convert freq_Hz to MIDI note number for visual Y‑axis.
Add point to PitchTier (for audio resynthesis).
Look up intensity at that time from the source’s Intensity object.
Map intensity to brightness / size.
Choose colour based on colorScheme.
Draw a line from previous point to current point (or a dot) with appropriate colour, thickness, and style.

This single procedure ensures that every grammar event has a dual representation: a sound‑modifying pitch point and a visual mark. The visual styles are applied live as the grammar runs.

Theoretical claim – what the method embodies

The strongest claim behind this method is:

“Contour can be treated as a medium‑independent formal structure.”

That is, a contour is not just something heard in melody or seen in graphics; it is an abstract organizational pattern that can be generated once and rendered in more than one modality.

What is being composed: notes or shapes?

Primarily shapes. The method is not about choosing exact pitches first, but about choosing contour archetypes and then instantiating them numerically. The important musical object is a profile like “rise quickly, hover, descend strongly” – privileging direction, curvature, registral behavior, and phrase morphology over discrete note succession.

Why randomness is used

Randomness here is not the method itself; it is a way to produce variation within constraints. The grammar specifies allowable kinds of motion; random choice determines which variant occurs, how large the rise or fall is, how long each segment lasts, and where the local fluctuations go. The result is outputs that are recognizably from the same family, but never exactly the same – balancing coherence with novelty.

What “bimodal” adds conceptually

The most distinctive part is that the contour is treated as a shared abstract object across two sensory domains. The same form – height over time, intensity over time, segmentation into phrase units – is rendered as pitch movement in audio and as line/color/point behavior in graphics. This makes the method a kind of cross‑modal mapping, where pitch height maps to vertical position or color, intensity maps to brightness/thickness/dot size, and phrase segmentation appears both as heard articulation and visible shape.

What it is not

It is not primarily harmonic composition, tonal voice‑leading, motivic development in the classical sense, or analysis of an existing melody. It does not infer deep structure from the source sound; instead, it imposes a newly generated contour structure on the sound. Methodologically, it is closer to contour synthesis, gesture grammar, and procedural phrase generation than to transcription or conventional composition.

The method in one sentence

This method treats melodic contour as a grammar of phrase‑level gestures and realizes the resulting structure simultaneously as sound and image, making contour itself the central compositional object.

Parameters & defaults

Random seed

Parameter	Range	Default	Description
randomSeed	integer (0 = auto)	0	0 uses current time for randomness; any other number gives a fixed seed for reproducible results.

Visual display

Parameter	Range	Default	Description
Image_width	any positive	1000	Width of the Praat picture in hundredths of an inch (so 1000 = 10 inches).
Image_height	any positive	500	Height in hundredths of an inch.

Color & style

Parameter	Options	Default	Description
colorScheme	Pitch+Loudness Rainbow / PitchClass+Loudness Wheel / Intensity Heatmap / Octave Spiral	Rainbow	How to map pitch and intensity to RGB colour.
lineStyle	Thin continuous line / Thickness varies with loudness / Dots with size varies	Thin continuous	Visual representation of the contour.
minDotSize / maxDotSize	positive	1.0 / 4.0	Range of dot sizes when lineStyle = Dots.

Audio synthesis

Parameter	Range	Default	Description
baseLoudness	any (dB scale)	70	Reference intensity for the source sound; used to modulate brightness/size.
loudnessVariation	any	10	Variation around baseLoudness for the grammar’s internal intensity.

Display options

Parameter	Default	Description
showGrid	yes	Draw horizontal lines for each MIDI note (with octave lines emphasised).
showNoteLabels	yes	Label the leftmost side with note names (C4, etc.).
playAfterGeneration	yes	Auto‑play the resynthesised sound.

Visual styles – colour schemes

Scheme	Description
Pitch+Loudness Rainbow	Hue cycles from low (violet) to high (red); brightness modulated by intensity.
PitchClass+Loudness Wheel	Pitch class (C, C#, …) determines hue; octave ignored. Intensity modulates brightness.
Intensity Heatmap	Colour maps intensity to a blue‑green‑red heat scale (blue = quiet, red = loud).
Octave Spiral	Pitch class determines hue, but higher octaves have increased brightness, creating a spiral effect.

The line styles affect how the contour is drawn: continuous lines (with optional thickness variation) or discrete dots. The dot size maps intensity linearly between minDotSize and maxDotSize.

Tip: The grid shows MIDI note numbers on the Y‑axis (40–100 by default). Octave lines (C) are drawn darker, and note labels can be displayed on the left.

FAQ / troubleshooting

No sound is generated / resynthesis fails

The script uses Praat’s Manipulation and Replace pitch tier. If the source sound is very short or has no pitch (e.g., noise), PSOLA may not work well. Try using a sound with clear pitched content.

Visualisation is empty / no lines drawn

Check that Image_width and Image_height are not too small. Also ensure that the grammar actually generated points – the Info window shows “Total points generated”. If 0, the grammar may have terminated immediately (possible if source sound is too short).

Reproducibility: same seed gives different result

The script uses a custom LCG (randomUniform) seeded with the user‑supplied value. If you change any parameter (e.g., colour scheme, line style), the grammar itself is unaffected – only visual mapping changes. The pitch contour remains the same. If you want exact audio reproduction, keep the same seed and do not change the source sound (the grammar uses the source’s duration to set the target time).

Understanding the grammar loop

The main generate_contour procedure runs a while loop: while current_time < targetTime – 0.2: @phrase. This means the grammar stops 0.2 s before the end of the source. The final part of the source is left unchanged (or may be silent in the resynthesis if no pitch points are added).

Pitch range

The grammar generates pitch values roughly between 50 and 200 Hz, but these can occasionally exceed the MIDI range (40–100) used for visualisation. The visual axes are fixed to 40–100 MIDI; points outside this range are drawn but may be clipped. You can adjust the visual range by editing currentMidiMin/Max in the script.