Bimodal Contour Grammar — v0.2 User Guide
A generative contour method: builds melodic shape from phrase‑level gesture primitives (Onset → Nucleus → Coda). The same abstract contour is rendered in two modalities: as sound (via pitch resynthesis) and as image (coloured pitch contour).
What this does
Bimodal Contour Grammar generates a musical contour by recursively expanding a simple formal grammar. The grammar is applied in a loop, building a sequence of phrases (each composed of an onset, nucleus, and coda). For every generated “point” (time, pitch), two actions occur simultaneously:
- Audio mode: the point is added to a
PitchTier, which is later used to resynthesize the original sound via PSOLA (pitch‑synchronous overlap‑add). - Visual mode: the point is drawn in the Praat picture window as a line segment or dot, with colour and thickness/size modulated by intensity and pitch class.
phrase, onset, nucleus, coda) call the
addPoint procedure, which simultaneously updates the PitchTier and draws on the screen.
This ensures perfect synchronisation between the visual contour and the resulting audio.
The method – generative contour grammar
📐 What the method is
This is a generative contour method: instead of starting from notes, scales, or a fixed melody, it starts from a small set of pitch gestures and combines them into larger units using a grammar.
The basic idea:
- a contour is built from phrases
- each phrase has a beginning, middle, and ending
- each of those parts is chosen from a small inventory of gesture types
- the same generated contour is then used in two modalities: as sound (via pitch resynthesis) and as image (via a drawn contour)
That is why it is bimodal: one abstract contour structure is realized both audibly and visually.
The underlying model
The method treats melodic shape as a kind of syntax. Instead of saying “compose note by note,” it says:
- a whole contour can be recursively expanded into phrases
- a phrase can be expanded into functional components
- each component can be realized by one of several gesture types
So the contour is not arbitrary. It is rule‑governed, but still variable. In this sense, it is close to:
- generative linguistics, where sentences are built from phrase rules
- gesture‑based music theory, where shape matters more than exact pitch content
- procedural composition, where structure is generated from reusable primitives
Quick start
- In Praat, select exactly one Sound object (any duration). This sound provides the base timbre – the grammar will impose a new pitch contour on it.
- Run script… →
Bimodal_Contour_Grammar.praat. - Adjust parameters:
- randomSeed – 0 = auto‑seed from current time; any other number = fixed seed for reproducible results.
- Image_width / height – dimensions of the visual output in the Praat picture window.
- colorScheme – Pitch+Loudness Rainbow, PitchClass+Loudness Wheel, Intensity Heatmap, Octave Spiral.
- lineStyle – Thin continuous line, Thickness varies with loudness, Dots with size varies with loudness.
- baseLoudness / loudnessVariation – controls intensity modulation of the generated contour.
- Click OK. The script runs the grammar, draws the contour, resynthesizes the sound via PSOLA, and creates a new Sound object named
originalname_grammar.
Manipulation and PSOLA resynthesis. The source sound must have a reasonable duration (≥0.5 s) and contain some voiced content for the pitch‑shifting to work naturally. The grammar generates pitch values in Hz (range 50–200 Hz typical) – these are converted to MIDI for visualisation.
Grammar rules
📐 Formal grammar
S → Phrase + S (recursive, terminates when near target duration)
Phrase → Onset + Nucleus + Coda
Each phrase generates a small melodic gesture, then adds a short breath pause (0.05–0.1 s) before the next phrase.
Terminal rules
| Rule | Description | Parameters (random) |
|---|---|---|
| Onset | Initial rise into the phrase | duration 0.05–0.15 s; chooses either JumpUp (step) or GlissandoUp (continuous rise). |
| Nucleus | Sustained / wobbling core | duration 0.2–0.5 s; chooses either Plateau (flat) or Wobble (small random steps). |
| Coda | Final fall | duration 0.1–0.2 s; chooses either Fall (20–40 Hz drop) or DeepDrop (50–80 Hz drop). |
The grammar uses a seeded random number generator (LCG) to ensure reproducibility. All random choices (duration, pitch offsets, rule selection) are derived from this seed.
What counts as a phrase
A phrase has three functional zones, forming a classic arch of activation → sustain → release:
jump or glide | plateau or wobble | fall or deep drop
1. Onset – the entry gesture
Its role is to establish motion and direction. In this method, the onset tends to rise – either by a jump upward or by a gliding ascent. So the phrase begins with activation or lift.
2. Nucleus – the main body
Its role is to stabilize or elaborate the pitch region reached in the onset. The nucleus is either a plateau (sustained pitch) or a wobble (local oscillation around a centre). This gives the phrase its central identity: steadiness or internal activity.
3. Coda – the release gesture
Its role is to close the phrase. In this method, the coda is always downward – either a moderate fall or a deep drop. Thus each phrase follows a broad shape of upward activation → sustain/fluctuation → downward closure. This is the real compositional core of the method.
Bimodal primitives – the addPoint procedure
- Convert
freq_Hzto MIDI note number for visual Y‑axis. - Add point to
PitchTier(for audio resynthesis). - Look up intensity at that time from the source’s Intensity object.
- Map intensity to brightness / size.
- Choose colour based on
colorScheme. - Draw a line from previous point to current point (or a dot) with appropriate colour, thickness, and style.
This single procedure ensures that every grammar event has a dual representation: a sound‑modifying pitch point and a visual mark. The visual styles are applied live as the grammar runs.
Theoretical claim – what the method embodies
The strongest claim behind this method is:
“Contour can be treated as a medium‑independent formal structure.”
That is, a contour is not just something heard in melody or seen in graphics; it is an abstract organizational pattern that can be generated once and rendered in more than one modality.
What is being composed: notes or shapes?
Primarily shapes. The method is not about choosing exact pitches first, but about choosing contour archetypes and then instantiating them numerically. The important musical object is a profile like “rise quickly, hover, descend strongly” – privileging direction, curvature, registral behavior, and phrase morphology over discrete note succession.
Why randomness is used
Randomness here is not the method itself; it is a way to produce variation within constraints. The grammar specifies allowable kinds of motion; random choice determines which variant occurs, how large the rise or fall is, how long each segment lasts, and where the local fluctuations go. The result is outputs that are recognizably from the same family, but never exactly the same – balancing coherence with novelty.
What “bimodal” adds conceptually
The most distinctive part is that the contour is treated as a shared abstract object across two sensory domains. The same form – height over time, intensity over time, segmentation into phrase units – is rendered as pitch movement in audio and as line/color/point behavior in graphics. This makes the method a kind of cross‑modal mapping, where pitch height maps to vertical position or color, intensity maps to brightness/thickness/dot size, and phrase segmentation appears both as heard articulation and visible shape.
What it is not
It is not primarily harmonic composition, tonal voice‑leading, motivic development in the classical sense, or analysis of an existing melody. It does not infer deep structure from the source sound; instead, it imposes a newly generated contour structure on the sound. Methodologically, it is closer to contour synthesis, gesture grammar, and procedural phrase generation than to transcription or conventional composition.
The method in one sentence
This method treats melodic contour as a grammar of phrase‑level gestures and realizes the resulting structure simultaneously as sound and image, making contour itself the central compositional object.
Parameters & defaults
Random seed
| Parameter | Range | Default | Description |
|---|---|---|---|
| randomSeed | integer (0 = auto) | 0 | 0 uses current time for randomness; any other number gives a fixed seed for reproducible results. |
Visual display
| Parameter | Range | Default | Description |
|---|---|---|---|
| Image_width | any positive | 1000 | Width of the Praat picture in hundredths of an inch (so 1000 = 10 inches). |
| Image_height | any positive | 500 | Height in hundredths of an inch. |
Color & style
| Parameter | Options | Default | Description |
|---|---|---|---|
| colorScheme | Pitch+Loudness Rainbow / PitchClass+Loudness Wheel / Intensity Heatmap / Octave Spiral | Rainbow | How to map pitch and intensity to RGB colour. |
| lineStyle | Thin continuous line / Thickness varies with loudness / Dots with size varies | Thin continuous | Visual representation of the contour. |
| minDotSize / maxDotSize | positive | 1.0 / 4.0 | Range of dot sizes when lineStyle = Dots. |
Audio synthesis
| Parameter | Range | Default | Description |
|---|---|---|---|
| baseLoudness | any (dB scale) | 70 | Reference intensity for the source sound; used to modulate brightness/size. |
| loudnessVariation | any | 10 | Variation around baseLoudness for the grammar’s internal intensity. |
Display options
| Parameter | Default | Description |
|---|---|---|
| showGrid | yes | Draw horizontal lines for each MIDI note (with octave lines emphasised). |
| showNoteLabels | yes | Label the leftmost side with note names (C4, etc.). |
| playAfterGeneration | yes | Auto‑play the resynthesised sound. |
Visual styles – colour schemes
| Scheme | Description |
|---|---|
| Pitch+Loudness Rainbow | Hue cycles from low (violet) to high (red); brightness modulated by intensity. |
| PitchClass+Loudness Wheel | Pitch class (C, C#, …) determines hue; octave ignored. Intensity modulates brightness. |
| Intensity Heatmap | Colour maps intensity to a blue‑green‑red heat scale (blue = quiet, red = loud). |
| Octave Spiral | Pitch class determines hue, but higher octaves have increased brightness, creating a spiral effect. |
The line styles affect how the contour is drawn: continuous lines (with optional thickness variation) or discrete dots. The dot size maps intensity linearly between minDotSize and maxDotSize.
FAQ / troubleshooting
The script uses Praat’s Manipulation and Replace pitch tier. If the source sound is very short or has no pitch (e.g., noise), PSOLA may not work well. Try using a sound with clear pitched content.
Check that Image_width and Image_height are not too small. Also ensure that the grammar actually generated points – the Info window shows “Total points generated”. If 0, the grammar may have terminated immediately (possible if source sound is too short).
The script uses a custom LCG (randomUniform) seeded with the user‑supplied value. If you change any parameter (e.g., colour scheme, line style), the grammar itself is unaffected – only visual mapping changes. The pitch contour remains the same. If you want exact audio reproduction, keep the same seed and do not change the source sound (the grammar uses the source’s duration to set the target time).
The main generate_contour procedure runs a while loop: while current_time < targetTime – 0.2: @phrase. This means the grammar stops 0.2 s before the end of the source. The final part of the source is left unchanged (or may be silent in the resynthesis if no pitch points are added).
The grammar generates pitch values roughly between 50 and 200 Hz, but these can occasionally exceed the MIDI range (40–100) used for visualisation. The visual axes are fixed to 40–100 MIDI; points outside this range are drawn but may be clipped. You can adjust the visual range by editing currentMidiMin/Max in the script.