OT Grammar Learning from Audio — v1.0 User Guide

Learns a melodic well‑formedness grammar from audio using error‑driven constraint ranking. Implements two classic OT learning algorithms: the Gradual Learning Algorithm (GLA) and Recursive Constraint Demotion (RCD). GEN (candidate generator) operates in two modes: Neighbor‑GEN (single file) or Pair‑corpus (good/bad folders).

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.0 (2026) License: MIT License Repo: GitHub
Contents:

What this does

OT Grammar Learning from Audio implements two classic error‑driven learning algorithms from Optimality Theory (OT): the Gradual Learning Algorithm (GLA) and Recursive Constraint Demotion (RCD). The system extracts a melody from audio, generates candidate pairs (winner/loser), and learns a ranking of 15 melodic well‑formedness constraints.

Optimality Theory (OT) basics: In OT, a grammar is a ranking of violable constraints. A candidate is the most harmonic (optimal) if it incurs the fewest violations of the highest‑ranked constraint that distinguishes candidates. Learning is error‑driven: when the learner predicts the wrong winner, it promotes constraints that favour the correct winner and demotes constraints that favour the incorrect loser.

The constraint set is fixed (15 constraints, see table below). The script does not discover new constraints – it learns which constraints are ranked higher or lower based on the observed melody (Neighbor‑GEN) or a corpus of good/bad examples (Pair‑corpus).

Quick start

  1. In Praat, select exactly one Sound object (mono, voiced melody).
  2. Run script…OT_Grammar_Learning_from_Audio.praat.
  3. Choose Instrument (affects pitch‑tracker settings) and Scale (for quantisation).
  4. Select GEN_mode:
    • Neighbor-GEN – learns from a single file (the winner). Losers are generated by perturbing one note at a time by ±1, ±2 semitones.
    • Pair-corpus – learns from a folder containing good/ and bad/ subfolders. Every good vs every bad melody becomes a winner/loser pair.
  5. Choose Algorithm:
    • GLA – stochastic, handles variation, outputs continuous ranking values.
    • RCD – deterministic, outputs a stratified ranking (strict domination).
  6. Adjust learning parameters (iterations, plasticity, noise for GLA).
  7. Click OK. The script extracts the melody, generates pairs, runs the learner, and outputs a ranked constraint list in the Info window, plus a Table object and visualisation.
Tip: Start with Neighbor-GEN and GLA (2000 iterations, plasticity 0.5, eval noise 2.0) to see how the grammar explains why the given melody is locally optimal against its neighbours. For a full stylistic grammar, use Pair-corpus with a folder of good and bad examples.
Important: This script is entirely in Praat – no Python required. Pitch extraction uses Praat’s To Pitch (ac). The constraint set is fixed (15 constraints). The output is a ranking (not a generative model).

The 15 melodic constraints

Violation counting: Constraints 1–8 and 11–15 are markedness constraints (penalise what they describe). Constraints 9, 10, 13 are positive well‑formedness constraints – they are counted as violations when the condition is NOT met. Lower violation counts = more harmonic.
IndexNameTypeDescription
1 *LEAP Markedness (span) Penalise interval > 5 semitones (excluding tritone).
2 *TRITONE Markedness (span) Penalise interval == 6 semitones.
3 *NONSTEP Markedness (span) Penalise interval > 2 semitones.
4 *REPEAT Markedness (span) Penalise interval == 0 (repeated note).
5 *SEMITONE Markedness (span) Penalise interval == 1.
6 *WIDE-RANGE Markedness (global) Penalise total range > 12 semitones.
7 *NARROW-RANGE Markedness (global) Penalise total range < 7 semitones.
8 *NON-SCALE Markedness (global) Penalise notes outside the selected scale.
9 CADENCE Positive (well‑formedness) Penalise when the final two notes do NOT form a cadential approach to tonic.
10 END-ON-TONIC Positive (well‑formedness) Penalise when the last note is not the tonic.
11 *PEAK-EARLY Markedness (contour) Penalise when the highest note occurs in the first half.
12 *PEAK-LATE Markedness (contour) Penalise when the highest note occurs in the second half.
13 ARC-SHAPE Positive (contour) Penalise when the contour is NOT approximately single‑peaked.
14 *DIR-CHANGE Markedness (contour) Penalise many direction changes (> n/3).
15 *MONOTONIC Markedness (contour) Penalise when >85% of motion is in one direction.
Constraint short names (used in visualisation):
LEAP, TRIT, NSTEP, REPEAT, SEMI, WIDE, NARROW, NSCL, CAD, END, PKEAR, PKLAT, ARC, DIRCH, MONO

GEN – candidate generation

📐 Neighbor-GEN (single file)

The extracted melody is the winner. Losers are generated by perturbing one note at a time by -perturbation_max_semitones+perturbation_max_semitones (excluding 0). For each perturbed note, the new melody is evaluated. This creates up to nWinner × 2×perturbation_max_semitones pairs. The inductive bias: the source melody is already locally optimal; the learner discovers which constraints explain why neighbours are worse.

📁 Pair-corpus (good/bad folders)

The user selects a folder containing good/ and bad/ subfolders. Every file in good/ is a winner; every file in bad/ is a loser. The script extracts the melody from each file, computes violation vectors, and builds all good × bad pairs. This is the linguistically orthodox setup – it produces a real stylistic grammar that distinguishes well‑formed from ill‑formed melodies.

Informative pairs (where winner and loser have different violation profiles) are the only ones that drive learning. The script reports the number of informative pairs.

GLA (Gradual Learning Algorithm)

Boersma (1997), Boersma & Hayes (2001)
Each constraint has a continuous ranking value. Evaluation‑time noise (Gaussian) is added to produce a stochastic ranking. On each iteration:
  1. Pick a random pair (winner, loser).
  2. Add noise to each ranking value.
  3. Find the highest‑ranked constraint on which winner and loser differ.
  4. If that constraint prefers the loser (winner has more violations), it’s an error.
  5. Promote constraints that prefer the winner, demote constraints that prefer the loser, each by plasticity.
GLA learns continuous ranking values and can handle variation (if the same pair is encountered multiple times, ranking values converge to a range that predicts the observed frequencies). Evaluation noise at test time turns ranking values into a probability distribution over rankings.

Parameters:

RCD (Recursive Constraint Demotion)

Tesar & Smolensky (1993, 2000)
Classical OT: constraints are strictly ordered into strata. RCD builds strata greedily:
  1. Find all constraints that prefer the winner in at least one unresolved pair and never prefer the loser in any unresolved pair.
  2. Place them in the current stratum.
  3. Mark as resolved all pairs that are decided by any constraint in this stratum.
  4. Repeat.
If the data is consistent with a strict domination hierarchy, RCD converges to a stratified ranking. If not, it places as many constraints as possible and leaves the rest in the bottom stratum.

RCD is deterministic, fast, and produces a ranking that is guaranteed to be compatible with all winner/loser pairs if one exists. The output is a stratum number per constraint (higher stratum = higher rank in the hierarchy). The visualisation converts strata to ranking values for display.

Parameters & defaults

Instrument & scale

ParameterOptionsDefaultDescription
Instrument Violin / Vocal / Guitar / Flute / Piano / Other Violin Affects pitch‑tracker floor/ceiling, time step, voicing threshold.
Scale C major / G major / … / A minor (harmonic) / Chromatic C major Scale for quantisation (if enabled).
Quantize_to_scale yes/no yes Snap extracted pitches to the selected scale.
Min_note_duration_ms 20–500 80 Minimum duration for a detected note (ms).

GEN

ParameterOptionsDefaultDescription GEN_mode Neighbor-GEN / Pair-corpus Neighbor-GEN How winner/loser pairs are generated. Perturbation_max_semitones 1–6 2 For Neighbor-GEN: max semitone change when perturbing a note.

Learning algorithm

ParameterRangeDefaultDescription Algorithm GLA / RCD GLA Which learning algorithm to use. GLA_iterations 100–10000 2000 Number of learning steps (GLA only). GLA_plasticity 0.1–2.0 0.5 Step size for promotion/demotion. GLA_eval_noise 0.5–5.0 2.0 Standard deviation of evaluation noise. GLA_initial_ranking_value 10–500 100 Initial ranking value for all constraints.

Output

ParameterDefaultDescription Show_visualization yes Draw piano roll, ranking bar chart, learning curve, tableau, and summary. Create_output_table yes Create a Praat Table object with the final ranking.

Visualization (Praat picture)

When Show_visualization = 1, the script draws a 5‑panel figure:

Tip: The tableau panel is the key to understanding why the learner promotes or demotes constraints. If a constraint shows “L” (loser preferred), that constraint is demoted in GLA (or placed in a lower stratum in RCD).

FAQ / troubleshooting

“Could not extract enough notes”

Adjust Min_note_duration_ms (lower for faster melodies) or change Instrument (e.g., Vocal vs. Violin). The pitch tracker may also need different minPitch/maxPitch – these are hardcoded per instrument but can be edited in the script.

No informative pairs / error rate stuck at 1.0

If all generated pairs have identical violation profiles, the learner has nothing to learn. For Neighbor‑GEN, increase Perturbation_max_semitones (e.g., to 4). For Pair‑corpus, ensure that good and bad melodies actually differ in their constraint violations.

GLA error rate does not converge

Try reducing GLA_plasticity (e.g., to 0.2) or increasing GLA_iterations (e.g., to 5000). The learning curve in the visualisation shows the error rate over time – if it’s still noisy, the data may be inconsistent or the noise level may be too high.

RCD cannot resolve all pairs

If the data is inconsistent with a strict domination hierarchy, RCD will place remaining constraints in the bottom stratum. The visualisation reports how many pairs were resolved. This is not an error – it simply means no single strict ranking can account for all pairs.

Scale quantisation

When Quantize_to_scale is on, extracted pitches are snapped to the nearest note in the selected scale (octave‑aware). This ensures that the melody is analysed in the intended tonal framework. For atonal or microtonal material, turn quantisation off.