Corpus Mosaic — v1.2 User Guide

Offline corpus‑based mosaic synthesis. Reconstructs a target Sound object by splicing together tiny grains of audio from a folder of corpus sounds. Matching is performed via a Python engine using a 6‑dimensional feature vector (Loudness, Centroid, Flatness, Rolloff, ZCR, Pitch). Includes repetition penalties, continuity bonuses, and a new silence gate to skip low‑energy target grains.

Author: Shai Cohen Affiliation: Department of Music, Bar‑Ilan University, Israel Version: 1.2 (2026) – Silence Gate License: MIT License Repo: GitHub
Contents:

What this does

Corpus Mosaic is an offline concatenative synthesizer. It takes a target sound and a folder of corpus sounds, slices both into grains of equal duration (with overlap), extracts a 6‑dimensional feature vector for each grain, and then rebuilds the target by selecting the best‑matching corpus grain for every target grain. The selection process can be influenced by repetition penalties (to avoid reusing the same grain too often), continuity bonuses (to favour consecutive grains from the same source file), and a randomness factor.

v1.2 adds a silence gate: target grains whose RMS energy falls below a threshold (in dB relative to the peak RMS) are replaced by silence, not rendered. This saves processing time and prevents noisy matches for silent regions.

Quick start

  1. In Praat, select exactly one Sound object (the target).
  2. Run script…CorpusMosaic.praat.
  3. In the directory chooser, select a folder containing corpus sounds (WAV, FLAC, AIFF). The script will scan recursively.
  4. In the form, set:
    • Grain_size_ms – duration of each grain (e.g., 100 ms).
    • Overlap_percent – overlap between consecutive grains (0–99 %).
    • Pitch_weight, Timbre_weight, Loudness_weight – relative importance of each feature group.
    • Top_k_candidates – number of best‑matching grains to consider.
    • Randomness – 0 = always choose the best match; 1 = uniform random among top‑k.
    • Continuity_preference – bonus for choosing the next grain from the same source file.
    • Repetition_penalty – penalty for reusing a grain recently.
    • Gate_threshold_dB – silence gate level (e.g., ‑40 dB). Grains below this are not rendered.
  5. Click OK. Python scans the corpus, extracts features, computes distances, and renders the mosaic as originalname_mosaic.
Tip: Start with weights all 1.0, top‑k=5, randomness=0.5, continuity=0.3, penalty=1.5, gate=‑40 dB. Adjust weights to emphasise pitch (for melodic targets) or timbre (for texture).
Important: Python dependencies: numpy, scipy, soundfile, librosa. The engine uses librosa for feature extraction (YIN pitch, spectral features). For large corpora, feature extraction may take a minute or two.

Pipeline — five stages

Stage 1 – Source processing (Python) – load target, slice into overlapping grains, extract 6‑dim features.
Stage 2 – Corpus scanning – recursively scan corpus folder, load each file, slice into grains, extract features, store metadata (file path, start sample).
Stage 3 – Normalisation & weighting – robust z‑score normalisation across all corpus grains. Apply user‑supplied weights to the squared‑distance calculation.
Stage 4 – Matching & rendering – for each target grain, compute distances to all corpus grains, apply penalties/bonuses, select grain (deterministic or stochastic), place into output buffer with Hann window.
Stage 5 – Silence gate – if target grain RMS < threshold, skip rendering (inserts a placeholder in CSV but no audio). This speeds up processing and prevents noisy matches for silent regions.

The 6‑dimensional feature vector

FeatureDescriptionAssociated weight
RMSRoot‑mean‑square energy (loudness)Loudness_weight
CentroidSpectral centroid (brightness)Timbre_weight
FlatnessSpectral flatness (tonal vs. noisy)Timbre_weight
RolloffFrequency below which 85 % of energy liesTimbre_weight
ZCRZero‑crossing rate (noisiness)Timbre_weight
Pitch (F0)Fundamental frequency from YIN algorithm (0 = unvoiced)Pitch_weight

Features are normalised using robust z‑score (mean and standard deviation) across the entire corpus. The weighted squared Euclidean distance is then computed between each target grain and all corpus grains.

Parameters & defaults

Grain & overlap

ParameterRangeDefaultDescription
Grain_size_ms≥10 ms100 msDuration of each analysis/ synthesis grain.
Overlap_percent0–99 %50 %Overlap between consecutive grains (hop = grain × (1‑overlap)).

Feature weights

ParameterRangeDefaultDescription
Pitch_weight≥01.0Weight for F0 feature.
Timbre_weight≥01.0Weight for centroid, flatness, rolloff, ZCR.
Loudness_weight≥01.0Weight for RMS.

Selection control

ParameterRangeDefaultDescription
Top_k_candidates≥15Number of best‑matching grains to consider (for stochastic selection).
Randomness0–10.50 = always pick the best; 1 = uniform random among top‑k.
Continuity_preference≥00.3Bonus subtracted from distance when the next grain in the same file follows the previous chosen grain.
Repetition_penalty≥01.5Penalty added to distance for grains used recently (memory ≈ 2 s).

Silence gate (v1.2)

ParameterRangeDefaultDescription
Gate_threshold_dB≤ -10 dB-40 dBGrains in the target whose RMS is below this level (relative to peak) are replaced by silence and not rendered. This speeds up processing and prevents noisy matches for silent regions.

Output

ParameterDefaultDescription
Normalize_outputyesScale output peak to 0.95 after rendering.
Draw_visualizationyesShow waveforms, spectrograms, corpus usage strip, distance trace, and summary.
Play_resultyesAuto‑play after processing.

Silence gate (v1.2)

🎚️ How the gate works

For each target grain, the engine computes the peak RMS across all target grains. The linear threshold is calculated as: gate_linear = peak_RMS × 10^(gate_db/20). If the grain's RMS is below this threshold, it is not rendered – the output simply has silence at that position.

  • This speeds up processing because no matching is performed for silent grains (they are skipped early).
  • It also prevents the engine from trying to match very quiet (often noisy) regions with inappropriate corpus grains.
  • The CSV path still contains a row for the grain, with corpus_file = "__SILENCE__", distance 0.0.
  • The stats report includes a line Silenced grains (Gated): X.

Recommended setting: -40 dB is a safe starting point. For very dynamic material, you may want a higher threshold (e.g., -30 dB) to silence more of the quiet parts.

Visualization (Praat picture)

When Draw_visualization = 1, the script draws a multi‑panel plot:

Tip: The corpus usage strip gives an immediate overview of which source files contributed most. The distance trace shows how well the mosaic matches the target – spikes indicate poor matches (high distance).

FAQ / troubleshooting

“Python not found” or missing packages

Install: pip install numpy scipy soundfile librosa. On Windows, the script uses py.

No corpus grains extracted / output silent

Check the Info window for “Corpus grains available”. If 0, the engine could not extract features from any corpus file. Ensure files are readable and longer than one grain. Also check the Gate_threshold_dB – if it's too high, many target grains may be silenced. Reduce it (e.g., to -50 dB) to include more grains.

Mosaic sounds choppy / has artefacts

The overlap‑add synthesis uses a Hann window; if grains are too short or overlap too low, artefacts may appear. Increase Overlap_percent (e.g., to 75 %) for smoother transitions. Also ensure Grain_size_ms is not too small (≥50 ms recommended).

Repetition penalty memory

The penalty is applied to grains that have been used within the last ~2 seconds (calculated as penalty_memory = max(1, SR/hop * 2)). This prevents the same grain from being reused too frequently, encouraging variety.

Continuity bonus

If the previous chosen grain came from a certain file, and the next grain in that same file (same file, next consecutive grain) is a candidate, its distance is reduced by continuity_preference × max_dist_estimate. This encourages long runs from the same source file, reducing “choppiness”.

CSV format

The CSV written by the engine (temp_mosaic_path.csv) contains columns: source_grain, source_time_sec, corpus_file, corpus_time_sec, distance. Silenced grains have corpus_file = "__SILENCE__" and distance 0.