Corpus Map — CataRT-Style Interactive Player

Picks a corpus folder, extracts acoustic descriptors for every sound file, projects the corpus to 2-D via PCA, and opens an interactive scatter-plot where clicking or hovering triggers real-time playback. Includes performance recording and auto-import back to Praat.

Author: Shai Cohen Affiliation: Department of Music, Bar-Ilan University, Israel Version: 2.5 (2026) License: MIT License Repo: https://github.com/ShaiCohen-ops/Praat-plugin_AudioTools

Contents:

What this does Quick start Acoustic Features GUI & Workflow Performance Recording Applications

What this does

This script implements a CataRT-style corpus-based interactive player. Select a folder containing audio files (WAV, FLAC, AIFF, MP3, OGG), and the script extracts 20 acoustic descriptors per file: RMS, ZCR, spectral centroid, bandwidth, flatness, rolloff, 13 MFCCs, and median pitch. The corpus is then projected to 2D using PCA and visualised as a colour-coded scatter plot. Clicking or hovering over any point triggers playback of the corresponding sound in real time. The GUI also includes a performance recorder that captures your interactions and renders them to a stereo WAV, which is automatically imported back into Praat as a Sound object named "performance".

What is CataRT? CataRT (Corpus-based Analysis and Real-time Transformation) is a classic system for navigating sound corpora by similarity. Points close together in the PCA projection are acoustically similar. This script recreates that experience: explore a folder of sounds by moving through a 2D map, trigger sounds by clicking, and record your exploration as a new composition.

Key Features:

20 acoustic descriptors — RMS, ZCR, Centroid, Bandwidth, Flatness, Rolloff, Pitch Median, 13 MFCCs
PCA projection — reduces 20D feature space to 2D for visualisation
Interactive scatter plot — click or hover to trigger sounds (real-time playback)
Performance recorder — capture every trigger event (time, unit, gain), render to stereo WAV
Auto-import to Praat — after closing the GUI, the latest performance is loaded as "performance" Sound
Filter by name — type to show only files matching a substring
Exclusive mode — stop previous sound when a new one starts
Cross-platform GUI — PyQtGraph + PySide6, dark theme, mouse wheel scroll

Technical implementation (v2.5): The script blocks while the GUI is open — Praat waits for you to close the Python window. This allows automatic import of the recorded performance. The Python engine uses librosa for feature extraction (v2.3 replaced slow pyin with fast yin for pitch tracking), sklearn for PCA, sounddevice for low-latency playback, and pyqtgraph for the interactive scatter plot.

Quick start

In Praat, run CorpusMap.praat.
Select a corpus folder containing audio files (recursive scan).
The script analyses all files — this may take a few minutes for large corpora.
A PyQtGraph GUI window opens with a 2D scatter plot. Each point = one sound file.
Click a point → play the sound. Hover (if enabled) → trigger on mouse over.
Use the sidebar to adjust gain, toggle exclusive mode, filter by name, or select audio device.
Click RECORD, then perform (trigger sounds). Click STOP & SAVE to render a WAV.
Close the GUI window — Praat automatically imports the recorded performance as a Sound named performance.

Quick tip: Use hover mode to quickly audition many sounds by moving your mouse across the plot. Use exclusive mode for one-shot triggering (each new sound stops the previous). The colour of each point encodes a hash of the filename, and point size is proportional to duration. Filter by name to focus on specific sounds.

Important: Python dependencies required — install with:
pip install numpy librosa sounddevice scikit-learn pyqtgraph PySide6
The script blocks Praat while the GUI is open. Close the Python window to return to Praat. The corpus folder can contain subfolders (recursive scan). Supported formats: WAV, FLAC, AIFF, MP3, OGG. For very large corpora (>1000 files), analysis may take several minutes.

Acoustic Features (20 dimensions)

Feature	Description	PCA weight
RMS	Root mean square amplitude (loudness proxy)	high
ZCR	Zero-crossing rate (noisiness / spectral slope)	medium
Spectral centroid	Centre of gravity of spectrum (brightness)	high
Spectral bandwidth	Spread of spectrum around centroid	medium
Spectral flatness	Tonal vs. noise-like (1 = white noise, 0 = pure tone)	medium
Spectral rolloff	Frequency below which 85% of energy lies	medium
Pitch median	Median fundamental frequency (YIN algorithm, 60–800 Hz)	high
MFCC 1–13	Mel-frequency cepstral coefficients (spectral envelope)	very high

Feature extraction pipeline (per file):

Load full audio (multichannel preserved for playback).
Derive mono signal and take first 30 seconds (or full file if shorter).
Compute RMS, ZCR, centroid, bandwidth, flatness, rolloff via librosa.
Compute 13 MFCCs (means over time).
Compute pitch using YIN algorithm (fast, deterministic), median over voiced frames.
Standardise all features (zero mean, unit variance).
PCA reduction to 2 dimensions for visualisation.

PCA interpretation: The first principal component (x-axis) typically captures overall brightness and energy (centroid, RMS, MFCC1). The second component (y-axis) often separates tonal vs. noisy sounds (flatness, pitch stability, high MFCCs). Points close together are acoustically similar; far apart are dissimilar.

GUI & Workflow

Left sidebar

Units loaded: Number of audio files in corpus
Output device: Select audio interface, shows channel count
Gain slider: Global playback gain (0–1, linear)
Exclusive mode: Check = stop previous sound on new trigger
Trigger on hover: Check = play sound when mouse enters point
Filter by name: Type substring to show only matching files
Last triggered: Shows metadata of most recently played file
Record button: Start/stop performance recording
Panic (Esc): Stop all playback immediately

Scatter plot

Pan: Click and drag to move view
Zoom: Mouse wheel (or Ctrl+wheel)
Click: Triggers the corresponding sound
Hover: If enabled, triggers sound on mouse over (debounced 150 ms)
Point size: Proportional to sound duration
Colour: Hashed from filename (consistent across sessions)

Hover debounce: To prevent overwhelming triggers when moving quickly across many points, hover events are debounced to 150 ms. This means you can sweep across the plot without triggering every single point.

Performance Recording & Auto-Import

Recording workflow (v2.5 blocking mode):

Click RECORD in the GUI — timer starts.
Perform by clicking or hovering on points. Each trigger records:
- Timestamp (seconds from start)
- Unit ID (index of sound file)
- Current gain value (from slider)
Click STOP & SAVE — renders offline mixdown:
- Each sound resampled to 44.1 kHz, converted to stereo
- Placed at its captured timestamp, scaled by gain
- Summed into a single stereo buffer
- Normalised to prevent clipping (peak ≤ 0.97)
WAV saved to <corpus>/_recordings/performance_YYYYMMDD_HHMMSS.wav
Pointer file written to temp directory: corpusmap_last_recording.txt
Close the GUI window — Praat auto-imports the latest take as Sound "performance".

Fallback directory: If the corpus folder is read-only or cannot create a _recordings subfolder, the rendered WAV is saved to the system temp directory instead. The pointer file always points to the actual location.

Why blocking mode? In versions before v2.5, the Python GUI ran as a separate process, and you had to manually run an import script. Now Praat waits for the GUI to close, then automatically imports the recorded performance. This is ideal for live performance capture: perform, close the window, and the sound appears in Praat ready for playback or further processing.

Applications

Corpus exploration and navigation

Use case: Quickly find sounds in a large library by similarity. Instead of scrolling through a list, navigate the 2D map — similar sounds cluster together.

Workflow: Point to a folder of 500+ samples. After analysis, explore by clicking: discover hidden relationships, find timbral neighbours, identify outliers.

Live performance / improvisation

Use case: Use the scatter plot as a performable instrument. Record your navigation as a composition.

Workflow: Load a corpus of your own sounds, enable exclusive mode and hover mode. Sweep across the plot like a Theremin — each sound triggers as you move. Record the performance, then import to Praat for further mixing or effects.

Sound design / sample library quality control

Use case: Visualise the distribution of a sample pack. Identify duplicates or out-of-place sounds.

Workflow: Load a sample library. Points that are far from all others may be mis-categorised or incorrectly recorded. Use filtering to isolate specific name patterns.

Music information retrieval (MIR) demonstration

Use case: Teach PCA, feature extraction, and similarity-based retrieval.

Workflow: Load a small corpus (e.g., 20 sounds from different categories). The PCA plot will visibly separate categories. Clicking each point plays its sound — students hear the correlation between position and timbre.

Workflow: Build a performance from a field recording corpus

Corpus: 100+ field recordings (birds, water, footsteps, wind, traffic).
Action: Explore the plot, click interesting sounds in a rhythmic pattern, record the performance.
Result: A unique electroacoustic composition that captures your interaction with the corpus. Import to Praat, add effects, export to video.

Workflow: Live electronic music with custom samples

Corpus: Your own drum hits, synth stabs, vocal chops.
Action: Arrange them in a folder, run Corpus Map, enable exclusive mode. Use the plot as a launchpad — each click triggers a different sound.
Result: An intuitive performance interface where spatial proximity implies similarity. No MIDI mapping required.

Troubleshooting:
• GUI crashes on hover/click: v2.3.1 fixed a numpy array truth-value bug. Make sure you're running the latest version (2.5).
• No playback / audio device error: Check that your audio interface is selected in the dropdown. If you see "callback with the return type 'void' must return None", you have an older version — update to v2.3 or later.
• Very slow analysis: v2.3 replaced pyin with yin, making pitch tracking 10–100x faster. If still slow, reduce corpus size or use shorter files.
• Performance recording is silent: Check gain slider (should be >0). Ensure you actually triggered sounds while recording (the status shows number of events).
• Praat doesn't import the performance: Make sure you close the GUI window (not just minimise). The script waits for the window to close before importing.

Technical notes on v2.5 changes

Blocking vs. non-blocking: Previously, Praat launched the GUI with runSystem_nocheck (non-blocking) and continued. This allowed both Praat and the GUI to run simultaneously, but required a separate import script. Now the script uses runSystem_nocheck with a command that waits for the Python process to exit — this is still non-blocking in the sense that Praat's GUI remains responsive, but the script does not proceed past the runSystem_nocheck line until the Python window closes. The trade-off: Praat is occupied while the GUI is open, but auto-import works seamlessly.