Corpus Map — CataRT-Style Interactive Player
Picks a corpus folder, extracts acoustic descriptors for every sound file, projects the corpus to 2-D via PCA, and opens an interactive scatter-plot where clicking or hovering triggers real-time playback. Includes performance recording and auto-import back to Praat.
What this does
This script implements a CataRT-style corpus-based interactive player. Select a folder containing audio files (WAV, FLAC, AIFF, MP3, OGG), and the script extracts 20 acoustic descriptors per file: RMS, ZCR, spectral centroid, bandwidth, flatness, rolloff, 13 MFCCs, and median pitch. The corpus is then projected to 2D using PCA and visualised as a colour-coded scatter plot. Clicking or hovering over any point triggers playback of the corresponding sound in real time. The GUI also includes a performance recorder that captures your interactions and renders them to a stereo WAV, which is automatically imported back into Praat as a Sound object named "performance".
Key Features:
- 20 acoustic descriptors — RMS, ZCR, Centroid, Bandwidth, Flatness, Rolloff, Pitch Median, 13 MFCCs
- PCA projection — reduces 20D feature space to 2D for visualisation
- Interactive scatter plot — click or hover to trigger sounds (real-time playback)
- Performance recorder — capture every trigger event (time, unit, gain), render to stereo WAV
- Auto-import to Praat — after closing the GUI, the latest performance is loaded as "performance" Sound
- Filter by name — type to show only files matching a substring
- Exclusive mode — stop previous sound when a new one starts
- Cross-platform GUI — PyQtGraph + PySide6, dark theme, mouse wheel scroll
librosa for feature extraction (v2.3 replaced slow pyin with fast yin for pitch tracking), sklearn for PCA, sounddevice for low-latency playback, and pyqtgraph for the interactive scatter plot.
Quick start
- In Praat, run
CorpusMap.praat. - Select a corpus folder containing audio files (recursive scan).
- The script analyses all files — this may take a few minutes for large corpora.
- A PyQtGraph GUI window opens with a 2D scatter plot. Each point = one sound file.
- Click a point → play the sound. Hover (if enabled) → trigger on mouse over.
- Use the sidebar to adjust gain, toggle exclusive mode, filter by name, or select audio device.
- Click RECORD, then perform (trigger sounds). Click STOP & SAVE to render a WAV.
- Close the GUI window — Praat automatically imports the recorded performance as a Sound named
performance.
pip install numpy librosa sounddevice scikit-learn pyqtgraph PySide6The script blocks Praat while the GUI is open. Close the Python window to return to Praat. The corpus folder can contain subfolders (recursive scan). Supported formats: WAV, FLAC, AIFF, MP3, OGG. For very large corpora (>1000 files), analysis may take several minutes.
Acoustic Features (20 dimensions)
| Feature | Description | PCA weight |
|---|---|---|
| RMS | Root mean square amplitude (loudness proxy) | high |
| ZCR | Zero-crossing rate (noisiness / spectral slope) | medium |
| Spectral centroid | Centre of gravity of spectrum (brightness) | high |
| Spectral bandwidth | Spread of spectrum around centroid | medium |
| Spectral flatness | Tonal vs. noise-like (1 = white noise, 0 = pure tone) | medium |
| Spectral rolloff | Frequency below which 85% of energy lies | medium |
| Pitch median | Median fundamental frequency (YIN algorithm, 60–800 Hz) | high |
| MFCC 1–13 | Mel-frequency cepstral coefficients (spectral envelope) | very high |
- Load full audio (multichannel preserved for playback).
- Derive mono signal and take first 30 seconds (or full file if shorter).
- Compute RMS, ZCR, centroid, bandwidth, flatness, rolloff via librosa.
- Compute 13 MFCCs (means over time).
- Compute pitch using YIN algorithm (fast, deterministic), median over voiced frames.
- Standardise all features (zero mean, unit variance).
- PCA reduction to 2 dimensions for visualisation.
GUI & Workflow
Left sidebar
- Units loaded: Number of audio files in corpus
- Output device: Select audio interface, shows channel count
- Gain slider: Global playback gain (0–1, linear)
- Exclusive mode: Check = stop previous sound on new trigger
- Trigger on hover: Check = play sound when mouse enters point
- Filter by name: Type substring to show only matching files
- Last triggered: Shows metadata of most recently played file
- Record button: Start/stop performance recording
- Panic (Esc): Stop all playback immediately
Scatter plot
- Pan: Click and drag to move view
- Zoom: Mouse wheel (or Ctrl+wheel)
- Click: Triggers the corresponding sound
- Hover: If enabled, triggers sound on mouse over (debounced 150 ms)
- Point size: Proportional to sound duration
- Colour: Hashed from filename (consistent across sessions)
Performance Recording & Auto-Import
- Click RECORD in the GUI — timer starts.
- Perform by clicking or hovering on points. Each trigger records:
- Timestamp (seconds from start)
- Unit ID (index of sound file)
- Current gain value (from slider)
- Click STOP & SAVE — renders offline mixdown:
- Each sound resampled to 44.1 kHz, converted to stereo
- Placed at its captured timestamp, scaled by gain
- Summed into a single stereo buffer
- Normalised to prevent clipping (peak ≤ 0.97)
- WAV saved to
<corpus>/_recordings/performance_YYYYMMDD_HHMMSS.wav - Pointer file written to temp directory:
corpusmap_last_recording.txt - Close the GUI window — Praat auto-imports the latest take as Sound "performance".
_recordings subfolder, the rendered WAV is saved to the system temp directory instead. The pointer file always points to the actual location.
Applications
Corpus exploration and navigation
Use case: Quickly find sounds in a large library by similarity. Instead of scrolling through a list, navigate the 2D map — similar sounds cluster together.
Workflow: Point to a folder of 500+ samples. After analysis, explore by clicking: discover hidden relationships, find timbral neighbours, identify outliers.
Live performance / improvisation
Use case: Use the scatter plot as a performable instrument. Record your navigation as a composition.
Workflow: Load a corpus of your own sounds, enable exclusive mode and hover mode. Sweep across the plot like a Theremin — each sound triggers as you move. Record the performance, then import to Praat for further mixing or effects.
Sound design / sample library quality control
Use case: Visualise the distribution of a sample pack. Identify duplicates or out-of-place sounds.
Workflow: Load a sample library. Points that are far from all others may be mis-categorised or incorrectly recorded. Use filtering to isolate specific name patterns.
Music information retrieval (MIR) demonstration
Use case: Teach PCA, feature extraction, and similarity-based retrieval.
Workflow: Load a small corpus (e.g., 20 sounds from different categories). The PCA plot will visibly separate categories. Clicking each point plays its sound — students hear the correlation between position and timbre.
Workflow: Build a performance from a field recording corpus
Corpus: 100+ field recordings (birds, water, footsteps, wind, traffic).
Action: Explore the plot, click interesting sounds in a rhythmic pattern, record the performance.
Result: A unique electroacoustic composition that captures your interaction with the corpus. Import to Praat, add effects, export to video.
Workflow: Live electronic music with custom samples
Corpus: Your own drum hits, synth stabs, vocal chops.
Action: Arrange them in a folder, run Corpus Map, enable exclusive mode. Use the plot as a launchpad — each click triggers a different sound.
Result: An intuitive performance interface where spatial proximity implies similarity. No MIDI mapping required.
• GUI crashes on hover/click: v2.3.1 fixed a numpy array truth-value bug. Make sure you're running the latest version (2.5).
• No playback / audio device error: Check that your audio interface is selected in the dropdown. If you see "callback with the return type 'void' must return None", you have an older version — update to v2.3 or later.
• Very slow analysis: v2.3 replaced
pyin with yin, making pitch tracking 10–100x faster. If still slow, reduce corpus size or use shorter files.• Performance recording is silent: Check gain slider (should be >0). Ensure you actually triggered sounds while recording (the status shows number of events).
• Praat doesn't import the performance: Make sure you close the GUI window (not just minimise). The script waits for the window to close before importing.
Technical notes on v2.5 changes
runSystem_nocheck (non-blocking) and continued. This allowed both Praat and the GUI to run simultaneously, but required a separate import script. Now the script uses runSystem_nocheck with a command that waits for the Python process to exit — this is still non-blocking in the sense that Praat's GUI remains responsive, but the script does not proceed past the runSystem_nocheck line until the Python window closes. The trade-off: Praat is occupied while the GUI is open, but auto-import works seamlessly.