MFCC Extractor — User Guide
Calculates Mel-Frequency Cepstral Coefficients (MFCCs) for a selected sound and outputs the results to a Praat Table object for easy data export.
What this does
This script performs a feature extraction process on a selected Sound object to calculate Mel-Frequency Cepstral Coefficients (MFCCs). MFCCs are a standard set of features used in automatic speech and music recognition, providing a compact representation of the short-term power spectrum based on a linear cosine transform of a log power spectrum on a nonlinear Mel scale of frequency.
The calculation uses Praat's internal MFCC function with fixed, optimized parameters for standard voice and acoustic analysis.
Quick start
- Load a Sound object (preferably speech or a music sample) into the Praat Objects window.
- Select the Sound object.
- Run the script: Praat → Run script… →
MFCC.praat. - The script will create a new object named "MFCC_Table" in the Objects window.
- Select "MFCC_Table" and click View & Edit to see the data, or Write → Write to file... to export the data.
Implementation Details
Analysis Steps
The script uses a standardized pipeline to generate the MFCC table:
- MFCC Generation: It calls the Praat command
To MFCCon the selected Sound object. - Matrix Conversion: The resulting MFCC object is converted to a Matrix object, which allows for easy, cell-by-cell access to the coefficient values.
- Table Creation: A new Table object is created with columns for "FrameTime" and 12 coefficients (C1 to C12).
- Data Population: It iterates through the time frames, calculating the timestamp for each frame and extracting the 12 coefficient values from the Matrix to populate the Table.
- Cleanup: The script removes the temporary MFCC object and the Matrix object, retaining only the final Table and the original Sound.
Fixed Analysis Parameters
The MFCC calculation relies on the following fixed parameters, ensuring a consistent feature set:
| Parameter | Value | Description |
|---|---|---|
| Number of Coefficients | 12 (C1 to C12) | The dimensionality of the cepstral vector, excluding C0. |
| Window Length | 0.015 s (15 ms) | The size of the analysis window. |
| Time Step | 0.005 s (5 ms) | The shift between analysis windows. |
| Max Frequency | 100 Mel | The upper limit of the Mel-scale frequency band. |
| Number of Mel Filters | 100 | The number of triangular filters in the Mel filter bank. |
| Pre-emphasis | 0.0 | No pre-emphasis is applied. |
| Output Format | Praat Table object | The final destination for the extracted features. |