Share
Export Citation
Improving the Performance of Voice Lie Detection Using Mel Frequency Cepstral Coefficients and Long Short Term Memory Models
Kusumawati D.
Journal of Advanced Computational Intelligence and Intelligent Informatics
Q3Abstract
The objective of this study is to show that the combination of the Mel frequency cepstral coefficient (MFCC) and long short-term memory (LSTM) can be an effective approach for voice lie detection. To improve the performance of voice lie detection, a modified MFCC was used to extract important features in voice. An MFCC was modified by adding zero-crossing rate, audio entropy, and energy entropy parameters to detect changes in tone in each voice frame. LSTM was used to detect and classify voice-based lies. Datasets were obtained from the video recordings of the trial of a suspect. A total of 847 voice datasets were obtained after applying the time stretching augmentation technique where the audio duration was changed from 28.0 s to 4 s per video. The lie classification process was performed using the LSTM method that was equipped with additional dropout and dense layers and optimized using the adaptive moment estimation (Adam) optimizer. The results showed that the combination of the MFCC and LSTM achieved a classification accuracy level of 97% and an area under the curve value of 0.97 using epoch parameters of 200, Adam optimizer, and learning rate of 0.0001. This study concluded that the addition of zero-crossing rate, audio entropy, and energy entropy parameters to the MFCC extraction feature and the use of Adam optimizer in LSTM improved the accuracy of voice lie detection.
Access to Document
10.20965/jaciii.2026.p0113Other files and links
- Link to publication in Scopus
- Open Access Version Available