The effect of deep learning methods on deepfake audio detection for digital investigation

dc.contributor.authorMcuba, Mvelo
dc.contributor.authorSingh, Avinash
dc.contributor.authorIkuesan, Richard Adeyemi
dc.contributor.authorVenter, H.S. (Hein)
dc.contributor.emailasingh@cs.up.ac.zaen_US
dc.date.accessioned2023-09-12T04:47:10Z
dc.date.available2023-09-12T04:47:10Z
dc.date.issued2023
dc.descriptionPaper presented at CENTERIS – International Conference on ENTERprise Information Systems / ProjMAN – International Conference on Project MANagement / HCist – International Conference on Health and Social Care Information Systems and Technologies 2022.en_US
dc.description.abstractVoice cloning methods have been used in a range of ways, from customized speech interfaces for marketing to video games. Current voice cloning systems are smart enough to learn speech characteristics from a few samples and produce perceptually unrecognizable speech. These systems pose new protection and privacy risks to voice-driven interfaces. Fake audio has been used for malicious purposes and is difficult to classify what is real and fake during a digital forensic investigation. This paper reviews the issue of deep-fake audio classification and evaluates the current methods of deep-fake audio detection for forensic investigation. Audio file features were extracted and visually presented using MFCC, Mel-spectrum, Chromagram, and spectrogram representations to further study the differences. Harnessing the different deep learning techniques from existing literature were compared using five iterative tests to determine the mean accuracy and the effects thereof. The results showed a Custom Architecture gave better results for the Chromagram, Spectrogram, and Me-Spectrum images and the VGG-16 architecture gave the best results for the MFCC image feature. This paper contributes to further assisting forensic investigators in differentiating between synthetic and real voices.en_US
dc.description.departmentComputer Scienceen_US
dc.description.urihttps://www.journals.elsevier.com/procedia-computer-scienceen_US
dc.identifier.citationMcuba, M., Singh, A., Ikuesan, R.A. & Venter, H. 2023, 'The effect of deep learning methods on deepfake audio detection for digital investigation', Procedia Computer Science, vol. 219, pp. 211-219, doi : 10.1016/j.procs.2023.01.283.en_US
dc.identifier.issn1877-0509 (online)
dc.identifier.other10.1016/j.procs.2023.01.283
dc.identifier.urihttp://hdl.handle.net/2263/92265
dc.language.isoenen_US
dc.publisherElsevieren_US
dc.rights© 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND licenseen_US
dc.subjectDeepfake audioen_US
dc.subjectDigital investigationen_US
dc.subjectCNNen_US
dc.subjectVoice cloningen_US
dc.titleThe effect of deep learning methods on deepfake audio detection for digital investigationen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mcuba_Effect_2023.pdf
Size:
694.12 KB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: