El peligro de la suplantación de la identidad por medio de audio
Keywords:
Detección de falsificación de audio, Verificación automática de locutor, Contramedidas, Redes neuronales
Abstract
Biometric authentication has permeated daily life due to the continuous advancement of technology, which has allowed its inclusion in various services as well as in many everyday devices such as smartphones, laptops, or tablets. We must be aware of the danger posed by authentication through these means as identity theft attacks are a reality. This paper explains the vulnerabilities of automatic speaker verification authentication systems and why they are prone to attacks with audio generated for malicious purposes, as well as some necessary countermeasure approaches to achieve spoof audio detection and thus protect against identity theft.
Downloads
Download data is not yet available.
References
Chen, Z., Xie, Z., Zhang, W., & Xu, X. (2017). ResNet and Model Fusion for Automatic Spoofing Detection. 18th Annual Conference Of The International Speech Communication Association (Interspeech 2017), Vols 1-6: Situated Interaction, 102–106. https://doi.org/10.21437/Interspeech. 2017-1085
Dua, M., Jain, C., & Kumar, S. (2021). LSTM and CNN based ensemble approach for spoof detection task in automatic speaker verification systems. Journal of Ambient Intelligence and Humanized Computing, 13, 1985–2000. https://doi. org/10.1007/s12652-021-02960-0
Haykin, S. (1994). Neural Networks - A Comprehensive Foundation (Second Edi). Pearson Education.
Hernández-Nava, C. A., Rincón-García, E. A., Lara-Velázquez, P., De-Los-Cobos- Silva, S. G., Gutiérrez-Andrade, M. A., & Mora-Gutiérrez, R. A. (2023). Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients. PeerJ. Computer Science, 9, e1740. https://doi. org/10.7717/peerj-cs.1740
Lorenzo-Trueba, J., Fang, F., Wang, X., Echizen, I., Yamagishi, J., & Kinnunen, T. (2018). Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, Wave- Net and low-quality found data. Odyssey 2018 The Speaker and Language Recognition Workshop. https://doi.org/10.21437/ odyssey.2018-34
Malik, K. M., Javed, A., Malik, H., & Irtaza, A. (2020). A Light-Weight Replay Detection Framework For Voice Controlled IoT Devices. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 14(5), 982–996. https://doi. org/10.1109/JSTSP.2020.2999828
Pang, W., & He, Q. (2017). A Simple Neural Network Based Countermeasure for Replay Attack. Proceedings Of 2017 2nd International Conference On Communication And Information Systems, 234–238. https://doi.org/10.1145/3158233.3159308
Stupp, C. (2019, August). Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case. The Wall Street Journal. https://www.wsj.com/articles/fraudstersuse- ai-to-mimic-ceos-voice-in-unusual-cybercrime- case-11567157402
Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilçi, C., Sahidullah, M., & Sizov, A. (2015). ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge. https:// doi.org/10.21437/Interspeech.2015-462
Dua, M., Jain, C., & Kumar, S. (2021). LSTM and CNN based ensemble approach for spoof detection task in automatic speaker verification systems. Journal of Ambient Intelligence and Humanized Computing, 13, 1985–2000. https://doi. org/10.1007/s12652-021-02960-0
Haykin, S. (1994). Neural Networks - A Comprehensive Foundation (Second Edi). Pearson Education.
Hernández-Nava, C. A., Rincón-García, E. A., Lara-Velázquez, P., De-Los-Cobos- Silva, S. G., Gutiérrez-Andrade, M. A., & Mora-Gutiérrez, R. A. (2023). Voice spoofing detection using a neural networks assembly considering spectrograms and mel frequency cepstral coefficients. PeerJ. Computer Science, 9, e1740. https://doi. org/10.7717/peerj-cs.1740
Lorenzo-Trueba, J., Fang, F., Wang, X., Echizen, I., Yamagishi, J., & Kinnunen, T. (2018). Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama’s voice using GAN, Wave- Net and low-quality found data. Odyssey 2018 The Speaker and Language Recognition Workshop. https://doi.org/10.21437/ odyssey.2018-34
Malik, K. M., Javed, A., Malik, H., & Irtaza, A. (2020). A Light-Weight Replay Detection Framework For Voice Controlled IoT Devices. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 14(5), 982–996. https://doi. org/10.1109/JSTSP.2020.2999828
Pang, W., & He, Q. (2017). A Simple Neural Network Based Countermeasure for Replay Attack. Proceedings Of 2017 2nd International Conference On Communication And Information Systems, 234–238. https://doi.org/10.1145/3158233.3159308
Stupp, C. (2019, August). Fraudsters Used AI to Mimic CEO’s Voice in Unusual Cybercrime Case. The Wall Street Journal. https://www.wsj.com/articles/fraudstersuse- ai-to-mimic-ceos-voice-in-unusual-cybercrime- case-11567157402
Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilçi, C., Sahidullah, M., & Sizov, A. (2015). ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge. https:// doi.org/10.21437/Interspeech.2015-462
Published
2024-12-10
How to Cite
Hernández Nava, C. A., Rincón García, E. A., Lara Velázquez, P., de los Cobos Silva, S. G., Gutiérrez Andrade, M. A., Martínez Licona, F. M., Martínez Licona, A. E., Mora Gutiérrez, R. A., & Montes Orozco, E. (2024). El peligro de la suplantación de la identidad por medio de audio. Contactos, Revista De Educación En Ciencias E Ingeniería, (137), 43 - 52. Retrieved from https://contactos.izt.uam.mx/index.php/contactos/article/view/443
Section
Artículos