Speaker voice comparison in Thai using Euclidian distance: a case study of temporal feature
Main Article Content
Abstract
The present study aims to investigate the effectiveness of temporal features in speaker voice comparison in Thai using Euclidean distance. The assumption is that the distance of parameter values from temporal features from the same speakers is smaller than that of different speakers. Three parameters were used, i.e., ∆C, ∆V and %V. Acoustic data were obtained by interviewing ten participants. All data were connected speech. The result revealed that distances obtained from the same speakers were 0.005 - 0.063 while those from different speakers were 0.402 - 8.456. The result supported the assumption and revealed that temporal features were effective in speaker voice comparison in Thai.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
จรัญ ภักดีธนากุล. (2561). กฎหมายลักษณะพยานหลักฐาน (พิมพ์ครั้งที่ 13). กรุงเทพฯ: สำนักอบรมศึกษากฎหมายแห่งเนติบัณฑิตยสภา.
Byrne, C., & Foulkes, P. (2004). The ‘mobile phone effect’ on vowel formants. Journal of Speech Language and the Law, 11(1), 83-102.
Cahn, J. E. (1990). The generation of affect in synthesized speech. Journal of the American Voice I/O Society, electronic Publication: https://eprints.kfupm.edu.sa/70011/1/70011.pdf
Dellwo, V. (2006). Rhythm and speech rate: a variation coefficient for _C. Language and Language-processing. In P. Karnowski, & I. Szigeti (Eds). Frankfurt am Main: Peter Lang, pp. 231-241.
Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence (Doctoral Dissertation). University of Bonn.
Dellwo, V., & Koreman, J. (2008). How speaker idiosyncratic is acoustically measurable speech rhythm?, Electronic Proceedings of the annual meeting of the International Association of Forensic Phonetics and Acoustics (IAFPA), Lausanne/Switzerland.
Grabe, E., & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis, in Papers in Laboratory Phonology 7. In C. Gussenhoven, & N. Warner (Eds.). Berlin, New York: Mouton de Gruyter.
Hollien, H. (2002). Forensic voice identification. London: Taylor and Francis.
Hove, I., & Dellwo, V., (2012). The effect of articulatory obstruction on temporal characteristics of speech. Abstract Presented at the IAFPA Conference 2012, Santander.
Jessen, M. (2007). Forensic reference data on articulation rate in German. Science and Justice, (47), 50–67.
Laan, G. (1997). The contribution of intonation, segmental durations, and spectral features to the perception of a spontaneous and a read speaking style. Speech Communication, 22, 43-65.
Leemann, A., Kolly, M.-J., & Dellwo, V. (2014). Speaker-individuality in suprasegmental temporal features: Implications for forensic voice comparison. Forensic Science International, 238, 59–67.
Nolan, F. (2009). The Phonetic Bases of Forensic Speaker Identification (2nd ed.) CUP: Cambridge.
Pingjai, S., & Ishihara, S. (2013). Forensic voice comparison in Thai: a likelihood ratio-based approach using tonal acoustics. Journal of Humanities Naresuan University, 9(13), 51-66.
Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, 265-292.
Rose, P. (2002). Forensic speaker identification. London: Taylor and Francis.
Segundo, E., San, Tsanas, A., & Gómez-Vilda, P. (2017). Euclidean Distances as measures of speaker similarity including identical twin pairs: A forensic investigation using source and filter voice characteristics. Forensic Science International, (270), 25-38.
Singh, M. K., Singh, N., & Singh, A. K. (2019) Speaker's voice characteristics and similarity measurement using Euclidean Distances, International Conference on Signal Processing and Communication (ICSC), NOIDA, India, 2019, (pp. 317-322), doi: 10.1109/ICSC45622.2019.8938366.
White, L., & Mattys, S. (2007). Calibrating rhythm: first language and second language studies. Journal of Phonetics, 35, 501-522.
Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O., & Mattys, S. L. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of the Acoustical Society of America, 127, 1559-1569.
Yoon, T. J. (2010) Capturing inter-speaker invariance using statistical measures of speech rhythm, Proceedings of Speech Prosody, 5, Chicago.
Taitechawat, S., & Foulkes, P. (2011). Discrimination of speakers using tone and formant dynamics in Thai. Proceeing of 17th International Congress of Phonetic Sciences (ICPhS), pp.1975–1981.
Thakur, A. S., & Sahayam, N. (2013). Speech recognition using Euclidean Distance. International Journal of Emerging Technology and Advanced Engineering, 3(3), 587-590.