|
|
Comparative Study of AI-Synthesized Speech and Natural Speech
LIAO Fangling, CHEN Manqing, CHEN Shengxiang, GUO Yuhang, YANG Yingcang, MU Fan
2026(2):
38-45.
DOI: 10.3969/j.issn.1671-2072.2026.02.005
Objective With the rapid development of artificial intelligence (AI)-synthesized speech technology, its detectability in forensic appraisal has become a key issue. This study systematically analyzes the differential features between AI-synthesized speech and natural speech through a two-dimensional comparative study of auditory perception and acoustic quantification, thereby providing an effective reference for the identification, prevention, inspection, and appraisal of synthesized speech in judicial practice. Methods In the auditory test, the Likert-type Scale was used to rate the consistency between natural speech and synthesized speech. Acoustic tests were conducted by extracting feature parameters such as fundamental frequency, formants, sound intensity, and duration using the Praat speech analysis software. Combined with SPSS 27 statistical analysis software, a paired-sample t-test was conducted to quantify the differences between natural speech and synthesized speech. Results Compared with natural speech, AI-synthesized speech exhibited poorer performance in terms of auditory features such as monosyllabic integrity, retroflex features, stress, speech rate, and fluency. Statistical analysis of the acoustic testing showed that there were significant differences in fundamental frequency and formants, while sound intensity and duration showed no significant differences. Conclusion The combined application of “human ear preliminary screening” and acoustic quantification two-dimensional testing techniques in forensic appraisal can effectively distinguish AI-synthesized speech from natural speech, providing technical support for the inspection and appraisal of AI-synthesized speech.
References |
Related Articles |
Metrics
|