主管:中华人民共和国司法部
主办:司法鉴定科学研究院
ISSN 1671-2072  CN 31-1863/N

中国司法鉴定 ›› 2022 ›› Issue (4): 54-59.DOI: 10.3969/j.issn.1671-2072.2022.04.007

• 鉴定科学 • 上一篇    下一篇

声纹鉴定中嗓音音质的声学界标初探——基于随机森林和决策树模型的研究

耿浦洋,施少培,郭 弘,等   

  1. 司法鉴定科学研究院 上海市司法鉴定专业技术服务平台 司法部司法鉴定重点实验室,上海 200063
  • 收稿日期:2021-01-22 出版日期:2022-07-15 发布日期:2022-08-15
  • 通讯作者: 施少培(1962—),男,正高级工程师,主要从事刑事技术研究工作。E-mail:shisp@ssfjd.cn
  • 作者简介:耿浦洋(1989—),男,助理研究员,博士,主要从事声纹、生理语音学研究。E-mail:gengpy@ssfjd.cn
  • 基金资助:
    国家社科基金青年项目(21CYY011);中央级科研院所公益项目(GY2021G-9,GY2019G-2,GY2018G-4);上海市司法鉴定专业技术服务平台资助项目(19DZ2292700)。

A Preliminary Study on the Acoustic Landmark of Voice Quality in Voiceprint Identification— A Study Based on Random Forest and Decision Tree Model

GENG Puyang, SHI Shaopei, GUO Hong, et al   

  1. Shanghai Forensic Service Platform, Key Laboratory of Forensic Science, Ministry of Justice, Academy of  Forensic Science, Shanghai 200063, China
  • Received:2021-01-22 Published:2022-07-15 Online:2022-08-15

摘要: 目的 嗓音音质是声纹鉴定的重要参考特征之一。但目前鉴定实践中关于嗓音音质的类别判断尚缺乏客观数据支撑。方法 基于随机森林和决策树模型,利用18个声学参数对4种嗓音音质(即正常嗓音、嘎裂嗓音、气嗓音和假嗓音)的声学界标进行探索。结果 随机森林结果显示,嗓音类别的判别准确率为90.7 %,基频、整字时长、谐噪比(HNR)、基频/振幅抖动、以及第一谐波和第三振幅差值(H1-A3)对于嗓音判别的贡献度较大;决策树模型结果显示,4种嗓音类别可以通过三个决策点(即HNR、基频均值和H1-A3)加以区分,嗓音判别正确率在75 %以上。结论 基于基频、谐噪比和谐波差值等参数可以实现较好的嗓音判别,且不同嗓音之间的声学界标对于声纹鉴定中嗓音类别判断具有较好的参考价值和可行性。

关键词: 嗓音音质, 声学界标, 随机森林, 决策树模型

Abstract: Objective Voice quality serves as one of the most important features in forensic voice comparison. However, the acoustic evidence to define voice quality type is still under study. This study aims at establishing a method to define voice quality. Method Based on random forest and decision tree model, the current paper investigated the acoustic landmarks of four types of voice quality i.e., normal, creaky, breathy, and falsettousing 18 acoustic parameters. Results The random forest analysis received 90.7% accurate results of voice quality classification, and fundamental frequency F0, duration, HNR, and H1-A3 are salient factors that contributed to the classification. The results of decision tree model showed that the four types of voice quality could be reasonably classified i.e., accuracy is above 75%based on three decision nodes i.e., HNR, F0 mean, and H1-A3. Conclusion A promising result of voice quality classification could be achieved based on F0, HNR, H1-A3, and etc. The application acoustic landmarks of voice quality could be an effective and significant method for forensic voice comparison practic.

Key words: voice quality, acoustic landmark, random forest, decision tree model

中图分类号: