论文标题
广播音频中的对话增强和听力工作:多模式评估
Dialogue Enhancement and Listening Effort in Broadcast Audio: A Multimodal Evaluation
论文作者
论文摘要
对话增强(DE)在广播中起着至关重要的作用,从而实现了前景语音与背景音乐和效果之间相对水平的个性化。 DE已被证明可以提高经验,清晰度和自我报告的听力(LE)的质量。听力学研究已知的LE的生理指标是学生的大小。通常使用人工句子和背景噪声研究学生大小与LE之间的关系。这项工作以多模式的方式评估了DE对LE的影响,其中包括瞳孔大小(通过VR耳机跟踪)和电视的真实音频摘录。在理想的聆听条件下,28名正常听力参与者听取了30个以随机顺序呈现的音频摘录,并通过改变前景音频和背景音频之间的相对水平进行处理。这些条件之一采用最近提出的源分离系统来减弱背景,因为原始混合物是唯一输入的。听取了每次摘录后,要求受试者重复听到的句子并自我报告LE。分析了平均瞳孔扩张和峰值瞳孔扩张,并将其与自我报告和单词召回率进行比较。多模式评估显示出降低LE以及背景水平降低的一致趋势。 DE也通过源分离启用,大大降低了学生的大小以及自我报告的LE。这突出了用户尽头的个性化功能的好处。
Dialogue enhancement (DE) plays a vital role in broadcasting, enabling the personalization of the relative level between foreground speech and background music and effects. DE has been shown to improve the quality of experience, intelligibility, and self-reported listening effort (LE). A physiological indicator of LE known from audiology studies is pupil size. The relation between pupil size and LE is typically studied using artificial sentences and background noises not encountered in broadcast content. This work evaluates the effect of DE on LE in a multimodal manner that includes pupil size (tracked by a VR headset) and real-world audio excerpts from TV. Under ideal listening conditions, 28 normal-hearing participants listened to 30 audio excerpts presented in random order and processed by conditions varying the relative level between foreground and background audio. One of these conditions employed a recently proposed source separation system to attenuate the background given the original mixture as the sole input. After listening to each excerpt, subjects were asked to repeat the heard sentence and self-report the LE. Mean pupil dilation and peak pupil dilation were analyzed and compared with the self-report and the word recall rate. The multimodal evaluation shows a consistent trend of decreasing LE along with decreasing background level. DE, also when enabled by source separation, significantly reduces the pupil size as well as the self-reported LE. This highlights the benefit of personalization functionalities at the user's end.