论文标题
一项针对音景情绪线性模型的详尽可变选择研究:排名和吉布斯分析
An exhaustive variable selection study for linear models of soundscape emotions: rankings and Gibbs analysis
论文作者
论文摘要
在过去的十年中,音景已成为声学中最活跃的主题之一,为声学环境提供了整体方法,涉及人类的感知和背景。声景所吸引的情绪是中心的,并且基本微妙且没有引起人们的注意(与言语或音乐相比)。目前,音景情感识别是文献中非常活跃的话题。我们向众所周知的数据集(Emo-Soundscapes)提供了详尽的可变选择研究(即声音景观指标的选择)。我们考虑了两个音景描述符的线性音景情感模型:唤醒和价。 应用了几种选择变量数量的排名方案和过程。我们还执行了一种交替的优化方案,以获取最佳序列,以保持固定一定数量的特征。此外,我们设计了一种基于Gibbs采样的新技术,该技术对每个变量的相关性提供了更完整,更清晰的看法。最后,我们还将我们的结果与基于P值的经典方法获得的分析进行了比较。作为我们的研究结果,我们建议将两个简单的和简约的线性模型分别用于两个输出(唤醒和价值),分别为7和16个变量(在122个可能的特征之内)。建议的线性模型提供了非常良好的竞争性能,分别具有$ r^2> 0.86 $和$ r^2> 0.63 $(分别在交叉验证过程后获得的值)。
In the last decade, soundscapes have become one of the most active topics in Acoustics, providing a holistic approach to the acoustic environment, which involves human perception and context. Soundscapes-elicited emotions are central and substantially subtle and unnoticed (compared to speech or music). Currently, soundscape emotion recognition is a very active topic in the literature. We provide an exhaustive variable selection study (i.e., a selection of the soundscapes indicators) to a well-known dataset (emo-soundscapes). We consider linear soundscape emotion models for two soundscapes descriptors: arousal and valence. Several ranking schemes and procedures for selecting the number of variables are applied. We have also performed an alternating optimization scheme for obtaining the best sequences keeping fixed a certain number of features. Furthermore, we have designed a novel technique based on Gibbs sampling, which provides a more complete and clear view of the relevance of each variable. Finally, we have also compared our results with the analysis obtained by the classical methods based on p-values. As a result of our study, we suggest two simple and parsimonious linear models of only 7 and 16 variables (within the 122 possible features) for the two outputs (arousal and valence), respectively. The suggested linear models provide very good and competitive performance, with $R^2>0.86$ and $R^2>0.63$ (values obtained after a cross-validation procedure), respectively.