城市狂想曲：城市音景的大规模探索

论文标题

城市狂想曲：城市音景的大规模探索

Urban Rhapsody: Large-scale exploration of urban soundscapes

论文作者

Rulff, Joao, Miranda, Fabio, Hosseini, Maryam, Lage, Marcos, Cartwright, Mark, Dove, Graham, Bello, Juan, Silva, Claudio T.

论文摘要

噪声是城市环境中主要生活质量问题之一。除了烦恼之外，噪声还会对公共卫生和教育表现产生负面影响。虽然可以部署低成本传感器来监视高时间分辨率下的环境噪声水平，但它们产生的数据量以及这些数据的复杂性构成了重大的分析挑战。解决这些挑战的一种方法是通过机器听力技术，该技术用于提取特征，以试图对噪声的来源进行分类并了解城市噪音情况的时间模式。但是，城市环境中的大量噪声源和标记数据的稀缺性几乎不可能创建具有足够大词汇的分类模型，以捕获本文中城市音景的真实动态，我们首先确定了尚未探索的城市声音景观探索领域中的一组需求。 To satisfy the requirements and tackle the identified challenges, we propose Urban Rhapsody, a framework that combines state-of-the-art audio representation, machine learning, and visual analytics to allow users to interactively create classification models, understand noise patterns of a city, and quickly retrieve and label audio excerpts in order to create a large high-precision annotated database of urban sound recordings.我们使用域专家进行的案例研究，使用在纽约市一个独一无二的传感器网络的五年部署中生成的数据来证明该工具的实用性。

Noise is one of the primary quality-of-life issues in urban environments. In addition to annoyance, noise negatively impacts public health and educational performance. While low-cost sensors can be deployed to monitor ambient noise levels at high temporal resolutions, the amount of data they produce and the complexity of these data pose significant analytical challenges. One way to address these challenges is through machine listening techniques, which are used to extract features in attempts to classify the source of noise and understand temporal patterns of a city's noise situation. However, the overwhelming number of noise sources in the urban environment and the scarcity of labeled data makes it nearly impossible to create classification models with large enough vocabularies that capture the true dynamism of urban soundscapes In this paper, we first identify a set of requirements in the yet unexplored domain of urban soundscape exploration. To satisfy the requirements and tackle the identified challenges, we propose Urban Rhapsody, a framework that combines state-of-the-art audio representation, machine learning, and visual analytics to allow users to interactively create classification models, understand noise patterns of a city, and quickly retrieve and label audio excerpts in order to create a large high-precision annotated database of urban sound recordings. We demonstrate the tool's utility through case studies performed by domain experts using data generated over the five-year deployment of a one-of-a-kind sensor network in New York City.

下载PDF全文

下载文献需遵守相关版权规定

论文标题