传感器控制在动态，稀疏和部分观察到的环境中信息增益

论文标题

传感器控制在动态，稀疏和部分观察到的环境中信息增益

Sensor Control for Information Gain in Dynamic, Sparse and Partially Observed Environments

论文作者

Burns, J. Brian, Sundaresan, Aravind, Sequeira, Pedro, Sadhu, Vidyasagar

论文摘要

我们提出了一种自主传感器控制的方法，用于在部分可观察，动态和稀疏采样的环境下收集信息，以最大程度地提高有关该空间中存在的实体的信息。我们描述了我们执行射频（RF）频谱监视任务的方法，其目标是在环境中搜索和跟踪未知的动态信号。为此，我们通过（1）使用新的信息增益奖励来改善稀疏，非平稳环境中的探索，并通过（2）使用混合卷积的神经层来监测复杂的，动态的活动模式。我们还将此问题扩展到了情况，在这种情况下，从预期的RF频谱/字段进行采样受到限制，并提出了基于模型的原始RL算法的基于模型的版本，该版本是通过迭代从有限的字段采样中迭代改进的模型来微调控制器的。模拟复杂性的模拟RF环境的结果表明，我们的系统比标准的DAN体系结构优于标准的DAN体系结构，并且比基线专家设计的代理更灵活，更健壮。我们还表明，它适用于非平稳发射环境。

We present an approach for autonomous sensor control for information gathering under partially observable, dynamic and sparsely sampled environments that maximizes information about entities present in that space. We describe our approach for the task of Radio-Frequency (RF) spectrum monitoring, where the goal is to search for and track unknown, dynamic signals in the environment. To this end, we extend the Deep Anticipatory Network (DAN) Reinforcement Learning (RL) framework by (1) improving exploration in sparse, non-stationary environments using a novel information gain reward, and (2) scaling up the control space and enabling the monitoring of complex, dynamic activity patterns using hybrid convolutional-recurrent neural layers. We also extend this problem to situations in which sampling from the intended RF spectrum/field is limited and propose a model-based version of the original RL algorithm that fine-tunes the controller via a model that is iteratively improved from the limited field sampling. Results in simulated RF environments of differing complexity show that our system outperforms the standard DAN architecture and is more flexible and robust than baseline expert-designed agents. We also show that it is adaptable to non-stationary emission environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题