论文标题
个人热舒适模型的有条件合成数据生成
Conditional Synthetic Data Generation for Personal Thermal Comfort Models
论文作者
论文摘要
个人热舒适模型旨在预测个人的热舒适响应,而不是大组的平均响应。最近,机器学习算法已被证明具有巨大的潜力,作为个人热舒适模型的候选人。但是,通常在建筑物的正常设置中,通过实验获得的个人热舒适数据是严重的班级影响。与“首选温暖器”和“偏爱冷却器”类别相比,“不更改”类的数据样本数量不成比例。在现实世界中部署时,经过此类类别数据的数据训练的机器学习算法会表现出色。为了使用上述类不平衡数据以及隐私数据共享开发基于机器学习的应用程序,我们建议实现最先进的条件合成数据生成器,以生成与低频类相对应的合成数据。通过实验,我们表明生成的合成数据具有模拟实际数据分布的分布。提出的方法可以扩展到其他智能建筑数据集/用例。
Personal thermal comfort models aim to predict an individual's thermal comfort response, instead of the average response of a large group. Recently, machine learning algorithms have proven to be having enormous potential as a candidate for personal thermal comfort models. But, often within the normal settings of a building, personal thermal comfort data obtained via experiments are heavily class-imbalanced. There are a disproportionately high number of data samples for the "Prefer No Change" class, as compared with the "Prefer Warmer" and "Prefer Cooler" classes. Machine learning algorithms trained on such class-imbalanced data perform sub-optimally when deployed in the real world. To develop robust machine learning-based applications using the above class-imbalanced data, as well as for privacy-preserving data sharing, we propose to implement a state-of-the-art conditional synthetic data generator to generate synthetic data corresponding to the low-frequency classes. Via experiments, we show that the synthetic data generated has a distribution that mimics the real data distribution. The proposed method can be extended for use by other smart building datasets/use-cases.