确保数据驱动的知识发现模型在生产和制造中的鲁棒性和可靠性

论文标题

确保数据驱动的知识发现模型在生产和制造中的鲁棒性和可靠性

Ensuring the Robustness and Reliability of Data-Driven Knowledge Discovery Models in Production and Manufacturing

论文作者

Tripathi, Shailesh, Muhr, David, Manuel, Brunner, Emmert-Streib, Frank, Jodlbauer, Herbert, Dehmer, Matthias

论文摘要

强大，稳定和以用户为中心的数据分析和机器学习模型的实施面临生产和制造中的许多挑战。因此，需要一种系统的方法来开发，评估和部署此类模型。数据驱动的知识发现框架提供了数据挖掘过程的有序分区，以确保数据分析和机器学习模型的实际实施。但是，强大的特定于行业的数据驱动知识发现模型的实际应用面临着多个数据以及与模型开发有关的问题。这些问题应通过允许灵活，定制和特定于行业的知识发现框架来仔细解决这些问题；在我们的情况下，这采用了数据挖掘（CRISP-DM）的跨行业标准过程的形式。该框架旨在确保不同阶段之间的积极合作，以充分解决与数据相关的问题。在本文中，我们回顾了CRISP-DM模型和各种数据合格的几种扩展 - 以及机器学习中与模型相关的问题，这些问题目前由于数据驱动的知识发现模型的局限而在数据专家和业务专家之间缺乏适当的合作。

The implementation of robust, stable, and user-centered data analytics and machine learning models is confronted by numerous challenges in production and manufacturing. Therefore, a systematic approach is required to develop, evaluate, and deploy such models. The data-driven knowledge discovery framework provides an orderly partition of the data-mining processes to ensure the practical implementation of data analytics and machine learning models. However, the practical application of robust industry-specific data-driven knowledge discovery models faces multiple data-- and model-development--related issues. These issues should be carefully addressed by allowing a flexible, customized, and industry-specific knowledge discovery framework; in our case, this takes the form of the cross-industry standard process for data mining (CRISP-DM). This framework is designed to ensure active cooperation between different phases to adequately address data- and model-related issues. In this paper, we review several extensions of CRISP-DM models and various data-robustness-- and model-robustness--related problems in machine learning, which currently lacks proper cooperation between data experts and business experts because of the limitations of data-driven knowledge discovery models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题