复发性神经网络可以学习过程模型结构吗？

论文标题

复发性神经网络可以学习过程模型结构吗？

Can recurrent neural networks learn process model structure?

论文作者

Peeperkorn, Jari, Broucke, Seppe vanden, De Weerdt, Jochen

论文摘要

已经提出了使用机器和深度学习的各种方法来解决预测过程监视中的不同任务，并预测正在进行的情况，例如最可能的下一个事件或后缀，其剩余时间或与结果相关的变量。复发性神经网络（RNN），更具体地说是长期短期记忆网（LSTMS），在受欢迎程度方面脱颖而出。在这项工作中，我们研究了这种LSTM实际学习事件日志的基础过程模型结构的功能。我们介绍了一个评估框架，该框架结合了基于变体的重采样和定制指标，以实现健身，精度和概括。我们评估了4个关于LSTM的学习能力，过度拟合对策的效果，训练集中的不完整水平以及基础过程模型中并行性水平的假设。我们确认，即使使用简单的流程数据和非常宽松的设置，LSTM也可能难以学习过程模型结构。采取正确的反拟合措施可以减轻问题。但是，这些措施在选择纯粹预测准确性的超参数时并不是最佳的。我们还发现，减少LSTM在训练过程中看到的信息量会导致概括和精度得分急剧下降。在我们的实验中，我们无法确定模型中并行性程度与概括能力之间的关系，但它们确实表明该过程的复杂性可能会产生影响。

Various methods using machine and deep learning have been proposed to tackle different tasks in predictive process monitoring, forecasting for an ongoing case e.g. the most likely next event or suffix, its remaining time, or an outcome-related variable. Recurrent neural networks (RNNs), and more specifically long short-term memory nets (LSTMs), stand out in terms of popularity. In this work, we investigate the capabilities of such an LSTM to actually learn the underlying process model structure of an event log. We introduce an evaluation framework that combines variant-based resampling and custom metrics for fitness, precision and generalization. We evaluate 4 hypotheses concerning the learning capabilities of LSTMs, the effect of overfitting countermeasures, the level of incompleteness in the training set and the level of parallelism in the underlying process model. We confirm that LSTMs can struggle to learn process model structure, even with simplistic process data and in a very lenient setup. Taking the correct anti-overfitting measures can alleviate the problem. However, these measures did not present themselves to be optimal when selecting hyperparameters purely on predicting accuracy. We also found that decreasing the amount of information seen by the LSTM during training, causes a sharp drop in generalization and precision scores. In our experiments, we could not identify a relationship between the extent of parallelism in the model and the generalization capability, but they do indicate that the process' complexity might have impact.

下载PDF全文

下载文献需遵守相关版权规定

论文标题