论文标题
NSL:从嘈杂的原始数据中解释的混合解释性学习
NSL: Hybrid Interpretable Learning From Noisy Raw Data
论文作者
论文摘要
归纳逻辑编程(ILP)系统利用现有背景知识以数据有效的方式学习通用的,可解释的规则。但是,当前的ILP系统需要以结构化逻辑格式指定培训示例。神经网络从非结构化的数据中学习,尽管他们的学到的模型可能很难解释,并且在运行时很容易受到数据扰动的影响。本文介绍了一个称为NSL的混合神经符号学习框架,该框架从标记的非结构化数据中学习了可解释的规则。 NSL结合了预先训练的神经网络,用于与Fastlas(Fastlas)的特征提取,Fastlas是一个最先进的ILP系统,用于在答案集语义下进行规则学习。神经成分提取的特征定义了标记的示例的结构化上下文,而神经预测的置信度决定了示例的噪声水平。使用Fastlas的评分函数,NSL搜索在此类嘈杂示例上概括的简短,可解释的规则。我们使用MNIST数据集作为原始数据评估了有关命题和一阶分类任务的框架。具体而言,我们证明了NSL能够从扰动的MNIST数据中学习强大的规则,并且与神经网络和随机森林基线相比,在更一般和可解释的情况下相比,获得了可比或卓越的准确性。
Inductive Logic Programming (ILP) systems learn generalised, interpretable rules in a data-efficient manner utilising existing background knowledge. However, current ILP systems require training examples to be specified in a structured logical format. Neural networks learn from unstructured data, although their learned models may be difficult to interpret and are vulnerable to data perturbations at run-time. This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data. NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics. Features extracted by the neural components define the structured context of labelled examples and the confidence of the neural predictions determines the level of noise of the examples. Using the scoring function of FastLAS, NSL searches for short, interpretable rules that generalise over such noisy examples. We evaluate our framework on propositional and first-order classification tasks using the MNIST dataset as raw data. Specifically, we demonstrate that NSL is able to learn robust rules from perturbed MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines whilst being more general and interpretable.