论文标题

部分可观测时空混沌系统的无模型预测

Multiclass classification utilising an estimated algorithmic probability prior

论文作者

Dingle, Kamaludin, Batlle, Pau, Owhadi, Houman

论文摘要

模式识别和机器学习的方法广泛应用于科学,技术和社会。因此,相关理论的任何进步都可能转化为大规模影响。在这里,我们探讨算法信息理论,尤其是算法概率如何有助于机器学习任务。我们研究了一个多类监督分类问题,即学习RNA分子序列对形状图,其中不同的形状被视为类别。这项工作的主要动机是概念示例的证明,在其中,可以通过近似算法的概率来帮助一个具体,动机的机器学习任务。我们的方法基于直接估算形状复杂性的类(即形状)概率,并将估计概率用作高斯过程学习问题的先验。自然,有了大量的培训数据,先验对分类准确性没有显着影响,但是在很小的培训数据方案中,我们表明,使用先验可以大大提高分类精度。据我们所知,这项工作是最早证明算法概率如何有助于混凝土,现实世界,机器学习问题的工作之一。

Methods of pattern recognition and machine learning are applied extensively in science, technology, and society. Hence, any advances in related theory may translate into large-scale impact. Here we explore how algorithmic information theory, especially algorithmic probability, may aid in a machine learning task. We study a multiclass supervised classification problem, namely learning the RNA molecule sequence-to-shape map, where the different possible shapes are taken to be the classes. The primary motivation for this work is a proof of concept example, where a concrete, well-motivated machine learning task can be aided by approximations to algorithmic probability. Our approach is based on directly estimating the class (i.e., shape) probabilities from shape complexities, and using the estimated probabilities as a prior in a Gaussian process learning problem. Naturally, with a large amount of training data, the prior has no significant influence on classification accuracy, but in the very small training data regime, we show that using the prior can substantially improve classification accuracy. To our knowledge, this work is one of the first to demonstrate how algorithmic probability can aid in a concrete, real-world, machine learning problem.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源