因此，我知道我得分：使用基于域专业知识的约束来对评分功能的无标签制作

论文标题

因此，我知道我得分：使用基于域专业知识的约束来对评分功能的无标签制作

I Know Therefore I Score: Label-Free Crafting of Scoring Functions using Constraints Based on Domain Expertise

论文作者

Palakkadavath, Ragja, Sivaprasad, Sarath, Karande, Shirish, Pedanekar, Niranjan

论文摘要

几种现实生活中的应用需要从测量的观测值中制作简洁的定量评分功能（也称为评分系统）。例如，需要使用许多参与指标为广告活动创建有效性分数。专家通常需要在没有标记数据的情况下创建此类评分功能，在这些数据中，分数需要反映域专家所理解的业务洞察力和规则。没有一种系统地捕获这些输入的方法，这将成为一个涉及反复试验的耗时的过程。在本文中，我们引入了一种无标签的实用方法，以从多维数值数据中学习评分函数。该方法以易于观察和特定的约束形式结合了域专家的洞察力和业务规则，这些约束被机器学习模型用作薄弱的监督。我们将这些约束转换为损失函数，这些损失函数在学习评分函数时同时优化。我们使用合成数据集和四个现实生活数据集研究了该方法的功效，还比较了它如何执行有关监督的学习模型。

Several real-life applications require crafting concise, quantitative scoring functions (also called rating systems) from measured observations. For example, an effectiveness score needs to be created for advertising campaigns using a number of engagement metrics. Experts often need to create such scoring functions in the absence of labelled data, where the scores need to reflect business insights and rules as understood by the domain experts. Without a way to capture these inputs systematically, this becomes a time-consuming process involving trial and error. In this paper, we introduce a label-free practical approach to learn a scoring function from multi-dimensional numerical data. The approach incorporates insights and business rules from domain experts in the form of easily observable and specifiable constraints, which are used as weak supervision by a machine learning model. We convert such constraints into loss functions that are optimized simultaneously while learning the scoring function. We examine the efficacy of the approach using a synthetic dataset as well as four real-life datasets, and also compare how it performs vis-a-vis supervised learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题