食谱域中的文本提取成分

论文标题

食谱域中的文本提取成分

Ingredient Extraction from Text in the Recipe Domain

论文作者

Dharawat, Arkin, Doan, Chris

论文摘要

近年来，我们客厅和厨房中虚拟助手（例如：Siri，Google Home，Alexa）的设备数量增加了。因此，这些设备会收到有关食谱的几个疑问。所有这些查询将包含与“食谱域”有关的术语，即：它们将包含餐具，成分，烹饪时间，饮食偏好等。从查询中提取这些与食谱相关的方面，因此在满足用户信息需求时变得很重要。我们的项目着重于从这种普通文本用户话语中提取成分。我们表现最好的模型是一个微调的BERT，其F1分数为95.01美元。我们已经在GitHub存储库中发布了所有代码。

In recent years, there has been an increase in the number of devices with virtual assistants (e.g: Siri, Google Home, Alexa) in our living rooms and kitchens. As a result of this, these devices receive several queries about recipes. All these queries will contain terms relating to a "recipe-domain" i.e: they will contain dish-names, ingredients, cooking times, dietary preferences etc. Extracting these recipe-relevant aspects from the query thus becomes important when it comes to addressing the user's information need. Our project focuses on extracting ingredients from such plain-text user utterances. Our best performing model was a fine-tuned BERT which achieved an F1-score of $95.01$. We have released all our code in a GitHub repository.

下载PDF全文

下载文献需遵守相关版权规定

论文标题