论文标题
IOBES:用于跨度处理的库
iobes: A Library for Span-Level Processing
论文作者
论文摘要
自然语言处理中的许多任务,例如指定的实体识别和插槽填充,涉及识别和标记特定文本跨度。为了利用通用模型,这些任务通常是序列标记任务的重新铸造。给出每个令牌标签,这些标签以特殊令牌(例如b-或i-)前缀。模型将标签分配给每个令牌后,这些前缀用于将令牌分组为跨度。 正确解析这些注释对于制作公平和可比的指标至关重要。但是,尽管它很重要,但并没有一个易于使用的,标准化的,可以集成的库来帮助处理跨度标签。为了解决这个问题,我们介绍了我们的开源库Iobes。 IOBE用于解析,转换和处理跨度,表示为令牌级别的决策。
Many tasks in natural language processing, such as named entity recognition and slot-filling, involve identifying and labeling specific spans of text. In order to leverage common models, these tasks are often recast as sequence labeling tasks. Each token is given a label and these labels are prefixed with special tokens such as B- or I-. After a model assigns labels to each token, these prefixes are used to group the tokens into spans. Properly parsing these annotations is critical for producing fair and comparable metrics; however, despite its importance, there is not an easy-to-use, standardized, programmatically integratable library to help work with span labeling. To remedy this, we introduce our open-source library, iobes. iobes is used for parsing, converting, and processing spans represented as token level decisions.