论文标题

技术报告:匹配引用正则表达式和镜头

Technical Report: Match-reference regular expressions and lenses

论文作者

Musca, Jeanne-Marie, Miltner, Anders, Fisher, Kathleen, Walker, David

论文摘要

镜头是一个单个程序,一次指定两个数据转换:一个转换将数据从源格式转换为目标格式,第二个转换将过程反转。在过去的十年中,研究人员开发了许多具有不同特性的镜头。一类此类语言通过普通语言运作。换句话说,这些镜头将从一种常规语言绘制的字符串转换为从另一种常规语言中绘制的字符串(再次返回)。在本文中,我们定义了一种更强大的镜头语言,我们称之为匹配引用镜片,该语言能够在包含重复子字样的非规范格式之间转换,这是一种原始的依赖形式。为了定义非规范格式本身,我们开发了一种新语言,匹配引用正则表达式,这是可以将变量绑定到子字符串并反复使用这些子字样的正则表达式。这些匹配引用的正则表达式与熟悉的``反向引用''密切相关,这些``反向引用''可以在传统的正则表达式包中找到,但经过重新设计以遵守传统的编程语言语言词汇范围范围,并与镜头语言基础结构平稳互动。我们定义了对匹配的表达式和匹配匹配的匹配式匹配的匹配序列。自动机系统(MRRAS),在语言匹配引用正则表达式中确定字符串成员资格。

A lens is a single program that specifies two data transformations at once: one transformation converts data from source format to target format and a second transformation inverts the process. Over the past decade, researchers have developed many different kinds of lenses with different properties. One class of such languages operate over regular languages. In other words, these lenses convert strings drawn from one regular language to strings drawn from another regular language (and back again). In this paper, we define a more powerful language of lenses, which we call match-reference lenses, that is capable of translating between non-regular formats that contain repeated substrings, which is a primitive form of dependency. To define the non-regular formats themselves, we develop a new language, match-reference regular expressions, which are regular expressions that can bind variables to substrings and use those substrings repeatedly. These match-reference regular expressions are closely related to the familiar ``back-references" that can be found in traditional regular expression packages, but are redesigned to adhere to conventional programming language lexical scoping conventions and to interact smoothly with lens language infrastructure. We define the semantics of match-reference regular expressions and match-reference lenses. We also define a new kind of automaton, the match-reference regex automaton system (MRRAS), for deciding string membership in the language match-reference regular expressions. We illustrate our definitions with a variety of examples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源