Text manipulation is one of the most common tasks for everyone using a computer. The increasing number of textual information in electronic format that every computer user collects everyday also increases the need of more powerful tools to interact with texts. Indeed, much work has been done to provide simple and versatile tools that can be useful for the most common text manipulation tasks. Regular Expressions (RE), introduced by Kleene, are well known in the formal language theory. RE have been extended in various ways, depending on the application of interest. In almost all the implementations of RE search algorithms (e.g. the egrep [15] UNIX command, or the Perl [20] language pattern matching constructs) we find backreferences, i.e. expressions that make reference to the string matched by a previous subexpression. Generally speaking, it seems that all kinds of synchronizations between subexpressions in a RE can be very useful when interacting with texts. In this paper we introduce the Synchronized Regular Expressions (SRE) as an extension of the Regular Expressions. We use SRE to present a formal study of the already known backreferences extension, and of a new extension proposed by us, which we call the synchronized exponents. Moreover, since we are dealing with formalisms that should have a practical utility and be used in real applications, we have the problem of how to present SRE to the final users. Therefore, in this paper we also propose a user-friendly syntax for SRE to be used in implementations of SRE-powered search algorithms.

Synchronized Regular Expressions

DELLA PENNA, GIUSEPPE;
2003

Abstract

Text manipulation is one of the most common tasks for everyone using a computer. The increasing number of textual information in electronic format that every computer user collects everyday also increases the need of more powerful tools to interact with texts. Indeed, much work has been done to provide simple and versatile tools that can be useful for the most common text manipulation tasks. Regular Expressions (RE), introduced by Kleene, are well known in the formal language theory. RE have been extended in various ways, depending on the application of interest. In almost all the implementations of RE search algorithms (e.g. the egrep [15] UNIX command, or the Perl [20] language pattern matching constructs) we find backreferences, i.e. expressions that make reference to the string matched by a previous subexpression. Generally speaking, it seems that all kinds of synchronizations between subexpressions in a RE can be very useful when interacting with texts. In this paper we introduce the Synchronized Regular Expressions (SRE) as an extension of the Regular Expressions. We use SRE to present a formal study of the already known backreferences extension, and of a new extension proposed by us, which we call the synchronized exponents. Moreover, since we are dealing with formalisms that should have a practical utility and be used in real applications, we have the problem of how to present SRE to the final users. Therefore, in this paper we also propose a user-friendly syntax for SRE to be used in implementations of SRE-powered search algorithms.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11697/6916
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 11
social impact