Member-only story
A Brief Guide to Regular Expressions
Regular expressions is a DSL (Domain-specific Language), which is used to match patterns of text. They are everywhere. All major programming languages have a built-in library that supports a flavor of regular expressions. The following article attempts to explore the origins of regular expressions and present their basic syntactic variations.
Origins of Regular Expressions
The theoretical background of regular expressions lies within automaton theory and formal languages. Regular expressions belong to a type 3 grammar of the Chomsky hierarchy. This hierarchy, described by Chomsky in 1956 [1], provides a categorization of grammars that describe formal languages.
In 1943 Warren McCulloch and Walter Pitts described the human neural system using automata [2]. The mathematician Stephen Kleene described the proposed models with a mathematical notation named regular sets [3]. Later Brzozowski [4] provided mathematical definitions for the Kleene regular expressions formalism, and introduced ways to convert regular expressions in to state diagrams. In the late 1960s, Ken Thompson proposed a compiler that translated a regular expression into the assembly language of an IBM 7094 processor [5]. Later on, he implemented the regular sets in his text editor named qed, and afterwards on ed, which became part of the UNIX…