Member-only story

A Brief Guide to Regular Expressions

Vassilios Karakoidas
7 min readJun 19, 2023

--

Regular expressions is a DSL (Domain-specific Language), which is used to match patterns of text. They are everywhere. All major programming languages have a built-in library that supports a flavor of regular expressions. The following article attempts to explore the origins of regular expressions and present their basic syntactic variations.

Origins of Regular Expressions

The theoretical background of regular expressions lies within automaton theory and formal languages. Regular expressions belong to a type 3 grammar of the Chomsky hierarchy. This hierarchy, described by Chomsky in 1956 [1], provides a categorization of grammars that describe formal languages.

In 1943 Warren McCulloch and Walter Pitts described the human neural system using automata [2]. The mathematician Stephen Kleene described the proposed models with a mathematical notation named regular sets [3]. Later Brzozowski [4] provided mathematical definitions for the Kleene regular expressions formalism, and introduced ways to convert regular expressions in to state diagrams. In the late 1960s, Ken Thompson proposed a compiler that translated a regular expression into the assembly language of an IBM 7094 processor [5]. Later on, he implemented the regular sets in his text editor named qed, and afterwards on ed, which became part of the UNIX…

--

--

Vassilios Karakoidas
Vassilios Karakoidas

Written by Vassilios Karakoidas

Software Engineer, Software Architect, Gamer and Researcher. Opinions are my own.

No responses yet