\[ \newcommand{\ListType}{{\text{List}}} \newcommand{\ListEmpty}{{\left[\right]}} \newcommand{\ListLength}[1]{{\ell\left(#1\right)}} \newcommand{\ListAt}[2]{{{#1}_{#2}}} \]
Home

Data Definition Language

1 Introduction

This document is the specification of the Data Definition Language. Programs of this language are sequences of Unicode code points and describe structured data for the purpose of for exchanging that data between entities (humans and machines alike). The programs describes data in terms of typed value and this specificiation describes two aspects; language provides scalar types (boolean type, number type, string type, and void type) as well as aggregate types (map values and list values). This specification describes the translation from Unicode code points to typed values.

2 Translation

A program of the DDL language is translated into values. This translation happens in three phases: The lexical translation translates Unicode code points to words. The syntactical translation filters the resulting sequence of words and then translates these words into a sentence. The semantical translation translates sentences into values t.

3 Lexical Translation

The lexical translation translates a sequence of Unicode code points provided as input a sequence of words.

The lexical translation of the Data Definition Language is based on the Common Lexical Translations (see https://michaelheilmann.com/specifications/common-lexical-translations for more information).

The lexical grammar consists of

The two rules, defined in terms of the Common Lexical translations, are defined as follows:

\[\begin{array}{ll} \text{DDL.Lexical.Words} &: \text{DDL.Lexical.Word}^*\\ \text{DDL.Lexical.Word} &:\;\text{Lexical.Boolean}\\ &|\;\text{Lexical.Number}\\ &|\;\text{Lexical.String}\\ &|\;\text{Lexical.Void}\\ &|\;\text{Lexical.Name}\\ &|\;\text{Lexical.LeftCurlyBracket}\\ &|\;\text{Lexical.RightCurlyBracket}\\ &|\;\text{Lexical.LeftSquareBracket}\\ &|\;\text{Lexical.RightSquareBracket}\\ &|\;\text{Lexical.Comma}\\ &|\;\text{Lexical.Whitespace}\\ &|\;\text{Lexical.Newline}\\ &|\;\text{Lexical.Comment}\\ \end{array}\]

The lexical translation translates a sequence of Unicode code points into words. This resulting sequence of words is then consumed by the syntactical translation.

4 Syntactical Translation

The syntactical translation translates a sequence of words provided by the lexical translation into a sentence.

The syntactical grammar consists of

Important:The following words are removed from the sequence of words before its translation into a sentence:

\[\begin{aligned} &\text{DDL.Syntactical.Sentence} : \text{DDL.Syntactical.Value} \end{aligned}\]

4.1 DDL.Syntactical.Value

The sentence \(\text{DDL.Syntactical.Value}\) is defined by

\[\begin{aligned} &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.Map}\\ &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.List}\\ &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.String}\\ &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.Number}\\ &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.Boolean}\\ &\text{DDL.Syntactical.Value} : \text{DDL.Syntactical.Void} \end{aligned}\]

4.2 DDL.Syntactical.String

The sentence \(\text{DDL.Syntactical.String}\) is defined by

\[\begin{aligned} &\text{DDL.Syntactical.String} : \text{Lexical.String} \end{aligned}\]

4.2 DDL.Syntactical.Number

The sentence \(\text{DDL.Syntactical.Number}\) is

\[\begin{aligned} &\text{DDL.Syntactical.Number} : \text{Lexical.Number} \end{aligned}\]

4.3 DDL.Syntactical.Boolean

The sentence \(\text{DDL.Syntactical.Boolean}\) is

\[\begin{aligned} &\text{DDL.Syntactical.Boolean} : \text{Lexical.Boolean} \end{aligned}\]

4.4 DDL.Syntactical.Void

The sentence \(\text{DDL.Syntactical.Void}\) is

\[\begin{aligned} &\text{DDL.Syntactical.Void} : \text{Lexical.Void} \end{aligned}\]

4.5 DDL.Syntactical.Map

The sentence \(\text{DDL.Syntactical.Map}\) is

\[\begin{aligned} &\text{DDL.Syntactical.Map} :\text{Lexical.LeftCurlyBracket}\;\text{DDL.Syntactical.MapBody}\;\text{Lexical.RightCurlyBracket}\\ \\ &\text{DDL.Syntactical.MapBody} : \text{DDL.Syntactical.MapBodyElement}\;\text{DDL.Syntactical.MapBodyRest}\\ &\text{DDL.Syntactical.MapBody} : \epsilon\\ \\ &\text{DDL.Syntactical.MapBodyRest} : \text{Lexical.Comma}\;\text{DDL.Syntactical.MapBodyElement}\;\text{DDL.Syntactical.MapBodyRest}\\ &\text{DDL.Syntactical.MapBodyRest} : \text{Lexical.Comma}\\ &\text{DDL.Syntactical.MapBodyRest} : \epsilon\\ &\text{DDL.Syntactical.MapBodyElement} : \text{Lexical.Name}\;\text{Lexical.Colon}\;\text{DDL.Syntactical.Value} \end{aligned}\]

4.6 DDL.Syntactical.List

The sentence \(\text{DDL.Syntactical.List}\) is

\[\begin{aligned} &\text{DDL.Syntactical.List} : \text{Lexical.LeftSquareBracket}\; \text{DDL.Syntactical.ListBody}\; \text{Lexical.RightSquareBracket}\\ \\ &\text{DDL.Syntactical.ListBody} : \text{DDL.Syntactical.ListBodyElement}\; \text{DDL.Syntactical.ListBodyRest}\\ &\text{DDL.Syntactical.ListBody} : \epsilon\\ \\ &\text{DDL.Syntactical.ListBodyRest} : \text{Lexical.Comma}\; \text{DDL.Syntactical.ListBodyElement}\; \text{DDL.Syntactical.ListBodyRest}\\ &\text{DDL.Syntactical.ListBodyRest} : \text{Lexical.Comma}\\ &\text{DDL.Syntactical.ListBodyRest} : \epsilon\\ \\ &\text{DDL.Syntactical.ListBodyElement} : \text{DDL.Syntactical.Value} \end{aligned}\]

The syntatical translation translates a sequence of words into one sentence. This resulting sntence is then consumed by the semantical translation.

5 Semantical Translation

The semantical translation a sentence provided by the syntactical translation into a typed value. The Data Definition Language knows six basic types \(\textit{List}\) and \(\textit{Map}\), which are the so called aggregate types, and \(\textit{Boolean}\), \(\textit{Number}\), \(\textit{String}\), and \(\textit{Void}\), which are the so called scalar types.

The type \(\textit{Value}\) is defined as the union of all the types above. \[\begin{aligned} \textit{Value} =&\;\textit{List}\\ \cup&\;\textit{Map}\\ \cup&\;\textit{Boolean}\\ \cup&\;\textit{Number}\\ \cup&\;\textit{String}\\ \cup&\;\textit{Void} \end{aligned}\]

The translation of a sentence into values is described by syntax-directed translations (see Aho, Seti, Ullman: Compilers, Principles, Techniques, and Tools; 1st; pp. 305 for more information).

At the end of a translation, the input syntactic form \(x\) has a variable \(x.\text{value}\) which is either a value of type \textit{Value}. Furthermore, for each syntactic form, we define two attributes: \(\text{value}\) is the computed value of the syntactic form. \(\textit{codePoints}\) is the sequence of code points associated with the syntactic form.

4.1 Scalar Types

This section defines \(\sigma\) for the translation of scalar types.

4.1.1 Boolean Type

The type \(\textit{Boolean}\) type has two values \(\textit{true}\) and \(\textit{false}\) which are expressed in the language by the words \(\texttt{DDL.Lexical.true}\) and \(\texttt{DDL.Lexical.false}\), respectively (as defined in the syntactical grammar).

The semantical translation is defined as follows:

\(\text{DDL.Syntactical.Boolean}@1 : \text{Lexical.Boolean}@2\) \(\{ 1.\text{value} := 2.\text{value} \}\)
\(\text{Lexical.Boolean}@1 : \text{Lexical.True}\) \(\{ 1.\text{value} := \textit{true} \}\)
\(\text{Lexical.Boolean}@1 : \text{Lexical.False}\) \(\{ 1.\text{value} := \textit{false} \}\)

4.1.2 Number Type

The type \(\textit{Number}\) represent sequences of Unicode code points that adhere to the constraints defined by \(\textit{DDL.Syntactical.Number}\). These Unicode code point sequences represent arbitrary precision integer and arbitrary precision floating point numbers.

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.Number}@1 : \text{Lexical.Number}@2 \end{array}\] \[\begin{array}{ll} \{ 1.\text{value} := 2.\textit{codePoints} \} \end{array}\]

Important: Implementations may want to use typical data types like \(\texttt{int}\) or \(\texttt{float}\). The Data Definition Language does not impose restrictions on the value ranges represented by these number literals. Consequently, the numeric value represented by value of type \(\textit{Number}\) might not be reprentable by typical data types like \(\texttt{int}\) or \(\texttt{float}\).

4.1.3 String Type

The \(\textit{String}\) type represents sequences of Unicode code points that have the properties defined by \(\text{DDL.Syntactical.String}\).

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.String}@1 : \text{Lexical.String}@2 \end{array}\] \[\begin{array}{ll} \{ 1.\text{value} := 2.\textit{codePoints} \} \end{array}\]

4.1.4 Void Type

The type sem:Void type has a single values \(\textit{void\} which is expressed in the language by the word \(\textit{void\} (as defined in the syntactical grammar).

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.Void}@1 : \text{Lexical.Void}@2 \end{array}\] \[\begin{array}{ll} \{ 1.\text{value} = \textit{void} \} \end{array}\]

4.2 List Type

The type \(\textit{List}\) represents list of elements. Each element is a DDL node of type \(\textit{Value}\). A DDL node of type \(\textit{List}\) is expressed in the language by the sentence \(\text{DDL.Syntactical.List}\) (as defined in the syntactical grammar).

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.List}@1 : &\text{Lexical.LeftSquareBracket}\\ &\text{DDL.Syntactical.ListBody}@2\\ &\text{Lexical.RightSquareBracket} \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBody}@1 : &\text{DDL.Syntactical.ListBodyElement}@2\\ &\text{DDL.Syntactical.ListBodyRest}@3 \end{array}\] \[\begin{array}{ll} 1.\text{value} := \left[ 2.\text{value} \right] \circ 3.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBody}@1 : \epsilon \end{array}\] \[\begin{array}{ll} 1.\text{value} := [] \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBodyElement}@1 : \text{DDL.Syntactical.Value}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} := 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBodyRest}@1 :& \text{Lexical.Comma}\\ & \text{DDL.Syntactical.ListBodyElement}@2\\ & \text{DDL.Syntactical.ListBodyRest}@3 \end{array}\] \[\begin{array}{ll} 1.\text{value} = \left[ 2.\text{value} \right] \circ 3.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBodyRest}@1 : \text{Lexical.Comma}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = [] \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.ListBodyRest}@1 : \epsilon \end{array}\] \[\begin{array}{ll} 1.\text{value} = [] \end{array}\]

Example

// A list with three numbers 1, 2, and 3.
[ 1, 2, 3 ]

4.3 Map Type

The type \(\textit{Map}\) represents a list of entries. Each entry is a pair. The first element of a pair is a value of type \(\textit{Name}\), the second element is a value of type \(\textit{Value}\). A value of type \(\textit{Map}\) is expressed in the language by the sentence \(\text{DDL.Syntactical.Map}\) (as defined in the syntactical grammar).

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.Map}@1 : &\text{Lexical.LeftCurlyBracket}\\ &\text{DDL.Syntactical.MapBody}@2\\ &\text{Lexical.RightCurlyBracket} \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBody}@1 : &\text{DDL.Syntactical.MapBodyElement}@2\\ &\text{DDL.Syntactical.MapBodyRest}@3 \end{array}\] \[\begin{array}{ll} 1.\text{value} := \left[ 2.\text{value} \right] \circ 3.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBody}@1 : \epsilon \end{array}\] \[\begin{array}{ll} 1.\text{value} := [] \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBodyRest}@1 : &\text{Lexical.Comma}\\ &\text{DDL.Syntactical.MapBodyElement}@2\\ &\text{DDL.Syntactical.MapBodyRest}@3 \end{array}\] \[\begin{array}{ll} 1.\text{value} = \left[ 2.\text{value} \right] \circ 3.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBodyRest}@1 : \text{Lexical.Comma}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = [] \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBodyRest}@1 : \epsilon \end{array}\] \[\begin{array}{ll} 1.\text{value} = [] \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.MapBodyElement}@1 = \text{Lexical.Name}@2;\text{Lexical.Colon};\text{DDL.Syntactical.Value}@3 \end{array}\] \[\begin{array}{ll} 1.\text{value} := \left(2.\text{codePoints}, 3.\text{value}\right) \end{array}\]

4.4 Value Type

The type \(\textit{Value}\) type is the union type of the types \(\textit{Boolean}\), \(\textit{List}\), \(\textit{Map}\), \(\textit{Number}\), \(\textit{String}\), and \(\textit{Void}\).

The semantical translation is defined as follows:

\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.Boolean}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.List}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.Map}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.Number}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.String}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]
\[\begin{array}{ll} \text{DDL.Syntactical.Value}@1 : \text{DDL.Syntactical.Void}@2 \end{array}\] \[\begin{array}{ll} 1.\text{value} = 2.\text{value} \end{array}\]