Data Definition Language
1 Introduction
This document is the specification of the Data Definition Language. Programs of this language are sequences of Unicode code points and describe structured data for the purpose of for exchanging that data between entities (humans and machines alike). The programs describes data in terms of typed value and this specificiation describes two aspects; language provides scalar types (boolean type, number type, string type, and void type) as well as aggregate types (map values and list values). This specification describes the translation from Unicode code points to typed values.
2 Translation
A program of the DDL language is translated into values. This translation happens in three phases: The lexical translation translates Unicode code points to words. The syntactical translation filters the resulting sequence of words and then translates these words into a sentence. The semantical translation translates sentences into values t.
3 Lexical Translation
The lexical translation translates a sequence of Unicode code points provided as input a sequence of words.
The lexical translation of the Data Definition Language is based on the Common Lexical Translations (see https://michaelheilmann.com/specifications/common-lexical-translations for more information).
The lexical grammar consists of
- a set of non-terminals \(\textit{DDL.Lexical.NonTerminals}\) and the set of terminals \(\textit{DDL.Lexical.Terminals}\) which are disjoint
- a set of production rules \(\textit{DDL.Lexical.ProductionRules}\), which are layed down in this section, and
- a starting symbol \(\text{DDL.Lexical.Words}\) which is element of \(\textit{DDL.Lexical.NonTerminals}\).
The two rules, defined in terms of the Common Lexical translations, are defined as follows:
The lexical translation translates a sequence of Unicode code points into words. This resulting sequence of words is then consumed by the syntactical translation.
4 Syntactical Translation
The syntactical translation translates a sequence of words provided by the lexical translation into a sentence.
The syntactical grammar consists of
- a set of non-terminals \(\textit{DDL.Syntactical.NonTerminals}\) and the set of terminals \(\textit{DDL.Syntactical.Terminals}\) which are disjoint
- a set of production rules \(\textit{DDL.Syntactical.ProductionRules}\), which are layed down in this section, and
- a starting symbol \(\textit{DDL.Syntactical.Sentence}\)> which is element of \(\textit{DDL.Syntactical.NonTerminals}\).
Important:The following words are removed from the sequence of words before its translation into a sentence:
- \(\text{DDL.Lexical.Whitespace}\),
- \(\text{DDL.Lexical.Newline}\), and
- \(\text{DDL.Lexical.Comment}\)
4.1 DDL.Syntactical.Value
The sentence \(\text{DDL.Syntactical.Value}\) is defined by
4.2 DDL.Syntactical.String
The sentence \(\text{DDL.Syntactical.String}\) is defined by
4.2 DDL.Syntactical.Number
The sentence \(\text{DDL.Syntactical.Number}\) is
4.3 DDL.Syntactical.Boolean
The sentence \(\text{DDL.Syntactical.Boolean}\) is
4.4 DDL.Syntactical.Void
The sentence \(\text{DDL.Syntactical.Void}\) is
4.5 DDL.Syntactical.Map
The sentence \(\text{DDL.Syntactical.Map}\) is
4.6 DDL.Syntactical.List
The sentence \(\text{DDL.Syntactical.List}\) is
The syntatical translation translates a sequence of words into one sentence. This resulting sntence is then consumed by the semantical translation.
5 Semantical Translation
The semantical translation a sentence provided by the syntactical translation into a typed value. The Data Definition Language knows six basic types \(\textit{List}\) and \(\textit{Map}\), which are the so called aggregate types, and \(\textit{Boolean}\), \(\textit{Number}\), \(\textit{String}\), and \(\textit{Void}\), which are the so called scalar types.
The type \(\textit{Value}\) is defined as the union of all the types above. \[\begin{aligned} \textit{Value} =&\;\textit{List}\\ \cup&\;\textit{Map}\\ \cup&\;\textit{Boolean}\\ \cup&\;\textit{Number}\\ \cup&\;\textit{String}\\ \cup&\;\textit{Void} \end{aligned}\]
The translation of a sentence into values is described by syntax-directed translations (see Aho, Seti, Ullman: Compilers, Principles, Techniques, and Tools; 1st; pp. 305 for more information).
At the end of a translation, the input syntactic form \(x\) has a variable \(x.\text{value}\) which is either a value of type \textit{Value}. Furthermore, for each syntactic form, we define two attributes: \(\text{value}\) is the computed value of the syntactic form. \(\textit{codePoints}\) is the sequence of code points associated with the syntactic form.
4.1 Scalar Types
This section defines \(\sigma\) for the translation of scalar types.
4.1.1 Boolean Type
The type \(\textit{Boolean}\) type has two values \(\textit{true}\) and \(\textit{false}\) which are expressed in the language by the words \(\texttt{DDL.Lexical.true}\) and \(\texttt{DDL.Lexical.false}\), respectively (as defined in the syntactical grammar).
The semantical translation is defined as follows:
| \(\text{DDL.Syntactical.Boolean}@1 : \text{Lexical.Boolean}@2\) | \(\{ 1.\text{value} := 2.\text{value} \}\) |
| \(\text{Lexical.Boolean}@1 : \text{Lexical.True}\) | \(\{ 1.\text{value} := \textit{true} \}\) |
| \(\text{Lexical.Boolean}@1 : \text{Lexical.False}\) | \(\{ 1.\text{value} := \textit{false} \}\) |
4.1.2 Number Type
The type \(\textit{Number}\) represent sequences of Unicode code points that adhere to the constraints defined by \(\textit{DDL.Syntactical.Number}\). These Unicode code point sequences represent arbitrary precision integer and arbitrary precision floating point numbers.
The semantical translation is defined as follows:
|
|
|
Important: Implementations may want to use typical data types like \(\texttt{int}\) or \(\texttt{float}\). The Data Definition Language does not impose restrictions on the value ranges represented by these number literals. Consequently, the numeric value represented by value of type \(\textit{Number}\) might not be reprentable by typical data types like \(\texttt{int}\) or \(\texttt{float}\).
4.1.3 String Type
The \(\textit{String}\) type represents sequences of Unicode code points that have the properties defined by \(\text{DDL.Syntactical.String}\).
The semantical translation is defined as follows:
|
|
|
4.1.4 Void Type
The type
The semantical translation is defined as follows:
|
|
|
4.2 List Type
The type \(\textit{List}\) represents list of elements. Each element is a DDL node of type \(\textit{Value}\). A DDL node of type \(\textit{List}\) is expressed in the language by the sentence \(\text{DDL.Syntactical.List}\) (as defined in the syntactical grammar).
The semantical translation is defined as follows:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Example
// A list with three numbers 1, 2, and 3.
[ 1, 2, 3 ]
4.3 Map Type
The type \(\textit{Map}\) represents a list of entries. Each entry is a pair. The first element of a pair is a value of type \(\textit{Name}\), the second element is a value of type \(\textit{Value}\). A value of type \(\textit{Map}\) is expressed in the language by the sentence \(\text{DDL.Syntactical.Map}\) (as defined in the syntactical grammar).
The semantical translation is defined as follows:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4.4 Value Type
The type \(\textit{Value}\) type is the union type of the types \(\textit{Boolean}\), \(\textit{List}\), \(\textit{Map}\), \(\textit{Number}\), \(\textit{String}\), and \(\textit{Void}\).
The semantical translation is defined as follows:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|