Data Definition Language Schema (draft, v0.2)
1 Introduction
\( \def\ddl#1{{#1_{\textit{DDL}}}} \def\ddls#1{{#1_{\textit{DDLS}}}} \def\NumberLit#1{{\texttt{#1}}} \def\StringLit#1{{\texttt{#1}}} \def\true{{\text{true}}} \def\false{{\text{false}}} \def\CyclicDefinitionError{{\ddls{\textit{CyclicDefinitionError}}}} \def\SchemaAlreadySpecifiedError{{\ddls{\textit{SchemaAlreadySpecifiedError}}}} \def\SchemaNotFoundError{{\ddls{\textit{SchemaNotFoundError}}}} \def\InvalidDefinitionError{{\ddls{\textit{InvalidDefinitionError}}}} \def\ValidationError{{\ddls{\textit{ValidationError}}}} \)This document is the specification of the Data Definition Language Schema, or DDLS for short. DDLS is a declarative language for defining the structure and constraints for data described by a DDL programs.
DDLS programs and DDL programs are inputs to DDLS validators. DDLS validators are tools which validate an input DDL program against an input DDLS Schema program. These tools can be standalone or integrated into libraries and executables.
Notations
This specification uses the semantical form of the DDL as defined in Data Definition Language specification (see Data Definition Language for more information).
Given a DDL value \(x\), we denote the type of that value by \(\textit{type}(x)\).
The following DDLS types exist: The \(\ddls{\textit{Any}}\) type, the scalar types \(\ddls{\textit{Boolean}}\), \(\ddls{\textit{Number}}\), \(\ddls{\textit{String}}\), and \(\ddls{\textit{Void}}\), the aggregate types \(\ddls{\textit{List}}\), and \(\ddls{\textit{Map}}\), the choice type \(\ddls{\textit{Choice}}\), as well as the schema reference type \(\ddls{\textit{SchemaReference}}\).
The \(\textit{Value}_{\textit{DDLS}}\) type is defined as the union of the above types \[\begin{aligned} \textit{Value}_{\textit{DDLS}} =&\;\ddls{\textit{Any}}\\ \cup&\;\ddls{\textit{Boolean}}\\ \cup&\;\ddls{\textit{Choice}}\\ \cup&\;\ddls{\textit{List}}\\ \cup&\;\ddls{\textit{Map}}\\ \cup&\;\ddls{\textit{Number}}\\ \cup&\;\ddls{\textit{SchemaReference}}\\ \cup&\;\ddls{\textit{Void}}\\ \end{aligned}\]Furthermore, there exist a schema type \(\ddls{\textit{Schema}}\) and a map entry type \(\ddls{\textit{MapEntry}}\).
Given a DDLS value \(x\), we denote the type of that value by \(\textit{type}(x)\).
3 Semantics
The translation of a DDL value into DDLS values is described by syntax-directed translations (see Aho, Seti, Ullman: Compilers, Principles, Techniques, and Tools; 1st; pp. 305 for more information). At the end of a translation, the input DDL value \(x\) has a variable \(x.\text{value}\) which is either a value of type \(\ddls{\textit{Value}}\), if the translation was successful, or an value of type \(\ddls{\textit{Error}}\), if the translation failed.
The translation from DDL value into DDLS value can fail as not every DDL program is a DDLS program. To indicate failures, values of type \(\ddls{\textit{Error}}\) are used. The following values, called errors, of that type are defined:
- \(\CyclicDefinitionError\)
- \(\SchemaAlreadySpecifiedError\)
- \(\SchemaNotFoundError\)
- \(\InvalidDefinitionError\)
- \(\ValidationError\)
We denote the set of lists by \(\mathbb{L}\). The empty list is denoted by \(\left[\right]\). We denote the concatenation of two lists \(a\) and \(b\) by \(a \circ b\). The set of Unicode code point sequences is denoted by \(\mathbb{U}\).
3.1 Any type
The values of type \(\ddls{\textit{Any}}\) are represented by \(\textit{Map}\) values. The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{Any}\) of type \(\textit{String}\).
The translation function is hence given by
|
|
|
where \(\ddls{\textit{any}}\) is the single value of type \(\ddls{\textit{Any}}\).
3.2 Boolean types
The values of type \(\ddls{\textit{Boolean}}\) are represented by a \(\textit{Map}\) values. The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{Boolean}\) of type \(\textit{String}\).
The translation function is hence given by
|
|
|
where \(\ddls{\textit{boolean}}\) is the single value of type \(\ddls{\textit{Boolean}}\).
3.3 Choice type
The values of type \(\ddls{\textit{Choice}}\) are represented by \(\textit{Map}\) values. The map contains two entries.
The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{Choice}\) of type \(\textit{String}\).The translation function is hence given by
|
|
|
where \(\ddls{\textit{any}}\) is the single value of type \(\ddls{\textit{Any}}\).
Auxiliary definition: toChoices
\(\text{toMapEntries} : \ddl{\textit{List}} \rightarrow \mathbb{L} \cup \ddls{\textit{Error}}\) is defined as
|
|
|
|
|
|
|
|
|
3.4 List type
Values of type \(\ddls{\textit{List}}\) are represented by a \(\textit{Map}\) values. The map contains two entries.
The translation function is hence given by
|
|
|
3.5 Map type
A value of type \(\ddls{\textit{Map}}\) are represented by a \(\ddl{\textit{Map}}\) value. The map contains two entries.
The translation function is hence given by
|
|
|
Auxiliary definition: toMapEntries
\(\text{toMapEntries} : \ddl{\textit{List}} \rightarrow \mathbb{L} \cup \ddls{\textit{Error}}\) is defined as
|
|
|
|
|
|
|
|
|
A value of type \(\ddls{\textit{MapEntry}}\) are represented by a \(\textit{Map}\) value. The map contains three entries.
The translation function for a \(\ddls{\textit{MapEntry}}\) is hence given by
|
|
|
3.6 Number types
The values of type \(\ddls{\textit{Number}}\) are represented by a \(\textit{Map}\) values. The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{Number}\) of type \(\textit{String}\).
The translation function is hence given by
|
|
|
where \(\ddls{\textit{number}}\) is the single value of type \(\ddls{\textit{Number}}\).
3.8 Schema type
A value of type \(\ddls{\textit{Schema}}\) are represented by a \(\textit{Map}\) value. The map contains three entries.
The translation function is hence given by
|
|
|
3.7 Schema Reference type
A value of type \(\ddls{\textit{SchemaReference}}\) are represented by a \(\textit{Map}\) value. The map contains two entries.
The translation function is hence given by
|
|
|
3.9 String type
The values of type \(\ddls{\textit{String}}\) are represented by a \(\textit{Map}\) values. The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{String}\) of type \(\textit{String}\).
The translation function is hence given by
|
|
|
where \(\ddls{\textit{string}}\) is the single value of type \(\ddls{\textit{String}}\).
3.10 Void type
The values of type \(\ddls{\textit{Void}}\) are represented by a \(\textit{Map}\) values. The map contains a single map entry of name \(\texttt{kind}\) with the value \(\texttt{Void}\) of type \(\textit{String}\).
The translation function is hence given by
|
|
|
where \(\ddls{\textit{void}}\) is the single value of type \(\ddls{\textit{Void}}\).
\( \def\ValidationContext{{\ddls{\textit{Context}}}} \)4 Validation
Given a value \(x \in \ddls{\textit{Schema}}\), we denote the 2nd element of the tuple, the name of the schema, by \(x.\text{name}\). We denote the 3rd element of the tuple, the definition, by \(x.\text{definition}\). Given a value \(x \in \ddls{\textit{SchemaReference}}\), we denote the 2nd element of the tuple, the name of the schema, by \(x.\text{name}\). Given a value \(x \in \ddls{\textit{List}}\), we denote the 2nd element of the tuple, the element, by \(x.\text{element}\), Given a value \(x \in \ddls{\textit{Map}}\), we denote the 2nd element of the tuple, the entries, by \(x.\text{entries}\), Given a value \(x \in \ddls{\textit{MapEntry}}\), we denote the 2nd element of the tuple, the name, by \(x.\text{name}\) and the 3d element, the value, by \(x.\text{value}\).
A validation context \(c \in \ValidationContext\) is a list of \(\ddls{\textit{Schema}}\) values.
4.1 DDL value validation
The DDL value validation function \[ \text{validate} : \ValidationContext \times \mathbb{U} \times \ddl{\textit{Value}} \rightarrow \mathbb{B} \] maps from a validation context \(c\), the name of a starting schema \(s\), and a DDL value \(v\) to boolean values: It evaluates to \(\text{true}\) if the DDL value was accepted by the validation context with the given starting schema name. Otherwise it returns \(\text{false}\).
The function is defined as follows:
- Return a \(\textit{false}\) if there is no \(a \in c\) with \(a.\text{name} = \text{s}\).
- Return a \(\textit{false}\) if there two schemas \(a, b \in c\) such that \(a \neq b\) but \(a.\text{name} = b.\text{name}\).
- Otherwise let \(a \in c\) with \(a.\text{name} = s\).
- Evaluate to \(\text{validate}'(c, a, v)\)
4.2 DDL value validation'
The DDL value validation' function \[ \text{validate}' : \ValidationContext \times \ddls{\textit{Value}} \times \ddl{\textit{Value}} \rightarrow \mathbb{B} \] maps from a validation context \(c\), a DDLS value \(a\), and a DDL value \(v\) to boolean values: It evaluates to \(\text{true}\) if the DDL value was accepted by the validation context/DDLS value. Otherwise it returns \(\text{false}\).
The function is defined as follows:
- If \(a \in \ddls{\textit{SchemaReference}}\) let \(a = \text{resolve}(c, a)\). if \(a = \false\) return \(\false\).
- If \(v\) is a DDL boolean value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{Boolean}}\).
- Otherwise return \(\true\).
- If \(v\) is a DDL list value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{List}}\).
- Otherwise return \(\text{listValidation}(c, a, v)\).
- If \(v\) is a DDL map value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{Map}}\).
- Otherwise return \(\text{mapValidation}(c, a, v)\).
- If \(v\) is a DDL number value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{Number}}\).
- Otherwise return \(\true\).
- If \(v\) is a DDL string value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{String}}\).
- Otherwise return \(\true\).
- If \(v\) is a DDL void value:
- Return \(\false\) if \(a\) is not a value of type \(\ddls{\textit{Void}}\).
- Otherwise return \(\true\).
4.3 DDL list value validation
The DDL list value validation function \[ \text{validateList} : \ValidationContext \times \ddls{\textit{Value}} \times \ddl{\textit{List}} \rightarrow \mathbb{B} \] maps from a validation context \(c\), a DDLS value \(a\), and a DDL list value \(v\) to boolean values: It evaluates to \(\true\) if the DDL list value was accepted by the validation context/DDLS value. Otherwise it returns \(\false\).
The function is defined as follows:
- If \(a \not\in\ddls{\textit{List}}\), return \(\false\).
- \(\textit{validate}'\left(c, a.\text{element}, w\right)\) is evaluated for each \(w \in v.\textit{elements}\).
- If there is an evaluation \(\false = \textit{validate}\left(c, a.\text{element}.\text{name}, w\right)\) then return \(\false\). Otherwise return \(\true\).
4.4 DDL map value validation
The DDL map value validation function \[ \text{validateMap} : \ValidationContext \times \ddls{\textit{Value}} \times \ddl{\textit{Map}} \rightarrow \mathbb{B} \] maps from a validation context \(c\), a DDLS value \(s\), and a DDL map value \(v\) to boolean values: It evaluates to \(\true\) if the DDL map value was accepted by the validation context/DDLS value. Otherwise it returns \(\false\).
The function is defined as follows:
- If \(a \not\in \ddls{\textit{Map}}\), return \(\text{false}\).
- Return a \(\false\) if there exist \(b', b'' \in a.\textit{entries}\) such that \(b' \neq b''\) and \(b'.\text{name} = b''.\text{name}\). That is, the DDLS map value must not have two map entries with the same name.
- Return a \(\false\) if there exist \(u', u'' \in v\) such that \(u' \neq u''\) and \(u'_0 = u''_0\). That is, the DDL map value must not have two map entries with the same name.
- Return \(\false\) if there exists \(u' \in v\) such that no \(b' \in a.\textit{entries}\) with \(u'_0 = b'.\text{name}\). That is, if a map entry with a certain name was specified in \(v\) then it must be present in \(a\).
- Return \(\false\) if there exists \(b' \in a.\text{entries}\) such that no \(u' \in v\) with \(u'_0 = b'.\text{name}\). That is, if a map entry with a certain name is present in \(a\) then it must be specified in \(v\).
- For each \(b' \in a.\textit{entries}\) and for each \(u' \in v\) \(\textit{validate}'\left(x, b'_1, u'_1\right)\) is invoked if \(u'_0 = b'.\text{name}\). If there is an invocation \(\false = \textit{validate}'\left(c, b'_1, u'_1\right)\) return \(\false\). Otherwise return \(\true\).
4.4 Schema reference resolution
The DDL value validation function \[ \text{resolve} : \ValidationContext \times \ddls{\textit{SchemaReference}} \rightarrow \ddls{\textit{Schema}} \cup \mathbb{B} \] maps from a validation context \(c\) and a DDLS schema reference \(s\) to a DDLS schema or to \(\false\):
The function is defined as follows:
- Let \(v = \emptyset\).
- While \(s\) is a \(\ddls{\textit{SchemaReference}}\) type:
- Find \(s' \in c, s' \in \ddls{\textit{Schema}}\) such that \(s'.\text{name} = s.\text{name}\). Return a \(\textit{false}\) if no such \(s'\) was found.
- Return a \(\textit{false}\) if \(s' \in v\).
- Let \(s = s'\) and \(v = v \cup \{ s' \}\).
- Return \(s\).
4.4 Acceptance and Rejection
\(\textit{validate}(S, s, x)\) always terminates. \(x\) was accepted by \(S\) with \(s\) if the result is \(\ddls{\textit{success}}\). Otherwise - if the result is in \(\ddls{\textit{Error}}\) then \(x\) was rejected by \(S\) with \(s\).