Chapter 3 Language definition
3.1 Preliminary definitions and notations
In this implementation, the data are represented as
lists of objects. This section defines what
we mean here by objects and lists. It then provides definitions
of basic functions that will be useful when defining the language
in the next section.
3.1.1 Objects
An object can be
either nil,atomic or complex.
nil is an object that denotes undefined values. The type
of nil is Tnil. An atomic object can be of type
Tboolean, Tinteger, Tdouble or Tstring, whereas a
complex objects represents a DOM Node.
Obj denotes the set of objects, including nil.
Obj* denotes the set of non-nil objects.
Atomic objects:
Atomic objects are wrappers for primitive data types.
The value of an atomic object is the
corresponding basic atomic value. TRUE (resp. FALSE) is the
boolean object which value is true (resp. false).
Bool (resp
Int,Dbl,Str)
denotes the set of boolean (resp integer,
double, string) objects.
Atom denotes Bool È Int È Dbl È
Str.
Num denotes Int È Dbl.
atomic is the factory function that can be used to build atomic
objects. For instance, atomic(1) returns the integer object i
such that value(i)=1.
For clarity, we will in the sequel use notation shortcuts. For instance:
"a string" is equivalent to atomic("a string");
1 is equivalent to atomic(1);
1.0 + 4.5 is equivalent to atomic(value(atomic(1.0))
+value(atomic(4.5))).
Complex objects:
Complex objects are wrappers for DOM Nodes.
The W3C Recommandation [WG98] defines 12
different types of nodes, but we will consider here only DOM nodes of
type Element, Attr (for attributes) and Text.
The corresponding complex object types are Telement,
Tattr, and Ttext.
Text (resp Attr,Elt) denotes the set of
complex objects of type Ttext (resp Tattr,
Telement).
Cplx denotes Text È Attr È Elt.
Each complex object has a node attribute. It provides
access to the wrapped DOM Node. Each t in Text (resp. a in
Attr, e in Elt) has a textNode (resp. attrNode, elementNode) attribute. It provides access to the
wrapped Text (resp Attr, Element) DOM node.
complex is the factory fonction that can be used to build
complex objects. If n is a DOM node, complex(n) returns a
complex object of type corresponding to the type of n, or nil
otherwise.
3.1.2 Lists
If o1, ..., on are in Obj*,
á o1,...,on ñ
denotes the corresponding ordered list of objects.
á ñ denotes the empty list.
L denotes the set of lists containing zero or more objects
in Obj*.
L* denotes the set of lists containing one or more
objects in Obj*.
Let o1,...,on in Obj. Then
á o1,...,on ñ denotes the list l in L
constructed as follows :
l=cons(o1,cons(o2,...,cons(on,á ñ)...))
where cons is defined on Obj × L as follows:
let o is Obj*,
l'=á o1',...,om' ñ in L*
| cons(nil,á ñ) | = | á ñ |
| cons(nil,l') | = | l' |
| cons(o,á ñ) | = | á o ñ |
| cons(o,l') | = | á o,o1',...,om' ñ | |
|
For instance, á nil,1,nil,"a string" ñ denotes the element
á 1,"a string" ñ of L.
union:
The union operator is denoted ||. It is defined on
L × L as follows:
let l1=á o11,...,o1n ñ in L*
,l2=á o21,...,o2m ñ in L*.
| á ñ FLUnion á ñ = á ñ |
| l1 || á ñ = á ñ || l1 = l1 |
| l1 || l2=á o11,...,o1n,o21,...,o2m ñ | |
3.1.3 Object functions and predicates
A n-ary object function is a function that takes n,
with n>0, objects as arguments and returns an object.
We will denote Fn the set of n-ary object functions.
The set of unary (resp. binary) object functions is
F1 (resp. F2).
A list function is a function that takes one object as
argument and returns a list in L.
Fl denotes the set of list functions.
An n-ary object predicate is a n-ary object function that returns either
TRUE or FALSE.
We will denote Pn the set of n-ary object predicates.
The set of unary (resp. binary) object predicates is
P1 (resp. P2).
Unary predicates
is:
If cst is in Obj, iscst
is a unary object predicate. It is
defined on Obj as follows:
let o in Obj
| iscst(o) = |
ì
í
î
|
|
| TRUE | if o is the object cst |
| FALSE | otherwise | |
|
We will in the sequel make use of isnil,isTRUE and isFalse.
in:
If set is a set of objects, inset is a unary object
predicate. It is defined on Obj as follows:
let o in Obj
| inset(o)= |
ì
í
î
|
|
| TRUE | if o Î set |
| FALSE | otherwise | |
|
We will in the sequel make use of inBool
,inInt ,inDbl ,inNum ,inStr
,inAtom ,inText ,inAttr ,inElt and
inCplx.
Conversion functions
string:
string is a unary object function that returns a string object
or nil. string is defined on Obj as follows:
let o in Obj
| type(o) | string(o) |
| Tnil | nil |
| Tboolean | "true" if isTRUE(o), "false" otherwise. |
| Tstring | o |
| Tinteger | a string object representing this integer. |
| Tdouble | a string object representing this double,
in the style [-]ddd.ddd |
| Ttext | o.textNode.nodeValue() |
| Tattr | o.attrNode.nodeValue() |
| Telement | string(o)=StrElt(o), where StrElt is
defined above. |
|
|
procedure StrElt(TElement e) |
| begin | |
| | result := nil; |
| | l := e.elementNode.getChildNodes(); |
| | for each | n in l |
| | | if | n Î Text |
| | | | result := result + string(n) |
| | | endif |
| | endfor |
| | return result; |
| end
|
text:
text is a unary object function that returns a string object
or nil. It is defined on Obj as follows:
let o in Obj
| text(o) = |
ì
í
î
|
|
| textElt(o) | if inElt(o) |
| string(o) | otherwise | |
|
where TextElt is the following procedure:
|
procedure TextElt(TElement e) |
| begin | |
| | result := nil; |
| | l := e.elementNode.getChildNodes(); |
| | for each | n in l |
| | | if | n Î Text |
| | | | result := result + string(n) |
| | | endif |
| | | if | inElt(n) |
| | | | result := result + TextElt(n) |
| | | endif |
| | endfor |
| | return result; |
| end
|
boolean:
boolean is a unary object function that returns a boolean
object or nil. boolean is defined on
Obj as follows:
let o in Obj
| type(o) | boolean(o) |
| Tnil | FALSE |
| Tboolean | o |
| Tstring | TRUE if the lowercased value of o is "true",
FALSE otherwise |
| Tinteger | FALSE if o=0, TRUE otherwise |
| Tdouble | FALSE |
| other | boolean(string(o)) |
|
integer:
integer is a unary object function that returns an integer
object or nil. integer is defined on
Obj as follows:
let o in Obj
| type(o) | integer(o) |
| Tnil | nil |
| Tboolean | nil |
| Tstring | an integer if o is a string representation of an
integer, nil otherwise |
| Tinteger | o |
| Tdouble | an integer if the decimal part of o equals 0,
nil otherwise. |
| other | integer(string(o)) |
|
double:
double is a unary object function that returns a double
object or nil. double is defined on
Obj as follows:
let o in Obj
| type(o) | double(o) |
| Tnil | nil |
| Tboolean | nil |
| Tstring | a double if o is a string representation od a
double, nil otherwise |
| Tinteger | a double d such that d=o |
| Tdouble | o |
| other | double(string(o)) |
|
num:
num is a unary object function that returns an object in Num
or nil. num is defined on Obj
as follows:
let o in Obj
| num(o)= |
ì
í
î
|
|
| integer(o) | if integer(o) ¹ nil |
| double(o) | otherwise | |
|
Comparison functions
less:
less is a binary object function. If its two arguments are
coercible into numeric objects, it returns the result of the
comparison between them, and nil otherwise.
less is defined on Obj × Obj as follows:
let o1 in Obj, o2 in Obj,
let n1=num(o1), n2=num(o2)
| less(o1,o2)= |
ì
í
î
|
|
| nil | if n1=nil or n2=nil |
| TRUE | if n1<n2 |
| FALSE | otherwise | |
|
greater,lessEqual and greaterEqual are defined in a similar way.
compare:
compare is a binary object function. It performs a comparison
by value between its two arguments.
compare is defined on Obj × Obj as follows:
Let o1 in Obj,o2 in Obj
- if o1=nil or o2=nil,
compare(o1,o2) = nil
- if o1 Î Num or o2 Î Num,
compare(o1,o2) = less( num(o1), num(o2))
- if o1 Î Str È Text È Attr
or o2 Î Str È Text È Att,
compare(o1,o2) = strcomp(o1,o2)
- if o1 Î Elt and o2 Î Elt,
compare(o1,o2) = eltcomp(o1,o2)
where strcomp is defined on Obj × Obj as
follows:
let o1 in Obj,o2 in Obj,
let s1=string(o1), s2=string(o2)
| strcomp(o1,o2)= |
ì
í
î
|
|
| nil | if s1=nil or s2=nil |
| TRUE | if s1=s2 |
| FALSE | otherwise | |
|
and eltcomp is defined on Elt × Elt as
follows:
let e1 in Elt,e2 in Elt
| eltcomp(e1,e2)= |
ì
í
î
|
|
| TRUE | if e1 and e2 have the same structure
(identical names, attributes and children) |
| FALSE | otherwise | |
|
Arithmetic functions
unaryPlus:
unaryPlus is defined on Obj as follows:
let o in Obj
unaryPlus(o)=num(o)
unaryMinus:
unaryPlus is defined on Obj as follows:
let o in Obj
unaryMinus(o)= -num(o)
plus:
plus is defined on Obj × Obj as follows:
let o1 in Obj, o2 in Obj
| plus(o1,o2)= |
ì
í
î
|
|
| nil | if num(o1)=nil or num(o2)=nil |
| num(o1) + num(o2) | otherwise | |
|
minus, mul and div are defined in the same way.
DOM operations
name:
name is an object unary function that returns the name of an
object. name is defined on Obj as follows:
Let o in Obj.
| name(o)= |
ì
í
î
|
|
| o.node.getNodeName() | if o Î Cplx |
| nil | otherwise | |
|
label:
label is an object unary function that returns the label of an
object. label is defined on Obj as follows:
let o in Obj.
| name(o)= |
ì
í
î
|
|
| o.node.getNodeName() | if o Î (Attr È Elt) |
| nil | otherwise | |
|
attributes:
attribute is a list function that returns a list of
the attributes nodes of an object. attributes is defined on
Obj as follows:
let o in Obj
| name(o)= |
ì
í
î
|
|
| Attrs(o) | if o Î Cplx |
| á ñ | otherwise | |
|
|
procedure Attrs(TComplex o) |
| begin | |
| | result := á ñ; |
| | attrs := e.node.getAttributes(); |
| | if | attrs == null |
| | | return result; |
| | for each | attr in attrs |
| | | result := cons(attr,result) |
| | endfor |
| | return result; |
| end
|
childNodes:
childNodes is a list function that returns a list of
the child nodes, at the first level in the DOM tree, of an
object.
childNodes is defined on Obj as follows:
let o in Obj
| childNodes(o)= |
ì
í
î
|
|
| Children(o) | if o Î Cplx |
| á ñ | otherwise | |
|
where Children is the following procedure:
|
procedure Children(TComplex o) |
| begin | |
| | result := á ñ; |
| | children := e.node.getChildNodes(); |
| | if | children == null |
| | | return result; |
| | for each | child in children |
| | | result := result || á child ñ |
| | endfor |
| | return result; |
| end
|
descNodes:
descNodes is a list function that returns a list of
the child nodes, at any depth in the DOM tree, of an object.
descNodes is defined on Obj as follows:
let o in Obj
| descNodes(o)= |
ì
í
î
|
|
| Descendants(o) | if o Î Cplx |
| á ñ | otherwise | |
|
where Descendants is the following procedure:
|
procedure Descendants(TComplex o) |
| begin | |
| | result := á ñ; |
| | children := e.node.getChildNodes(); |
| | if | children == null |
| | | return result; |
| | for each | child in children |
| | | result := result ||
cons(child,Descendants(child)) |
| | endfor |
| | return result; |
| end
|
3.1.4 Aggregate functions and predicates
An aggregate function is a function that takes a list in L
as argument and returns an object in Obj.
An aggregate predicate is a list function that returns either
TRUE or FALSE.
Aggregate functions
first:
Let l=á o1,...,on ñ in L*.
first is defined on L as follows:
first(á ñ)= nil
first(l)=o1
last:
Let l=á o1,...,on ñ in L*.
last is defined on L as follows:
last(á ñ)= nil
last(l)=on
count:
Let l=á o1,...,on ñ in L*.
count is defined on L as follows:
count(á ñ)= 0
count(l)=n
Aggregate predicates
empty:
empty is an aggregate predicate. It is defined on
L as follows:
let l in L
| empty(l)= |
ì
í
î
|
|
| TRUE | if count(l)=0 |
| FALSE | otherwise | |
|
3.1.5 Selection operator
The selection operator s filters a list according to an
object unary predicate.
Let l in L*, p in P1. s is defined on
L × P1 as follows:
| s(á ñ,p) | = | á ñ |
| s(l,p) | = | á o, o Î l, p(o) ñ | | |
3.1.6 Mapping operators
We introduce two mapping operators. Map is the classical
mapping operator on lists. Mapl is a generalized version of
Map, for the case of the mapping function is in Fl.
Let l in L*, f in F1. Map is defined on
L × F1 as follows:
| Map(á ñ,f) | = | á ñ |
| Map(l,f) | = | á f(o1),...,f(on) ñ | | |
Let l in L*, f in Fl. Mapl is defined on
L × Fl as follows:
| Mapl(á ñ,f) | = | á ñ |
| Mapl(l,f) | = | f(o1) || ... || f(on) | | |
3.1.7 Existantial quantifier
Let p in P1, l in L. Exist(l,p) returns TRUE
if there exist o in l such that p(o)=TRUE, and FALSE
otherwise.
Exist is defined on L × P1 as follows:
| Exist(l,p)= |
ì
í
î
|
|
| TRUE | if empty(Map(l,p))=FALSE |
| FALSE | otherwise | |
|
3.1.8 Universal quantifier
Let p in P1, l in L*.
All is defined on L × P1 as follows:
| All(á ñ,p) | = | FALSE |
| All(l,p) | = |
ì
í
î
|
|
| TRUE | if for each o in l, p(o)=TRUE |
| FALSE | otherwise | |
| | | |
3.1.9 Join operator
Let l1 in L*,l2 in L*,
f in F2, p in P2.
Join(l1,á ñ,f,p)=Join(á ñ,l1,f,p)=á ñ
Join(l1,l2,f,p)=á f(o1,o2), o1 Î l1, o2 Î l2,
p(o1,po2) ñ
If the parameter p is omitted, its default value is the
object binary predicate that returns always TRUE.
3.2 X-OQL expressions
X-OQL is a functional language. It is defined by a set of basic
queries and functions and a way of building new ones through composition.
3.2.1 Definitions
An X-OQL expression is a string of the form query ;.
The value of an expression is computed by calling the evaluation function E with the argument query. Given
a query q, E(q) returns a list in L.
We will also consider in the boolean evaluation
functions Eb. Eb takes as argument a query and returns either TRUE
or FALSE.
We will in the sequel provide the syntax of basic queries and
functions. For each of them, the corresponding value of E and
Eb is given. If, given a query q, only E(q) or Eb(q)
is given, then the following rule applies:
-
if only E(q) is given, then
Eb(q)=FALSE if E(q) is an empty collection of
objects, TRUE otherwise;
-
if only Eb(q) is given, then E(q) = á Eb(q) ñ.
3.2.2 Basic queries
Atomic literals
Literals have the usual syntax. For instance, an integer literal
is a sequence of digits, a string literal is a double quoted
sequence of characters.
- true is a query.
Eb(true) = TRUE
- false is a query.
Eb(false) = FALSE
- If i is an atomic literal, then i is a query.
E(i) = á i ñ
List construction
- {} is a query.
E({})= á ñ
- If X and Y are queries, then X union Y is a query.
E(X union Y)= E(X) || E(Y)
- If X1,...,Xn are queries, then
{X1,...,Xn} is a query.
E({X1,...,Xn})=
E(X1) || ... || E(Xn)
Boolean connectors
- If X is a query, then not X is a query.
| Eb(not X) = |
ì
í
î
|
|
| TRUE | if Eb(X)=FALSE |
| FALSE | otherwise | |
|
- If X and Y are queries, then X and Y is a query.
| Eb(X and Y) = |
ì
í
î
|
|
| TRUE | if Eb(X)=TRUE and Eb(Y)=TRUE |
| FALSE | otherwise | |
|
- If X and Y are queries, then X or Y is a query.
| Eb(X and Y) = |
ì
í
î
|
|
| TRUE | if Eb(X)=TRUE or Eb(Y)=TRUE |
| FALSE | otherwise | |
|
Relational queries
- If X and Y are queries, then X=Y is a query.
Eb(X=Y) returns
TRUE if there exist x in E(X), y in E(Y) such that
compare(x,y)=TRUE.
Eb(X=Y)=Exist(Join(E(X),E(Y),compare),isTRUE)
- If X and Y are queries, then X<Y is a query.
Eb(X<Y)
returns TRUE if there exist x in E(X), y in
E(Y) such that x and y can be converted into numerical
values nx, ny, and nx<ny.
Eb(X<Y)=Exist(Join(Map(E(X),num),Map(E(Y),num),less),isTrue)
- If X and Y are queries, then X<=Y is a query.
Eb(X<=Y)=Exist(Join(Map(E(X),num),Map(E(Y),num),lessEqual),isTrue)
- If X and Y are queries, then X>Y is a query.
Eb(X>Y)=Exist(Join(Map(E(X),num),Map(E(Y),num),greater),isTrue)
- If X and Y are queries, then X>=Y is a query.
Eb(X>=Y)=Exist(Join(Map(E(X),num),Map(E(Y),num),greaterEqual),isTrue)
3.2.3 Arithmetic operators
- If X is a query, then +X is query.
E(+X) = Map(X,unaryPlus)
- If X is a query, then -X is query.
E(+X) = Map(X,unaryMinus)
- If X and Y are queries, then X+Y is a query.
E(X+Y)=Join(Map(X,num),Map(Y,num),plus)
- If X and Y are queries, then X-Y is a query.
E(X-Y)=Join(Map(X,num),Map(Y,num),minus)
- If X and Y are queries, then X*Y is a query.
E(X*Y)=Join(Map(X,num),Map(Y,num),mul)
- If X and Y are queries, then X div Y is a query.
E(X div Y)=Join(Map(X,num),Map(Y,num),div)
3.2.4 Path expressions
A path expression is a query of the form [query] [navigation
operator] [path constraint expression].
A navigation operator can be @, / or //.
Starting from a complex object o, @ allows to reach the attributes
of o, whereas / (resp. // provides access to the
child (resp. descendant) nodes of o.
A path constraint expression can be used to restrict the
collection of attributes (resp. child nodes, descendant nodes)
that a path expression defines. A path constraint
expression P is evaluated as regular expression r using
the function R.
Definitions of path constraint expressions and the
corresponding value of R are given below.
Let r a regular expression, o1,o2 in Object, such that o2 is a descendant of o1.
| path(o1,o2)= |
ì
í
î
|
|
| label(o2) | if o2 is a child of o1 |
| path(o1,parent(o2)) + "/" + label(o2) | otherwise | |
|
| rr(o1,o2) = |
ì
í
î
|
|
| TRUE | if r(path(o1,o2),r) |
| FALSE | otherwise | |
|
Accessing attributes
- If X is a query, P is path constraint expression, X@P is a query.
| E(X@P)=Join(E(X)[x],attributes(x)[y],y,r |
| (x,y))
|
Accessing child elements
- If X is a query, P is path constraint expression, X/P is a query.
| E( X/P)=s(Join(E(X)[x],childNodes(x)[y],y,r |
| (x,y)),in | Elt)
|
Accessing descendant elements
- If X is a query, P is path constraint expression, X//P is a query.
| E(X//P)=s(Join(E(X)[x],descNodes(x)[y],y,r |
| (x,y)),in | Elt)
|
3.2.5 Predicates
- If X is a query, and p is the name of a unary object predicate, then
p(X) and X/p() are queries.
E(p(X)) = map(E(X),p)
Eb(p(X)) = exist(E(p(X)),isTrue)
E(X/p()) = E(p(X))
Eb(X/p()) = Eb(p(X))
- If X is a query, and p is the name of a list predicate, then
p(X) and X/p() are queries.
Eb(p(X)) = p(E(X))
Eb(X/p()) = Eb(p(X))
3.2.6 Function calls
- If X is a query, and f is the name of a unary object function, then
f(X) and X/f() are queries.
E(f(X)) = Map(E(X),f)
E(X/f()) = E(f(X))
- If X is a query, and f is the name of a list function, then
f(X) and X/f() are queries.
E(f(X)) = Mapl(E(X),f)
E(X/f()) = E(f(X))
- If X is a query, and f is the name of an aggregate function, then
f(X) and X/f() are queries.
E(f(X)) = á E(X) ñ
E(X/f()) = E(f(X))