Files
CXML/doc/dom.html
2005-12-26 22:04:08 +00:00

157 lines
6.1 KiB
HTML

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>Closure XML</title>
<link rel="stylesheet" type="text/css" href="cxml.css"/>
</head>
<body>
<div class="sidebar">
</div>
<h1>The DOM implementation</h1>
<p>
CXML implements the DOM Level 2 Core interfaces.&nbsp; For details
on DOM, please refer to the <a
href="http://www.w3.org/TR/DOM-Level-2-Core/core.html">specification</a>.
</p>
<a name="parser"/>
<h3>Parsing into DOM</h3>
<p>
To parse an XML document into a DOM tree, use the SAX parser with a
DOM builder as the SAX handler. Example:
</p>
<pre>(cxml:parse-file "test.xml" (dom:make-dom-builder))</pre>
<p>
<div class="def">Function DOM:MAKE-DOM-BUILDER ()</div>
Create a SAX handler which builds a DOM document.
</p>
<a name="serialization"/>
<h3>Serializing DOM</h3>
<p>
The technique used to serialize a DOM document is to use a SAX
serialization sink as the argument to <tt>dom:map-document</tt>,
which generates SAX events for the DOM tree.
</p>
<p>
In addition, there are convenience functions like
<tt>unparse-document</tt> as a thin wrapper around
<tt>map-document</tt>.
</p>
<p>
<div class="def">Function DOM:MAP-DOCUMENT (handler document &key include-xmlns-attributes include-default-values include-doctype)</div>
Traverse a DOM document and call SAX functions as if an XML
representation of the document was processed by a SAX parser.
</p>
<p>Keyword arguments:</p>
<ul>
<li>
<tt>include-xmlns-attributes</tt> -- defaults to
<tt>sax:*include-xmlns-attributes*</tt>
</li>
<li>
<tt>include-doctype</tt> -- One of <tt>nil</tt> (no doctype
declaration), <tt>:full-internal-subset</tt> (include a doctype
declaration and the full internal subset), or
<tt>:canonical-notations</tt> (write a doctype declaration
with an internal subset including only notations, as required
for canonical serialization).
</li>
<li>
<tt>include-default-values</tt> -- include attribute nodes with nil
<tt>dom:specified</tt>.
</li>
</ul>
<p>
<div class="def">Function CXML:UNPARSE-DOCUMENT (document stream &rest keys)</div>
<div class="def">Function CXML:UNPARSE-DOCUMENT-TO-OCTETS (document &rest keys) => vector</div>
</p>
<p>
Serialize a DOM document object. These convenience functions are
wrappers around <tt>dom:map-document</tt>.
</p>
<p>Keyword arguments are passed on to the sink. C.f. <a
href="using.html#serialization">cxml:make-octet-vector-sink</a>.</p>
<p>Notes:</p>
<ul>
<li>
If keyword argument <tt>canonical</tt> is specified as 2, a
doctype declaration will be written that includes notations
declared in the document.
</li>
<li>
If namespace processing is enabled
(<tt>sax:*namespace-processing*</tt>), a <a
href="using.html#misc">namespace normalizer</a> is used.
</li>
</ul>
<a name="mapping"/>
<h3>DOM/Lisp mapping</h3>
<p>
Note that there is no "standard" DOM mapping for Lisp.
</p>
<p>
DOM is <a
href="http://www.w3.org/TR/DOM-Level-2-Core/idl-definitions.html">specified
in CORBA IDL</a>, but it refrains from using object-oriented IDL
features, allowing for a much more natural Lisp implemenation than
the the ordinary IDL/Lisp mapping would.&nbsp;
Differences between CXML's DOM and the direct IDL/Lisp mapping:
</p>
<ul>
<li>
DOM function names are symbols in the <tt>DOM</tt> package (not
the <tt>OP</tt> package).
</li>
<li>
DOM functions have proper required arguments, not a huge
<tt>&rest</tt> lambda list.
</li>
<li>
Although most IDL interfaces are implemented as CLOS classes by
CXML, the Lisp types of DOM objects is not documented and cannot
be relied upon.&nbsp; A node's type can be determined using
<tt>dom:node-type</tt> instead.
</li>
<li>
<tt>DOMString</tt> is mapped to <tt>rod</tt>, which is either
an <tt>(unsigned-byte 16)</tt> array type or a string type.
</li>
<li>
The IDL/Lisp mapping maps CORBA enums to Lisp keywords.&nbsp;
Unfortunately, the DOM IDL does not use enums.&nbsp; Instead,
both exception types and node types are defined integer
constants.&nbsp; CXML chooses to ignore this definition and uses
keywords instead.
</li>
<li>
DOM uses StudlyCaps.&nbsp; Lisp programmers don't.&nbsp; We
insert <tt>#\-</tt> before every upper case letter preceded by a
lower case letter and before every upper case letter which is
followed by a lower case letter, but preceded by a capital
letter.&nbsp; This algorithms leads to the natural Lisp spelling
of DOM function names.
</li>
<li>
Implementation note: DOM's <tt>NodeList</tt> does not
necessarily map to a native "sequence" type.&nbsp; (For example,
node lists are objects in Java, not arrays.)&nbsp;
<tt>NodeList</tt> is specified to reflect changes done after a
node list was created, so node lists cannot be Lisp lists.&nbsp;
(A node list could be implemented as a CLOS object pointing to
said list though.)&nbsp; Instead, CXML currently implements node
lists as adjustable vectors.&nbsp; Note that code which relies on
this implementation and uses Lisp sequence functions
instead of sticking to <tt>dom:item</tt> and <tt>dom:length</tt>
is not portable.&nbsp; As a compromise, you can use our
extensions <tt>dom:map-node-list</tt> or
<tt>dom:do-node-list</tt>, which can be implemented portably.
</li>
</ul>
</body>
</html>