- Implemented DOM 2 Core.
- (A handler for DOM 3-style namespace normalization is provided and
- used by default for serialization of DOM documents if namespace
- support is enabled.)
-
-
- Error handling overhaul: All syntax errors should now be
- reported as instances of well-formedness-violation. We
- also print line number information.
-
+
Implemented DOM 2 Core.
+
Error handling overhaul.
Support internal subset serialization.
Gilbert Baumann has clarified the license as Lisp-LGPL.
+ Create a SAX handler which builds a DOM document.
+
+
+
+
Serializing DOM
+
+ The technique used to serialize a DOM document is to use a SAX
+ serialization sink as the argument to dom:map-document,
+ which generates SAX events for the DOM tree.
+
+
+ In addition, there are convenience functions like
+ unparse-document as a thin wrapper around
+ map-document.
+
+
+
Function DOM:MAP-DOCUMENT (handler document &key include-xmlns-attributes include-default-values include-doctype)
+ Traverse a DOM document and call SAX functions as if an XML
+ representation of the document was processed by a SAX parser.
+
+
Keyword arguments:
+
+
+ include-xmlns-attributes -- defaults to
+ sax:*include-xmlns-attributes*
+
+
+ include-doctype -- One of nil (no doctype
+ declaration), :full-internal-subset (include a doctype
+ declaration and the full internal subset), or
+ :canonical-notations (write a doctype declaration
+ with an internal subset including only notations, as required
+ for canonical serialization).
+
+
+ include-default-values -- include attribute nodes with nil
+ dom:specified.
+
+
+
+
+
Function CXML:UNPARSE-DOCUMENT (document stream &rest keys)
+
Function CXML:UNPARSE-DOCUMENT-TO-OCTETS (document &rest keys) => vector
+
+
+ Serialize a DOM document object. These convenience functions are
+ wrappers around dom:map-document.
+
+ If keyword argument canonical is specified as 2, a
+ doctype declaration will be written that includes notations
+ declared in the document.
+
+
+ If namespace processing is enabled
+ (sax:*namespace-processing*), a namespace normalizer is used.
+
+
+
+
+
DOM/Lisp mapping
+
+ Note that there is no "standard" DOM mapping for Lisp.
+
+
+ DOM is specified
+ in CORBA IDL, but it refrains from using object-oriented IDL
+ features, allowing for a much more natural Lisp implemenation than
+ the the ordinary IDL/Lisp mapping would.
+ Differences between CXML's DOM and the direct IDL/Lisp mapping:
+
+
+
+ DOM function names are symbols in the DOM package (not
+ the OP package).
+
+
+ DOM functions have proper required arguments, not a huge
+ &rest lambda list.
+
+
+ Although most IDL interfaces are implemented as CLOS classes by
+ CXML, the Lisp types of DOM objects is not documented and cannot
+ be relied upon. A node's type can be determined using
+ dom:node-type instead.
+
+
+ DOMString is mapped to rod, which is either
+ an (unsigned-byte 16) array type or a string type.
+
+
+ The IDL/Lisp mapping maps CORBA enums to Lisp keywords.
+ Unfortunately, the DOM IDL does not use enums. Instead,
+ both exception types and node types are defined integer
+ constants. CXML chooses to ignore this definition and uses
+ keywords instead.
+
+
+ DOM uses StudlyCaps. Lisp programmers don't. We
+ insert #\- before every upper case letter preceded by a
+ lower case letter and before every upper case letter which is
+ followed by a lower case letter, but preceded by a capital
+ letter. This algorithms leads to the natural Lisp spelling
+ of DOM function names.
+
+
+ Implementation note: DOM's NodeList does not
+ necessarily map to a native "sequence" type. (For example,
+ node lists are objects in Java, not arrays.)
+ NodeList is specified to reflect changes done after a
+ node list was created, so node lists cannot be Lisp lists.
+ (A node list could be implemented as a CLOS object pointing to
+ said list though.) Instead, CXML currently implements node
+ lists as adjustable vectors. Note that code which relies on
+ this implementation and uses Lisp sequence functions
+ instead of sticking to dom:item and dom:length
+ is not portable. As a compromise, you can use our
+ extensions dom:map-node-list or
+ dom:do-node-list, which can be implemented portably.
+
+ CXML is implemented as a SAX parser. (Refer to make-dom-builder for information about
+ DOM.)
+
Function CXML:PARSE-FILE (pathname handler &key ...)
Function CXML:PARSE-STREAM (stream handler &key ...)
@@ -175,16 +112,29 @@
Serialization
-
Function CXML:UNPARSE-DOCUMENT (document stream &rest keys)
-
Function CXML:UNPARSE-DOCUMENT-TO-OCTETS (document &rest keys) => vector
- Serialize a DOM document object. These convenience functions are
- wrappers around dom:map-document.
+ Serialization is performed using sink objects. A sink
+ is an output stream for runes. There are different kinds of sinks
+ for output to lisp streams, vectors, etc.
+
+
+ Technically, sinks are SAX handlers that write XML output for SAX
+ events sent to them. In practise, user code would normally not
+ generate those SAX events manually, and instead use a function
+ like dom:map-document or xmls-compat:map-node to serialize an
+ in-memory document.
+
+
+ In addition to map-document, cxml has a set of
+ convenience macros for serialization (see below for
+ with-xml-output, with-element, etc).
+
+
+
+
Function CXML:MAKE-CHARACTER-STREAM-SINK (stream &rest keys) => sink
+
Function CXML:MAKE-OCTET-VECTOR-SINK (&rest keys) => sink
+ Return a handle suitable for event-based XML serialization.
-
-
document -- a DOM document object
-
stream -- a Common Lisp stream with element-type
- character
-
Keyword arguments:
@@ -231,12 +181,6 @@
characters written by unparse-document are really UTF-8
bytes encoded as characters.
-
-
-
Function CXML:MAKE-CHARACTER-STREAM-SINK (stream &rest keys) => sink
-
Function CXML:MAKE-OCTET-VECTOR-SINK (&rest keys) => sink
- Return a handle suitable for event-based XML serialization.
-
These function provide the low-level mechanism used by the DOM
serialization functions. To serialize a document without building
@@ -272,7 +216,7 @@
</foo>
(Note that these functions accept both strings and rods, so we
- could write "foo" instead of #"foo" above.)
+ can write "foo" instead of #"foo" above.)
@@ -303,7 +247,7 @@
(sax:end-document sink))
-
Miscellaneous Utility Functions
+
Miscellaneous SAX handlers
Function CXML:MAKE-VALIDATOR (dtd root)
Create a SAX handler which validates against a DTD instance.
@@ -337,82 +281,11 @@
Return a SAX handler that performs DOM
- 3-style namespace normalization on Attribute lists in
+ 3-style namespace normalization on attribute lists in
start-element events before passing them on the next
handler.
-
-
XMLS Compatibility
-
- Like other XML parsers written in Lisp, CXML can work with
- documents represented as list structures. The specific model
- implemented by cxml is compatible with the xmls parser. Xmls
- list structures are a simpler and faster alternative to full DOM
- document trees. They also serve as an example showing how to
- implement user-defined document models as an independent layer
- over the the base parser (c.f. xml/xmls-compat.lisp in
- the cxml distribution). However, note that the list structures do
- not include all information available in DOM documents and are
- sometimes more difficult to work wth since many DOM functions
- cannot be implemented on them.
-
-
-
Function CXML-XMLS:MAKE-XMLS-BUILDER (&key include-default-values)
- Create a SAX handler which builds XMLS list structures.
- If include-default-values is true, default values for
- attributes declared in a DTD are included as attributes in the
- xmls output. include-default-values is true by default
- and can be set to nil to suppress inclusion of default
- values.
-
-
Function CXML-XMLS:MAKE-NODE (&key name ns attrs
- children) => xmls node
- Build a list node of the form
- (name ((namevalue)*) child*).
-
-
- The node list's car can also be a cons of local name
- and namespace prefix ns.
- fixme: It is unclear to me how namespaces are meant to
- work in xmls, since xmls documentation differs from how xmls
- actually works in current releases. Usually applications need to
- know both the namespace prefix and the namespace URI. We
- currently follow the xmls implementation and use the
- namespace prefix instead of following its documentation which
- shows the URI. We do not follow xmls in munging xmlns attribute
- values. Attributes themselves have namespaces and it is not clear
- to me how that works in xmls.
-
-
-
Accessor CXML-XMLS:NODE-NAME (node)
-
Accessor CXML-XMLS:NODE-NS (node)
-
Accessor CXML-XMLS:NODE-ATTRS (node)
-
Accessor CXML-XMLS:NODE-CHILDREN (node)
- Accessors for xmls node data.
-
-
-
-
Dealing with Rods
@@ -648,102 +521,5 @@ NIL
fixme: For more information on these functions refer to the docstrings.
-
-
-
-
DOM Notes
-
- CXML implements the DOM Level 2 Core interfaces. For details
- on DOM, please refer to the specification.
-
-
- However, note that there is no "standard" DOM mapping for Lisp. DOM
- is specified
- in CORBA IDL, but it refrains from using object-oriented IDL
- features, allowing for a much more natural Lisp implemenation than
- the the ordinary IDL/Lisp mapping would. The mapping chosen for
- cxml is explained below.
-
Function DOM:MAP-DOCUMENT (handler document &key include-xmlns-attributes include-default-values)
- Traverse a DOM document and call SAX functions as if an XML
- representation of the document were processed by a SAX parser.
-
-
- dom:map-document is the low-level building-block used to
- implement the serialization functions
- like unparse-document, but can also be used directly.
-
-
-
DOM/Lisp mapping
-
- Differences between CXML's DOM and the direct IDL/Lisp mapping:
-
-
-
- DOM function names are symbols in the DOM package (not
- the OP package).
-
-
- DOM functions have proper required arguments, not a huge
- &rest lambda list.
-
-
- Although most IDL interfaces are implemented as CLOS classes by
- CXML, the Lisp types of DOM objects is not documented and cannot
- be relied upon. A node's type can be determined using
- dom:node-type instead.
-
-
- DOMString is mapped to rod, which is either
- an (unsigned-byte 16) array type or a string type.
-
-
- The IDL/Lisp mapping maps CORBA enums to Lisp keywords.
- Unfortunately, the DOM IDL does not use enums. Instead,
- both exception types and node types are defined integer
- constants. CXML chooses to ignore this definition and uses
- keywords instead.
-
-
- DOM uses StudlyCaps. Lisp programmers don't. We
- insert #\- before every upper case letter preceded by a
- lower case letter and before every upper case letter which is
- followed by a lower case letter, but preceded by a capital
- letter. This algorithms leads to the natural Lisp spelling
- of DOM function names.
-
-
- Implementation note: DOM's NodeList does not
- necessarily map to a native "sequence" type. (For example,
- node lists are objects in Java, not arrays.)
- NodeList is specified to reflect changes done after a
- node list was created, so node lists cannot be Lisp lists.
- (A node list could be implemented as a CLOS object pointing to
- said list though.) Instead, CXML currently implements node
- lists as adjustable vectors. Note that code which relies on
- this implementation and uses Lisp sequence functions
- instead of sticking to dom:item and dom:length
- is not portable. As a compromise, you can use our
- extensions dom:map-node-list or
- dom:do-node-list, which can be implemented portably.
-