From dd833309bf4887f3923ce6abd59ef156ad720cb2 Mon Sep 17 00:00:00 2001
From: dlichteblau
+
+
+
+
+ Relax NG validation is available as a separate + project: cxml-rng. +
+ + -rel-2007-xx-yy
Make sure to install and load cxml first.
-Create a test file called example.xml:
++ To try the following examples, create a test file + called example.xml: +
* (with-open-file (s "example.xml" :direction :output)
(write-string "<test a='b'><child/></test>" s))
+ Parse example.xml into a DOM tree (read more):
* (cxml:parse-file "example.xml" (cxml-dom:make-dom-builder)) #<DOM-IMPL::DOCUMENT @ #x72206172> + ;; save result for later: * (defparameter *example* *) *EXAMPLE*+
Inspect the DOM tree (read more):
* (dom:document-element *example*) #<DOM-IMPL::ELEMENT test @ #x722b6ba2> -* (dom:tag-name (dom:document-element *example*)) + +* (dom:tag-name (dom:document-element *example*)) "test" -* (dom:child-nodes (dom:document-element *example*)) + +* (dom:child-nodes (dom:document-element *example*)) #(#<DOM-IMPL::ELEMENT child @ #x722b6d8a>) -* (dom:get-attribute (dom:document-element *example*) "a") + +* (dom:get-attribute (dom:document-element *example*) "a") "b"+
Serialize the DOM document back into a file (read more):
-(with-open-file (out "example.out" :direction :output :element-type '(unsigned-byte 8)) - (dom:map-document (cxml:make-octet-stream-sink out) *example*))+
(with-open-file (out "example.out" :direction :output :element-type '(unsigned-byte 8)) + (dom:map-document (cxml:make-octet-stream-sink out) *example*)+ +
+ If DOM is not the representation you want to you, parsing into + other data structures is possible using the same SAX parser + function, while using a different handler. + The XMLS builder is included for compatibility with XMLS, and also + also sample code (see cxml/xml/xmls-compat.lisp) for your own + handlers. +
As an alternative to DOM, parse into xmls-compatible list structure (read more):
* (cxml:parse-file "example.xml" (cxml-xmls:make-xmls-builder))
("test" (("a" "b")) ("child" NIL))
+ + Again, serialization into XML is done using a sink as a SAX + handler and a data-structure specific function to generate SAX + events for the document, in this case cxml-xmls:map-node. +
+ +* (with-open-file (out "example.out" :direction :output :element-type '(unsigned-byte 8))
+ (cxml-xmls:map-node (cxml:make-octet-stream-sink out)
+ '("test" (("a" "b")) ("child" nil))))
+
+ Use klacks to read events from the parser incrementally. The following example looks only for :start-element and :end-element events and prints them (read more):
* (klacks:with-open-source
- (s (cxml:make-source #p"example.xml"))
+ (s (cxml:make-source #p"example.xml"))
(loop
- for key = (klacks:peek s)
+ for key = (klacks:peek s)
while key
do
(case key
(:start-element
- (format t "~A {" (klacks:current-qname s)))
+ (format t "~A {" (klacks:current-qname s)))
(:end-element
(format t "}")))
- (klacks:consume s)))
+ (klacks:consume s)))
test {child {}}
+
+ + Serialization is always done using sinks, which accept SAX events, + but there are convenience functions and macros to make that easier + to use: +
+(cxml:with-xml-output (cxml:make-octet-stream-sink stream :indentation 2 :canonical nil) + (cxml:with-element "foo" + (cxml:attribute "xyz" "abc") + (cxml:with-element "bar" + (cxml:attribute "blub" "bla")) + (cxml:text "Hi there.")))+
+ Prints this to stream: +
+<foo xyz="abc"> + <bar blub="bla"></bar> + Hi there. +</foo>+ +
+ By default, this error will occur when the DTD (or generally, any + entity) has an http:// URL as its system ID. CXML itself + understands only file:// URLs, but allows users to customize the + behaviour for all URLs. +
+ ++ The are several solutions to this, covered in detail below: +
+ Here are the example files for the following solutions to this + problem: +
+ + + dtdexample.xml: +<!DOCTYPE test SYSTEM 'http://www.lichteblau.com/blubba/dtdexample.dtd'> +<test a='b'>blub<child/></test>+ + + dtdexample.dtd: +
<!ELEMENT test (#PCDATA|child)*> +<!ATTLIST test + a CDATA #REQUIRED + > + +<!ELEMENT child EMPTY> ++ +
+ Use the :entity-resolver argument to parse-file to + specify a function that maps System IDs and Public IDs to local + files of your choice: +
+ +(let ((uri "http://www.lichteblau.com/blubba/dtdexample.dtd") + (pathname "dtdexample.dtd")) + (flet ((resolver (pubid sysid) + (declare (ignore pubid)) + (when (puri:uri= sysid (puri:parse-uri uri)) + (open pathname :element-type '(unsigned-byte 8))))) + (cxml:parse-file "dtdexample.xml" (cxml-dom:make-dom-builder) :entity-resolver #'resolver)))+ + +
+ Yes and no. +
++ Yes, you can force CXML to do this, see the following example. +
+ ++ But no, skipping the DTD will not actually work if the document + references entities declared in the DTD, especially since neither + SAX nor DOM are able to report unresolved entity references in + attributes. +
+ ++ The trick to make CXML skip the DTD is to pretend that it is empty + by returning a zero-length stream instead: +
+ +(flet ((resolver (pubid sysid) + (declare (ignore pubid sysid)) + (flexi-streams:make-in-memory-input-stream nil))) + (cxml:parse-file "dtdexample.xml" (cxml-dom:make-dom-builder) :entity-resolver #'resolver))+ +
+ Rather than writing an entity resolver function yourself, CXML can + use XML catalogs to find DTDs and entity files on your local system. +
++ Catalogs are particularly helpful for DTDs that are + pre-installed. For example, most Linux distributions include a + package for the XHTML DTD. The DTD will reside in a + distribution-dependent location, which the central catalog file + points to. +
+By default, CXML looks for the catalog in /etc/xml/catalog + (Linux) and /usr/local/share/xml/catalog.ports (FreeBSD). +
+* (setf cxml:*catalog* (cxml:make-catalog)) +* (cxml:parse-file "test.xhtml" (cxml-dom:make-dom-builder))+ +
+ Sure, just use an entity-resolver function that does it. +
++ Install Drakma and try this: +
+(flet ((resolver (pubid sysid) + (declare (ignore pubid)) + (when (eq (puri:uri-scheme sysid) :http) + (drakma:http-request sysid :want-stream t)))) + (cxml:parse-file "dtdexample.xml" (cxml-dom:make-dom-builder) :entity-resolver #'resolver))