find-element, find-event

This commit is contained in:
dlichteblau
2007-02-18 16:46:32 +00:00
parent 2d9a419c5c
commit 90a6079532
5 changed files with 198 additions and 36 deletions

View File

@ -58,7 +58,9 @@
<a href="klacks.html">Klacks parser</a> <a href="klacks.html">Klacks parser</a>
<ul class="sub"> <ul class="sub">
<li><a href="klacks.html#sources">Parsing incrementally</a></li> <li><a href="klacks.html#sources">Parsing incrementally</a></li>
<li><a href="klacks.html#convenience">Convenience functions</a></li>
<li><a href="klacks.html#klacksax">Bridging Klacks and SAX</a></li> <li><a href="klacks.html#klacksax">Bridging Klacks and SAX</a></li>
<li><a href="klacks.html#klacksax">Examples</a></li>
</ul> </ul>
</li> </li>
<li> <li>

View File

@ -16,34 +16,9 @@
support, entity resolution) while offering a more flexible interface support, entity resolution) while offering a more flexible interface
than SAX. than SAX.
</p> </p>
<h3>Example</h3>
<p> <p>
The following example illustrates creation of a klacks <tt>source</tt>, See below for <a href="#examples">examples</a>.
use of the <tt>consume</tt> function to read individual events,
and shows some of the most common event types.
</p> </p>
<pre>* <b>(defparameter *source* (cxml:make-source "&lt;example>text&lt;/example>"))</b>
*SOURCE*
* <b>(klacks:consume *source*)</b>
:START-DOCUMENT
* <b>(klacks:consume *source*)</b>
:START-ELEMENT
NIL ;namespace URI
"example" ;local name
"example" ;qualified name
* <b>(klacks:consume *source*)</b>
:CHARACTERS
"text"
* <b>(klacks:consume *source*)</b>
:END-ELEMENT
NIL
"example"
"example"
* <b>(klacks:consume *source*)</b>
:END-DOCUMENT
* <b>(klacks:consume *source*)</b>
NIL</pre>
<a name="sources"/> <a name="sources"/>
<h3>Parsing incrementally using sources</h3> <h3>Parsing incrementally using sources</h3>
@ -161,11 +136,11 @@ NIL</pre>
<tt>peek</tt> returns the current event's key and main values. <tt>peek</tt> returns the current event's key and main values.
</p> </p>
<p> <p>
<div class="def">Function KLACKS:CONSUME (source) => key, value*</div> <div class="def">Function KLACKS:PEEK-NEXT (source) => key, value*</div>
</p> </p>
<p> <p>
Return the same values <tt>peek</tt> would, and in addition Advance the source forward to the next event and returns it
advance the source forward to the next event. like <tt>peek</tt> would.
</p> </p>
<p> <p>
<div class="def">Function KLACKS:PEEK-VALUE (source) => value*</div> <div class="def">Function KLACKS:PEEK-VALUE (source) => value*</div>
@ -173,6 +148,13 @@ NIL</pre>
<p> <p>
Like <tt>peek</tt>, but return only the values, not the key. Like <tt>peek</tt>, but return only the values, not the key.
</p> </p>
<p>
<div class="def">Function KLACKS:CONSUME (source) => key, value*</div>
</p>
<p>
Return the same values <tt>peek</tt> would, and in addition
advance the source forward to the next event.
</p>
<p> <p>
<div class="def">Function KLACKS:CURRENT-URI (source) => uri</div> <div class="def">Function KLACKS:CURRENT-URI (source) => uri</div>
<div class="def">Function KLACKS:CURRENT-LNAME (source) => string</div> <div class="def">Function KLACKS:CURRENT-LNAME (source) => string</div>
@ -231,11 +213,124 @@ NIL</pre>
exiting <tt>body</tt>, whether normally or abnormally. exiting <tt>body</tt>, whether normally or abnormally.
</p> </p>
<a name="convenience"/>
<h3>Convenience functions</h3>
<p>
<div class="def">Function KLACKS:FIND-EVENT (source key)</div>
Read events from <tt>source</tt> and discard them until an event
of type <i>key</i> is found. Return values like <tt>peek</tt>, or
NIL if no such event was found.
</p>
<p>
<div class="def">Function KLACKS:FIND-ELEMENT (source &amp;optional
lname uri)</div>
Read events from <tt>source</tt> and discard them until an event
of type :start-element is found with matching local name and
namespace uri is found. If <tt>lname</tt> is <tt>nil</tt>, any
tag name matches. If <tt>uri</tt> is <tt>nil</tt>, any
namespace matches. Return values like <tt>peek</tt> or NIL if no
such event was found.
</p>
<a name="klacksax"/> <a name="klacksax"/>
<h3>Bridging Klacks and SAX</h3> <h3>Bridging Klacks and SAX</h3>
<p>
<div class="def">Function KLACKS:SERIALIZE-EVENT (source handler)</div>
Send the current klacks events from <tt>source</tt> as a SAX
events to the SAX <tt>handler</tt> and consume it.
</p>
<p>
<div class="def">Function KLACKS:SERIALIZE-ELEMENT (source handler
&amp;key document-events)</div>
Read all klacks events from the following <tt>:start-element</tt> to
its <tt>:end-element</tt> and send them as SAX events
to <tt>handler</tt>. When this function is called, the current
event must be <tt>:start-element</tt>, else an error is
signalled. With <tt>document-events</tt> (the default),
<tt>sax:start-document</tt> and <tt>sax:end-document</tt> events
are sent around the element.
</p>
<p> <p>
<div class="def">Function KLACKS:SERIALIZE-SOURCE (source handler)</div> <div class="def">Function KLACKS:SERIALIZE-SOURCE (source handler)</div>
Read all klacks events from <tt>source</tt> and send them as SAX Read all klacks events from <tt>source</tt> and send them as SAX
events to the SAX <tt>handler</tt>. events to the SAX <tt>handler</tt>.
</p> </p>
<a name="examples"/>
<h3>Examples</h3>
<p>
The following example illustrates creation of a klacks <tt>source</tt>,
use of the <tt>peek-next</tt> function to read individual events,
and shows some of the most common event types.
</p>
<pre>* <b>(defparameter *source* (cxml:make-source "&lt;example>text&lt;/example>"))</b>
*SOURCE*
* <b>(klacks:peek-next *source*)</b>
:START-DOCUMENT
* <b>(klacks:peek-next *source*)</b>
:START-ELEMENT
NIL ;namespace URI
"example" ;local name
"example" ;qualified name
* <b>(klacks:peek-next *source*)</b>
:CHARACTERS
"text"
* <b>(klacks:peek-next *source*)</b>
:END-ELEMENT
NIL
"example"
"example"
* <b>(klacks:peek-next *source*)</b>
:END-DOCUMENT
* <b>(klacks:peek-next *source*)</b>
NIL</pre>
<p>
In this example, <tt>find-element</tt> is used to skip over the
uninteresting events until the opening <tt>child1</tt> tag is
found. Then <tt>serialize-element</tt> is used to generate SAX
events for the following element, including its children, and an
xmls-compatible list structure is build from those
events. <tt>find-element</tt> skips over whitespace,
and <tt>find-event</tt> is used to parse up
to <tt>:end-document</tt>, ensuring that the source has been
closed.
</p>
<pre>* <b>(defparameter *source*
(cxml:make-source "&lt;example>
&lt;child1>&lt;p>foo&lt;/p>&lt;/child1>
&lt;child2 bar='baz'/>
&lt;/example>"))</b>
*SOURCE*
* <b>(klacks:find-element *source* "child1")</b>
:START-ELEMENT
NIL
"child1"
"child1"
* <b>(klacks:serialize-element *source* (cxml-xmls:make-xmls-builder))</b>
("child1" NIL ("p" NIL "foo"))
* <b>(klacks:find-element *source*)</b>
:START-ELEMENT
NIL
"child2"
"child2"
* <b>(klacks:serialize-element *source* (cxml-xmls:make-xmls-builder))</b>
("child2" (("bar" "baz")))
* <b>(klacks:find-event *source* :end-document)</b>
:END-DOCUMENT
NIL
NIL
NIL
</pre>
</documentation> </documentation>

View File

@ -79,6 +79,12 @@
(fill-source source) (fill-source source)
(apply #'values current-values))) (apply #'values current-values)))
(defmethod klacks:peek-next ((source cxml-source))
(with-source (source current-key current-values)
(setf current-key nil)
(fill-source source)
(apply #'values current-key current-values)))
(defmethod klacks:consume ((source cxml-source)) (defmethod klacks:consume ((source cxml-source))
(with-source (source current-key current-values) (with-source (source current-key current-values)
(fill-source source) (fill-source source)

View File

@ -69,9 +69,9 @@
(check-type key (member :characters)) (check-type key (member :characters))
characters)) characters))
(defun klacks:serialize-source (source handler) (defun klacks:serialize-event (source handler)
(loop
(multiple-value-bind (key a b c) (klacks:peek source) (multiple-value-bind (key a b c) (klacks:peek source)
(let ((result nil))
(case key (case key
(:start-document (:start-document
(sax:start-document handler)) (sax:start-document handler))
@ -107,12 +107,66 @@
(:end-element (:end-element
(sax:end-element handler a b c)) (sax:end-element handler a b c))
(:end-document (:end-document
(return (sax:end-document handler))) (setf result (sax:end-document handler)))
((nil)
(error "serialize-event read past end of document"))
(t (t
(error "unexpected klacks key: ~A" key))) (error "unexpected klacks key: ~A" key)))
(klacks:consume source)))) (klacks:consume source)
result)))
(defun serialize-declaration-kludge (list handler) (defun serialize-declaration-kludge (list handler)
(loop (loop
for (fn . args) in list for (fn . args) in list
do (apply fn handler args))) do (apply fn handler args)))
(defun klacks:serialize-source (source handler)
(loop
(let ((document (klacks:serialize-event source handler)))
(when document
(return document)))))
(defun klacks:serialize-element (source handler &key (document-events t))
(unless (eq (klacks:peek source) :start-element)
(error "not at start of element"))
(when document-events
(sax:start-document handler))
(labels ((recurse ()
(klacks:serialize-event source handler)
(loop
(let ((key (klacks:peek source)))
(ecase key
(:start-element (recurse))
(:end-element (return))
((:characters :comment :processing-instruction)
(klacks:serialize-event source handler)))))
(klacks:serialize-event source handler)))
(recurse))
(when document-events
(sax:end-document handler)))
(defun klacks:find-element (source &optional lname uri)
(loop
(multiple-value-bind (key current-uri current-lname current-qname)
(klacks:peek-next source)
(case key
((nil)
(return nil))
(:start-element
(when (and (eq key :start-element)
(or (null lname)
(equal lname (klacks:current-lname source)))
(or (null uri)
(equal uri (klacks:current-uri source))))
(return
(values key current-uri current-lname current-qname))))))))
(defun klacks:find-event (source key)
(loop
(multiple-value-bind (this a b c)
(klacks:peek-next source)
(cond
((null this)
(return nil))
((eq this key)
(return (values this a b c)))))))

View File

@ -24,6 +24,11 @@
#:peek #:peek
#:peek-value #:peek-value
#:peek-next
#:consume
#:find-element
#:find-event
#:map-attributes #:map-attributes
#:list-attributes #:list-attributes
@ -33,6 +38,6 @@
#:current-characters #:current-characters
#:current-cdata-section-p #:current-cdata-section-p
#:consume #:serialize-event
#:serialize-element
#:serialize-source)) #:serialize-source))