Initial revision

git-svn-id: svn://bknr.net/svn/trunk/thirdparty/cl-ppcre@12 4281704c-cde7-0310-8518-8e2dc76b1ff0
2004-06-23 08:26:55 +00:00
commit 4122284075
26 changed files with 29552 additions and 0 deletions
--- a/156
+++ b/156
@ -0,0 +1,156 @@
+Version 0.7.4
+2004-02-16
+Fixed wrong call to SIGNAL-PPCRE-SIGNAL-ERROR in lexer.lisp (caught by Peter Graves)
+Added :CL-PPCRE to *FEATURES* (for CL-INTERPOL)
+Compiler macro for SPLIT
+
+Version 0.7.3
+2004-01-28
+Fixed bug in CURRENT-MIN-REST for lookaheads (reported by Thomas-Paz Hartman)
+Added tests for this bug
+
+Version 0.7.2
+2004-01-27
+Fixed typo (SUBSEQ/NSUBSEQ) in SPLIT (thanks to Alan Ruttenberg)
+Updated docs with respect to ECL (thanks to Alex Mizrahi)
+Mention FreeBSD port in docs
+
+Version 0.7.1
+2003-10-24
+Fixed version numbers in docs (thanks to S<>bastien Saint-Sevin)
+
+Version 0.7.0
+2003-10-23
+New macros REGISTER-GROUPS-BIND and DO-REGISTER-GROUPS
+Added SHAREP keyword argument to most API functions and macros
+Mention CL-INTERPOL in docs
+Partial code cleanup (using WITH-UNIQUE-NAMES and REBINDING)
+
+Version 0.6.1
+2003-10-11
+Added EXTERNAL-FORMAT keyword args to CL-PPCRE-TEST:TEST for some CLs (thanks to JP Massar and Scott D. Kalter)
+Fixed bug with REGEX-REPLACE and REGEX-REPLACE-ALL when (= START END) was true
+Added doc sections for quoting problems and backslash confusions (thanks to conversations with Peter Seibel)
+Disable quoting in definition of QUOTE-SECTIONS so you can always safely rebuild CL-PPCRE
+
+Version 0.6.0
+2003-10-07
+CL-PPCRE now has its own condition types
+Added support for Perl's \Q and \E (Peter Seibel convinced me to do it) - see QUOTE-META-CHARS and *ALLOW-QUOTING*
+Added tests for this new feature
+Threaded tests are more verbose now and use only keyword args
+
+Version 0.5.9
+2003-10-03
+Changed "^" optimizations with respect to constant end strings with offsets (bug caught by Yexuan Gui)
+Added tests for this bug
+Removed *.dos files from CL-PPCRE-TEST tests (thanks to JP Massar)
+Added threaded tests for SBCL (thanks to Christophe Rhodes)
+
+Version 0.5.8
+2003-09-17
+Optimizations for ".*" were too optimistic when look-behinds were involved
+Added tests for this bug
+Removed *.dos files
+
+Version 0.5.7
+2003-08-20
+Fixed (CL-PPCRE:SCAN "(.)X$" "ABCX" :START 4) bug (spotted by Tibor Simko)
+Forgot to export *REGEX-CHAR-CODE-LIMIT* in Corman version of DEFPACKAGE
+Removed Emacs local variables from source code (finally...)
+Mention Gentoo in docs
+
+Version 0.5.6
+2003-06-30
+Replaced wrong COPY-REGEX code for WORD-BOUNDARY objects (detected by Max Goldberg)
+Added info about possible TRUENAME problems with ACL in README (thanks to Kevin Layer for providing a patch for this)
+
+Version 0.5.5
+2003-06-09
+Patch for SBCL/Debian compatibility by Kevin Rosenberg
+Simpler version of compiler macro
+Availability through asdf-install
+
+Version 0.5.4
+2003-04-09
+Added DESTRUCTIVE keyword to CREATE-SCANNER
+
+Version 0.5.3
+2003-03-31
+Fixed bug in REGEX-REPLACE (replacement string couldn't contain literal backslash)
+Fixed bug in definition of CHAR-CLASS (since 0.5.0 the hash slot may be NIL - CMUCL's new PCL detects this)
+Micro-optimization in INSERT-CHAR-CLASS-TESTER: CHAR-NOT-GREATERP instead of CHAR-DOWNCASE
+
+Version 0.5.2
+2003-03-28
+Better compiler macro (thanks to Kent M. Pitman)
+
+Version 0.5.1
+2003-03-27
+Removed compiler macro
+
+Version 0.5.0
+2003-03-27
+Lexer, parser, and converter mostly re-written to reduce consing and increase speed
+Get rid of FIX-POS in lexer and parser, "ism" flags are handled after parsing now
+Smaller test suite (again) due to literal embedding of line breaks
+Seperate test files for DOS line endings
+Replaced constant +REGEX-CHAR-CODE-LIMIT+ with special variable *REGEX-CHAR-CODE-LIMIT*
+
+Version 0.4.1
+2003-03-19
+Added compiler macro for SCAN
+Changed test suite to be nicer to Corman Lisp and ECL (see docs for new syntax)
+Incorporated visual feedback (dots) in test suite (thanks to JP Massar)
+Added README file
+Replaced STRING-LIST-TO-SIMPLE-STRING with a much improved version by JP Massar
+
+Version 0.4.0
+2003-02-27
+Added *USE-BMH-MATCHER*
+
+Version 0.3.2
+2003-02-21
+Added load.lisp
+Various minor changes for Corman Lisp compatibility (thanks to Karsten Poeck and JP Massar)
+
+Version 0.3.1
+2003-01-18
+Bugfix in CREATE-SCANNER (didn't work if flags were given and arg was a parse-tree)
+
+Version 0.3.0
+2003-01-12
+Added new features to REGEX-REPLACE and REGEX-REPLACE-ALL
+
+Version 0.2.0
+2003-01-11
+Make SPLIT more Perl-compatible, including new keyword parameters
+
+Version 0.1.4
+2003-01-10
+Don't move "^" and "\A" while iterating with DO-SCANS
+Added link to Debian package
+
+Version 0.1.3
+2002-12-25
+More usable MK:DEFSYSTEM files (courtesy of Hannu Koivisto)
+Fixed typo in documentation
+
+Version 0.1.2
+2002-12-22
+Added version numbers for Debian packaging
+Be friendly to case-sensitive ACL images (courtesy of Kevin Rosenberg and Douglas Crosher)
+"Fixed" two cases where declarations came after docstrings (because of bugs in Corman Lisp and older CMUCL versions)
+Added #-cormanlisp to hide (INCF (THE FIXNUM POS)) from Corman Lisp
+Added file doc/benchmarks.2002-12-22.txt
+
+Version 0.1.1
+2002-12-21
+Added asdf system definitions by Marco Baringer
+Small additions to documentation
+Correct (Emacs) local variables list in closures.lisp and api.lisp
+Added this CHANGELOG
+
+Version 0.1.0
+2002-12-20
+Initial release
--- a/51
+++ b/51
@ -0,0 +1,51 @@
+Complete documentation for CL-PPCRE can be found in the 'doc'
+directory.
+
+1. Installation
+
+1.1. Probably the easiest way is
+
+       (load "/path/to/cl-ppcre/load.lisp")
+
+     This should compile and load CL-PPCRE on most Common Lisp
+     implementations.
+
+1.2. With MK:DEFSYSTEM you can make a symbolic link from
+     'cl-ppcre.system' and 'cl-ppcre-test.system' to your central registry
+     (which by default is in '/usr/local/lisp/Registry/') and then issue
+     the command
+
+       (mk:compile-system "cl-ppcre")
+
+     Note that this relies on TRUENAME returning the original file a
+     symbolic link is pointing to. This will only work with AllegroCL
+     6.2 if you've applied all patches with (SYS:UPDATE-ALLEGRO).
+
+1.3. You can also use ASDF instead of MK:DEFSYSTEM in a similar way
+     (use the .asd files instead of the .system files).
+
+2. Test
+
+CL-PPCRE comes with a test suite that can be used to check its
+compatibility with Perl's regex syntax. See the documentation on how
+to use this test suite for benchmarks and on how to write your own
+tests.
+
+2.1. If you've used 'load.lisp' to load CL-PPCRE you already have the
+     test suite loaded and can start the default tests with
+
+       (cl-ppcre-test:test)
+
+2.2. With MK:DEFSYSTEM you need to compile the 'cl-ppcre-test' system
+     as well before you can proceed as in 2.1.
+
+2.3. Same for ASDF.
+
+Depending on your machine and your CL implementation the default test
+will take between a few seconds and a couple of minutes. (It will
+print a dot for every tenth test case while it proceeds to give some
+visual feedback.) It should exactly report three 'errors' (662, 790,
+and 1439) which are explained in the documentation.
+
+MCL might report an error for the ninth test case which is also
+explained in the docs.
--- a/api.lisp
+++ b/api.lisp
--- a/cl-ppcre-test.asd
+++ b/cl-ppcre-test.asd
@ -0,0 +1,40 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/cl-ppcre-test.asd,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; This ASDF system definition was kindly provided by Marco Baringer.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz.  All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(defpackage #:cl-ppcre-test.system
+  (:use #:cl
+        #:asdf))
+
+(in-package #:cl-ppcre-test.system)
+
+(defsystem #:cl-ppcre-test
+    :depends-on (#:cl-ppcre)
+    :components ((:file "ppcre-tests")))
--- a/cl-ppcre-test.system
+++ b/cl-ppcre-test.system
@ -0,0 +1,40 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/cl-ppcre-test.system,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz.  All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-user)
+
+(defparameter *cl-ppcre-test-base-directory*
+  (make-pathname :name nil :type nil :version nil
+                 :defaults (parse-namestring *load-truename*)))
+
+(mk:defsystem #:cl-ppcre-test
+    :source-pathname *cl-ppcre-test-base-directory*
+    :source-extension "lisp"
+    :depends-on (#:cl-ppcre)
+    :components ((:file "ppcre-tests")))
--- a/cl-ppcre.asd
+++ b/cl-ppcre.asd
@ -0,0 +1,51 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/cl-ppcre.asd,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; This ASDF system definition was kindly provided by Marco Baringer.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz.  All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(defpackage #:cl-ppcre.system
+  (:use #:cl
+        #:asdf))
+
+(in-package #:cl-ppcre.system)
+
+(defsystem #:cl-ppcre
+    :components ((:file "packages")
+                 (:file "specials" :depends-on ("packages"))
+                 (:file "util" :depends-on ("packages"))
+                 (:file "errors" :depends-on ("util"))
+                 (:file "lexer" :depends-on ("errors" "specials"))
+                 (:file "parser" :depends-on ("lexer"))
+                 (:file "regex-class" :depends-on ("parser"))
+                 (:file "convert" :depends-on ("regex-class"))
+                 (:file "optimize" :depends-on ("convert"))
+                 (:file "closures" :depends-on ("optimize" "specials"))
+                 (:file "repetition-closures" :depends-on ("closures"))
+                 (:file "scanner" :depends-on ("repetition-closures"))
+                 (:file "api" :depends-on ("scanner"))))
--- a/cl-ppcre.system
+++ b/cl-ppcre.system
@ -0,0 +1,51 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/cl-ppcre.system,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz.  All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-user)
+
+(defparameter *cl-ppcre-base-directory*
+  (make-pathname :name nil :type nil :version nil
+                 :defaults (parse-namestring *load-truename*)))
+
+(mk:defsystem #:cl-ppcre
+    :source-pathname *cl-ppcre-base-directory*
+    :source-extension "lisp"
+    :components ((:file "packages")
+                 (:file "specials" :depends-on ("packages"))
+                 (:file "util" :depends-on ("packages"))
+                 (:file "errors" :depends-on ("util"))
+                 (:file "lexer" :depends-on ("errors" "specials"))
+                 (:file "parser" :depends-on ("lexer"))
+                 (:file "regex-class" :depends-on ("parser"))
+                 (:file "convert" :depends-on ("regex-class"))
+                 (:file "optimize" :depends-on ("convert"))
+                 (:file "closures" :depends-on ("optimize" "specials"))
+                 (:file "repetition-closures" :depends-on ("closures"))
+                 (:file "scanner" :depends-on ("repetition-closures"))
+                 (:file "api" :depends-on ("scanner"))))
--- a/closures.lisp
+++ b/closures.lisp
@ -0,0 +1,576 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/closures.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Here we create the closures which together build the final
+;;; scanner.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(declaim (inline *string*= *string*-equal))
+
+(defun *string*= (string2 start1 end1 start2 end2)
+  "Like STRING=, i.e. compares the special string *STRING* from START1
+to END1 with STRING2 from START2 to END2. Note that there's no
+boundary check - this has to be implemented by the caller."
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum start1 end1 start2 end2))
+  (loop for string1-idx of-type fixnum from start1 below end1
+        for string2-idx of-type fixnum from start2 below end2
+        always (char= (schar *string* string1-idx)
+                      (schar string2 string2-idx))))
+
+(defun *string*-equal (string2 start1 end1 start2 end2)
+  "Like STRING-EQUAL, i.e. compares the special string *STRING* from
+START1 to END1 with STRING2 from START2 to END2. Note that there's no
+boundary check - this has to be implemented by the caller."
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum start1 end1 start2 end2))
+  (loop for string1-idx of-type fixnum from start1 below end1
+        for string2-idx of-type fixnum from start2 below end2
+        always (char-equal (schar *string* string1-idx)
+                           (schar string2 string2-idx))))
+
+(defgeneric create-matcher-aux (regex next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Creates a closure which takes one parameter,
+START-POS, and tests whether REGEX can match *STRING* at START-POS
+such that the call to NEXT-FN after the match would succeed."))
+                
+(defmethod create-matcher-aux ((seq seq) next-fn)
+  ;; the closure for a SEQ is a chain of closures for the elements of
+  ;; this sequence which call each other in turn; the last closure
+  ;; calls NEXT-FN
+  (loop for element in (reverse (elements seq))
+        for curr-matcher = next-fn then next-matcher
+        for next-matcher = (create-matcher-aux element curr-matcher)
+        finally (return next-matcher)))
+
+(defmethod create-matcher-aux ((alternation alternation) next-fn)
+  ;; first create closures for all alternations of ALTERNATION
+  (let ((all-matchers (mapcar #'(lambda (choice)
+                                  (create-matcher-aux choice next-fn))
+                              (choices alternation))))
+    ;; now create a closure which checks if one of the closures
+    ;; created above can succeed
+    (lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (loop for matcher in all-matchers
+            thereis (funcall (the function matcher) start-pos)))))
+
+(defmethod create-matcher-aux ((register register) next-fn)
+  ;; the position of this REGISTER within the whole regex; we start to
+  ;; count at 0
+  (let ((num (num register)))
+    (declare (type fixnum num))
+    ;; STORE-END-OF-REG is a thin wrapper around NEXT-FN which will
+    ;; update the corresponding values of *REGS-START* and *REGS-END*
+    ;; after the inner matcher has succeeded
+    (flet ((store-end-of-reg (start-pos)
+               (declare (type fixnum start-pos)
+                        (type function next-fn))
+               (setf (svref *reg-starts* num) (svref *regs-maybe-start* num)
+                     (svref *reg-ends* num) start-pos)
+           (funcall next-fn start-pos)))
+      ;; the inner matcher is a closure corresponding to the regex
+      ;; wrapped by this REGISTER
+      (let ((inner-matcher (create-matcher-aux (regex register)
+                                               #'store-end-of-reg)))
+        (declare (type function inner-matcher))
+        ;; here comes the actual closure for REGISTER
+        (lambda (start-pos)
+          (declare (type fixnum start-pos))
+          ;; remember the old values of *REGS-START* and friends in
+          ;; case we cannot match
+          (let ((old-*reg-starts* (svref *reg-starts* num))
+                (old-*regs-maybe-start* (svref *regs-maybe-start* num))
+                (old-*reg-ends* (svref *reg-ends* num)))
+            ;; we cannot use *REGS-START* here because Perl allows
+            ;; regular expressions like /(a|\1x)*/
+            (setf (svref *regs-maybe-start* num) start-pos)
+            (let ((next-pos (funcall inner-matcher start-pos)))
+              (unless next-pos
+                ;; restore old values on failure
+                (setf (svref *reg-starts* num) old-*reg-starts*
+                      (svref *regs-maybe-start* num) old-*regs-maybe-start*
+                      (svref *reg-ends* num) old-*reg-ends*))
+              next-pos)))))))
+
+(defmethod create-matcher-aux ((lookahead lookahead) next-fn)
+  ;; create a closure which just checks for the inner regex and
+  ;; doesn't care about NEXT-FN
+  (let ((test-matcher (create-matcher-aux (regex lookahead) #'identity)))
+    (declare (type function next-fn test-matcher))
+    (if (positivep lookahead)
+      ;; positive look-ahead: check success of inner regex, then call
+      ;; NEXT-FN
+      (lambda (start-pos)
+        (and (funcall test-matcher start-pos)
+             (funcall next-fn start-pos)))
+      ;; negative look-ahead: check failure of inner regex, then call
+      ;; NEXT-FN
+      (lambda (start-pos)
+        (and (not (funcall test-matcher start-pos))
+             (funcall next-fn start-pos))))))
+
+(defmethod create-matcher-aux ((lookbehind lookbehind) next-fn)
+  (let ((len (len lookbehind))
+        ;; create a closure which just checks for the inner regex and
+        ;; doesn't care about NEXT-FN
+        (test-matcher (create-matcher-aux (regex lookbehind) #'identity)))
+    (declare (type function next-fn test-matcher)
+             (type fixnum len))
+    (if (positivep lookbehind)
+      ;; positive look-behind: check success of inner regex (if we're
+      ;; far enough from the start of *STRING*), then call NEXT-FN
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (and (>= (- start-pos *start-pos*) len)
+             (funcall test-matcher (- start-pos len))
+             (funcall next-fn start-pos)))
+      ;; negative look-behind: check failure of inner regex (if we're
+      ;; far enough from the start of *STRING*), then call NEXT-FN
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (and (or (< start-pos len)
+                 (not (funcall test-matcher (- start-pos len))))
+             (funcall next-fn start-pos))))))
+
+(defmacro insert-char-class-tester ((char-class chr-expr) &body body)
+  "Utility macro to replace each occurence of '(CHAR-CLASS-TEST)
+within BODY with the correct test (corresponding to CHAR-CLASS)
+against CHR-EXPR."
+  (with-unique-names (%char-class)
+    ;; the actual substitution is done here: replace
+    ;; '(CHAR-CLASS-TEST) with NEW
+    (flet ((substitute-char-class-tester (new)
+               (subst new '(char-class-test) body
+                      :test #'equalp)))
+      `(let* ((,%char-class ,char-class)
+              (hash (hash ,%char-class))
+              (count (if hash
+                       (hash-table-count hash)
+                       most-positive-fixnum))
+              ;; collect a list of "all" characters in the hash if
+              ;; there aren't more than two
+              (key-list (if (<= count 2)
+                          (loop for chr being the hash-keys of hash
+                                collect chr)
+                          nil))
+              downcasedp)
+        (declare (type fixnum count))
+        ;; check if we can partition the hash into three ranges (or
+        ;; less)
+        (multiple-value-bind (min1 max1 min2 max2 min3 max3)
+            (create-ranges-from-hash hash)
+          ;; if that didn't work and CHAR-CLASS is case-insensitive we
+          ;; try it again with every character downcased
+          (when (and (not min1)
+                     (case-insensitive-p ,%char-class))
+            (multiple-value-setq (min1 max1 min2 max2 min3 max3)
+              (create-ranges-from-hash hash :downcasep t))
+            (setq downcasedp t))
+          (cond ((= count 1)
+                  ;; hash contains exactly one character so we just
+                  ;; check for this single character; (note that this
+                  ;; actually can't happen because this case is
+                  ;; optimized away in CONVERT already...)
+                  (let ((chr1 (first key-list)))
+                    ,@(substitute-char-class-tester
+                       `(char= ,chr-expr chr1))))
+                ((= count 2)
+                  ;; hash contains exactly two characters
+                  (let ((chr1 (first key-list))
+                        (chr2 (second key-list)))
+                    ,@(substitute-char-class-tester
+                       `(let ((chr ,chr-expr))
+                         (or (char= chr chr1)
+                             (char= chr chr2))))))
+                ((word-char-class-p ,%char-class)
+                  ;; special-case: hash is \w, \W, [\w], [\W] or
+                  ;; something equivalent
+                  ,@(substitute-char-class-tester
+                     `(word-char-p ,chr-expr)))
+                ((= count *regex-char-code-limit*)
+                  ;; according to the ANSI standard we might have all
+                  ;; possible characters in the hash even if it
+                  ;; doesn't contain CHAR-CODE-LIMIT characters but
+                  ;; this doesn't seem to be the case for current
+                  ;; implementations (also note that this optimization
+                  ;; implies that you must not have characters with
+                  ;; character codes beyond *REGEX-CHAR-CODE-LIMIT* in
+                  ;; your regexes if you've changed this limit); we
+                  ;; expect the compiler to optimize this T "test"
+                  ;; away
+                  ,@(substitute-char-class-tester t))
+                ((and downcasedp min1 min2 min3)
+                  ;; three different ranges, downcased
+                  ,@(substitute-char-class-tester
+                     `(let ((chr ,chr-expr))
+                       (or (char-not-greaterp min1 chr max1)
+                           (char-not-greaterp min2 chr max2)
+                           (char-not-greaterp min3 chr max3)))))
+                ((and downcasedp min1 min2)
+                  ;; two ranges, downcased
+                  ,@(substitute-char-class-tester
+                     `(let ((chr ,chr-expr))
+                       (or (char-not-greaterp min1 chr max1)
+                           (char-not-greaterp min2 chr max2)))))
+                ((and downcasedp min1)
+                  ;; one downcased range
+                  ,@(substitute-char-class-tester
+                     `(char-not-greaterp min1 ,chr-expr max1)))
+                ((and min1 min2 min3)
+                  ;; three ranges
+                  ,@(substitute-char-class-tester
+                     `(let ((chr ,chr-expr))
+                       (or (char<= min1 chr max1)
+                           (char<= min2 chr max2)
+                           (char<= min3 chr max3)))))
+                ((and min1 min2)
+                  ;; two ranges
+                  ,@(substitute-char-class-tester
+                     `(let ((chr ,chr-expr))
+                       (or (char<= min1 chr max1)
+                           (char<= min2 chr max2)))))
+                (min1
+                  ;; one range
+                  ,@(substitute-char-class-tester
+                     `(char<= min1 ,chr-expr max1)))
+                (t
+                  ;; the general case; note that most of the above
+                  ;; "optimizations" are based on experiences and
+                  ;; benchmarks with CMUCL - if you're really
+                  ;; concerned with speed you might find out that the
+                  ;; general case is almost always the best one for
+                  ;; other implementations (because the speed of their
+                  ;; hash-table access in relation to other operations
+                  ;; might be better than in CMUCL)
+                  ,@(substitute-char-class-tester
+                     `(gethash ,chr-expr hash)))))))))
+
+(defmethod create-matcher-aux ((char-class char-class) next-fn)
+  (declare (type function next-fn))
+  ;; insert a test against the current character within *STRING*
+  (insert-char-class-tester (char-class (schar *string* start-pos))
+    (if (invertedp char-class)
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (and (< start-pos *end-pos*)
+             (not (char-class-test))
+             (funcall next-fn (1+ start-pos))))
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (and (< start-pos *end-pos*)
+             (char-class-test)
+             (funcall next-fn (1+ start-pos)))))))
+
+(defmethod create-matcher-aux ((str str) next-fn)
+  (declare (type fixnum *end-string-pos*)
+           (type function next-fn)
+           ;; this special value is set by CREATE-SCANNER when the
+           ;; closures are built
+           (special end-string))
+  (let* ((len (len str))
+         (case-insensitive-p (case-insensitive-p str))
+         (start-of-end-string-p (start-of-end-string-p str))
+         (skip (skip str))
+         (str (str str))
+         (chr (schar str 0))
+         (end-string (and end-string (str end-string)))
+         (end-string-len (if end-string
+                           (length end-string)
+                           nil)))
+    (declare (type fixnum len))
+    (cond ((and start-of-end-string-p case-insensitive-p)
+            ;; closure for the first STR which belongs to the constant
+            ;; string at the end of the regular expression;
+            ;; case-insensitive version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos end-string-len))
+              (let ((test-end-pos (+ start-pos end-string-len)))
+                (declare (type fixnum test-end-pos))
+                ;; either we're at *END-STRING-POS* (which means that
+                ;; it has already been confirmed that end-string
+                ;; starts here) or we really have to test
+                (and (or (= start-pos *end-string-pos*)
+                         (and (<= test-end-pos *end-pos*)
+                              (*string*-equal end-string start-pos test-end-pos
+                                              0 end-string-len)))
+                     (funcall next-fn (+ start-pos len))))))
+          (start-of-end-string-p
+            ;; closure for the first STR which belongs to the constant
+            ;; string at the end of the regular expression;
+            ;; case-sensitive version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos end-string-len))
+              (let ((test-end-pos (+ start-pos end-string-len)))
+                (declare (type fixnum test-end-pos))
+                ;; either we're at *END-STRING-POS* (which means that
+                ;; it has already been confirmed that end-string
+                ;; starts here) or we really have to test
+                (and (or (= start-pos *end-string-pos*)
+                         (and (<= test-end-pos *end-pos*)
+                              (*string*= end-string start-pos test-end-pos
+                                         0 end-string-len)))
+                     (funcall next-fn (+ start-pos len))))))
+          (skip
+            ;; a STR which can be skipped because some other function
+            ;; has already confirmed that it matches
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (funcall next-fn (+ start-pos len))))
+          ((and (= len 1) case-insensitive-p)
+            ;; STR represent exactly one character; case-insensitive
+            ;; version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (< start-pos *end-pos*)
+                   (char-equal (schar *string* start-pos) chr)
+                   (funcall next-fn (1+ start-pos)))))
+          ((= len 1)
+            ;; STR represent exactly one character; case-sensitive
+            ;; version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (< start-pos *end-pos*)
+                   (char= (schar *string* start-pos) chr)
+                   (funcall next-fn (1+ start-pos)))))
+          (case-insensitive-p
+            ;; general case, case-insensitive version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (let ((next-pos (+ start-pos len)))
+                (declare (type fixnum next-pos))
+                (and (<= next-pos *end-pos*)
+                     (*string*-equal str start-pos next-pos 0 len)
+                     (funcall next-fn next-pos)))))
+          (t
+            ;; general case, case-sensitive version
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (let ((next-pos (+ start-pos len)))
+                (declare (type fixnum next-pos))
+                (and (<= next-pos *end-pos*)
+                     (*string*= str start-pos next-pos 0 len)
+                     (funcall next-fn next-pos))))))))
+
+(declaim (inline word-boundary-p))
+
+(defun word-boundary-p (start-pos)
+  "Check whether START-POS is a word-boundary within *STRING*."
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum start-pos))
+  (let ((1-start-pos (1- start-pos)))
+    ;; either the character before START-POS is a word-constituent and
+    ;; the character at START-POS isn't...
+    (or (and (or (= start-pos *end-pos*)
+                 (and (< start-pos *end-pos*)
+                      (not (word-char-p (schar *string* start-pos)))))
+             (and (< 1-start-pos *end-pos*)
+                  (<= *start-pos* 1-start-pos)
+                  (word-char-p (schar *string* 1-start-pos))))
+        ;; ...or vice versa
+        (and (or (= start-pos *start-pos*)
+                 (and (< 1-start-pos *end-pos*)
+                      (<= *start-pos* 1-start-pos)
+                      (not (word-char-p (schar *string* 1-start-pos)))))
+             (and (< start-pos *end-pos*)
+                  (word-char-p (schar *string* start-pos)))))))
+
+(defmethod create-matcher-aux ((word-boundary word-boundary) next-fn)
+  (declare (type function next-fn))
+  (if (negatedp word-boundary)
+    (lambda (start-pos)
+      (and (not (word-boundary-p start-pos))
+           (funcall next-fn start-pos)))
+    (lambda (start-pos)
+      (and (word-boundary-p start-pos)
+           (funcall next-fn start-pos)))))
+
+(defmethod create-matcher-aux ((everything everything) next-fn)
+  (declare (type function next-fn))
+  (if (single-line-p everything)
+    ;; closure for single-line-mode: we really match everything, so we
+    ;; just advance the index into *STRING* by one and carry on
+    (lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (and (< start-pos *end-pos*)
+           (funcall next-fn (1+ start-pos))))
+    ;; not single-line-mode, so we have to make sure we don't match
+    ;; #\Newline
+    (lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (and (< start-pos *end-pos*)
+           (char/= (schar *string* start-pos) #\Newline)
+           (funcall next-fn (1+ start-pos))))))
+
+(defmethod create-matcher-aux ((anchor anchor) next-fn)
+  (declare (type function next-fn))
+  (let ((startp (startp anchor))
+        (multi-line-p (multi-line-p anchor)))
+    (cond ((no-newline-p anchor)
+            ;; this must be and end-anchor and it must be modeless, so
+            ;; we just have to check whether START-POS equals
+            ;; *END-POS*
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (= start-pos *end-pos*)
+                   (funcall next-fn start-pos))))
+          ((and startp multi-line-p)
+            ;; a start-anchor in multi-line-mode: check if we're at
+            ;; *START-POS* or if the last character was #\Newline
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (let ((*start-pos* (or *real-start-pos* *start-pos*)))
+                (and (or (= start-pos *start-pos*)
+                         (and (<= start-pos *end-pos*)
+                              (> start-pos *start-pos*)
+                              (char= #\Newline
+                                     (schar *string* (1- start-pos)))))
+                     (funcall next-fn start-pos)))))
+          (startp
+            ;; a start-anchor which is not in multi-line-mode, so just
+            ;; check whether we're at *START-POS*
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (= start-pos (or *real-start-pos* *start-pos*))
+                   (funcall next-fn start-pos))))
+          (multi-line-p
+            ;; an end-anchor in multi-line-mode: check if we're at
+            ;; *END-POS* or if the character we're looking at is
+            ;; #\Newline
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (or (= start-pos *end-pos*)
+                       (and (< start-pos *end-pos*)
+                            (char= #\Newline
+                                   (schar *string* start-pos))))
+                   (funcall next-fn start-pos))))
+          (t
+            ;; an end-anchor which is not in multi-line-mode, so just
+            ;; check if we're at *END-POS* or if we're looking at
+            ;; #\Newline and there's nothing behind it
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (and (or (= start-pos *end-pos*)
+                       (and (= start-pos (1- *end-pos*))
+                            (char= #\Newline
+                                   (schar *string* start-pos))))
+                   (funcall next-fn start-pos)))))))
+
+(defmethod create-matcher-aux ((back-reference back-reference) next-fn)
+  (declare (type function next-fn))
+  ;; the position of the corresponding REGISTER within the whole
+  ;; regex; we start to count at 0
+  (let ((num (num back-reference)))
+    (if (case-insensitive-p back-reference)
+      ;; the case-insensitive version
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (let ((reg-start (svref *reg-starts* num))
+              (reg-end (svref *reg-ends* num)))
+          ;; only bother to check if the corresponding REGISTER as
+          ;; matched successfully already
+          (and reg-start
+               (let ((next-pos (+ start-pos (- (the fixnum reg-end)
+                                               (the fixnum reg-start)))))
+                 (declare (type fixnum next-pos))
+                 (and
+                   (<= next-pos *end-pos*)
+                   (*string*-equal *string* start-pos next-pos
+                                   reg-start reg-end)
+                   (funcall next-fn next-pos))))))
+      ;; the case-sensitive version
+      (lambda (start-pos)
+        (declare (type fixnum start-pos))
+        (let ((reg-start (svref *reg-starts* num))
+              (reg-end (svref *reg-ends* num)))
+          ;; only bother to check if the corresponding REGISTER as
+          ;; matched successfully already
+          (and reg-start
+               (let ((next-pos (+ start-pos (- (the fixnum reg-end)
+                                               (the fixnum reg-start)))))
+                 (declare (type fixnum next-pos))
+                 (and
+                   (<= next-pos *end-pos*)
+                   (*string*= *string* start-pos next-pos
+                              reg-start reg-end)
+                   (funcall next-fn next-pos)))))))))
+
+(defmethod create-matcher-aux ((branch branch) next-fn)
+  (let* ((test (test branch))
+         (then-matcher (create-matcher-aux (then-regex branch) next-fn))
+         (else-matcher (create-matcher-aux (else-regex branch) next-fn)))
+    (declare (type function then-matcher else-matcher))
+    (cond ((numberp test)
+            (lambda (start-pos)
+              (declare (type fixnum test))
+              (if (and (< test (length *reg-starts*))
+                       (svref *reg-starts* test))
+                (funcall then-matcher start-pos)
+                (funcall else-matcher start-pos))))
+          (t
+            (let ((test-matcher (create-matcher-aux test #'identity)))
+              (declare (type function test-matcher))
+              (lambda (start-pos)
+                (if (funcall test-matcher start-pos)
+                  (funcall then-matcher start-pos)
+                  (funcall else-matcher start-pos))))))))
+
+(defmethod create-matcher-aux ((standalone standalone) next-fn)
+  (let ((inner-matcher (create-matcher-aux (regex standalone) #'identity)))
+    (declare (type function next-fn inner-matcher))
+    (lambda (start-pos)
+      (let ((next-pos (funcall inner-matcher start-pos)))
+        (and next-pos
+             (funcall next-fn next-pos))))))
+
+(defmethod create-matcher-aux ((void void) next-fn)
+  ;; optimize away VOIDs: don't create a closure, just return NEXT-FN
+  next-fn)
--- a/convert.lisp
+++ b/convert.lisp
@ -0,0 +1,775 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/convert.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Here the parse tree is converted into its internal representation
+;;; using REGEX objects.  At the same time some optimizations are
+;;; already applied.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+;;; The flags that represent the "ism" modifiers are always kept
+;;; together in a three-element list. We use the following macros to
+;;; access individual elements.
+
+(defmacro case-insensitive-mode-p (flags)
+  "Accessor macro to extract the first flag out of a three-element flag list."
+  `(first ,flags))
+
+(defmacro multi-line-mode-p (flags)
+  "Accessor macro to extract the second flag out of a three-element flag list."
+  `(second ,flags))
+
+(defmacro single-line-mode-p (flags)
+  "Accessor macro to extract the third flag out of a three-element flag list."
+  `(third ,flags))
+
+(defun set-flag (token)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (special flags))
+  "Reads a flag token and sets or unsets the corresponding entry in
+the special FLAGS list."
+  (case token
+    ((:case-insensitive-p)
+      (setf (case-insensitive-mode-p flags) t))
+    ((:case-sensitive-p)
+      (setf (case-insensitive-mode-p flags) nil))
+    ((:multi-line-mode-p)
+      (setf (multi-line-mode-p flags) t))
+    ((:not-multi-line-mode-p)
+      (setf (multi-line-mode-p flags) nil))
+    ((:single-line-mode-p)
+      (setf (single-line-mode-p flags) t))
+    ((:not-single-line-mode-p)
+      (setf (single-line-mode-p flags) nil))
+    (otherwise
+      (signal-ppcre-syntax-error "Unknown flag token ~A" token))))
+
+(defun add-range-to-hash (hash from to)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (special flags))
+  "Adds all characters from character FROM to character TO (inclusive)
+to the char class hash HASH. Does the right thing with respect to
+case-(in)sensitivity as specified by the special variable FLAGS."
+  (let ((from-code (char-code from))
+        (to-code (char-code to)))
+    (when (> from-code to-code)
+      (signal-ppcre-syntax-error "Invalid range from ~A to ~A in char-class"
+                                 from to))
+    (cond ((case-insensitive-mode-p flags)
+            (loop for code from from-code to to-code
+                  for chr = (code-char code)
+                  do (setf (gethash (char-upcase chr) hash) t
+                           (gethash (char-downcase chr) hash) t)))
+          (t
+            (loop for code from from-code to to-code
+                  do (setf (gethash (code-char code) hash) t))))
+    hash))
+
+(defun convert-char-class-to-hash (list)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Combines all items in LIST into one char class hash and returns it.
+Items can be single characters, character ranges like \(:RANGE #\\A
+#\\E), or special character classes like :DIGIT-CLASS. Does the right
+thing with respect to case-\(in)sensitivity as specified by the
+special variable FLAGS."
+  (loop with hash = (make-hash-table :size (ceiling (expt *regex-char-code-limit* (/ 1 4)))
+                                     :rehash-size (float (expt *regex-char-code-limit* (/ 1 4)))
+                                     :rehash-threshold 1.0)
+        for item in list
+        if (characterp item)
+          ;; treat a single character C like a range (:RANGE C C)
+          do (add-range-to-hash hash item item)
+        else if (symbolp item)
+          ;; special character classes
+          do (setq hash
+                     (case item
+                       ((:digit-class)
+                         (merge-hash hash +digit-hash+))
+                       ((:non-digit-class)
+                         (merge-inverted-hash hash +digit-hash+))
+                       ((:whitespace-char-class)
+                         (merge-hash hash +whitespace-char-hash+))
+                       ((:non-whitespace-char-class)
+                         (merge-inverted-hash hash +whitespace-char-hash+))
+                       ((:word-char-class)
+                         (merge-hash hash +word-char-hash+))
+                       ((:non-word-char-class)
+                         (merge-inverted-hash hash +word-char-hash+))
+                       (otherwise
+                         (signal-ppcre-syntax-error
+                          "Unknown symbol ~A in character class"
+                          item))))
+        else if (and (consp item)
+                     (eq (car item) :range))
+          ;; proper ranges
+          do (add-range-to-hash hash
+                                (second item)
+                                (third item))
+        else do (signal-ppcre-syntax-error "Unknown item ~A in char-class list"
+                                           item)
+        finally (return hash)))
+
+(defun maybe-split-repetition (regex
+                               greedyp
+                               minimum
+                               maximum
+                               min-len
+                               length
+                               reg-seen)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum minimum)
+           (type (or fixnum null) maximum))
+  "Splits a REPETITION object into a constant and a varying part if
+applicable, i.e. something like
+  a{3,} -> a{3}a*
+The arguments to this function correspond to the REPETITION slots of
+the same name."
+  ;; note the usage of COPY-REGEX here; we can't use the same REGEX
+  ;; object in both REPETITIONS because they will have different
+  ;; offsets
+  (when maximum
+    (when (zerop maximum)
+      ;; trivial case: don't repeat at all
+      (return-from maybe-split-repetition
+        (make-instance 'void)))
+    (when (= 1 minimum maximum)
+      ;; another trivial case: "repeat" exactly once
+      (return-from maybe-split-repetition
+        regex)))
+  ;; first set up the constant part of the repetition
+  ;; maybe that's all we need
+  (let ((constant-repetition (if (plusp minimum)
+                               (make-instance 'repetition
+                                              :regex (copy-regex regex)
+                                              :greedyp greedyp
+                                              :minimum minimum
+                                              :maximum minimum
+                                              :min-len min-len
+                                              :len length
+                                              :contains-register-p reg-seen)
+                               ;; don't create garbage if minimum is 0
+                               nil)))
+    (when (and maximum
+               (= maximum minimum))
+      (return-from maybe-split-repetition
+        ;; no varying part needed because min = max
+        constant-repetition))
+    ;; now construct the varying part
+    (let ((varying-repetition
+            (make-instance 'repetition
+                           :regex regex
+                           :greedyp greedyp
+                           :minimum 0
+                           :maximum (if maximum (- maximum minimum) nil)
+                           :min-len min-len
+                           :len length
+                           :contains-register-p reg-seen)))
+      (cond ((zerop minimum)
+              ;; min = 0, no constant part needed
+              varying-repetition)
+            ((= 1 minimum)
+              ;; min = 1, constant part needs no REPETITION wrapped around
+              (make-instance 'seq
+                             :elements (list (copy-regex regex)
+                                             varying-repetition)))
+            (t
+              ;; general case
+              (make-instance 'seq
+                             :elements (list constant-repetition
+                                             varying-repetition)))))))
+
+;; During the conversion of the parse tree we keep track of the start
+;; of the parse tree in the special variable STARTS-WITH which'll
+;; either hold a STR object or an EVERYTHING object. The latter is the
+;; case if the regex starts with ".*" which implicitely anchors the
+;; regex at the start (perhaps modulo #\Newline).
+
+(defmethod maybe-accumulate ((str str))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (special accumulate-start-p starts-with))
+  (declare (ftype (function (t) fixnum) len))
+  "Accumulate STR into the special variable STARTS-WITH if
+ACCUMULATE-START-P (also special) is true and STARTS-WITH is either
+NIL or a STR object of the same case mode. Always returns NIL."
+  (when accumulate-start-p
+    (etypecase starts-with
+      (str
+        ;; STARTS-WITH already holds a STR, so we check if we can
+        ;; concatenate
+        (cond ((eq (case-insensitive-p starts-with)
+                   (case-insensitive-p str))
+                ;; we modify STARTS-WITH in place
+                (setf (len starts-with)
+                        (+ (len starts-with) (len str)))
+                ;; note that we use SLOT-VALUE because the accessor
+                ;; STR has a declared FTYPE which doesn't fit here
+                (adjust-array (slot-value starts-with 'str)
+                              (len starts-with)
+                              :fill-pointer t)
+                (setf (subseq (slot-value starts-with 'str)
+                              (- (len starts-with) (len str)))
+                        (str str)
+                      ;; STR objects that are parts of STARTS-WITH
+                      ;; always have their SKIP slot set to true
+                      ;; because the SCAN function will take care of
+                      ;; them, i.e. the matcher can ignore them
+                      (skip str) t))
+              (t (setq accumulate-start-p nil))))
+      (null
+        ;; STARTS-WITH is still empty, so we create a new STR object
+        (setf starts-with
+                (make-instance 'str
+                               :str ""
+                               :case-insensitive-p (case-insensitive-p str))
+              ;; INITIALIZE-INSTANCE will coerce the STR to a simple
+              ;; string, so we have to fill it afterwards
+              (slot-value starts-with 'str)
+                (make-array (len str)
+                            :initial-contents (str str)
+                            :element-type 'character
+                            :fill-pointer t
+                            :adjustable t)
+              (len starts-with)
+                (len str)
+              ;; see remark about SKIP above
+              (skip str) t))
+      (everything
+        ;; STARTS-WITH already holds an EVERYTHING object - we can't
+        ;; concatenate
+        (setq accumulate-start-p nil))))
+  nil)
+
+(defun convert-aux (parse-tree)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (special flags reg-num accumulate-start-p starts-with max-back-ref))
+  "Converts the parse tree PARSE-TREE into a REGEX object and returns it.
+
+Will also
+  - split and optimize repetitions,
+  - accumulate strings or EVERYTHING objects into the special variable
+    STARTS-WITH,
+  - keep track of all registers seen in the special variable REG-NUM,
+  - keep track of the highest backreference seen in the special
+    variable MAX-BACK-REF,
+  - maintain and adher to the currently applicable modifiers in the special
+    variable FLAGS, and
+  - maybe even wash your car..."
+  (cond ((consp parse-tree)
+          (case (first parse-tree)
+            ;; (:SEQUENCE {<regex>}*)
+            ((:sequence)
+              (cond ((cddr parse-tree)
+                      ;; this is essentially like
+                      ;; (MAPCAR 'CONVERT-AUX (REST PARSE-TREE))
+                      ;; but we don't cons a new list
+                      (loop for parse-tree-rest on (rest parse-tree)
+                            while parse-tree-rest
+                            do (setf (car parse-tree-rest)
+                                       (convert-aux (car parse-tree-rest))))
+                      (make-instance 'seq
+                                     :elements (rest parse-tree)))
+                    (t (convert-aux (second parse-tree)))))
+            ;; (:GROUP {<regex>}*)
+            ;; this is a syntactical construct equivalent to :SEQUENCE
+            ;; intended to keep the effect of modifiers local
+            ((:group)
+              ;; make a local copy of FLAGS and shadow the global
+              ;; value while we descend into the enclosed regexes
+              (let ((flags (copy-list flags)))
+                (declare (special flags))
+                (cond ((cddr parse-tree)
+                        (loop for parse-tree-rest on (rest parse-tree)
+                              while parse-tree-rest
+                              do (setf (car parse-tree-rest)
+                                         (convert-aux (car parse-tree-rest))))
+                        (make-instance 'seq
+                                       :elements (rest parse-tree)))
+                      (t (convert-aux (second parse-tree))))))
+            ;; (:ALTERNATION {<regex>}*)
+            ((:alternation)
+              ;; we must stop accumulating objects into STARTS-WITH
+              ;; once we reach an alternation
+              (setq accumulate-start-p nil)
+              (loop for parse-tree-rest on (rest parse-tree)
+                    while parse-tree-rest
+                    do (setf (car parse-tree-rest)
+                               (convert-aux (car parse-tree-rest))))
+              (make-instance 'alternation
+                             :choices (rest parse-tree)))
+            ;; (:BRANCH <test> <regex>)
+            ;; <test> must be look-ahead, look-behind or number;
+            ;; if <regex> is an alternation it must have one or two
+            ;; choices
+            ((:branch)
+              (setq accumulate-start-p nil)
+              (let* ((test-candidate (second parse-tree))
+                     (test (cond ((numberp test-candidate)
+                                   (when (zerop (the fixnum test-candidate))
+                                     (signal-ppcre-syntax-error
+                                      "Register 0 doesn't exist: ~S"
+                                      parse-tree))
+                                   (1- (the fixnum test-candidate)))
+                                 (t (convert-aux test-candidate))))
+                     (alternations (convert-aux (third parse-tree))))
+                (when (and (not (numberp test))
+                           (not (typep test 'lookahead))
+                           (not (typep test 'lookbehind)))
+                  (signal-ppcre-syntax-error
+                   "Branch test must be look-ahead, look-behind or number: ~S"
+                   parse-tree))
+                (typecase alternations
+                  (alternation
+                    (case (length (choices alternations))
+                      ((0)
+                        (signal-ppcre-syntax-error "No choices in branch: ~S"
+                                                   parse-tree))
+                      ((1)
+                        (make-instance 'branch
+                                       :test test
+                                       :then-regex (first
+                                                    (choices alternations))))
+                      ((2)
+                        (make-instance 'branch
+                                       :test test
+                                       :then-regex (first
+                                                    (choices alternations))
+                                       :else-regex (second
+                                                    (choices alternations))))
+                      (otherwise
+                        (signal-ppcre-syntax-error
+                         "Too much choices in branch: ~S"
+                         parse-tree))))
+                  (t
+                    (make-instance 'branch
+                                   :test test
+                                   :then-regex alternations)))))
+            ;; (:POSITIVE-LOOKAHEAD|:NEGATIVE-LOOKAHEAD <regex>)
+            ((:positive-lookahead :negative-lookahead)
+              ;; keep the effect of modifiers local to the enclosed
+              ;; regex and stop accumulating into STARTS-WITH
+              (setq accumulate-start-p nil)
+              (let ((flags (copy-list flags)))
+                (declare (special flags))
+                (make-instance 'lookahead
+                               :regex (convert-aux (second parse-tree))
+                               :positivep (eq (first parse-tree)
+                                              :positive-lookahead))))
+            ;; (:POSITIVE-LOOKBEHIND|:NEGATIVE-LOOKBEHIND <regex>)
+            ((:positive-lookbehind :negative-lookbehind)
+              ;; keep the effect of modifiers local to the enclosed
+              ;; regex and stop accumulating into STARTS-WITH
+              (setq accumulate-start-p nil)
+              (let* ((flags (copy-list flags))
+                     (regex (convert-aux (second parse-tree)))
+                     (len (regex-length regex)))
+                (declare (special flags))
+                ;; lookbehind assertions must be of fixed length
+                (unless len
+                  (signal-ppcre-syntax-error
+                   "Variable length look-behind not implemented (yet): ~S"
+                   parse-tree))
+                (make-instance 'lookbehind
+                               :regex regex
+                               :positivep (eq (first parse-tree)
+                                              :positive-lookbehind)
+                               :len len)))
+            ;; (:GREEDY-REPETITION|:NON-GREEDY-REPETITION <min> <max> <regex>)
+            ((:greedy-repetition :non-greedy-repetition)
+              ;; remember the value of ACCUMULATE-START-P upon entering
+              (let ((local-accumulate-start-p accumulate-start-p))
+                (let ((minimum (second parse-tree))
+                      (maximum (third parse-tree)))
+                  (declare (type fixnum minimum))
+                  (declare (type (or null fixnum) maximum))
+                  (unless (and maximum
+                               (= 1 minimum maximum))
+                    ;; set ACCUMULATE-START-P to NIL for the rest of
+                    ;; the conversion because we can't continue to
+                    ;; accumulate inside as well as after a proper
+                    ;; repetition
+                    (setq accumulate-start-p nil))
+                  (let* (reg-seen
+                         (regex (convert-aux (fourth parse-tree)))
+                         (min-len (regex-min-length regex))
+                         (greedyp (eq (first parse-tree) :greedy-repetition))
+                         (length (regex-length regex)))
+                    ;; note that this declaration already applies to
+                    ;; the call to CONVERT-AUX above
+                    (declare (special reg-seen))
+                    (when (and local-accumulate-start-p
+                               (not starts-with)
+                               (zerop minimum)
+                               (not maximum))
+                      ;; if this repetition is (equivalent to) ".*"
+                      ;; and if we're at the start of the regex we
+                      ;; remember it for ADVANCE-FN (see the SCAN
+                      ;; function)
+                      (setq starts-with (everythingp regex)))
+                    (if (or (not reg-seen)
+                            (not greedyp)
+                            (not length)
+                            (zerop length)
+                            (and maximum (= minimum maximum)))
+                      ;; the repetition doesn't enclose a register, or
+                      ;; it's not greedy, or we can't determine it's
+                      ;; (inner) length, or the length is zero, or the
+                      ;; number of repetitions is fixed; in all of
+                      ;; these cases we don't bother to optimize
+                      (maybe-split-repetition regex
+                                              greedyp
+                                              minimum
+                                              maximum
+                                              min-len
+                                              length
+                                              reg-seen)
+                      ;; otherwise we make a transformation that looks
+                      ;; roughly like one of
+                      ;;   <regex>* -> (?:<regex'>*<regex>)?
+                      ;;   <regex>+ -> <regex'>*<regex>
+                      ;; where the trick is that as much as possible
+                      ;; registers from <regex> are removed in
+                      ;; <regex'>
+                      (let* (reg-seen   ; new instance for REMOVE-REGISTERS
+                             (remove-registers-p t)
+                             (inner-regex (remove-registers regex))
+                             (inner-repetition
+                               ;; this is the "<regex'>" part
+                               (maybe-split-repetition inner-regex
+                                                       ;; always greedy
+                                                       t
+                                                       ;; reduce minimum by 1
+                                                       ;; unless it's already 0
+                                                       (if (zerop minimum)
+                                                         0
+                                                         (1- minimum))
+                                                       ;; reduce maximum by 1
+                                                       ;; unless it's NIL
+                                                       (and maximum
+                                                            (1- maximum))
+                                                       min-len
+                                                       length
+                                                       reg-seen))
+                             (inner-seq
+                               ;; this is the "<regex'>*<regex>" part
+                               (make-instance 'seq
+                                              :elements (list inner-repetition
+                                                              regex))))
+                        ;; note that this declaration already applies
+                        ;; to the call to REMOVE-REGISTERS above
+                        (declare (special remove-registers-p reg-seen))
+                        ;; wrap INNER-SEQ with a greedy
+                        ;; {0,1}-repetition (i.e. "?") if necessary
+                        (if (plusp minimum)
+                          inner-seq
+                          (maybe-split-repetition inner-seq
+                                                  t
+                                                  0
+                                                  1
+                                                  min-len
+                                                  nil
+                                                  t))))))))
+            ;; (:REGISTER <regex>)
+            ((:register)
+              ;; keep the effect of modifiers local to the enclosed
+              ;; regex; also, assign the current value of REG-NUM to
+              ;; the corresponding slot of the REGISTER object and
+              ;; increase this counter afterwards
+              (let ((flags (copy-list flags))
+                    (stored-reg-num reg-num))
+                (declare (special flags reg-seen))
+                (setq reg-seen t)
+                (incf (the fixnum reg-num))
+                (make-instance 'register
+                               :regex (convert-aux (second parse-tree))
+                               :num stored-reg-num)))
+            ;; (:STANDALONE <regex>)
+            ((:standalone)
+              ;; keep the effect of modifiers local to the enclosed
+              ;; regex
+              (let ((flags (copy-list flags)))
+                (declare (special flags))
+                (make-instance 'standalone
+                               :regex (convert-aux (second parse-tree)))))
+            ;; (:BACK-REFERENCE <number>)
+            ((:back-reference)
+              (let ((backref-number (second parse-tree)))
+                (declare (type fixnum backref-number))
+                (when (or (not (typep backref-number 'fixnum))
+                          (<= backref-number 0))
+                  (signal-ppcre-syntax-error
+                   "Illegal back-reference: ~S"
+                   parse-tree))
+                ;; stop accumulating into STARTS-WITH and increase
+                ;; MAX-BACK-REF if necessary
+                (setq accumulate-start-p nil
+                      max-back-ref (max (the fixnum max-back-ref)
+                                        backref-number))
+                (make-instance 'back-reference
+                               ;; we start counting from 0 internally
+                               :num (1- backref-number)
+                               :case-insensitive-p (case-insensitive-mode-p
+                                                    flags))))
+            ;; (:CHAR-CLASS|:INVERTED-CHAR-CLASS {<item>}*)
+            ;; where item is one of
+            ;;   - a character
+            ;;   - a character range: (:RANGE <char1> <char2>)
+            ;;   - a special char class symbol like :DIGIT-CHAR-CLASS
+            ((:char-class :inverted-char-class)
+              ;; first create the hash-table and some auxiliary values
+              (let* (hash
+                     hash-keys
+                     (count most-positive-fixnum)
+                     (item-list (rest parse-tree))
+                     (invertedp (eq (first parse-tree) :inverted-char-class))
+                     word-char-class-p)
+                (cond ((every (lambda (item) (eq item :word-char-class))
+                              item-list)
+                        ;; treat "[\\w]" like "\\w"
+                        (setq word-char-class-p t))
+                      ((every (lambda (item) (eq item :non-word-char-class))
+                              item-list)
+                        ;; treat "[\\W]" like "\\W"
+                        (setq word-char-class-p t)
+                        (setq invertedp (not invertedp)))
+                      (t
+                        (setq hash (convert-char-class-to-hash item-list)
+                              count (hash-table-count hash))
+                        (when (<= count 2)
+                          ;; collect the hash-table keys into a list if
+                          ;; COUNT is smaller than 3
+                          (setq hash-keys
+                                  (loop for chr being the hash-keys of hash
+                                        collect chr)))))
+                (cond ((and (not invertedp)
+                            (= count 1))
+                        ;; convert one-element hash table into a STR
+                        ;; object and try to accumulate into
+                        ;; STARTS-WITH
+                        (let ((str (make-instance 'str
+                                                  :str (string
+                                                        (first hash-keys))
+                                                  :case-insensitive-p nil)))
+                          (maybe-accumulate str)
+                          str))
+                      ((and (not invertedp)
+                            (= count 2)
+                            (char-equal (first hash-keys) (second hash-keys)))
+                        ;; convert two-element hash table into a
+                        ;; case-insensitive STR object and try to
+                        ;; accumulate into STARTS-WITH if the two
+                        ;; characters are CHAR-EQUAL
+                        (let ((str (make-instance 'str
+                                                  :str (string
+                                                        (first hash-keys))
+                                                  :case-insensitive-p t)))
+                          (maybe-accumulate str)
+                          str))
+                      (t
+                        ;; the general case; stop accumulating into STARTS-WITH
+                        (setq accumulate-start-p nil)
+                        (make-instance 'char-class
+                                       :hash hash
+                                       :case-insensitive-p
+                                         (case-insensitive-mode-p flags)
+                                       :invertedp invertedp
+                                       :word-char-class-p word-char-class-p)))))
+            ;; (:FLAGS {<flag>}*)
+            ;; where flag is a modifier symbol like :CASE-INSENSITIVE-P
+            ((:flags)
+              ;; set/unset the flags corresponding to the symbols
+              ;; following :FLAGS
+              (mapc #'set-flag (rest parse-tree))
+              ;; we're only interested in the side effect of
+              ;; setting/unsetting the flags and turn this syntactical
+              ;; construct into a VOID object which'll be optimized
+              ;; away when creating the matcher
+              (make-instance 'void))
+            (otherwise
+              (signal-ppcre-syntax-error
+               "Unknown token ~A in parse-tree"
+               (first parse-tree)))))
+        ((or (characterp parse-tree) (stringp parse-tree))
+          ;; turn characters or strings into STR objects and try to
+          ;; accumulate into STARTS-WITH
+          (let ((str (make-instance 'str
+                                    :str (string parse-tree)
+                                    :case-insensitive-p
+                                      (case-insensitive-mode-p flags))))
+            (maybe-accumulate str)
+            str))
+        (t
+          ;; and now for the tokens which are symbols
+          (case parse-tree
+            ((:void)
+              (make-instance 'void))
+            ((:word-boundary)
+              (make-instance 'word-boundary :negatedp nil))             
+            ((:non-word-boundary)
+              (make-instance 'word-boundary :negatedp t))             
+            ;; the special character classes
+            ((:digit-class
+              :non-digit-class
+              :word-char-class
+              :non-word-char-class
+              :whitespace-char-class
+              :non-whitespace-char-class)
+              ;; stop accumulating into STARTS-WITH
+              (setq accumulate-start-p nil)
+              (make-instance 'char-class
+                             ;; use the constants defined in util.lisp
+                             :hash (case parse-tree
+                                     ((:digit-class
+                                       :non-digit-class)
+                                       +digit-hash+)
+                                     ((:word-char-class
+                                       :non-word-char-class)
+                                       nil)
+                                     ((:whitespace-char-class
+                                       :non-whitespace-char-class)
+                                       +whitespace-char-hash+))
+                             ;; this value doesn't really matter but
+                             ;; NIL should result in slightly faster
+                             ;; matchers
+                             :case-insensitive-p nil
+                             :invertedp (member parse-tree
+                                                '(:non-digit-class
+                                                  :non-word-char-class
+                                                  :non-whitespace-char-class)
+                                                :test #'eq)
+                             :word-char-class-p (member parse-tree
+                                                        '(:word-char-class
+                                                          :non-word-char-class)
+                                                        :test #'eq)))
+            ((:start-anchor             ; Perl's "^"
+              :end-anchor               ; Perl's "$"
+              :modeless-end-anchor-no-newline
+                                        ; Perl's "\z"
+              :modeless-start-anchor    ; Perl's "\A"
+              :modeless-end-anchor)     ; Perl's "\Z"
+              (make-instance 'anchor
+                             :startp (member parse-tree
+                                             '(:start-anchor
+                                               :modeless-start-anchor)
+                                             :test #'eq)
+                             ;; set this value according to the
+                             ;; current settings of FLAGS (unless it's
+                             ;; a modeless anchor)
+                             :multi-line-p
+                               (and (multi-line-mode-p flags)
+                                    (not (member parse-tree
+                                                 '(:modeless-start-anchor
+                                                   :modeless-end-anchor
+                                                   :modeless-end-anchor-no-newline)
+                                                 :test #'eq)))
+                             :no-newline-p
+                               (eq parse-tree
+                                   :modeless-end-anchor-no-newline)))
+            ((:everything)
+              ;; stop accumulating into STARTS-WITHS
+              (setq accumulate-start-p nil)
+              (make-instance 'everything
+                             :single-line-p (single-line-mode-p flags)))
+            ;; special tokens corresponding to Perl's "ism" modifiers
+            ((:case-insensitive-p
+              :case-sensitive-p
+              :multi-line-mode-p
+              :not-multi-line-mode-p
+              :single-line-mode-p
+              :not-single-line-mode-p)
+              ;; we're only interested in the side effect of
+              ;; setting/unsetting the flags and turn these tokens
+              ;; into VOID objects which'll be optimized away when
+              ;; creating the matcher
+              (set-flag parse-tree)
+              (make-instance 'void))
+            (otherwise
+              (signal-ppcre-syntax-error "Unknown token ~A in parse-tree"
+                                         parse-tree))))))
+
+(defun convert (parse-tree)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Converts the parse tree PARSE-TREE into an equivalent REGEX object
+and returns three values: the REGEX object, the number of registers
+seen and an object the regex starts with which is either a STR object
+or an EVERYTHING object (if the regex starts with something like
+\".*\") or NIL."
+  ;; this function basically just initializes the special variables
+  ;; and then calls CONVERT-AUX to do all the work
+  (let* ((flags (list nil nil nil))
+         (reg-num 0)
+         (accumulate-start-p t)
+         starts-with
+         (max-back-ref 0)
+         (converted-parse-tree (convert-aux parse-tree)))
+    (declare (special flags reg-num accumulate-start-p starts-with max-back-ref))
+    ;; make sure we don't reference registers which aren't there
+    (when (> (the fixnum max-back-ref)
+             (the fixnum reg-num))
+      (signal-ppcre-syntax-error
+       "Backreference to register ~A which has not been defined"
+       max-back-ref))
+    (when (typep starts-with 'str)
+      (setf (slot-value starts-with 'str)
+              (coerce (slot-value starts-with 'str) 'simple-string)))
+    (values converted-parse-tree reg-num starts-with)))
--- a/doc/benchmarks.2002-12-22.txt
+++ b/doc/benchmarks.2002-12-22.txt
--- a/doc/index.html
+++ b/doc/index.html
--- a/errors.lisp
+++ b/errors.lisp
@ -0,0 +1,72 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE-LISP; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/errors.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defvar *syntax-error-string* nil
+  "The string which caused the syntax error.")
+
+(define-condition ppcre-error (simple-error)
+  ()
+  (:documentation "All errors signaled by CL-PPCRE are of
+this type."))
+
+(define-condition ppcre-syntax-error (ppcre-error)
+  ((string :initarg :string
+           :reader ppcre-syntax-error-string)
+   (pos :initarg :pos
+        :reader ppcre-syntax-error-pos))
+  (:default-initargs
+      :pos nil
+      :string *syntax-error-string*)
+  (:report (lambda (condition stream)
+             (format stream "~?~@[ at position ~A~]~@[ in string ~S~]"
+                     (simple-condition-format-control condition)
+                     (simple-condition-format-arguments condition)
+                     (ppcre-syntax-error-pos condition)
+                     (ppcre-syntax-error-string condition)))))
+
+(define-condition ppcre-invocation-error (ppcre-error)
+  ()
+  (:documentation "Signaled when CL-PPCRE functions are
+invoked with wrong arguments."))
+
+(defmacro signal-ppcre-syntax-error* (pos format-control &rest format-arguments)
+  `(error 'ppcre-syntax-error
+          :pos ,pos
+          :format-control ,format-control
+          :format-arguments (list ,@format-arguments)))
+
+(defmacro signal-ppcre-syntax-error (format-control &rest format-arguments)
+  `(signal-ppcre-syntax-error* nil ,format-control ,@format-arguments))
+
+(defmacro signal-ppcre-invocation-error (format-control &rest format-arguments)
+  `(error 'ppcre-invocation-error
+          :format-control ,format-control
+          :format-arguments (list ,@format-arguments)))
--- a/lexer.lisp
+++ b/lexer.lisp
@ -0,0 +1,769 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/lexer.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; The lexer's responsibility is to convert the regex string into a
+;;; sequence of tokens which are in turn consumed by the parser.
+;;;
+;;; The lexer is aware of Perl's 'extended mode' and it also 'knows'
+;;; (with a little help from the parser) how many register groups it
+;;; has opened so far. (The latter is necessary for interpreting
+;;; strings like "\\10" correctly.)
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(declaim (inline map-char-to-special-class))
+(defun map-char-to-special-char-class (chr)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Maps escaped characters like \"\\d\" to the tokens which represent
+their associated character classes."
+  (case chr
+    ((#\d)
+      :digit-class)
+    ((#\D)
+      :non-digit-class)
+    ((#\w)
+      :word-char-class)
+    ((#\W)
+      :non-word-char-class)
+    ((#\s)
+      :whitespace-char-class)
+    ((#\S)
+      :non-whitespace-char-class)))
+
+(locally
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (defstruct (lexer (:constructor make-lexer-internal))
+    "LEXER structures are used to hold the regex string which is
+currently lexed and to keep track of the lexer's state."
+    (str ""
+         :type string
+         :read-only t)
+    (len 0
+         :type fixnum
+         :read-only t)
+    (reg 0
+         :type fixnum)
+    (pos 0
+         :type fixnum)
+    (last-pos nil
+              :type list)))
+
+(defun make-lexer (string)
+  (declare (inline make-lexer-internal)
+           (type string string))
+  (make-lexer-internal :str (maybe-coerce-to-simple-string string)
+                       :len (length string)))
+
+(declaim (inline end-of-string-p))
+(defun end-of-string-p (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Tests whether we're at the end of the regex string."
+  (<= (lexer-len lexer)
+      (lexer-pos lexer)))
+
+(declaim (inline looking-at-p))
+(defun looking-at-p (lexer chr)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Tests whether the next character the lexer would see is CHR.
+Does not respect extended mode."
+  (and (not (end-of-string-p lexer))
+       (char= (schar (lexer-str lexer) (lexer-pos lexer))
+              chr)))
+
+(declaim (inline next-char-non-extended))
+(defun next-char-non-extended (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns the next character which is to be examined and updates the
+POS slot. Does not respect extended mode."
+  (cond ((end-of-string-p lexer)
+          nil)
+        (t
+          (prog1
+            (schar (lexer-str lexer) (lexer-pos lexer))
+            (incf (lexer-pos lexer))))))
+
+(defun next-char (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns the next character which is to be examined and updates the
+POS slot. Respects extended mode, i.e.  whitespace, comments, and also
+nested comments are skipped if applicable."
+  (let ((next-char (next-char-non-extended lexer))
+        last-loop-pos)
+    (loop
+      ;; remember where we started
+      (setq last-loop-pos (lexer-pos lexer))
+      ;; first we look for nested comments like (?#foo)
+      (when (and next-char
+                 (char= next-char #\()
+                 (looking-at-p lexer #\?))
+        (incf (lexer-pos lexer))
+        (cond ((looking-at-p lexer #\#)
+                ;; must be a nested comment - so we have to search for
+                ;; the closing parenthesis
+                (let ((error-pos (- (lexer-pos lexer) 2)))
+                  (unless
+                      ;; loop 'til ')' or end of regex string and
+                      ;; return NIL if ')' wasn't encountered
+                      (loop for skip-char = next-char
+                            then (next-char-non-extended lexer)
+                            while (and skip-char
+                                       (char/= skip-char #\)))
+                            finally (return skip-char))
+                    (signal-ppcre-syntax-error*
+                     error-pos
+                     "Comment group not closed")))
+                (setq next-char (next-char-non-extended lexer)))
+              (t
+                ;; undo effect of previous INCF if we didn't see a #
+                (decf (lexer-pos lexer)))))
+      (when *extended-mode-p*
+        ;; now - if we're in extended mode - we skip whitespace and
+        ;; comments; repeat the following loop while we look at
+        ;; whitespace or #\#
+        (loop while (and next-char
+                         (or (char= next-char #\#)
+                             (whitespacep next-char)))
+              do (setq next-char
+                         (if (char= next-char #\#)
+                           ;; if we saw a comment marker skip until
+                           ;; we're behind #\Newline...
+                           (loop for skip-char = next-char
+                                 then (next-char-non-extended lexer)
+                                 while (and skip-char
+                                            (char/= skip-char #\Newline))
+                                 finally (return (next-char-non-extended lexer)))
+                           ;; ...otherwise (whitespace) skip until we
+                           ;; see the next non-whitespace character
+                           (loop for skip-char = next-char
+                                 then (next-char-non-extended lexer)
+                                 while (and skip-char
+                                            (whitespacep skip-char))
+                                 finally (return skip-char))))))
+      ;; if the position has moved we have to repeat our tests
+      ;; because of cases like /^a (?#xxx) (?#yyy) {3}c/x which
+      ;; would be equivalent to /^a{3}c/ in Perl
+      (unless (> (lexer-pos lexer) last-loop-pos)
+        (return next-char)))))
+
+(declaim (inline fail))
+(defun fail (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Moves (LEXER-POS LEXER) back to the last position stored in
+\(LEXER-LAST-POS LEXER) and pops the LAST-POS stack."
+  (unless (lexer-last-pos lexer)
+    (signal-ppcre-syntax-error "LAST-POS stack of LEXER ~A is empty" lexer))
+  (setf (lexer-pos lexer) (pop (lexer-last-pos lexer)))
+  nil)
+
+(defun get-number (lexer &key (radix 10) max-length no-whitespace-p)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Read and consume the number the lexer is currently looking at and
+return it. Returns NIL if no number could be identified.
+RADIX is used as in PARSE-INTEGER. If MAX-LENGTH is not NIL we'll read
+at most the next MAX-LENGTH characters. If NO-WHITESPACE-P is not NIL
+we don't tolerate whitespace in front of the number."
+  (when (or (end-of-string-p lexer)
+            (and no-whitespace-p
+                 (whitespacep (schar (lexer-str lexer) (lexer-pos lexer)))))
+    (return-from get-number nil))
+  (multiple-value-bind (integer new-pos)
+      (parse-integer (lexer-str lexer)
+                     :start (lexer-pos lexer)
+                     :end (if max-length
+                            (let ((end-pos (+ (lexer-pos lexer)
+                                              (the fixnum max-length)))
+                                  (lexer-len (lexer-len lexer)))
+                              (if (< end-pos lexer-len)
+                                end-pos
+                                lexer-len))
+                            (lexer-len lexer))
+                     :radix radix
+                     :junk-allowed t)
+    (cond ((and integer (>= (the fixnum integer) 0))
+            (setf (lexer-pos lexer) new-pos)
+            integer)
+          (t nil))))
+
+(declaim (inline try-number))
+(defun try-number (lexer &key (radix 10) max-length no-whitespace-p)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Like GET-NUMBER but won't consume anything if no number is seen."
+  ;; remember current position
+  (push (lexer-pos lexer) (lexer-last-pos lexer))
+  (let ((number (get-number lexer
+                            :radix radix
+                            :max-length max-length
+                            :no-whitespace-p no-whitespace-p)))
+    (or number (fail lexer))))
+
+(declaim (inline make-char-from-code))
+(defun make-char-from-code (number error-pos)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Create character from char-code NUMBER. NUMBER can be NIL
+which is interpreted as 0. ERROR-POS is the position where
+the corresponding number started within the regex string."
+  ;; Only look at rightmost eight bits in compliance with Perl
+  (let ((code (logand #o377 (the fixnum (or number 0)))))
+    (or (and (< code char-code-limit)
+             (code-char code))
+        (signal-ppcre-syntax-error*
+         error-pos
+         "No character for hex-code ~X"
+         number))))
+
+(defun unescape-char (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Convert the characters(s) following a backslash into a token
+which is returned. This function is to be called when the backslash
+has already been consumed. Special character classes like \\W are
+handled elsewhere."
+  (when (end-of-string-p lexer)
+    (signal-ppcre-syntax-error "String ends with backslash"))
+  (let ((chr (next-char-non-extended lexer)))
+    (case chr
+      ((#\E)
+        ;; if \Q quoting is on this is ignored, otherwise it's just an
+        ;; #\E
+        (if *allow-quoting*
+          :void
+          #\E))
+      ((#\c)
+        ;; \cx means control-x in Perl
+        (let ((next-char (next-char-non-extended lexer)))
+          (unless next-char
+            (signal-ppcre-syntax-error*
+             (lexer-pos lexer)
+             "Character missing after '\\c' at position ~A"))
+          (code-char (logxor #x40 (char-code (char-upcase next-char))))))
+      ((#\x)
+        ;; \x should be followed by a hexadecimal char code,
+        ;; two digits or less
+        (let* ((error-pos (lexer-pos lexer))
+               (number (get-number lexer :radix 16 :max-length 2 :no-whitespace-p t)))
+          ;; note that it is OK if \x is followed by zero digits
+          (make-char-from-code number error-pos)))
+      ((#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
+        ;; \x should be followed by an octal char code,
+        ;; three digits or less
+        (let* ((error-pos (decf (lexer-pos lexer)))
+               (number (get-number lexer :radix 8 :max-length 3)))
+          (make-char-from-code number error-pos)))
+      ;; the following five character names are 'semi-standard'
+      ;; according to the CLHS but I'm not aware of any implementation
+      ;; that doesn't implement them
+      ((#\t)
+        #\Tab)
+      ((#\n)
+        #\Newline)
+      ((#\r)
+        #\Return)
+      ((#\f)
+        #\Page)
+      ((#\b)
+        #\Backspace)
+      ((#\a)
+        (code-char 7))                  ; ASCII bell
+      ((#\e)
+        (code-char 27))                 ; ASCII escape
+      (otherwise
+        ;; all other characters aren't affected by a backslash
+        chr))))
+
+(defun collect-char-class (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Reads and consumes characters from regex string until a right
+bracket is seen. Assembles them into a list \(which is returned) of
+characters, character ranges, like \(:RANGE #\\A #\\E) for a-e, and
+tokens representing special character classes."
+  (let ((start-pos (lexer-pos lexer))         ; remember start for error message
+        hyphen-seen
+        last-char
+        list)
+    (flet ((handle-char (c)
+             "Do the right thing with character C depending on whether
+we're inside a range or not."
+             (cond ((and hyphen-seen last-char)
+                     (setf (car list) (list :range last-char c)
+                           last-char nil))
+                   (t
+                     (push c list)
+                     (setq last-char c)))
+             (setq hyphen-seen nil)))
+      (loop for first = t then nil
+            for c = (next-char-non-extended lexer)
+            ;; leave loop if at end of string
+            while c
+            do (cond
+                 ((char= c #\\)
+                   ;; we've seen a backslash
+                   (let ((next-char (next-char-non-extended lexer)))
+                     (case next-char
+                       ((#\d #\D #\w #\W #\s #\S)
+                         ;; a special character class
+                         (push (map-char-to-special-char-class next-char) list)
+                         ;; if the last character was a hyphen
+                         ;; just collect it literally
+                         (when hyphen-seen
+                           (push #\- list))
+                         ;; if the next character is a hyphen do the same
+                         (when (looking-at-p lexer #\-)
+                           (push #\- list)
+                           (incf (lexer-pos lexer)))
+                         (setq hyphen-seen nil))
+                       ((#\E)
+                         ;; if \Q quoting is on we ignore \E,
+                         ;; otherwise it's just a plain #\E
+                         (unless *allow-quoting*
+                           (handle-char #\E)))
+                       (otherwise
+                         ;; otherwise unescape the following character(s)
+                         (decf (lexer-pos lexer))
+                         (handle-char (unescape-char lexer))))))
+                 (first
+                   ;; the first character must not be a right bracket
+                   ;; and isn't treated specially if it's a hyphen
+                   (handle-char c))
+                 ((char= c #\])
+                   ;; end of character class
+                   ;; make sure we collect a pending hyphen
+                   (when hyphen-seen
+                     (setq hyphen-seen nil)
+                     (handle-char #\-))
+                   ;; reverse the list to preserve the order intended
+                   ;; by the author of the regex string
+                   (return-from collect-char-class (nreverse list)))
+                 ((and (char= c #\-)
+                       last-char
+                       (not hyphen-seen))
+                   ;; if the last character was 'just a character'
+                   ;; we expect to be in the middle of a range
+                   (setq hyphen-seen t))
+                 ((char= c #\-)
+                   ;; otherwise this is just an ordinary hyphen
+                   (handle-char #\-))
+                 (t
+                   ;; default case - just collect the character
+                   (handle-char c))))
+      ;; we can only exit the loop normally if we've reached the end
+      ;; of the regex string without seeing a right bracket
+      (signal-ppcre-syntax-error*
+       start-pos
+       "Missing right bracket to close character class"))))
+
+(defun maybe-parse-flags (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Reads a sequence of modifiers \(including #\\- to reverse their
+meaning) and returns a corresponding list of \"flag\" tokens.  The
+\"x\" modifier is treated specially in that it dynamically modifies
+the behaviour of the lexer itself via the special variable
+*EXTENDED-MODE-P*."
+  (prog1
+    (loop with set = t
+          for chr = (next-char-non-extended lexer)
+          unless chr
+            do (signal-ppcre-syntax-error "Unexpected end of string")
+          while (find chr "-imsx" :test #'char=)
+          ;; the first #\- will invert the meaning of all modifiers
+          ;; following it
+          if (char= chr #\-)
+            do (setq set nil)
+          else if (char= chr #\x)
+            do (setq *extended-mode-p* set)
+          else collect (if set
+                         (case chr
+                           ((#\i)
+                             :case-insensitive-p)
+                           ((#\m)
+                             :multi-line-mode-p)
+                           ((#\s)
+                             :single-line-mode-p))
+                         (case chr
+                           ((#\i)
+                             :case-sensitive-p)
+                           ((#\m)
+                             :not-multi-line-mode-p)
+                           ((#\s)
+                             :not-single-line-mode-p))))
+    (decf (lexer-pos lexer))))
+
+(defun get-quantifier (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns a list of two values (min max) if what the lexer is looking
+at can be interpreted as a quantifier. Otherwise returns NIL and
+resets the lexer to its old position."
+  ;; remember starting position for FAIL and UNGET-TOKEN functions
+  (push (lexer-pos lexer) (lexer-last-pos lexer))
+  (let ((next-char (next-char lexer)))
+    (case next-char
+      ((#\*)
+        ;; * (Kleene star): match 0 or more times
+        '(0 nil))
+      ((#\+)
+        ;; +: match 1 or more times
+        '(1 nil))
+      ((#\?)
+        ;; ?: match 0 or 1 times
+        '(0 1))
+      ((#\{)
+        ;; one of
+        ;;   {n}:   match exactly n times
+        ;;   {n,}:  match at least n times
+        ;;   {n,m}: match at least n but not more than m times
+        ;; note that anything not matching one of these patterns will
+        ;; be interpreted literally - even whitespace isn't allowed
+        (let ((num1 (get-number lexer :no-whitespace-p t)))
+          (if num1
+            (let ((next-char (next-char-non-extended lexer)))
+              (case next-char
+                ((#\,)
+                  (let* ((num2 (get-number lexer :no-whitespace-p t))
+                         (next-char (next-char-non-extended lexer)))
+                    (case next-char
+                      ((#\})
+                        ;; this is the case {n,} (NUM2 is NIL) or {n,m}
+                        (list num1 num2))
+                      (otherwise
+                        (fail lexer)))))
+                ((#\})
+                  ;; this is the case {n}
+                  (list num1 num1))
+                (otherwise
+                  (fail lexer))))
+            ;; no number following left curly brace, so we treat it
+            ;; like a normal character
+            (fail lexer))))
+      ;; cannot be a quantifier
+      (otherwise
+        (fail lexer)))))
+
+(defun get-token (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns and consumes the next token from the regex string (or NIL)."
+  ;; remember starting position for UNGET-TOKEN function
+  (push (lexer-pos lexer)
+        (lexer-last-pos lexer))
+  (let ((next-char (next-char lexer)))
+    (cond (next-char
+            (case next-char
+              ;; the easy cases first - the following six characters
+              ;; always have a special meaning and get translated
+              ;; into tokens immediately
+              ((#\))
+                :close-paren)
+              ((#\|)
+                :vertical-bar)
+              ((#\?)
+                :question-mark)
+              ((#\.)
+                :everything)
+              ((#\^)
+                :start-anchor)
+              ((#\$)
+                :end-anchor)
+              ((#\+ #\*)
+                ;; quantifiers will always be consumend by
+                ;; GET-QUANTIFIER, they must not appear here
+                (signal-ppcre-syntax-error*
+                 (1- (lexer-pos lexer))
+                 "Quantifier '~A' not allowed"
+                 next-char))
+              ((#\{)
+                ;; left brace isn't a special character in it's own
+                ;; right but we must check if what follows might
+                ;; look like a quantifier
+                (let ((this-pos (lexer-pos lexer))
+                      (this-last-pos (lexer-last-pos lexer)))
+                  (unget-token lexer)
+                  (when (get-quantifier lexer)
+                    (signal-ppcre-syntax-error*
+                     (car this-last-pos)
+                     "Quantifier '~A' not allowed"
+                     (subseq (lexer-str lexer)
+                             (car this-last-pos)
+                             (lexer-pos lexer))))
+                  (setf (lexer-pos lexer) this-pos
+                        (lexer-last-pos lexer) this-last-pos)
+                  next-char))
+              ((#\[)
+                ;; left bracket always starts a character class
+                (cons  (cond ((looking-at-p lexer #\^)
+                               (incf (lexer-pos lexer))
+                               :inverted-char-class)
+                             (t
+                               :char-class))
+                       (collect-char-class lexer)))
+              ((#\\)
+                ;; backslash might mean different things so we have
+                ;; to peek one char ahead:
+                (let ((next-char (next-char-non-extended lexer)))
+                  (case next-char
+                    ((#\A)
+                      :modeless-start-anchor)
+                    ((#\Z)
+                      :modeless-end-anchor)
+                    ((#\z)
+                      :modeless-end-anchor-no-newline)
+                    ((#\b)
+                      :word-boundary)
+                    ((#\B)
+                      :non-word-boundary)
+                    ((#\d #\D #\w #\W #\s #\S)
+                      ;; these will be treated like character classes
+                      (map-char-to-special-char-class next-char))
+                    ((#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)
+                      ;; uh, a digit...
+                      (let* ((old-pos (decf (lexer-pos lexer)))
+                             ;; ...so let's get the whole number first
+                             (backref-number (get-number lexer)))
+                        (declare (type fixnum backref-number))
+                        (cond ((and (> backref-number (lexer-reg lexer))
+                                    (<= 10 backref-number))
+                                ;; \10 and higher are treated as octal
+                                ;; character codes if we haven't
+                                ;; opened that much register groups
+                                ;; yet
+                                (setf (lexer-pos lexer) old-pos)
+                                ;; re-read the number from the old
+                                ;; position and convert it to its
+                                ;; corresponding character
+                                (make-char-from-code (get-number lexer :radix 8 :max-length 3)
+                                                     old-pos))
+                              (t
+                                ;; otherwise this must refer to a
+                                ;; backreference
+                                (list :back-reference backref-number)))))
+                    ((#\0)
+                      ;; this always means an octal character code
+                      ;; (at most three digits)
+                      (let ((old-pos (decf (lexer-pos lexer))))
+                        (make-char-from-code (get-number lexer :radix 8 :max-length 3)
+                                             old-pos)))
+                    (otherwise
+                      ;; in all other cases just unescape the
+                      ;; character
+                      (decf (lexer-pos lexer))
+                      (unescape-char lexer)))))
+              ((#\()
+                ;; an open parenthesis might mean different things
+                ;; depending on what follows...
+                (cond ((looking-at-p lexer #\?)
+                        ;; this is the case '(?' (and probably more behind)
+                        (incf (lexer-pos lexer))
+                        ;; we have to check for modifiers first
+                        ;; because a colon might follow
+                        (let* ((flags (maybe-parse-flags lexer))
+                               (next-char (next-char-non-extended lexer)))
+                          ;; modifiers are only allowed if a colon
+                          ;; or a closing parenthesis are following
+                          (when (and flags
+                                     (not (find next-char ":)" :test #'char=)))
+                            (signal-ppcre-syntax-error*
+                             (car (lexer-last-pos lexer))
+                             "Sequence '~A' not recognized"
+                             (subseq (lexer-str lexer)
+                                     (car (lexer-last-pos lexer))
+                                     (lexer-pos lexer))))
+                          (case next-char
+                            ((nil)
+                              ;; syntax error
+                              (signal-ppcre-syntax-error
+                               "End of string following '(?'"))
+                            ((#\))
+                              ;; an empty group except for the flags
+                              ;; (if there are any)
+                              (or (and flags
+                                       (cons :flags flags))
+                                  :void))
+                            ((#\()
+                              ;; branch
+                              :open-paren-paren)
+                            ((#\>)
+                              ;; standalone
+                              :open-paren-greater)
+                            ((#\=)
+                              ;; positive look-ahead
+                              :open-paren-equal)
+                            ((#\!)
+                              ;; negative look-ahead
+                              :open-paren-exclamation)
+                            ((#\:)
+                              ;; non-capturing group - return flags as
+                              ;; second value
+                              (values :open-paren-colon flags))
+                            ((#\<)
+                              ;; might be a look-behind assertion, so
+                              ;; check next character
+                              (let ((next-char (next-char-non-extended lexer)))
+                                (case next-char
+                                  ((#\=)
+                                    ;; positive look-behind
+                                    :open-paren-less-equal)
+                                  ((#\!)
+                                    ;; negative look-behind
+                                    :open-paren-less-exclamation)
+                                  ((#\))
+                                    ;; Perl allows "(?<)" and treats
+                                    ;; it like a null string
+                                    :void)
+                                  ((nil)
+                                    ;; syntax error
+                                    (signal-ppcre-syntax-error
+                                     "End of string following '(?<'"))
+                                  (t
+                                    ;; also syntax error
+                                    (signal-ppcre-syntax-error*
+                                     (1- (lexer-pos lexer))
+                                     "Character '~A' may not follow '(?<'"
+                                     next-char )))))
+                            (otherwise
+                              (signal-ppcre-syntax-error*
+                               (1- (lexer-pos lexer))
+                               "Character '~A' may not follow '(?'"
+                               next-char)))))
+                      (t
+                        ;; if next-char was not #\? (this is within
+                        ;; the first COND), we've just seen an opening
+                        ;; parenthesis and leave it like that
+                        :open-paren)))
+              (otherwise
+                ;; all other characters are their own tokens
+                next-char)))
+          ;; we didn't get a character (this if the "else" branch from
+          ;; the first IF), so we don't return a token but NIL
+          (t
+            (pop (lexer-last-pos lexer))
+            nil))))
+
+(declaim (inline unget-token))
+(defun unget-token (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Moves the lexer back to the last position stored in the LAST-POS stack."
+  (if (lexer-last-pos lexer)
+    (setf (lexer-pos lexer)
+            (pop (lexer-last-pos lexer)))
+    (error "No token to unget \(this should not happen)")))
+
+(declaim (inline start-of-subexpr-p))
+(defun start-of-subexpr-p (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Tests whether the next token can start a valid sub-expression, i.e.
+a stand-alone regex."
+  (let* ((pos (lexer-pos lexer))
+         (next-char (next-char lexer)))
+    (not (or (null next-char)
+             (prog1
+               (member (the character next-char)
+                       '(#\) #\|)
+                       :test #'char=)
+               (setf (lexer-pos lexer) pos))))))
--- a/load.lisp
+++ b/load.lisp
@ -0,0 +1,62 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/load.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz.  All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-user)
+
+(defparameter *cl-ppcre-base-directory*
+  (make-pathname :name nil :type nil :version nil
+                 :defaults (parse-namestring *load-truename*)))
+
+(loop for file in '("packages"
+                    "specials"
+                    "util"
+                    "errors"
+                    "lexer"
+                    "parser"
+                    "regex-class"
+                    "convert"
+                    "optimize"
+                    "closures"
+                    "repetition-closures"
+                    "scanner"
+                    "api"
+                    "ppcre-tests")
+      do (let ((pathname (make-pathname :name file :type "lisp" :version nil
+                                        :defaults *cl-ppcre-base-directory*)))
+           #-:cormanlisp
+           (let ((compiled-pathname (compile-file-pathname pathname)))
+             (unless (probe-file compiled-pathname)
+               (compile-file pathname))
+             (setq pathname compiled-pathname))
+           (load pathname)))
+
+
+
+
+
--- a/optimize.lisp
+++ b/optimize.lisp
@ -0,0 +1,597 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/optimize.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; This file contains optimizations which can be applied to converted
+;;; parse trees.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defun string-list-to-simple-string (string-list)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Concatenates a list of strings to one simple-string."
+  ;; this function provided by JP Massar; note that we can't use APPLY
+  ;; with CONCATENATE here because of CALL-ARGUMENTS-LIMIT
+  (let ((total-size 0))
+    (declare (type fixnum total-size))
+    (dolist (string string-list)
+      (declare (type string string))
+      (incf total-size (length string)))
+    (let ((result-string (make-sequence 'simple-string total-size))
+          (curr-pos 0))
+      (declare (type fixnum curr-pos))
+      (dolist (string string-list)
+        (declare (type string string))
+        (replace result-string string :start1 curr-pos)
+        (incf curr-pos (length string)))
+      result-string)))
+
+(defgeneric flatten (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Merges adjacent sequences and alternations, i.e. it
+transforms #<SEQ #<STR \"a\"> #<SEQ #<STR \"b\"> #<STR \"c\">>> to
+#<SEQ #<STR \"a\"> #<STR \"b\"> #<STR \"c\">>. This is a destructive
+operation on REGEX."))
+
+(defmethod flatten ((seq seq))
+  ;; this looks more complicated than it is because we modify SEQ in
+  ;; place to avoid unnecessary consing
+  (let ((elements-rest (elements seq)))
+    (loop
+      (unless elements-rest
+        (return))
+      (let ((flattened-element (flatten (car elements-rest)))
+            (next-elements-rest (cdr elements-rest)))
+        (cond ((typep flattened-element 'seq)
+                ;; FLATTENED-ELEMENT is a SEQ object, so we "splice"
+                ;; it into out list of elements
+                (let ((flattened-element-elements
+                        (elements flattened-element)))
+                  (setf (car elements-rest)
+                          (car flattened-element-elements)
+                        (cdr elements-rest)
+                          (nconc (cdr flattened-element-elements)
+                                 (cdr elements-rest)))))
+              (t
+                ;; otherwise we just replace the current element with
+                ;; its flattened counterpart
+                (setf (car elements-rest) flattened-element)))
+        (setq elements-rest next-elements-rest))))
+  (let ((elements (elements seq)))
+    (cond ((cadr elements)
+            seq)
+          ((cdr elements)
+            (first elements))
+          (t (make-instance 'void)))))
+
+(defmethod flatten ((alternation alternation))
+  ;; same algorithm as above
+  (let ((choices-rest (choices alternation)))
+    (loop
+      (unless choices-rest
+        (return))
+      (let ((flattened-choice (flatten (car choices-rest)))
+            (next-choices-rest (cdr choices-rest)))
+        (cond ((typep flattened-choice 'alternation)
+                (let ((flattened-choice-choices
+                        (choices flattened-choice)))
+                  (setf (car choices-rest)
+                          (car flattened-choice-choices)
+                        (cdr choices-rest)
+                          (nconc (cdr flattened-choice-choices)
+                                 (cdr choices-rest)))))
+              (t
+                (setf (car choices-rest) flattened-choice)))
+        (setq choices-rest next-choices-rest))))
+  (let ((choices (choices alternation)))
+    (cond ((cadr choices)
+            alternation)
+          ((cdr choices)
+            (first choices))
+          (t (signal-ppcre-syntax-error
+              "Encountered alternation without choices.")))))
+
+(defmethod flatten ((branch branch))
+  (with-slots ((test test)
+               (then-regex then-regex)
+               (else-regex else-regex))
+      branch
+    (setq test
+            (if (numberp test)
+              test
+              (flatten test))
+          then-regex (flatten then-regex)
+          else-regex (flatten else-regex))
+    branch))
+
+(defmethod flatten ((regex regex))
+  (typecase regex
+    ((or repetition register lookahead lookbehind standalone)
+      ;; if REGEX contains exactly one inner REGEX object flatten it
+      (setf (regex regex)
+              (flatten (regex regex)))
+      regex)
+    (t
+      ;; otherwise (ANCHOR, BACK-REFERENCE, CHAR-CLASS, EVERYTHING,
+      ;; LOOKAHEAD, LOOKBEHIND, STR, VOID, and WORD-BOUNDARY) do
+      ;; nothing
+      regex)))
+
+(defgeneric gather-strings (regex)
+    (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Collects adjacent strings or characters into one
+string provided they have the same case mode. This is a destructive
+operation on REGEX."))
+
+(defmethod gather-strings ((seq seq))
+  ;; note that GATHER-STRINGS is to be applied after FLATTEN, i.e. it
+  ;; expects SEQ to be flattened already; in particular, SEQ cannot be
+  ;; empty and cannot contain embedded SEQ objects
+  (let* ((start-point (cons nil (elements seq)))
+         (curr-point start-point)
+         old-case-mode
+         collector
+         collector-start
+         (collector-length 0)
+         skip)
+    (declare (type fixnum collector-length))
+    (loop
+      (let ((elements-rest (cdr curr-point)))
+        (unless elements-rest
+          (return))
+        (let* ((element (car elements-rest))
+               (case-mode (case-mode element old-case-mode)))
+          (cond ((and case-mode
+                      (eq case-mode old-case-mode))
+                  ;; if ELEMENT is a STR and we have collected a STR of
+                  ;; the same case mode in the last iteration we
+                  ;; concatenate ELEMENT onto COLLECTOR and remember the
+                  ;; value of its SKIP slot
+                  (let ((old-collector-length collector-length))
+                    (unless (and (adjustable-array-p collector)
+                                 (array-has-fill-pointer-p collector))
+                      (setq collector
+                              (make-array collector-length
+                                          :initial-contents collector
+                                          :element-type 'character
+                                          :fill-pointer t
+                                          :adjustable t)
+                            collector-start nil))
+                    (adjust-array collector
+                                  (incf collector-length (len element))
+                                  :fill-pointer t)
+                    (setf (subseq collector
+                                  old-collector-length)
+                            (str element)
+                          ;; it suffices to remember the last SKIP slot
+                          ;; because due to the way MAYBE-ACCUMULATE
+                          ;; works adjacent STR objects have the same
+                          ;; SKIP value
+                          skip (skip element)))
+                  (setf (cdr curr-point) (cdr elements-rest)))
+                (t
+                  (let ((collected-string
+                          (cond (collector-start
+                                  collector-start)
+                                (collector
+                                  ;; if we have collected something already
+                                  ;; we convert it into a STR
+                                  (make-instance 'str
+                                                 :skip skip
+                                                 :str collector
+                                                 :case-insensitive-p
+                                                 (eq old-case-mode
+                                                     :case-insensitive)))
+                                (t nil))))
+                    (cond (case-mode
+                            ;; if ELEMENT is a string with a different case
+                            ;; mode than the last one we have either just
+                            ;; converted COLLECTOR into a STR or COLLECTOR
+                            ;; is still empty; in both cases we can now
+                            ;; begin to fill it anew
+                            (setq collector (str element)
+                                  collector-start element
+                                  ;; and we remember the SKIP value as above
+                                  skip (skip element)
+                                  collector-length (len element))
+                            (cond (collected-string
+                                    (setf (car elements-rest)
+                                            collected-string
+                                          curr-point
+                                            (cdr curr-point)))
+                                  (t
+                                    (setf (cdr curr-point)
+                                            (cdr elements-rest)))))
+                          (t
+                            ;; otherwise this is not a STR so we apply
+                            ;; GATHER-STRINGS to it and collect it directly
+                            ;; into RESULT
+                            (cond (collected-string
+                                    (setf (car elements-rest)
+                                            collected-string
+                                          curr-point
+                                            (cdr curr-point)
+                                          (cdr curr-point)
+                                            (cons (gather-strings element)
+                                                  (cdr curr-point))
+                                          curr-point
+                                            (cdr curr-point)))
+                                  (t
+                                    (setf (car elements-rest)
+                                            (gather-strings element)
+                                          curr-point
+                                            (cdr curr-point))))
+                            ;; we also have to empty COLLECTOR here in case
+                            ;; it was still filled from the last iteration
+                            (setq collector nil
+                                  collector-start nil))))))
+          (setq old-case-mode case-mode))))
+    (when collector
+      (setf (cdr curr-point)
+              (cons
+               (make-instance 'str
+                              :skip skip
+                              :str collector
+                              :case-insensitive-p
+                              (eq old-case-mode
+                                  :case-insensitive))
+               nil)))
+    (setf (elements seq) (cdr start-point))
+    seq))
+
+(defmethod gather-strings ((alternation alternation))
+  ;; loop ON the choices of ALTERNATION so we can modify them directly
+  (loop for choices-rest on (choices alternation)
+        while choices-rest
+        do (setf (car choices-rest)
+                   (gather-strings (car choices-rest))))
+  alternation)
+
+(defmethod gather-strings ((branch branch))
+  (with-slots ((test test)
+               (then-regex then-regex)
+               (else-regex else-regex))
+      branch
+    (setq test
+            (if (numberp test)
+              test
+              (gather-strings test))
+          then-regex (gather-strings then-regex)
+          else-regex (gather-strings else-regex))
+    branch))
+
+(defmethod gather-strings ((regex regex))
+  (typecase regex
+    ((or repetition register lookahead lookbehind standalone)
+      ;; if REGEX contains exactly one inner REGEX object apply
+      ;; GATHER-STRINGS to it
+      (setf (regex regex)
+              (gather-strings (regex regex)))
+      regex)
+    (t
+      ;; otherwise (ANCHOR, BACK-REFERENCE, CHAR-CLASS, EVERYTHING,
+      ;; LOOKAHEAD, LOOKBEHIND, STR, VOID, and WORD-BOUNDARY) do
+      ;; nothing
+      regex)))
+
+;; Note that START-ANCHORED-P will be called after FLATTEN and GATHER-STRINGS.
+
+(defgeneric start-anchored-p (regex &optional in-seq-p)
+    (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns T if REGEX starts with a \"real\" start
+anchor, i.e. one that's not in multi-line mode, NIL otherwise. If
+IN-SEQ-P is true the function will return :ZERO-LENGTH if REGEX is a
+zero-length assertion."))
+
+(defmethod start-anchored-p ((seq seq) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  ;; note that START-ANCHORED-P is to be applied after FLATTEN and
+  ;; GATHER-STRINGS, i.e. SEQ cannot be empty and cannot contain
+  ;; embedded SEQ objects
+  (loop for element in (elements seq)
+        for anchored-p = (start-anchored-p element t)
+        ;; skip zero-length elements because they won't affect the
+        ;; "anchoredness" of the sequence
+        while (eq anchored-p :zero-length)
+        finally (return (and anchored-p (not (eq anchored-p :zero-length))))))
+
+(defmethod start-anchored-p ((alternation alternation) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  ;; clearly an alternation can only be start-anchored if all of its
+  ;; choices are start-anchored
+  (loop for choice in (choices alternation)
+        always (start-anchored-p choice)))
+
+(defmethod start-anchored-p ((branch branch) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  (and (start-anchored-p (then-regex branch))
+       (start-anchored-p (else-regex branch))))
+
+(defmethod start-anchored-p ((repetition repetition) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  ;; well, this wouldn't make much sense, but anyway...
+  (and (plusp (minimum repetition))
+       (start-anchored-p (regex repetition))))
+
+(defmethod start-anchored-p ((register register) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  (start-anchored-p (regex register)))
+
+(defmethod start-anchored-p ((standalone standalone) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  (start-anchored-p (regex standalone)))
+
+(defmethod start-anchored-p ((anchor anchor) &optional in-seq-p)
+  (declare (ignore in-seq-p))
+  (and (startp anchor)
+       (not (multi-line-p anchor))))
+
+(defmethod start-anchored-p ((regex regex) &optional in-seq-p)
+  (typecase regex
+    ((or lookahead lookbehind word-boundary void)
+      ;; zero-length assertions
+      (if in-seq-p
+        :zero-length
+        nil))
+    (t
+      ;; BACK-REFERENCE, CHAR-CLASS, EVERYTHING, and STR
+      nil)))
+
+;; Note that END-STRING-AUX will be called after FLATTEN and GATHER-STRINGS.
+
+(defgeneric end-string-aux (regex &optional old-case-insensitive-p)
+    (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns the constant string (if it exists) REGEX
+ends with wrapped into a STR object, otherwise NIL.
+OLD-CASE-INSENSITIVE-P is the CASE-INSENSITIVE-P slot of the last STR
+collected or :VOID if no STR has been collected yet. (This is a helper
+function called by END-STRIN.)"))
+
+(defmethod end-string-aux ((str str)
+                           &optional (old-case-insensitive-p :void))
+  (declare (special last-str))
+  (cond ((and (not (skip str))          ; avoid constituents of STARTS-WITH
+              ;; only use STR if nothing has been collected yet or if
+              ;; the collected string has the same value for
+              ;; CASE-INSENSITIVE-P
+              (or (eq old-case-insensitive-p :void)
+                  (eq (case-insensitive-p str) old-case-insensitive-p)))
+          (setf last-str str
+                ;; set the SKIP property of this STR
+                (skip str) t)
+          str)
+        (t nil)))
+
+(defmethod end-string-aux ((seq seq)
+                           &optional (old-case-insensitive-p :void))
+  (declare (special continuep))
+  (let (case-insensitive-p
+        concatenated-string
+        concatenated-start
+        (concatenated-length 0))
+    (declare (type fixnum concatenated-length))
+    (loop for element in (reverse (elements seq))
+          ;; remember the case-(in)sensitivity of the last relevant
+          ;; STR object
+          for loop-old-case-insensitive-p = old-case-insensitive-p
+            then (if skip
+                   loop-old-case-insensitive-p
+                   (case-insensitive-p element-end))
+          ;; the end-string of the current element
+          for element-end = (end-string-aux element
+                                            loop-old-case-insensitive-p)
+          ;; whether we encountered a zero-length element
+          for skip = (if element-end
+                       (zerop (len element-end))
+                       nil)
+          ;; set CONTINUEP to NIL if we have to stop collecting to
+          ;; alert END-STRING-AUX methods on enclosing SEQ objects
+          unless element-end
+            do (setq continuep nil)
+          ;; end loop if we neither got a STR nor a zero-length
+          ;; element
+          while element-end
+          ;; only collect if not zero-length
+          unless skip
+            do (cond (concatenated-string
+                       (when concatenated-start
+                         (setf concatenated-string
+                                 (make-array concatenated-length
+                                             :initial-contents (reverse (str concatenated-start))
+                                             :element-type 'character
+                                             :fill-pointer t
+                                             :adjustable t)
+                               concatenated-start nil))
+                       (let ((len (len element-end))
+                             (str (str element-end)))
+                         (declare (type fixnum len))
+                         (incf concatenated-length len)
+                         (loop for i of-type fixnum downfrom (1- len) to 0
+                               do (vector-push-extend (char str i)
+                                                      concatenated-string))))
+                     (t
+                       (setf concatenated-string
+                               t
+                             concatenated-start
+                               element-end
+                             concatenated-length
+                               (len element-end)
+                             case-insensitive-p
+                               (case-insensitive-p element-end))))
+          ;; stop collecting if END-STRING-AUX on inner SEQ has said so
+          while continuep)
+    (cond ((zerop concatenated-length)
+            ;; don't bother to return zero-length strings
+            nil)
+          (concatenated-start
+            concatenated-start)
+          (t
+            (make-instance 'str
+                           :str (nreverse concatenated-string)
+                           :case-insensitive-p case-insensitive-p)))))
+
+(defmethod end-string-aux ((register register)
+                           &optional (old-case-insensitive-p :void))
+  (end-string-aux (regex register) old-case-insensitive-p))
+    
+(defmethod end-string-aux ((standalone standalone)
+                           &optional (old-case-insensitive-p :void))
+  (end-string-aux (regex standalone) old-case-insensitive-p))
+    
+(defmethod end-string-aux ((regex regex)
+                           &optional (old-case-insensitive-p :void))
+  (declare (special last-str end-anchored-p continuep))
+  (typecase regex
+    ((or anchor lookahead lookbehind word-boundary void)
+      ;; a zero-length REGEX object - for the sake of END-STRING-AUX
+      ;; this is a zero-length string
+      (when (and (typep regex 'anchor)
+                 (not (startp regex))
+                 (or (no-newline-p regex)
+                     (not (multi-line-p regex)))
+                 (eq old-case-insensitive-p :void))
+        ;; if this is a "real" end-anchor and we haven't collected
+        ;; anything so far we can set END-ANCHORED-P (where 1 or 0
+        ;; indicate whether we accept a #\Newline at the end or not)
+        (setq end-anchored-p (if (no-newline-p regex) 0 1)))
+      (make-instance 'str
+                     :str ""
+                     :case-insensitive-p :void))
+    (t
+      ;; (ALTERNATION, BACK-REFERENCE, BRANCH, CHAR-CLASS, EVERYTHING,
+      ;; REPETITION)
+      nil)))
+
+(defmethod end-string ((regex regex))
+  (declare (special end-string-offset))
+    (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns the constant string (if it exists) REGEX ends with wrapped
+into a STR object, otherwise NIL."
+  ;; LAST-STR points to the last STR object (seen from the end) that's
+  ;; part of END-STRING; CONTINUEP is set to T if we stop collecting
+  ;; in the middle of a SEQ
+  (let ((continuep t)
+        last-str)
+    (declare (special continuep last-str))
+    (prog1
+      (end-string-aux regex)
+      (when last-str
+        ;; if we've found something set the START-OF-END-STRING-P of
+        ;; the leftmost STR collected accordingly and remember the
+        ;; OFFSET of this STR (in a special variable provided by the
+        ;; caller of this function)
+        (setf (start-of-end-string-p last-str) t
+              end-string-offset (offset last-str))))))
+
+(defgeneric compute-min-rest (regex current-min-rest)
+    (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns the minimal length of REGEX plus
+CURRENT-MIN-REST. This is similar to REGEX-MIN-LENGTH except that it
+recurses down into REGEX and sets the MIN-REST slots of REPETITION
+objects."))
+
+(defmethod compute-min-rest ((seq seq) current-min-rest)
+  (loop for element in (reverse (elements seq))
+        for last-min-rest = current-min-rest then this-min-rest
+        for this-min-rest = (compute-min-rest element last-min-rest)
+        finally (return this-min-rest)))
+    
+(defmethod compute-min-rest ((alternation alternation) current-min-rest)
+  (loop for choice in (choices alternation)
+        minimize (compute-min-rest choice current-min-rest)))
+
+(defmethod compute-min-rest ((branch branch) current-min-rest)
+  (min (compute-min-rest (then-regex branch) current-min-rest)
+       (compute-min-rest (else-regex branch) current-min-rest)))
+
+(defmethod compute-min-rest ((str str) current-min-rest)
+  (+ current-min-rest (len str)))
+    
+(defmethod compute-min-rest ((repetition repetition) current-min-rest)
+  (setf (min-rest repetition) current-min-rest)
+  (compute-min-rest (regex repetition) current-min-rest)
+  (+ current-min-rest (* (minimum repetition) (min-len repetition))))
+
+(defmethod compute-min-rest ((register register) current-min-rest)
+  (compute-min-rest (regex register) current-min-rest))
+    
+(defmethod compute-min-rest ((standalone standalone) current-min-rest)
+  (declare (ignore current-min-rest))
+  (compute-min-rest (regex standalone) 0))
+    
+(defmethod compute-min-rest ((lookahead lookahead) current-min-rest)
+  (compute-min-rest (regex lookahead) 0)
+  current-min-rest)
+    
+(defmethod compute-min-rest ((lookbehind lookbehind) current-min-rest)
+  (compute-min-rest (regex lookbehind) (+ current-min-rest (len lookbehind)))
+  current-min-rest)
+    
+(defmethod compute-min-rest ((regex regex) current-min-rest)
+  (typecase regex
+    ((or char-class everything)
+      (1+ current-min-rest))
+    (t
+      ;; zero min-len and no embedded regexes (ANCHOR,
+      ;; BACK-REFERENCE, VOID, and WORD-BOUNDARY)
+      current-min-rest)))
--- a/packages.lisp
+++ b/packages.lisp
@ -0,0 +1,88 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-USER; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/packages.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-user)
+
+#-:cormanlisp
+(defpackage #:cl-ppcre
+  (:nicknames #:ppcre)
+  (:use #:cl)
+  (:export #:create-scanner
+           #:scan
+           #:scan-to-strings
+           #:do-scans
+           #:do-matches
+           #:do-matches-as-strings
+           #:all-matches
+           #:all-matches-as-strings
+           #:split
+           #:regex-replace
+           #:regex-replace-all
+           #:regex-apropos
+           #:regex-apropos-list
+           #:quote-meta-chars
+           #:*regex-char-code-limit*
+           #:*use-bmh-matchers*
+           #:*allow-quoting*
+           #:ppcre-error
+           #:ppcre-invocation-error
+           #:ppcre-syntax-error
+           #:ppcre-syntax-error-string
+           #:ppcre-syntax-error-pos
+           #:register-groups-bind
+           #:do-register-groups))
+
+#+:cormanlisp
+(defpackage "CL-PPCRE"
+  (:nicknames "PPCRE")
+  (:use "CL")
+  (:export "CREATE-SCANNER"
+           "SCAN"
+           "SCAN-TO-STRINGS"
+           "DO-SCANS"
+           "DO-MATCHES"
+           "DO-MATCHES-AS-STRINGS"
+           "ALL-MATCHES"
+           "ALL-MATCHES-AS-STRINGS"
+           "SPLIT"
+           "REGEX-REPLACE"
+           "REGEX-REPLACE-ALL"
+           "REGEX-APROPOS"
+           "REGEX-APROPOS-LIST"
+           "QUOTE-META-CHARS"
+           "*REGEX-CHAR-CODE-LIMIT*"
+           "*USE-BMH-MATCHERS*"
+           "*ALLOW-QUOTING*"
+           "PPCRE-ERROR"
+           "PPCRE-INVOCATION-ERROR"
+           "PPCRE-SYNTAX-ERROR"
+           "PPCRE-SYNTAX-ERROR-STRING"
+           "PPCRE-SYNTAX-ERROR-POS"
+           "REGISTER-GROUPS-BIND"
+           "DO-REGISTER-GROUPS"))
--- a/parser.lisp
+++ b/parser.lisp
@ -0,0 +1,347 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/parser.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; The parser will - with the help of the lexer - parse a regex
+;;; string and convert it into a "parse tree" (see docs for details
+;;; about the syntax of these trees). Note that the lexer might return
+;;; illegal parse trees. It is assumed that the conversion process
+;;; later on will track them down.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defun group (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Parses and consumes a <group>.
+The productions are: <group> -> \"(\"<regex>\")\"
+                                \"(?:\"<regex>\")\"
+                                \"(?<\"<regex>\")\"
+                                \"(?<flags>:\"<regex>\")\"
+                                \"(?=\"<regex>\")\"
+                                \"(?!\"<regex>\")\"
+                                \"(?<=\"<regex>\")\"
+                                \"(?<!\"<regex>\")\"
+                                \"(?(\"<num>\")\"<regex>\")\"
+                                \"(?(\"<regex>\")\"<regex>\")\"
+                                <legal-token>
+where <flags> is parsed by the lexer function MAYBE-PARSE-FLAGS.
+Will return <parse-tree> or (<grouping-type> <parse-tree>) where
+<grouping-type> is one of six keywords - see source for details."
+  (multiple-value-bind (open-token flags)
+      (get-token lexer)
+    (cond ((eq open-token :open-paren-paren)
+            ;; special case for conditional regular expressions; note
+            ;; that at this point we accept a couple of illegal
+            ;; combinations which'll be sorted out later by the
+            ;; converter
+            (let* ((open-paren-pos (car (lexer-last-pos lexer)))
+                   ;; check if what follows "(?(" is a number
+                   (number (try-number lexer :no-whitespace-p t))
+                   ;; make changes to extended-mode-p local
+                   (*extended-mode-p* *extended-mode-p*))
+              (declare (type fixnum open-paren-pos))
+              (cond (number
+                      ;; condition is a number (i.e. refers to a
+                      ;; back-reference)
+                      (let* ((inner-close-token (get-token lexer))
+                             (reg-expr (reg-expr lexer))
+                             (close-token (get-token lexer)))
+                        (unless (eq inner-close-token :close-paren)
+                          (signal-ppcre-syntax-error*
+                           (+ open-paren-pos 2)
+                           "Opening paren has no matching closing paren"))
+                        (unless (eq close-token :close-paren)
+                          (signal-ppcre-syntax-error*
+                           open-paren-pos
+                           "Opening paren has no matching closing paren"))
+                        (list :branch number reg-expr)))
+                    (t
+                      ;; condition must be a full regex (actually a
+                      ;; look-behind or look-ahead); and here comes a
+                      ;; terrible kludge: instead of being cleanly
+                      ;; separated from the lexer, the parser pushes
+                      ;; back the lexer by one position, thereby
+                      ;; landing in the middle of the 'token' "(?(" -
+                      ;; yuck!!
+                      (decf (lexer-pos lexer))
+                      (let* ((inner-reg-expr (group lexer))
+                             (reg-expr (reg-expr lexer))
+                             (close-token (get-token lexer)))
+                        (unless (eq close-token :close-paren)
+                          (signal-ppcre-syntax-error*
+                           open-paren-pos
+                           "Opening paren has no matching closing paren"))
+                        (list :branch inner-reg-expr reg-expr))))))
+          ((member open-token '(:open-paren
+                                :open-paren-colon
+                                :open-paren-greater
+                                :open-paren-equal
+                                :open-paren-exclamation
+                                :open-paren-less-equal
+                                :open-paren-less-exclamation)
+                   :test #'eq)
+            ;; make changes to extended-mode-p local
+            (let ((*extended-mode-p* *extended-mode-p*))
+              ;; we saw one of the six token representing opening
+              ;; parentheses
+              (let* ((open-paren-pos (car (lexer-last-pos lexer)))
+                     (reg-expr (reg-expr lexer))
+                     (close-token (get-token lexer)))
+                (when (eq open-token :open-paren)
+                  ;; if this is the "("<regex>")" production we have to
+                  ;; increment the register counter of the lexer
+                  (incf (lexer-reg lexer)))
+                (unless (eq close-token :close-paren)
+                  ;; the token following <regex> must be the closing
+                  ;; parenthesis or this is a syntax error
+                  (signal-ppcre-syntax-error*
+                   open-paren-pos
+                   "Opening paren has no matching closing paren"))
+                (if flags
+                  ;; if the lexer has returned a list of flags this must
+                  ;; have been the "(?:"<regex>")" production
+                  (cons :group (nconc flags (list reg-expr)))
+                  (list (case open-token
+                          ((:open-paren)
+                            :register)
+                          ((:open-paren-colon)
+                            :group)
+                          ((:open-paren-greater)
+                            :standalone)
+                          ((:open-paren-equal)
+                            :positive-lookahead)
+                          ((:open-paren-exclamation)
+                            :negative-lookahead)
+                          ((:open-paren-less-equal)
+                            :positive-lookbehind)
+                          ((:open-paren-less-exclamation)
+                            :negative-lookbehind))
+                        reg-expr)))))
+          (t
+            ;; this is the <legal-token> production; <legal-token> is
+            ;; any token which passes START-OF-SUBEXPR-P (otherwise
+            ;; parsing had already stopped in the SEQ method)
+            open-token))))
+
+(defun greedy-quant (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Parses and consumes a <greedy-quant>.
+The productions are: <greedy-quant> -> <group> | <group><quantifier>
+where <quantifier> is parsed by the lexer function GET-QUANTIFIER.
+Will return <parse-tree> or (:GREEDY-REPETITION <min> <max> <parse-tree>)."
+  (let* ((group (group lexer))
+         (token (get-quantifier lexer)))
+    (if token
+      ;; if GET-QUANTIFIER returned a non-NIL value it's the
+      ;; two-element list (<min> <max>)
+      (list :greedy-repetition (first token) (second token) group)
+      group)))
+
+(defun quant (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Parses and consumes a <quant>.
+The productions are: <quant> -> <greedy-quant> | <greedy-quant>\"?\".
+Will return the <parse-tree> returned by GREEDY-QUANT and optionally
+change :GREEDY-REPETITION to :NON-GREEDY-REPETITION."
+  (let* ((greedy-quant (greedy-quant lexer))
+         (pos (lexer-pos lexer))
+         (next-char (next-char lexer)))
+    (when next-char
+      (if (char= next-char #\?)
+        (setf (car greedy-quant) :non-greedy-repetition)
+        (setf (lexer-pos lexer) pos)))
+    greedy-quant))
+
+(defun seq (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Parses and consumes a <seq>.
+The productions are: <seq> -> <quant> | <quant><seq>.
+Will return <parse-tree> or (:SEQUENCE <parse-tree> <parse-tree>)."
+  (flet ((make-array-from-two-chars (char1 char2)
+           (let ((string (make-array 2
+                                     :element-type 'character
+                                     :fill-pointer t
+                                     :adjustable t)))
+             (setf (aref string 0) char1)
+             (setf (aref string 1) char2)
+             string)))
+    ;; Note that we're calling START-OF-SUBEXPR-P before we actually try
+    ;; to parse a <seq> or <quant> in order to catch empty regular
+    ;; expressions
+    (if (start-of-subexpr-p lexer)
+      (let ((quant (quant lexer)))
+        (if (start-of-subexpr-p lexer)
+          (let* ((seq (seq lexer))
+                 (quant-is-char-p (characterp quant))
+                 (seq-is-sequence-p (and (consp seq)
+                                         (eq (first seq) :sequence))))
+            (cond ((and quant-is-char-p
+                        (characterp seq))
+                    (make-array-from-two-chars seq quant))
+                  ((and quant-is-char-p
+                        (stringp seq))
+                    (vector-push-extend quant seq)
+                    seq)
+                  ((and quant-is-char-p
+                        seq-is-sequence-p
+                        (characterp (second seq)))
+                    (cond ((cddr seq)
+                            (setf (cdr seq)
+                                    (cons
+                                     (make-array-from-two-chars (second seq)
+                                                                quant)
+                                     (cddr seq)))
+                            seq)
+                          (t (make-array-from-two-chars (second seq) quant))))
+                  ((and quant-is-char-p
+                        seq-is-sequence-p
+                        (stringp (second seq)))
+                    (cond ((cddr seq)
+                            (setf (cdr seq)
+                                    (cons
+                                     (progn
+                                       (vector-push-extend quant (second seq))
+                                       (second seq))
+                                     (cddr seq)))
+                            seq)
+                          (t 
+                            (vector-push-extend quant (second seq))
+                            (second seq))))
+                  (seq-is-sequence-p
+                    ;; if <seq> is also a :SEQUENCE parse tree we merge
+                    ;; both lists into one to avoid unnecessary consing
+                    (setf (cdr seq)
+                            (cons quant (cdr seq)))
+                    seq)
+                  (t (list :sequence quant seq))))
+          quant))
+      :void)))
+  
+(defun reg-expr (lexer)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Parses and consumes a <regex>, a complete regular expression.
+The productions are: <regex> -> <seq> | <seq>\"|\"<regex>.
+Will return <parse-tree> or (:ALTERNATION <parse-tree> <parse-tree>)."
+  (let ((pos (lexer-pos lexer)))
+    (case (next-char lexer)
+      ((nil)
+        ;; if we didn't get any token we return :VOID which stands for
+        ;; "empty regular expression"
+        :void)
+      ((#\|)
+        ;; now check whether the expression started with a vertical
+        ;; bar, i.e. <seq> - the left alternation - is empty
+        (list :alternation :void (reg-expr lexer)))
+      (otherwise
+        ;; otherwise un-read the character we just saw and parse a
+        ;; <seq> plus the character following it
+        (setf (lexer-pos lexer) pos)
+        (let* ((seq (seq lexer))
+               (pos (lexer-pos lexer)))
+          (case (next-char lexer)
+            ((nil)
+              ;; no further character, just a <seq>
+              seq)
+            ((#\|)
+              ;; if the character was a vertical bar, this is an
+              ;; alternation and we have the second production
+              (let ((reg-expr (reg-expr lexer)))
+                (cond ((and (consp reg-expr)
+                            (eq (first reg-expr) :alternation))
+                        ;; again we try to merge as above in SEQ
+                        (setf (cdr reg-expr)
+                                (cons seq (cdr reg-expr)))
+                        reg-expr)
+                      (t (list :alternation seq reg-expr)))))
+            (otherwise
+              ;; a character which is not a vertical bar - this is
+              ;; either a syntax error or we're inside of a group and
+              ;; the next character is a closing parenthesis; so we
+              ;; just un-read the character and let another function
+              ;; take care of it
+              (setf (lexer-pos lexer) pos)
+              seq)))))))
+
+(defun reverse-strings (parse-tree)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (cond ((stringp parse-tree)
+          (nreverse parse-tree))
+        ((consp parse-tree)
+          (loop for parse-tree-rest on parse-tree
+                while parse-tree-rest
+                do (setf (car parse-tree-rest)
+                           (reverse-strings (car parse-tree-rest))))
+          parse-tree)
+        (t parse-tree)))
+
+(defun parse-string (string)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Translate the regex string STRING into a parse tree."
+  (let* ((lexer (make-lexer string))
+         (parse-tree (reverse-strings (reg-expr lexer))))
+    ;; check whether we've consumed the whole regex string
+    (if (end-of-string-p lexer)
+      parse-tree
+      (signal-ppcre-syntax-error*
+       (lexer-pos lexer)
+       "Expected end of string"))))
--- a/perltest.pl
+++ b/perltest.pl
@ -0,0 +1,174 @@
+#!/usr/bin/perl
+
+# This is a heavily modified version of the file 'perltest' which
+# comes with the PCRE library package, which is open source software,
+# written by Philip Hazel, and copyright by the University of
+# Cambridge, England.
+
+# The PCRE library package is available from
+# <ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/>
+
+use Time::HiRes qw(time);
+
+sub string_for_lisp {
+  my(@a, $t, $in_string, $switch);
+
+  my $string = shift;
+  $string =~ s/\\/\\\\/g;
+  $string =~ s/"/\\"/g;
+
+  return "\"$string\""
+    if $string =~ /^[\n\x20-\x7f]*$/;
+
+  $in_string = 1;
+  foreach $c (split(//, $string)) {
+    if (ord $c >= 32 and ord $c < 127) {
+      if ($in_string) {
+        $t .= $c;
+      } else {
+        $in_string = 1;
+        $t = $c;
+      }
+    } else {
+      if ($in_string) {
+        push @a, "\"$t\"";
+        $in_string = 0;
+        $switch = 1;
+      }
+      push @a, ord $c;
+    }
+  }
+  if ($switch) {
+    if ($in_string) {
+      push @a, "\"$t\"";
+    }
+    '(' . (join ' ', @a) . ')';
+  } else {
+    "\"$t\"";
+  }
+}
+
+$min_time = shift;
+
+NEXT_RE: while (1) {
+  last
+    if !($_ = <>);
+  next
+    if $_ eq "";
+
+  $pattern = $_;
+
+  while ($pattern !~ /^\s*(.).*\1/s) {
+    last
+      if !($_ = <>);
+    $pattern .= $_;
+  }
+
+  chomp($pattern);
+  $pattern =~ s/\s+$//;
+  $pattern =~ s/\+(?=[a-z]*$)//;
+
+  $multi_line_mode = ($pattern =~ /m[a-z]*$/) ? 't' : 'nil';
+  $single_line_mode = ($pattern =~ /s[a-z]*$/) ? 't' : 'nil';
+  $extended_mode = ($pattern =~ /x[a-z]*$/) ? 't' : 'nil';
+  $case_insensitive_mode = ($pattern =~ /i[a-z]*$/) ? 't' : 'nil';
+  $pattern =~ s/^(.*)g([a-z]*)$/\1\2/;
+
+  $pattern_for_lisp = $pattern;
+  $pattern_for_lisp =~ s/[a-z]*$//;
+  $pattern_for_lisp =~ s/^\s*(.)(.*)\1/$2/s;
+  $pattern_for_lisp =~ s/\\/\\\\/g;
+  $pattern_for_lisp =~ s/"/\\"/g;
+
+  $pattern = "/(?#)/$2"
+    if ($pattern =~ /^(.)\1(.*)$/);
+
+  while (1) {
+    last NEXT_RE
+      if !($_ = <>);
+
+    chomp;
+
+    s/\s+$//;
+    s/^\s+//;
+
+    last
+      if ($_ eq "");
+
+    $info_string = string_for_lisp "\"$_\" =~ $pattern";
+    $x = eval "\"$_\"";
+
+    @subs = ();
+
+    eval <<"END";
+if (\$x =~ ${pattern}) {
+  push \@subs,\$&;
+  push \@subs,\$1;
+  push \@subs,\$2;
+  push \@subs,\$3;
+  push \@subs,\$4;
+  push \@subs,\$5;
+  push \@subs,\$6;
+  push \@subs,\$7;
+  push \@subs,\$8;
+  push \@subs,\$9;
+  push \@subs,\$10;
+  push \@subs,\$11;
+  push \@subs,\$12;
+  push \@subs,\$13;
+  push \@subs,\$14;
+  push \@subs,\$15;
+  push \@subs,\$16;
+}
+
+\$test = sub {
+  my \$times = shift;
+
+  my \$start = time;
+  for (my \$i = 0; \$i < \$times; \$i++) {
+    \$x =~ ${pattern};
+  }
+  return time - \$start;
+};
+END
+
+    $times = 1;
+    $used = 0;
+    $counter++;
+    print STDERR "$counter\n";
+
+    if ($@) {
+      $error = 't';
+    } else {
+      $error = 'nil';
+      if ($min_time) {
+        $times = 10;
+        while (1) {
+          $used = &$test($times);
+          last
+            if $used > $min_time;
+          $times *= 10;
+        }
+      }
+    }
+
+    print "($counter $info_string \"$pattern_for_lisp\" $case_insensitive_mode $multi_line_mode $single_line_mode $extended_mode " . string_for_lisp($x) . " $error $times $used ";
+    if (!@subs) {
+      print 'nil nil';
+    } else {
+      print string_for_lisp($subs[0]) . ' (';
+      undef $not_first;
+      for ($i = 1; $i <= 16; $i++) {
+        print ' '
+          unless $i == 1;
+        if (defined $subs[$i]) {
+          print string_for_lisp $subs[$i];
+        } else {
+          print 'nil';
+        }
+      }
+      print ')';
+    }
+    print ")\n";
+  }
+}
--- a/ppcre-tests.lisp
+++ b/ppcre-tests.lisp
@ -0,0 +1,296 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE-TEST; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/ppcre-tests.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-user)
+
+#-:cormanlisp
+(defpackage #:cl-ppcre-test
+  (:use #:cl #:cl-ppcre)
+  (:export #:test))
+
+#+:cormanlisp
+(defpackage "CL-PPCRE-TEST"
+  (:use "CL" "CL-PPCRE")
+  (:export "TEST"))
+
+(in-package #:cl-ppcre-test)
+
+(defparameter *cl-ppcre-test-base-directory*
+  (make-pathname :name nil :type nil :version nil
+                 :defaults (parse-namestring *load-truename*)))
+
+(defun full-gc ()
+  "Start a full garbage collection."
+  ;; what are the corresponding values for MCL and OpenMCL?
+  #+:allegro (excl:gc t)
+  #+(or :cmu :scl) (ext:gc :full t)
+  #+:ecl (si:gc t)
+  #+:clisp (ext:gc)
+  #+:cormanlisp (loop for i from 0 to 3 do (cormanlisp:gc i))
+  #+:lispworks (hcl:mark-and-sweep 3)
+  #+:sbcl (sb-ext:gc :full t))
+
+;; warning: ugly code ahead!!
+;; this is just a quick hack for testing purposes
+
+(defun time-regex (factor regex string
+                          &key case-insensitive-mode
+                               multi-line-mode
+                               single-line-mode
+                               extended-mode)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Auxiliary function used by TEST to benchmark a regex scanner
+against Perl timings."
+  (declare (type string string))
+  (let* ((scanner (create-scanner regex
+                                  :case-insensitive-mode case-insensitive-mode
+                                  :multi-line-mode multi-line-mode
+                                  :single-line-mode single-line-mode
+                                  :extended-mode extended-mode))
+         ;; make sure GC doesn't invalidate our benchmarking
+         (dummy (full-gc))
+         (start (get-internal-real-time)))
+    (declare (ignore dummy))
+    (dotimes (i factor)
+      (funcall scanner string 0 (length string)))
+    (float (/ (- (get-internal-real-time) start) internal-time-units-per-second))))
+
+#+(or scl
+      lispworks
+      (and sbcl sb-thread))
+(defun threaded-scan (scanner target-string &key (threads 10) (repetitions 5000))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Auxiliary function used by TEST to check whether SCANNER is thread-safe."
+  (full-gc)
+  (let ((collector (make-array threads))
+        (counter 0))
+    (loop for i below threads
+          do (let* ((j i)
+                    (fn
+                      (lambda ()
+                        (let ((r (random repetitions)))
+                          (loop for k below repetitions
+                                if (= k r)
+                                  do (setf (aref collector j)
+                                             (let ((result
+                                                     (multiple-value-list
+                                                       (cl-ppcre:scan scanner target-string))))
+                                               (unless (cdr result)
+                                                 (setq result '(nil nil #() #())))
+                                               result))
+                                else
+                                  do (cl-ppcre:scan scanner target-string))
+                          (incf counter)))))
+               #+scl (thread:thread-create fn)
+               #+lispworks (mp:process-run-function "" nil fn)
+               #+(and sbcl sb-thread) (sb-thread:make-thread fn)))
+    (loop while (< counter threads)
+          do (sleep .1))
+    (destructuring-bind (first-start first-end first-reg-starts first-reg-ends)
+        (aref collector 0)
+      (loop for (start end reg-starts reg-ends) across collector
+            if (or (not (eql first-start start))
+                   (not (eql first-end end))
+                   (/= (length first-reg-starts) (length reg-starts))
+                   (/= (length first-reg-ends) (length reg-ends))
+                   (loop for first-reg-start across first-reg-starts
+                         for reg-start across reg-starts
+                         thereis (not (eql first-reg-start reg-start)))
+                   (loop for first-reg-end across first-reg-ends
+                         for reg-end across reg-ends
+                         thereis (not (eql first-reg-end reg-end))))
+            do (return (format nil "~&Inconsistent results during multi-threading"))))))
+
+(defun create-string-from-input (input)
+  (cond ((or (null input)
+             (stringp input))
+          input)
+        (t
+          (cl-ppcre::string-list-to-simple-string
+           (loop for element in input
+                 if (stringp element)
+                 collect element
+                 else
+                 collect (string (code-char element)))))))
+
+(defun test (&key (file-name 
+                   (make-pathname :name "testdata"
+                                  :type nil :version nil
+                                  :defaults *cl-ppcre-test-base-directory*)
+                   file-name-provided-p)
+                  threaded)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (ignorable threaded))
+  "Loop through all test cases in FILE-NAME and print report. Only in
+LispWorks and SCL: If THREADED is true, also test whether the scanners
+work multi-threaded."
+  (with-open-file (stream file-name
+                          #+(or :allegro :clisp :scl)
+                          :external-format
+                          #+(or :allegro :clisp :scl)
+                          (if file-name-provided-p
+                            :default
+                            #+:allegro :iso-8859-1
+                            #+:clisp charset:iso-8859-1
+                            #+:scl :iso-8859-1))
+    (loop with testcount of-type fixnum = 0
+          with *regex-char-code-limit* = (if file-name-provided-p
+                                           *regex-char-code-limit*
+                                           ;; the standard test suite
+                                           ;; doesn't need full
+                                           ;; Unicode support
+                                           255)
+          with *allow-quoting* = (if file-name-provided-p
+                                   *allow-quoting*
+                                   t)
+          for input-line = (read stream nil nil)
+          for (counter info-string regex
+                       case-insensitive-mode multi-line-mode
+                       single-line-mode extended-mode
+                       string perl-error factor
+                       perl-time ex-result ex-subs) = input-line
+          while input-line
+          do (let ((info-string (create-string-from-input info-string))
+                   (regex (create-string-from-input regex))
+                   (string (create-string-from-input string))
+                   (ex-result (create-string-from-input ex-result))
+                   (ex-subs (mapcar #'create-string-from-input ex-subs))
+                   (errors '()))
+               ;; provide some visual feedback for slow CL
+               ;; implementations; suggested by JP Massar
+               (incf testcount)
+               #+(or scl
+                     lispworks
+                     (and sbcl sb-thread))
+               (when threaded
+                 (format t "Test #~A (ID ~A)~%" testcount counter)
+                 (force-output))
+               (unless #-(or scl
+                             lispworks
+                             (and sbcl sb-thread))
+                       nil
+                       #+(or scl
+                             lispworks
+                             (and sbcl sb-thread))
+                       threaded
+                 (when (zerop (mod testcount 10))
+                   (format t ".")
+                   (force-output))
+                 (when (zerop (mod testcount 100))
+                   (terpri)))
+               (handler-case
+                 (let* ((*use-bmh-matchers* (if (and (> factor 1) (plusp perl-time))
+                                              *use-bmh-matchers*
+                                              ;; if we only check for
+                                              ;; correctness we don't
+                                              ;; care about speed that
+                                              ;; match (but rather
+                                              ;; about space
+                                              ;; constraints of the
+                                              ;; trial versions)
+                                              nil))
+                        (scanner (create-scanner regex
+                                                :case-insensitive-mode case-insensitive-mode
+                                                :multi-line-mode multi-line-mode
+                                                :single-line-mode single-line-mode
+                                                :extended-mode extended-mode)))
+                   (multiple-value-bind (result1 result2 sub-starts sub-ends)
+                       (scan scanner string)
+                     (cond (perl-error
+                             (push (format nil
+                                           "~&expected an error but got a result")
+                                   errors))
+                           (t
+                             (when (not (eq result1 ex-result))
+                               (if result1
+                                 (let ((result (subseq string result1 result2)))
+                                   (unless (string= result ex-result)
+                                     (push (format nil
+                                                   "~&expected ~S but got ~S"
+                                                   ex-result result)
+                                           errors))
+                                   (setq sub-starts (coerce sub-starts 'list)
+                                         sub-ends (coerce sub-ends 'list))
+                                   (loop for i from 0
+                                         for ex-sub in ex-subs
+                                         for sub-start = (nth i sub-starts)
+                                         for sub-end = (nth i sub-ends)
+                                         for sub = (if (and sub-start sub-end)
+                                                     (subseq string sub-start sub-end)
+                                                     nil)
+                                         unless (string= ex-sub sub)
+                                           do (push (format nil
+                                                            "~&\\~A: expected ~S but got ~S"
+                                                            (1+ i) ex-sub sub) errors)))
+                                 (push (format nil
+                                               "~&expected ~S but got ~S"
+                                               ex-result result1)
+                                       errors)))))
+                     #+(or scl
+                           lispworks
+                           (and sbcl sb-thread))
+                     (when threaded
+                       (let ((thread-result (threaded-scan scanner string)))
+                         (when thread-result
+                         (push thread-result errors))))))
+                 (condition (msg)
+                   (unless perl-error
+                     (push (format nil "~&got an unexpected error: '~A'" msg)
+                           errors))))
+               (setq errors (nreverse errors))
+               (cond (errors
+                       (when (or (<= factor 1) (zerop perl-time))
+                         (format t "~&~4@A (~A):~{~&   ~A~}~%"
+                                 counter info-string errors)))
+                     ((and (> factor 1) (plusp perl-time))
+                       (let ((result (time-regex factor regex string
+                                                 :case-insensitive-mode case-insensitive-mode
+                                                 :multi-line-mode multi-line-mode
+                                                 :single-line-mode single-line-mode
+                                                 :extended-mode extended-mode)))
+                         (format t "~&~4@A: ~,4F (~A repetitions, Perl: ~,4F seconds, CL-PPCRE: ~,4F seconds)" counter
+                                 (float (/ result perl-time)) factor perl-time result)
+                         #+:cormanlisp (force-output *standard-output*)))
+                     (t nil))))
+    (values)))
--- a/regex-class.lisp
+++ b/regex-class.lisp
@ -0,0 +1,752 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/regex-class.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; This file defines the REGEX class and some utility methods for
+;;; this class. REGEX objects are used to represent the (transformed)
+;;; parse trees internally
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(locally
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (defclass regex ()
+    ()
+    (:documentation "The REGEX base class. All other classes inherit from this one."))
+
+
+  (defclass seq (regex)
+    ((elements :initarg :elements
+               :accessor elements
+               :type cons
+               :documentation "A list of REGEX objects."))
+    (:documentation "SEQ objects represents sequences of
+regexes. (Like \"ab\" is the sequence of \"a\" and \"b\".)"))
+
+  (defclass alternation (regex)
+    ((choices :initarg :choices
+              :accessor choices
+              :type cons
+              :documentation "A list of REGEX objects"))
+    (:documentation "ALTERNATION objects represent alternations of
+regexes. (Like \"a|b\" ist the alternation of \"a\" or \"b\".)"))
+
+  (defclass lookahead (regex)
+    ((regex :initarg :regex
+            :accessor regex
+            :documentation "The REGEX object we're checking.")
+     (positivep :initarg :positivep
+                :reader positivep
+                :documentation "Whether this assertion is positive."))
+    (:documentation "LOOKAHEAD objects represent look-ahead assertions."))
+
+  (defclass lookbehind (regex)
+    ((regex :initarg :regex
+            :accessor regex
+            :documentation "The REGEX object we're checking.")
+     (positivep :initarg :positivep
+                :reader positivep
+                :documentation "Whether this assertion is positive.")
+     (len :initarg :len
+          :accessor len
+          :type fixnum
+          :documentation "The (fixed) length of the enclosed regex."))
+    (:documentation "LOOKBEHIND objects represent look-behind assertions."))
+
+  (defclass repetition (regex)
+    ((regex :initarg :regex
+            :accessor regex
+            :documentation "The REGEX that's repeated.")
+     (greedyp :initarg :greedyp
+              :reader greedyp
+              :documentation "Whether the repetition is greedy.")
+     (minimum :initarg :minimum
+              :accessor minimum
+              :type fixnum
+              :documentation "The minimal number of repetitions.")
+     (maximum :initarg :maximum
+              :accessor maximum
+              :documentation "The maximal number of repetitions.
+Can be NIL for unbounded.")
+     (min-len :initarg :min-len
+              :reader min-len
+              :documentation "The minimal length of the enclosed regex.")
+     (len :initarg :len
+          :reader len
+          :documentation "The length of the enclosed regex. NIL if unknown.")
+     (min-rest :initform 0
+               :accessor min-rest
+               :type fixnum
+               :documentation "The minimal number of characters which must
+appear after this repetition.")
+     (contains-register-p :initarg :contains-register-p
+                          :reader contains-register-p
+                          :documentation "If the regex contains a register."))
+    (:documentation "REPETITION objects represent repetitions of regexes."))
+
+  (defclass register (regex)
+    ((regex :initarg :regex
+            :accessor regex
+            :documentation "The inner regex.")
+     (num :initarg :num
+          :reader num
+          :type fixnum
+          :documentation "The number of this register, starting from 0.
+This is the index into *REGS-START* and *REGS-END*."))
+    (:documentation "REGISTER objects represent register groups."))
+
+  (defclass standalone (regex)
+    ((regex :initarg :regex
+            :accessor regex
+            :documentation "The inner regex."))
+    (:documentation "A standalone regular expression."))
+  
+  (defclass back-reference (regex)
+    ((num :initarg :num
+          :accessor num
+          :type fixnum
+          :documentation "The number of the register this reference refers to.")
+     (case-insensitive-p :initarg :case-insensitive-p
+                         :reader case-insensitive-p
+                         :documentation "Whether we check case-insensitively."))
+    (:documentation "BACK-REFERENCE objects represent backreferences."))
+
+  (defclass char-class (regex)
+    ((hash :initarg :hash
+           :reader hash
+           :type (or hash-table null)
+           :documentation "A hash table the keys of which are the characters;
+the values are always T.")
+     (case-insensitive-p :initarg :case-insensitive-p
+                         :reader case-insensitive-p
+                         :documentation "If the char class case-insensitive.")
+     (invertedp :initarg :invertedp
+                :reader invertedp
+                :documentation "Whether we mean the inverse of the char class.")
+     (word-char-class-p :initarg :word-char-class-p
+                        :reader word-char-class-p
+                        :documentation "Whether this CHAR CLASS
+represents the special class WORD-CHAR-CLASS."))
+    (:documentation "CHAR-CLASS objects represent character classes."))
+
+  (defclass str (regex)
+    ((str :initarg :str
+          :accessor str
+          :type string
+          :documentation "The actual string.")
+     (len :initform 0
+          :accessor len
+          :type fixnum
+          :documentation "The length of the string.")
+     (case-insensitive-p :initarg :case-insensitive-p
+                         :reader case-insensitive-p
+                         :documentation "If we match case-insensitively.")
+     (offset :initform nil
+             :accessor offset
+             :documentation "Offset from the left of the whole parse tree.
+The first regex has offset 0.
+NIL if unknown, i.e. behind a variable-length regex.")
+     (skip :initform nil
+           :initarg :skip
+           :accessor skip
+           :documentation "If we can avoid testing for this string
+because the SCAN function has done this already.")
+     (start-of-end-string-p :initform nil
+                            :accessor start-of-end-string-p
+                            :documentation "If this is the unique STR which
+starts END-STRING (a slot of MATCHER)."))
+    (:documentation "STR objects represent string."))
+
+  (defclass anchor (regex)
+    ((startp :initarg :startp
+             :reader startp
+             :documentation "Whether this is a \"start anchor\".")
+     (multi-line-p :initarg :multi-line-p
+                   :reader multi-line-p
+                   :documentation "Whether we're in multi-line mode,
+i.e. whether each #\\Newline is surrounded by anchors.")
+     (no-newline-p :initarg :no-newline-p
+                   :reader no-newline-p
+                   :documentation "Whether we ignore #\\Newline at the end."))
+    (:documentation "ANCHOR objects represent anchors like \"^\" or \"$\"."))
+
+  (defclass everything (regex)
+    ((single-line-p :initarg :single-line-p
+                    :reader single-line-p
+                    :documentation "Whether we're in single-line mode,
+i.e. whether we also match #\\Newline."))
+    (:documentation "EVERYTHING objects represent regexes matching
+\"everything\", i.e. dots."))
+
+  (defclass word-boundary (regex)
+    ((negatedp :initarg :negatedp
+               :reader negatedp
+               :documentation "Whether we mean the opposite,
+i.e. no word-boundary."))
+    (:documentation "WORD-BOUNDARY objects represent word-boundary assertions."))
+
+  (defclass branch (regex)
+    ((test :initarg :test
+           :accessor test
+           :documentation "The test of this branch, one of LOOKAHEAD,
+LOOKBEHIND, or a number.")
+     (then-regex :initarg :then-regex
+                 :accessor then-regex
+                 :documentation "The regex that's to be matched if the
+test succeeds.")
+     (else-regex :initarg :else-regex
+                 :initform (make-instance 'void)
+                 :accessor else-regex
+                 :documentation "The regex that's to be matched if the
+test fails."))
+    (:documentation "BRANCH objects represent Perl's conditional regular
+expressions."))
+
+  (defclass void (regex)
+    ()
+    (:documentation "VOID objects represent empty regular expressions.")))
+
+(declaim (ftype (function (t) simple-string) str))
+
+;;; The following four methods allow a VOID object to behave like a
+;;; zero-length STR object (only readers needed)
+
+(defmethod initialize-instance :after ((str str) &rest init-args)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (ignore init-args))
+  "Automatically computes the length of a STR after initialization."
+  (let ((str-slot (slot-value str 'str)))
+    (unless (typep str-slot 'simple-string)
+      (setf (slot-value str 'str) (coerce str-slot 'simple-string))))
+  (setf (len str) (length (str str))))
+
+(defmethod len ((void void))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  0)
+
+(defmethod str ((void void))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "")
+
+(defmethod skip ((void void))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  nil)
+
+(defmethod start-of-end-string-p ((void void))
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  nil)
+
+(defgeneric case-mode (regex old-case-mode)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Utility function used by the optimizer (see GATHER-STRINGS).
+Returns a keyword denoting the case-(in)sensitivity of a STR or its
+second argument if the STR has length 0. Returns NIL for REGEX objects
+which are not of type STR."))
+
+(defmethod case-mode ((str str) old-case-mode)
+  (cond ((zerop (len str))
+          old-case-mode)
+        ((case-insensitive-p str)
+          :case-insensitive)
+        (t
+          :case-sensitive)))
+
+(defmethod case-mode ((regex regex) old-case-mode)
+  (declare (ignore old-case-mode))
+  nil)
+
+(defgeneric copy-regex (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Implements a deep copy of a REGEX object."))
+
+(defmethod copy-regex ((anchor anchor))
+  (make-instance 'anchor
+                 :startp (startp anchor)
+                 :multi-line-p (multi-line-p anchor)
+                 :no-newline-p (no-newline-p anchor)))
+
+(defmethod copy-regex ((everything everything))
+  (make-instance 'everything
+                 :single-line-p (single-line-p everything)))
+
+(defmethod copy-regex ((word-boundary word-boundary))
+  (make-instance 'word-boundary
+                 :negatedp (negatedp word-boundary)))
+
+(defmethod copy-regex ((void void))
+  (make-instance 'void))
+
+(defmethod copy-regex ((lookahead lookahead))
+  (make-instance 'lookahead
+                 :regex (copy-regex (regex lookahead))
+                 :positivep (positivep lookahead)))
+
+(defmethod copy-regex ((seq seq))
+  (make-instance 'seq
+                 :elements (mapcar #'copy-regex (elements seq))))
+
+(defmethod copy-regex ((alternation alternation))
+  (make-instance 'alternation
+                 :choices (mapcar #'copy-regex (choices alternation))))
+
+(defmethod copy-regex ((branch branch))
+  (with-slots ((test test))
+      branch
+    (make-instance 'branch
+                   :test (if (typep test 'regex)
+                           (copy-regex test)
+                           test)
+                   :then-regex (copy-regex (then-regex branch))
+                   :else-regex (copy-regex (else-regex branch)))))
+
+(defmethod copy-regex ((lookbehind lookbehind))
+  (make-instance 'lookbehind
+                 :regex (copy-regex (regex lookbehind))
+                 :positivep (positivep lookbehind)
+                 :len (len lookbehind)))
+
+(defmethod copy-regex ((repetition repetition))
+  (make-instance 'repetition
+                 :regex (copy-regex (regex repetition))
+                 :greedyp (greedyp repetition)
+                 :minimum (minimum repetition)
+                 :maximum (maximum repetition)
+                 :min-len (min-len repetition)
+                 :len (len repetition)
+                 :contains-register-p (contains-register-p repetition)))
+
+(defmethod copy-regex ((register register))
+  (make-instance 'register
+                 :regex (copy-regex (regex register))
+                 :num (num register)))
+
+(defmethod copy-regex ((standalone standalone))
+  (make-instance 'standalone
+                 :regex (copy-regex (regex standalone))))
+
+(defmethod copy-regex ((back-reference back-reference))
+  (make-instance 'back-reference
+                 :num (num back-reference)
+                 :case-insensitive-p (case-insensitive-p back-reference)))
+
+(defmethod copy-regex ((char-class char-class))
+  (make-instance 'char-class
+                 :hash (hash char-class)
+                 :case-insensitive-p (case-insensitive-p char-class)
+                 :invertedp (invertedp char-class)
+                 :word-char-class-p (word-char-class-p char-class)))
+
+(defmethod copy-regex ((str str))
+  (make-instance 'str
+                 :str (str str)
+                 :case-insensitive-p (case-insensitive-p str)))
+
+;;; Note that COPY-REGEX and REMOVE-REGISTERS could have easily been
+;;; wrapped into one function. Maybe in the next release...
+
+;;; Further note that this function is used by CONVERT to factor out
+;;; complicated repetitions, i.e. cases like
+;;;   (a)* -> (?:a*(a))?
+;;; This won't work for, say,
+;;;   ((a)|(b))* -> (?:(?:a|b)*((a)|(b)))?
+;;; and therefore we stop REGISTER removal once we see an ALTERNATION.
+
+(defgeneric remove-registers (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns a deep copy of a REGEX (see COPY-REGEX) and
+optionally removes embedded REGISTER objects if possible and if the
+special variable REMOVE-REGISTERS-P is true."))
+
+(defmethod remove-registers ((register register))
+  (declare (special remove-registers-p reg-seen))
+  (cond (remove-registers-p
+          (remove-registers (regex register)))
+        (t
+          ;; mark REG-SEEN as true so enclosing REPETITION objects
+          ;; (see method below) know if they contain a register or not
+          (setq reg-seen t)
+          (copy-regex register))))
+
+(defmethod remove-registers ((repetition repetition))
+  (let* (reg-seen
+         (inner-regex (remove-registers (regex repetition))))
+    ;; REMOVE-REGISTERS will set REG-SEEN (see method above) if
+    ;; (REGEX REPETITION) contains a REGISTER
+    (declare (special reg-seen))
+    (make-instance 'repetition
+                   :regex inner-regex
+                   :greedyp (greedyp repetition)
+                   :minimum (minimum repetition)
+                   :maximum (maximum repetition)
+                   :min-len (min-len repetition)
+                   :len (len repetition)
+                   :contains-register-p reg-seen)))
+
+(defmethod remove-registers ((standalone standalone))
+  (make-instance 'standalone
+                 :regex (remove-registers (regex standalone))))
+
+(defmethod remove-registers ((lookahead lookahead))
+  (make-instance 'lookahead
+                 :regex (remove-registers (regex lookahead))
+                 :positivep (positivep lookahead)))
+
+(defmethod remove-registers ((lookbehind lookbehind))
+  (make-instance 'lookbehind
+                 :regex (remove-registers (regex lookbehind))
+                 :positivep (positivep lookbehind)
+                 :len (len lookbehind)))
+
+(defmethod remove-registers ((branch branch))
+  (with-slots ((test test))
+      branch
+    (make-instance 'branch
+                   :test (if (typep test 'regex)
+                           (remove-registers test)
+                           test)
+                   :then-regex (remove-registers (then-regex branch))
+                   :else-regex (remove-registers (else-regex branch)))))
+
+(defmethod remove-registers ((alternation alternation))
+  (declare (special remove-registers-p))
+  ;; an ALTERNATION, so we can't remove REGISTER objects further down
+  (setq remove-registers-p nil)
+  (copy-regex alternation))
+
+(defmethod remove-registers ((regex regex))
+  (copy-regex regex))
+
+(defmethod remove-registers ((seq seq))
+  (make-instance 'seq
+                 :elements (mapcar #'remove-registers (elements seq))))
+
+(defgeneric everythingp (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns an EVERYTHING object if REGEX is equivalent
+to this object, otherwise NIL. So, \"(.){1}\" would return true
+(i.e. the object corresponding to \".\", for example."))
+
+(defmethod everythingp ((seq seq))
+  ;; we might have degenerate cases like (:SEQUENCE :VOID ...)
+  ;; due to the parsing process
+  (let ((cleaned-elements (remove-if #'(lambda (element)
+                                         (typep element 'void))
+                                     (elements seq))))
+    (and (= 1 (length cleaned-elements))
+         (everythingp (first cleaned-elements)))))
+
+(defmethod everythingp ((alternation alternation))
+  (with-slots ((choices choices))
+      alternation
+    (and (= 1 (length choices))
+         ;; this is unlikely to happen for human-generated regexes,
+         ;; but machine-generated ones might look like this
+         (everythingp (first choices)))))
+
+(defmethod everythingp ((repetition repetition))
+  (with-slots ((maximum maximum)
+               (minimum minimum)
+               (regex regex))
+      repetition
+    (and maximum
+         (= 1 minimum maximum)
+         ;; treat "<regex>{1,1}" like "<regex>"
+         (everythingp regex))))
+
+(defmethod everythingp ((register register))
+  (everythingp (regex register)))
+
+(defmethod everythingp ((standalone standalone))
+  (everythingp (regex standalone)))
+
+(defmethod everythingp ((everything everything))
+  everything)
+
+(defmethod everythingp ((regex regex))
+  ;; the general case for ANCHOR, BACK-REFERENCE, BRANCH, CHAR-CLASS,
+  ;; LOOKAHEAD, LOOKBEHIND, STR, VOID, and WORD-BOUNDARY
+  nil)
+
+(defgeneric regex-length (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Return the length of REGEX if it is fixed, NIL otherwise."))
+
+(defmethod regex-length ((seq seq))
+  ;; simply add all inner lengths unless one of them is NIL
+  (loop for sub-regex in (elements seq)
+        for len = (regex-length sub-regex)
+        if (not len) do (return nil)
+        sum len))
+
+(defmethod regex-length ((alternation alternation))
+  ;; only return a true value if all inner lengths are non-NIL and
+  ;; mutually equal
+  (loop for sub-regex in (choices alternation)
+        for old-len = nil then len
+        for len = (regex-length sub-regex)
+        if (or (not len)
+               (and old-len (/= len old-len))) do (return nil)
+        finally (return len)))
+
+(defmethod regex-length ((branch branch))
+  ;; only return a true value if both alternations have a length and
+  ;; if they're equal
+  (let ((then-length (regex-length (then-regex branch))))
+    (and then-length
+         (eql then-length (regex-length (else-regex branch)))
+         then-length)))
+
+(defmethod regex-length ((repetition repetition))
+  ;; we can only compute the length of a REPETITION object if the
+  ;; number of repetitions is fixed; note that we don't call
+  ;; REGEX-LENGTH for the inner regex, we assume that the LEN slot is
+  ;; always set correctly
+  (with-slots ((len len)
+               (minimum minimum)
+               (maximum maximum))
+      repetition
+    (if (and len
+             (eq minimum maximum))
+      (* minimum len)
+      nil)))
+
+(defmethod regex-length ((register register))
+  (regex-length (regex register)))
+
+(defmethod regex-length ((standalone standalone))
+  (regex-length (regex standalone)))
+
+(defmethod regex-length ((back-reference back-reference))
+  ;; with enough effort we could possibly do better here, but
+  ;; currently we just give up and return NIL
+  nil)
+    
+(defmethod regex-length ((char-class char-class))
+  1)
+
+(defmethod regex-length ((everything everything))
+  1)
+
+(defmethod regex-length ((str str))
+  (len str))
+
+(defmethod regex-length ((regex regex))
+  ;; the general case for ANCHOR, LOOKAHEAD, LOOKBEHIND, VOID, and
+  ;; WORD-BOUNDARY (which all have zero-length)
+  0)
+
+(defgeneric regex-min-length (regex)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns the minimal length of REGEX."))
+
+(defmethod regex-min-length ((seq seq))
+  ;; simply add all inner minimal lengths
+  (loop for sub-regex in (elements seq)
+        for len = (regex-min-length sub-regex)
+        sum len))
+
+(defmethod regex-min-length ((alternation alternation))
+  ;; minimal length of an alternation is the minimal length of the
+  ;; "shortest" element
+  (loop for sub-regex in (choices alternation)
+        for len = (regex-min-length sub-regex)
+        minimize len))
+
+(defmethod regex-min-length ((branch branch))
+  ;; minimal length of both alternations
+  (min (regex-min-length (then-regex branch))
+       (regex-min-length (else-regex branch))))
+
+(defmethod regex-min-length ((repetition repetition))
+  ;; obviously the product of the inner minimal length and the minimal
+  ;; number of repetitions
+  (* (minimum repetition) (min-len repetition)))
+    
+(defmethod regex-min-length ((register register))
+  (regex-min-length (regex register)))
+    
+(defmethod regex-min-length ((standalone standalone))
+  (regex-min-length (regex standalone)))
+    
+(defmethod regex-min-length ((char-class char-class))
+  1)
+
+(defmethod regex-min-length ((everything everything))
+  1)
+
+(defmethod regex-min-length ((str str))
+  (len str))
+    
+(defmethod regex-min-length ((regex regex))
+  ;; the general case for ANCHOR, BACK-REFERENCE, LOOKAHEAD,
+  ;; LOOKBEHIND, VOID, and WORD-BOUNDARY
+  0)
+
+(defgeneric compute-offsets (regex start-pos)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (:documentation "Returns the offset the following regex would have
+relative to START-POS or NIL if we can't compute it. Sets the OFFSET
+slot of REGEX to START-POS if REGEX is a STR. May also affect OFFSET
+slots of STR objects further down the tree."))
+
+;; note that we're actually only interested in the offset of
+;; "top-level" STR objects (see ADVANCE-FN in the SCAN function) so we
+;; can stop at variable-length alternations and don't need to descend
+;; into repetitions
+
+(defmethod compute-offsets ((seq seq) start-pos)
+  (loop for element in (elements seq)
+        ;; advance offset argument for next call while looping through
+        ;; the elements
+        for pos = start-pos then curr-offset
+        for curr-offset = (compute-offsets element pos)
+        while curr-offset
+        finally (return curr-offset)))
+
+(defmethod compute-offsets ((alternation alternation) start-pos)
+  (loop for choice in (choices alternation)
+        for old-offset = nil then curr-offset
+        for curr-offset = (compute-offsets choice start-pos)
+        ;; we stop immediately if two alternations don't result in the
+        ;; same offset
+        if (or (not curr-offset)
+               (and old-offset (/= curr-offset old-offset)))
+          do (return nil)
+        finally (return curr-offset)))
+
+(defmethod compute-offsets ((branch branch) start-pos)
+  ;; only return offset if both alternations have equal value
+  (let ((then-offset (compute-offsets (then-regex branch) start-pos)))
+    (and then-offset
+         (eql then-offset (compute-offsets (else-regex branch) start-pos))
+         then-offset)))
+
+(defmethod compute-offsets ((repetition repetition) start-pos)
+  ;; no need to descend into the inner regex
+  (with-slots ((len len)
+               (minimum minimum)
+               (maximum maximum))
+      repetition
+    (if (and len
+             (eq minimum maximum))
+      ;; fixed number of repetitions, so we know how to proceed
+      (+ start-pos (* minimum len))
+      ;; otherwise return NIL
+      nil)))
+
+(defmethod compute-offsets ((register register) start-pos)
+  (compute-offsets (regex register) start-pos))
+    
+(defmethod compute-offsets ((standalone standalone) start-pos)
+  (compute-offsets (regex standalone) start-pos))
+    
+(defmethod compute-offsets ((char-class char-class) start-pos)
+  (1+ start-pos))
+    
+(defmethod compute-offsets ((everything everything) start-pos)
+  (1+ start-pos))
+    
+(defmethod compute-offsets ((str str) start-pos)
+  (setf (offset str) start-pos)
+  (+ start-pos (len str)))
+
+(defmethod compute-offsets ((back-reference back-reference) start-pos)
+  ;; with enough effort we could possibly do better here, but
+  ;; currently we just give up and return NIL
+  (declare (ignore start-pos))
+  nil)
+
+(defmethod compute-offsets ((regex regex) start-pos)
+  ;; the general case for ANCHOR, LOOKAHEAD, LOOKBEHIND, VOID, and
+  ;; WORD-BOUNDARY (which all have zero-length)
+  start-pos)
--- a/repetition-closures.lisp
+++ b/repetition-closures.lisp
@ -0,0 +1,868 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/repetition-closures.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; This is actually a part of closures.lisp which we put into a
+;;; separate file because it is rather complex. We only deal with
+;;; REPETITIONs here. Note that this part of the code contains some
+;;; rather crazy micro-optimizations which were introduced to be as
+;;; competitive with Perl as possible in tight loops.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defmacro incf-after (place &optional (delta 1) &environment env)
+  "Utility macro inspired by C's \"place++\", i.e. first return the
+value of PLACE and afterwards increment it by DELTA."
+  (with-unique-names (%temp)
+    (multiple-value-bind (vars vals store-vars writer-form reader-form)
+        (get-setf-expansion place env)
+      `(let* (,@(mapcar #'list vars vals)
+              (,%temp ,reader-form)
+              (,(car store-vars) (+ ,%temp ,delta)))
+        ,writer-form
+        ,%temp))))
+
+;; code for greedy repetitions with minimum zero
+
+(defmacro greedy-constant-length-closure (check-curr-pos)
+  "This is the template for simple greedy repetitions (where simple
+means that the minimum number of repetitions is zero, that the inner
+regex to be checked is of fixed length LEN, and that it doesn't
+contain registers, i.e. there's no need for backtracking).
+CHECK-CURR-POS is a form which checks whether the inner regex of the
+repetition matches at CURR-POS."
+  `(if maximum
+    (lambda (start-pos)
+      (declare (type fixnum start-pos maximum))
+      ;; because we know LEN we know in advance where to stop at the
+      ;; latest; we also take into consideration MIN-REST, i.e. the
+      ;; minimal length of the part behind the repetition
+      (let ((target-end-pos (min (1+ (- *end-pos* len min-rest))
+                                 ;; don't go further than MAXIMUM
+                                 ;; repetitions, of course
+                                 (+ start-pos
+                                    (the fixnum (* len maximum)))))
+            (curr-pos start-pos))
+        (declare (type fixnum target-end-pos curr-pos))
+        (block greedy-constant-length-matcher
+          ;; we use an ugly TAGBODY construct because this might be a
+          ;; tight loop and this version is a bit faster than our LOOP
+          ;; version (at least in CMUCL)
+          (tagbody
+            forward-loop
+            ;; first go forward as far as possible, i.e. while
+            ;; the inner regex matches
+            (when (>= curr-pos target-end-pos)
+              (go backward-loop))
+            (when ,check-curr-pos
+              (incf curr-pos len)
+              (go forward-loop))
+            backward-loop
+            ;; now go back LEN steps each until we're able to match
+            ;; the rest of the regex
+            (when (< curr-pos start-pos)
+              (return-from greedy-constant-length-matcher nil))
+            (let ((result (funcall next-fn curr-pos)))
+              (when result
+                (return-from greedy-constant-length-matcher result)))
+            (decf curr-pos len)
+            (go backward-loop)))))
+    ;; basically the same code; it's just a bit easier because we're
+    ;; not bounded by MAXIMUM
+    (lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (let ((target-end-pos (1+ (- *end-pos* len min-rest)))
+            (curr-pos start-pos))
+        (declare (type fixnum target-end-pos curr-pos))
+        (block greedy-constant-length-matcher
+          (tagbody
+            forward-loop
+            (when (>= curr-pos target-end-pos)
+              (go backward-loop))
+            (when ,check-curr-pos
+              (incf curr-pos len)
+              (go forward-loop))
+            backward-loop
+            (when (< curr-pos start-pos)
+              (return-from greedy-constant-length-matcher nil))
+            (let ((result (funcall next-fn curr-pos)))
+              (when result
+                (return-from greedy-constant-length-matcher result)))
+            (decf curr-pos len)
+            (go backward-loop)))))))
+
+(defun create-greedy-everything-matcher (maximum min-rest next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum min-rest)
+           (type function next-fn))
+  "Creates a closure which just matches as far ahead as possible,
+i.e. a closure for a dot in single-line mode."
+  (if maximum
+    (lambda (start-pos)
+      (declare (type fixnum start-pos maximum))
+      ;; because we know LEN we know in advance where to stop at the
+      ;; latest; we also take into consideration MIN-REST, i.e. the
+      ;; minimal length of the part behind the repetition
+      (let ((target-end-pos (min (+ start-pos maximum)
+                                 (- *end-pos* min-rest))))
+        (declare (type fixnum target-end-pos))
+        ;; start from the highest possible position and go backward
+        ;; until we're able to match the rest of the regex
+        (loop for curr-pos of-type fixnum from target-end-pos downto start-pos
+              thereis (funcall next-fn curr-pos))))
+    ;; basically the same code; it's just a bit easier because we're
+    ;; not bounded by MAXIMUM
+    (lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (let ((target-end-pos (- *end-pos* min-rest)))
+        (declare (type fixnum target-end-pos))
+        (loop for curr-pos of-type fixnum from target-end-pos downto start-pos
+              thereis (funcall next-fn curr-pos))))))
+
+(defmethod create-greedy-constant-length-matcher ((repetition repetition)
+                                                  next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is greedy and the minimal number of repetitions is
+zero. It is furthermore assumed that the inner regex of REPETITION is
+of fixed length and doesn't contain registers."
+  (let ((len (len repetition))
+        (maximum (maximum repetition))
+        (regex (regex repetition))
+        (min-rest (min-rest repetition)))
+    (declare (type fixnum len min-rest)
+             (type function next-fn))
+    (cond ((zerop len)
+            ;; inner regex has zero-length, so we can discard it
+            ;; completely
+            next-fn)
+          (t
+            ;; now first try to optimize for a couple of common cases
+            (typecase regex
+              (str
+                (let ((str (str regex)))
+                  (if (= 1 len)
+                    ;; a single character
+                    (let ((chr (schar str 0)))
+                      (if (case-insensitive-p regex)
+                        (greedy-constant-length-closure
+                         (char-equal chr (schar *string* curr-pos)))
+                        (greedy-constant-length-closure
+                         (char= chr (schar *string* curr-pos)))))
+                    ;; a string
+                    (if (case-insensitive-p regex)
+                      (greedy-constant-length-closure
+                       (*string*-equal str curr-pos (+ curr-pos len) 0 len))
+                      (greedy-constant-length-closure
+                       (*string*= str curr-pos (+ curr-pos len) 0 len))))))
+              (char-class
+                ;; a character class
+                (insert-char-class-tester (regex (schar *string* curr-pos))
+                  (if (invertedp regex)
+                    (greedy-constant-length-closure
+                     (not (char-class-test)))
+                    (greedy-constant-length-closure
+                     (char-class-test)))))
+              (everything
+                ;; an EVERYTHING object, i.e. a dot
+                (if (single-line-p regex)
+                  (create-greedy-everything-matcher maximum min-rest next-fn)
+                  (greedy-constant-length-closure
+                   (char/= #\Newline (schar *string* curr-pos)))))
+              (t
+                ;; the general case - we build an inner matcher which
+                ;; just checks for immediate success, i.e. NEXT-FN is
+                ;; #'IDENTITY
+                (let ((inner-matcher (create-matcher-aux regex #'identity)))
+                  (declare (type function inner-matcher))
+                  (greedy-constant-length-closure
+                   (funcall inner-matcher curr-pos)))))))))
+  
+(defmethod create-greedy-no-zero-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is greedy and the minimal number of repetitions is
+zero. It is furthermore assumed that the inner regex of REPETITION can
+never match a zero-length string (or instead the maximal number of
+repetitions is 1)."
+  (let ((maximum (maximum repetition))
+        ;; REPEAT-MATCHER is part of the closure's environment but it
+        ;; can only be defined after GREEDY-AUX is defined
+        repeat-matcher)
+    (declare (type function next-fn))
+    (cond
+      ((eql maximum 1)
+        ;; this is essentially like the next case but with a known
+        ;; MAXIMUM of 1 we can get away without a counter; note that
+        ;; we always arrive here if CONVERT optimizes <regex>* to
+        ;; (?:<regex'>*<regex>)?
+        (setq repeat-matcher
+                (create-matcher-aux (regex repetition) next-fn))
+        (lambda (start-pos)
+          (declare (type function repeat-matcher))
+          (or (funcall repeat-matcher start-pos)
+              (funcall next-fn start-pos))))
+      (maximum
+        ;; we make a reservation for our slot in *REPEAT-COUNTERS*
+        ;; because we need to keep track whether we've reached MAXIMUM
+        ;; repetitions
+        (let ((rep-num (incf-after *rep-num*)))
+          (flet ((greedy-aux (start-pos)
+                   (declare (type fixnum start-pos maximum rep-num)
+                            (type function repeat-matcher))
+                   ;; the actual matcher which first tries to match the
+                   ;; inner regex of REPETITION (if we haven't done so
+                   ;; too often) and on failure calls NEXT-FN
+                   (or (and (< (aref *repeat-counters* rep-num) maximum)
+                            (incf (aref *repeat-counters* rep-num))
+                            ;; note that REPEAT-MATCHER will call
+                            ;; GREEDY-AUX again recursively
+                            (prog1
+                              (funcall repeat-matcher start-pos)
+                              (decf (aref *repeat-counters* rep-num))))
+                       (funcall next-fn start-pos))))
+            ;; create a closure to match the inner regex and to
+            ;; implement backtracking via GREEDY-AUX
+            (setq repeat-matcher
+                    (create-matcher-aux (regex repetition) #'greedy-aux))
+            ;; the closure we return is just a thin wrapper around
+            ;; GREEDY-AUX to initialize the repetition counter
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (setf (aref *repeat-counters* rep-num) 0)
+              (greedy-aux start-pos)))))
+      (t
+        ;; easier code because we're not bounded by MAXIMUM, but
+        ;; basically the same
+        (flet ((greedy-aux (start-pos)
+                 (declare (type fixnum start-pos)
+                          (type function repeat-matcher))
+                 (or (funcall repeat-matcher start-pos)
+                     (funcall next-fn start-pos))))
+          (setq repeat-matcher
+                  (create-matcher-aux (regex repetition) #'greedy-aux))
+          #'greedy-aux)))))
+
+(defmethod create-greedy-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is greedy and the minimal number of repetitions is
+zero."
+  (let ((maximum (maximum repetition))
+        ;; we make a reservation for our slot in *LAST-POS-STORES* because
+        ;; we have to watch out for endless loops as the inner regex might
+        ;; match zero-length strings
+        (zero-length-num (incf-after *zero-length-num*))
+        ;; REPEAT-MATCHER is part of the closure's environment but it
+        ;; can only be defined after GREEDY-AUX is defined
+        repeat-matcher)
+    (declare (type fixnum zero-length-num)
+             (type function next-fn))
+    (cond
+      (maximum
+        ;; we make a reservation for our slot in *REPEAT-COUNTERS*
+        ;; because we need to keep track whether we've reached MAXIMUM
+        ;; repetitions
+        (let ((rep-num (incf-after *rep-num*)))
+          (flet ((greedy-aux (start-pos)
+                   ;; the actual matcher which first tries to match the
+                   ;; inner regex of REPETITION (if we haven't done so
+                   ;; too often) and on failure calls NEXT-FN
+                   (declare (type fixnum start-pos maximum rep-num)
+                            (type function repeat-matcher))
+                   (let ((old-last-pos
+                           (svref *last-pos-stores* zero-length-num)))
+                     (when (and old-last-pos
+                                (= (the fixnum old-last-pos) start-pos))
+                       ;; stop immediately if we've been here before,
+                       ;; i.e. if the last attempt matched a zero-length
+                       ;; string
+                       (return-from greedy-aux (funcall next-fn start-pos)))
+                     ;; otherwise remember this position for the next
+                     ;; repetition
+                     (setf (svref *last-pos-stores* zero-length-num) start-pos)
+                     (or (and (< (aref *repeat-counters* rep-num) maximum)
+                              (incf (aref *repeat-counters* rep-num))
+                              ;; note that REPEAT-MATCHER will call
+                              ;; GREEDY-AUX again recursively
+                              (prog1
+                                (funcall repeat-matcher start-pos)
+                                (decf (aref *repeat-counters* rep-num))
+                                (setf (svref *last-pos-stores* zero-length-num)
+                                        old-last-pos)))
+                         (funcall next-fn start-pos)))))
+            ;; create a closure to match the inner regex and to
+            ;; implement backtracking via GREEDY-AUX
+            (setq repeat-matcher
+                    (create-matcher-aux (regex repetition) #'greedy-aux))
+            ;; the closure we return is just a thin wrapper around
+            ;; GREEDY-AUX to initialize the repetition counter and our
+            ;; slot in *LAST-POS-STORES*
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (setf (aref *repeat-counters* rep-num) 0
+                    (svref *last-pos-stores* zero-length-num) nil)
+              (greedy-aux start-pos)))))
+      (t
+        ;; easier code because we're not bounded by MAXIMUM, but
+        ;; basically the same
+        (flet ((greedy-aux (start-pos)
+                 (declare (type fixnum start-pos)
+                          (type function repeat-matcher))
+                 (let ((old-last-pos
+                         (svref *last-pos-stores* zero-length-num)))
+                   (when (and old-last-pos
+                              (= (the fixnum old-last-pos) start-pos))
+                     (return-from greedy-aux (funcall next-fn start-pos)))
+                   (setf (svref *last-pos-stores* zero-length-num) start-pos)
+                   (or (prog1
+                         (funcall repeat-matcher start-pos)
+                         (setf (svref *last-pos-stores* zero-length-num) old-last-pos))
+                       (funcall next-fn start-pos)))))
+          (setq repeat-matcher
+                  (create-matcher-aux (regex repetition) #'greedy-aux))
+          (lambda (start-pos)
+            (declare (type fixnum start-pos))
+            (setf (svref *last-pos-stores* zero-length-num) nil)
+            (greedy-aux start-pos)))))))
+  
+;; code for non-greedy repetitions with minimum zero
+
+(defmacro non-greedy-constant-length-closure (check-curr-pos)
+  "This is the template for simple non-greedy repetitions (where
+simple means that the minimum number of repetitions is zero, that the
+inner regex to be checked is of fixed length LEN, and that it doesn't
+contain registers, i.e. there's no need for backtracking).
+CHECK-CURR-POS is a form which checks whether the inner regex of the
+repetition matches at CURR-POS."
+  `(if maximum
+    (lambda (start-pos)
+      (declare (type fixnum start-pos maximum))
+      ;; because we know LEN we know in advance where to stop at the
+      ;; latest; we also take into consideration MIN-REST, i.e. the
+      ;; minimal length of the part behind the repetition
+      (let ((target-end-pos (min (1+ (- *end-pos* len min-rest))
+                                 (+ start-pos
+                                    (the fixnum (* len maximum))))))
+        ;; move forward by LEN and always try NEXT-FN first, then
+        ;; CHECK-CUR-POS
+        (loop for curr-pos of-type fixnum from start-pos
+                                          below target-end-pos
+                                          by len
+              thereis (funcall next-fn curr-pos)
+              while ,check-curr-pos
+              finally (return (funcall next-fn curr-pos)))))
+  ;; basically the same code; it's just a bit easier because we're
+  ;; not bounded by MAXIMUM
+  (lambda (start-pos)
+    (declare (type fixnum start-pos))
+    (let ((target-end-pos (1+ (- *end-pos* len min-rest))))
+      (loop for curr-pos of-type fixnum from start-pos
+                                        below target-end-pos
+                                        by len
+            thereis (funcall next-fn curr-pos)
+            while ,check-curr-pos
+            finally (return (funcall next-fn curr-pos)))))))
+
+(defmethod create-non-greedy-constant-length-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is non-greedy and the minimal number of repetitions is
+zero. It is furthermore assumed that the inner regex of REPETITION is
+of fixed length and doesn't contain registers."
+  (let ((len (len repetition))
+        (maximum (maximum repetition))
+        (regex (regex repetition))
+        (min-rest (min-rest repetition)))
+    (declare (type fixnum len min-rest)
+             (type function next-fn))
+    (cond ((zerop len)
+            ;; inner regex has zero-length, so we can discard it
+            ;; completely
+            next-fn)
+          (t
+            ;; now first try to optimize for a couple of common cases
+            (typecase regex
+              (str
+                (let ((str (str regex)))
+                  (if (= 1 len)
+                    ;; a single character
+                    (let ((chr (schar str 0)))
+                      (if (case-insensitive-p regex)
+                        (non-greedy-constant-length-closure
+                         (char-equal chr (schar *string* curr-pos)))
+                        (non-greedy-constant-length-closure
+                         (char= chr (schar *string* curr-pos)))))
+                    ;; a string
+                    (if (case-insensitive-p regex)
+                      (non-greedy-constant-length-closure
+                       (*string*-equal str curr-pos (+ curr-pos len) 0 len))
+                      (non-greedy-constant-length-closure
+                       (*string*= str curr-pos (+ curr-pos len) 0 len))))))
+              (char-class
+                ;; a character class
+                (insert-char-class-tester (regex (schar *string* curr-pos))
+                  (if (invertedp regex)
+                    (non-greedy-constant-length-closure
+                     (not (char-class-test)))
+                    (non-greedy-constant-length-closure
+                     (char-class-test)))))
+              (everything
+                (if (single-line-p regex)
+                  ;; a dot which really can match everything; we rely
+                  ;; on the compiler to optimize this away
+                  (non-greedy-constant-length-closure
+                   t)
+                  ;; a dot which has to watch out for #\Newline
+                  (non-greedy-constant-length-closure
+                   (char/= #\Newline (schar *string* curr-pos)))))
+              (t
+                ;; the general case - we build an inner matcher which
+                ;; just checks for immediate success, i.e. NEXT-FN is
+                ;; #'IDENTITY
+                (let ((inner-matcher (create-matcher-aux regex #'identity)))
+                  (declare (type function inner-matcher))
+                  (non-greedy-constant-length-closure
+                   (funcall inner-matcher curr-pos)))))))))
+
+(defmethod create-non-greedy-no-zero-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is non-greedy and the minimal number of repetitions is
+zero. It is furthermore assumed that the inner regex of REPETITION can
+never match a zero-length string (or instead the maximal number of
+repetitions is 1)."
+  (let ((maximum (maximum repetition))
+        ;; REPEAT-MATCHER is part of the closure's environment but it
+        ;; can only be defined after NON-GREEDY-AUX is defined
+        repeat-matcher)
+    (declare (type function next-fn))
+    (cond
+      ((eql maximum 1)
+        ;; this is essentially like the next case but with a known
+        ;; MAXIMUM of 1 we can get away without a counter
+        (setq repeat-matcher
+                (create-matcher-aux (regex repetition) next-fn))
+        (lambda (start-pos)
+          (declare (type function repeat-matcher))
+          (or (funcall next-fn start-pos)
+              (funcall repeat-matcher start-pos))))
+      (maximum
+        ;; we make a reservation for our slot in *REPEAT-COUNTERS*
+        ;; because we need to keep track whether we've reached MAXIMUM
+        ;; repetitions
+        (let ((rep-num (incf-after *rep-num*)))
+          (flet ((non-greedy-aux (start-pos)
+                   ;; the actual matcher which first calls NEXT-FN and
+                   ;; on failure tries to match the inner regex of
+                   ;; REPETITION (if we haven't done so too often)
+                   (declare (type fixnum start-pos maximum rep-num)
+                            (type function repeat-matcher))
+                   (or (funcall next-fn start-pos)
+                       (and (< (aref *repeat-counters* rep-num) maximum)
+                            (incf (aref *repeat-counters* rep-num))
+                            ;; note that REPEAT-MATCHER will call
+                            ;; NON-GREEDY-AUX again recursively
+                            (prog1
+                              (funcall repeat-matcher start-pos)
+                              (decf (aref *repeat-counters* rep-num)))))))
+            ;; create a closure to match the inner regex and to
+            ;; implement backtracking via NON-GREEDY-AUX
+            (setq repeat-matcher
+                    (create-matcher-aux (regex repetition) #'non-greedy-aux))
+            ;; the closure we return is just a thin wrapper around
+            ;; NON-GREEDY-AUX to initialize the repetition counter
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (setf (aref *repeat-counters* rep-num) 0)
+              (non-greedy-aux start-pos)))))
+      (t
+        ;; easier code because we're not bounded by MAXIMUM, but
+        ;; basically the same
+        (flet ((non-greedy-aux (start-pos)
+                 (declare (type fixnum start-pos)
+                          (type function repeat-matcher))
+                 (or (funcall next-fn start-pos)
+                     (funcall repeat-matcher start-pos))))
+          (setq repeat-matcher
+                  (create-matcher-aux (regex repetition) #'non-greedy-aux))
+          #'non-greedy-aux)))))
+  
+(defmethod create-non-greedy-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION is non-greedy and the minimal number of repetitions is
+zero."
+  ;; we make a reservation for our slot in *LAST-POS-STORES* because
+  ;; we have to watch out for endless loops as the inner regex might
+  ;; match zero-length strings
+  (let ((zero-length-num (incf-after *zero-length-num*))
+        (maximum (maximum repetition))
+        ;; REPEAT-MATCHER is part of the closure's environment but it
+        ;; can only be defined after NON-GREEDY-AUX is defined
+        repeat-matcher)
+    (declare (type fixnum zero-length-num)
+             (type function next-fn))
+    (cond
+      (maximum
+        ;; we make a reservation for our slot in *REPEAT-COUNTERS*
+        ;; because we need to keep track whether we've reached MAXIMUM
+        ;; repetitions
+        (let ((rep-num (incf-after *rep-num*)))
+          (flet ((non-greedy-aux (start-pos)
+                   ;; the actual matcher which first calls NEXT-FN and
+                   ;; on failure tries to match the inner regex of
+                   ;; REPETITION (if we haven't done so too often)
+                   (declare (type fixnum start-pos maximum rep-num)
+                            (type function repeat-matcher))
+                   (let ((old-last-pos
+                           (svref *last-pos-stores* zero-length-num)))
+                     (when (and old-last-pos
+                                (= (the fixnum old-last-pos) start-pos))
+                       ;; stop immediately if we've been here before,
+                       ;; i.e. if the last attempt matched a zero-length
+                       ;; string
+                       (return-from non-greedy-aux (funcall next-fn start-pos)))
+                     ;; otherwise remember this position for the next
+                     ;; repetition
+                     (setf (svref *last-pos-stores* zero-length-num) start-pos)
+                     (or (funcall next-fn start-pos)
+                         (and (< (aref *repeat-counters* rep-num) maximum)
+                              (incf (aref *repeat-counters* rep-num))
+                              ;; note that REPEAT-MATCHER will call
+                              ;; NON-GREEDY-AUX again recursively
+                              (prog1
+                                (funcall repeat-matcher start-pos)
+                                (decf (aref *repeat-counters* rep-num))
+                                (setf (svref *last-pos-stores* zero-length-num)
+                                        old-last-pos)))))))
+            ;; create a closure to match the inner regex and to
+            ;; implement backtracking via NON-GREEDY-AUX
+            (setq repeat-matcher
+                    (create-matcher-aux (regex repetition) #'non-greedy-aux))
+            ;; the closure we return is just a thin wrapper around
+            ;; NON-GREEDY-AUX to initialize the repetition counter and our
+            ;; slot in *LAST-POS-STORES*
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (setf (aref *repeat-counters* rep-num) 0
+                    (svref *last-pos-stores* zero-length-num) nil)
+              (non-greedy-aux start-pos)))))
+      (t
+        ;; easier code because we're not bounded by MAXIMUM, but
+        ;; basically the same
+        (flet ((non-greedy-aux (start-pos)
+                 (declare (type fixnum start-pos)
+                          (type function repeat-matcher))
+                 (let ((old-last-pos
+                         (svref *last-pos-stores* zero-length-num)))
+                   (when (and old-last-pos
+                              (= (the fixnum old-last-pos) start-pos))
+                     (return-from non-greedy-aux (funcall next-fn start-pos)))
+                   (setf (svref *last-pos-stores* zero-length-num) start-pos)
+                   (or (funcall next-fn start-pos)
+                       (prog1
+                         (funcall repeat-matcher start-pos)
+                         (setf (svref *last-pos-stores* zero-length-num)
+                                 old-last-pos))))))
+          (setq repeat-matcher
+                  (create-matcher-aux (regex repetition) #'non-greedy-aux))
+          (lambda (start-pos)
+            (declare (type fixnum start-pos))
+            (setf (svref *last-pos-stores* zero-length-num) nil)
+            (non-greedy-aux start-pos)))))))
+  
+;; code for constant repetitions, i.e. those with a fixed number of repetitions
+                      
+(defmacro constant-repetition-constant-length-closure (check-curr-pos)
+  "This is the template for simple constant repetitions (where simple
+means that the inner regex to be checked is of fixed length LEN, and
+that it doesn't contain registers, i.e. there's no need for
+backtracking) and where constant means that MINIMUM is equal to
+MAXIMUM. CHECK-CURR-POS is a form which checks whether the inner regex
+of the repetition matches at CURR-POS."
+  `(lambda (start-pos)
+    (declare (type fixnum start-pos))
+      (let ((target-end-pos (+ start-pos
+                               (the fixnum (* len repetitions)))))
+        (declare (type fixnum target-end-pos))
+        ;; first check if we won't go beyond the end of the string
+        (and (>= *end-pos* target-end-pos)
+             ;; then loop through all repetitions step by step
+             (loop for curr-pos of-type fixnum from start-pos
+                                               below target-end-pos
+                                               by len
+                   always ,check-curr-pos)
+             ;; finally call NEXT-FN if we made it that far
+             (funcall next-fn target-end-pos)))))
+
+(defmethod create-constant-repetition-constant-length-matcher
+       ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION has a constant number of repetitions. It is
+furthermore assumed that the inner regex of REPETITION is of fixed
+length and doesn't contain registers."
+  (let ((len (len repetition))
+        (repetitions (minimum repetition))
+        (regex (regex repetition)))
+    (declare (type fixnum len repetitions)
+             (type function next-fn))
+    (if (zerop len)
+      ;; if the length is zero it suffices to try once
+      (create-matcher-aux regex next-fn)
+      ;; otherwise try to optimize for a couple of common cases
+      (typecase regex
+        (str
+          (let ((str (str regex)))
+            (if (= 1 len)
+              ;; a single character
+              (let ((chr (schar str 0)))
+                (if (case-insensitive-p regex)
+                  (constant-repetition-constant-length-closure
+                   (and (char-equal chr (schar *string* curr-pos))
+                        (1+ curr-pos)))
+                  (constant-repetition-constant-length-closure
+                   (and (char= chr (schar *string* curr-pos))
+                        (1+ curr-pos)))))
+              ;; a string
+              (if (case-insensitive-p regex)
+                (constant-repetition-constant-length-closure
+                 (let ((next-pos (+ curr-pos len)))
+                   (declare (type fixnum next-pos))
+                   (and (*string*-equal str curr-pos next-pos 0 len)
+                        next-pos)))
+                (constant-repetition-constant-length-closure
+                 (let ((next-pos (+ curr-pos len)))
+                   (declare (type fixnum next-pos))
+                   (and (*string*= str curr-pos next-pos 0 len)
+                        next-pos)))))))
+        (char-class
+          ;; a character class
+          (insert-char-class-tester (regex (schar *string* curr-pos))
+            (if (invertedp regex)
+              (constant-repetition-constant-length-closure
+               (and (not (char-class-test))
+                    (1+ curr-pos)))
+              (constant-repetition-constant-length-closure
+               (and (char-class-test)
+                    (1+ curr-pos))))))
+        (everything
+          (if (single-line-p regex)
+            ;; a dot which really matches everything - we just have to
+            ;; advance the index into *STRING* accordingly and check
+            ;; if we didn't go past the end
+            (lambda (start-pos)
+              (declare (type fixnum start-pos))
+              (let ((next-pos (+ start-pos repetitions)))
+                (declare (type fixnum next-pos))
+                (or (<= next-pos *end-pos*)
+                    (funcall next-fn next-pos))))
+            ;; a dot which is not in single-line-mode - make sure we
+            ;; don't match #\Newline
+            (constant-repetition-constant-length-closure
+             (and (char/= #\Newline (schar *string* curr-pos))
+                  (1+ curr-pos)))))
+        (t
+          ;; the general case - we build an inner matcher which just
+          ;; checks for immediate success, i.e. NEXT-FN is #'IDENTITY
+          (let ((inner-matcher (create-matcher-aux regex #'identity)))
+            (declare (type function inner-matcher))
+            (constant-repetition-constant-length-closure
+             (funcall inner-matcher curr-pos))))))))
+  
+(defmethod create-constant-repetition-matcher ((repetition repetition) next-fn)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Creates a closure which tries to match REPETITION. It is assumed
+that REPETITION has a constant number of repetitions."
+  (let ((repetitions (minimum repetition))
+        ;; we make a reservation for our slot in *REPEAT-COUNTERS*
+        ;; because we need to keep track of the number of repetitions
+        (rep-num (incf-after *rep-num*))
+        ;; REPEAT-MATCHER is part of the closure's environment but it
+        ;; can only be defined after NON-GREEDY-AUX is defined
+        repeat-matcher)
+    (declare (type fixnum repetitions rep-num)
+             (type function next-fn))
+    (if (zerop (min-len repetition))
+      ;; we make a reservation for our slot in *LAST-POS-STORES*
+      ;; because we have to watch out for needless loops as the inner
+      ;; regex might match zero-length strings
+      (let ((zero-length-num (incf-after *zero-length-num*)))
+        (declare (type fixnum zero-length-num))
+        (flet ((constant-aux (start-pos)
+                 ;; the actual matcher which first calls NEXT-FN and
+                 ;; on failure tries to match the inner regex of
+                 ;; REPETITION (if we haven't done so too often)
+                 (declare (type fixnum start-pos)
+                          (type function repeat-matcher))
+                 (let ((old-last-pos
+                         (svref *last-pos-stores* zero-length-num)))
+                   (when (and old-last-pos
+                              (= (the fixnum old-last-pos) start-pos))
+                     ;; if we've been here before we matched a
+                     ;; zero-length string the last time, so we can
+                     ;; just carry on because we will definitely be
+                     ;; able to do this again often enough
+                     (return-from constant-aux (funcall next-fn start-pos)))
+                   ;; otherwise remember this position for the next
+                   ;; repetition
+                   (setf (svref *last-pos-stores* zero-length-num) start-pos)
+                   (cond ((< (aref *repeat-counters* rep-num) repetitions)
+                           ;; not enough repetitions yet, try it again
+                           (incf (aref *repeat-counters* rep-num))
+                           ;; note that REPEAT-MATCHER will call
+                           ;; CONSTANT-AUX again recursively
+                           (prog1
+                             (funcall repeat-matcher start-pos)
+                             (decf (aref *repeat-counters* rep-num))
+                             (setf (svref *last-pos-stores* zero-length-num)
+                                     old-last-pos)))
+                         (t
+                           ;; we're done - call NEXT-FN
+                           (funcall next-fn start-pos))))))
+          ;; create a closure to match the inner regex and to
+          ;; implement backtracking via CONSTANT-AUX
+          (setq repeat-matcher
+                  (create-matcher-aux (regex repetition) #'constant-aux))
+          ;; the closure we return is just a thin wrapper around
+          ;; CONSTANT-AUX to initialize the repetition counter
+          (lambda (start-pos)
+            (declare (type fixnum start-pos))
+            (setf (aref *repeat-counters* rep-num) 0
+                  (aref *last-pos-stores* zero-length-num) nil)
+            (constant-aux start-pos))))
+      ;; easier code because we don't have to care about zero-length
+      ;; matches but basically the same
+      (flet ((constant-aux (start-pos)
+               (declare (type fixnum start-pos)
+                        (type function repeat-matcher))
+               (cond ((< (aref *repeat-counters* rep-num) repetitions)
+                       (incf (aref *repeat-counters* rep-num))
+                       (prog1
+                         (funcall repeat-matcher start-pos)
+                         (decf (aref *repeat-counters* rep-num))))
+                     (t (funcall next-fn start-pos)))))
+        (setq repeat-matcher
+                (create-matcher-aux (regex repetition) #'constant-aux))
+        (lambda (start-pos)
+          (declare (type fixnum start-pos))
+          (setf (aref *repeat-counters* rep-num) 0)
+          (constant-aux start-pos))))))
+  
+;; the actual CREATE-MATCHER-AUX method for REPETITION objects which
+;; utilizes all the functions and macros defined above
+
+(defmethod create-matcher-aux ((repetition repetition) next-fn)
+  (with-slots ((minimum minimum)
+               (maximum maximum)
+               (len len)
+               (min-len min-len)
+               (greedyp greedyp)
+               (contains-register-p contains-register-p))
+      repetition
+    (cond ((and maximum
+                (zerop maximum))
+            ;; this should have been optimized away by CONVERT but just
+            ;; in case...
+            (error "Got REPETITION with MAXIMUM 0 \(should not happen)"))
+          ((and maximum
+                (= minimum maximum 1))
+            ;; this should have been optimized away by CONVERT but just
+            ;; in case...
+            (error "Got REPETITION with MAXIMUM 1 and MINIMUM 1 \(should not happen)"))
+          ((and (eql minimum maximum)
+                len
+                (not contains-register-p))
+            (create-constant-repetition-constant-length-matcher repetition next-fn))
+          ((eql minimum maximum)
+            (create-constant-repetition-matcher repetition next-fn))
+          ((and greedyp
+                len
+                (not contains-register-p))
+            (create-greedy-constant-length-matcher repetition next-fn))
+          ((and greedyp
+                (or (plusp min-len)
+                    (eql maximum 1)))
+            (create-greedy-no-zero-matcher repetition next-fn))
+          (greedyp
+            (create-greedy-matcher repetition next-fn))
+          ((and len
+                (plusp len)
+                (not contains-register-p))
+            (create-non-greedy-constant-length-matcher repetition next-fn))
+          ((or (plusp min-len)
+               (eql maximum 1))
+            (create-non-greedy-no-zero-matcher repetition next-fn))
+          (t
+            (create-non-greedy-matcher repetition next-fn)))))
--- a/scanner.lisp
+++ b/scanner.lisp
@ -0,0 +1,519 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/scanner.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Here the scanner for the actual regex as well as utility scanners
+;;; for the constant start and end strings are created.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defmacro bmh-matcher-aux (&key case-insensitive-p)
+  "Auxiliary macro used by CREATE-BMH-MATCHER."
+  (let ((char-compare (if case-insensitive-p 'char-equal 'char=)))
+    `(lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (if (> (the fixnum (+ start-pos m)) *end-pos*)
+        nil
+        (loop named bmh-matcher
+              for k of-type fixnum = (+ start-pos m -1)
+                then (+ k (max 1 (aref skip (char-code (schar *string* k)))))
+              while (< k *end-pos*)
+              do (loop for j of-type fixnum downfrom (1- m)
+                       for i of-type fixnum downfrom k
+                       while (and (>= j 0)
+                                  (,char-compare (schar *string* i)
+                                                 (schar pattern j)))
+                       finally (if (minusp j)
+                                 (return-from bmh-matcher (1+ i)))))))))
+
+(defun create-bmh-matcher (pattern case-insensitive-p)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns a Boyer-Moore-Horspool matcher which searches the (special)
+simple-string *STRING* for the first occurence of the substring
+PATTERN. The search starts at the position START-POS within *STRING*
+and stops before *END-POS* is reached. Depending on the second
+argument the search is case-insensitive or not. If the special
+variable *USE-BMH-MATCHERS* is NIL, use the standard SEARCH function
+instead. (BMH matchers are faster but need much more space.)"
+  ;; see <http://www-igm.univ-mlv.fr/~lecroq/string/node18.html> for
+  ;; details
+  (unless *use-bmh-matchers*
+    (let ((test (if case-insensitive-p #'char-equal #'char=)))
+      (return-from create-bmh-matcher
+        (lambda (start-pos)
+          (declare (type fixnum start-pos))
+          (search pattern
+                  *string*
+                  :start2 start-pos
+                  :end2 *end-pos*
+                  :test test)))))
+  (let* ((m (length pattern))
+	 (skip (make-array *regex-char-code-limit*
+			  :element-type 'fixnum
+			  :initial-element m)))
+    (declare (type fixnum m))
+    (loop for k of-type fixnum below m
+          if case-insensitive-p
+            do (setf (aref skip (char-code (char-upcase (schar pattern k)))) (- m k 1)
+                     (aref skip (char-code (char-downcase (schar pattern k)))) (- m k 1))
+	  else
+	    do (setf (aref skip (char-code (schar pattern k))) (- m k 1)))
+    (if case-insensitive-p
+      (bmh-matcher-aux :case-insensitive-p t)
+      (bmh-matcher-aux))))
+
+(defmacro char-searcher-aux (&key case-insensitive-p)
+  "Auxiliary macro used by CREATE-CHAR-SEARCHER."
+  (let ((char-compare (if case-insensitive-p 'char-equal 'char=)))
+    `(lambda (start-pos)
+      (declare (type fixnum start-pos))
+      (loop for i of-type fixnum from start-pos below *end-pos*
+            thereis (and (,char-compare (schar *string* i) chr) i)))))
+
+(defun create-char-searcher (chr case-insensitive-p)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns a function which searches the (special) simple-string
+*STRING* for the first occurence of the character CHR. The search
+starts at the position START-POS within *STRING* and stops before
+*END-POS* is reached. Depending on the second argument the search is
+case-insensitive or not."
+  (if case-insensitive-p
+    (char-searcher-aux :case-insensitive-p t)
+    (char-searcher-aux)))
+
+(declaim (inline newline-skipper))
+
+(defun newline-skipper (start-pos)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum start-pos))
+  "Find the next occurence of a character in *STRING* which is behind
+a #\Newline."
+  (loop for i of-type fixnum from start-pos below *end-pos*
+        thereis (and (char= (schar *string* i) #\Newline)
+                     (1+ i))))
+
+(defmacro insert-advance-fn (advance-fn)
+  "Creates the actual closure returned by CREATE-SCANNER-AUX by
+replacing '(ADVANCE-FN-DEFINITION) with a suitable definition for
+ADVANCE-FN. This is a utility macro used by CREATE-SCANNER-AUX."
+  (subst
+   advance-fn '(advance-fn-definition)
+   '(lambda (string start end)
+     (block scan
+       ;; initialize a couple of special variables used by the
+       ;; matchers (see file specials.lisp)
+       (let* ((*string* string)
+              (*start-pos* start)
+              (*end-pos* end)
+              ;; we will search forward for END-STRING if this value
+              ;; isn't at least as big as POS (see ADVANCE-FN), so it
+              ;; is safe to start to the left of *START-POS*; note
+              ;; that this value will _never_ be decremented - this
+              ;; is crucial to the scanning process
+              (*end-string-pos* (1- *start-pos*))
+              ;; the next five will shadow the variables defined by
+              ;; DEFPARAMETER; at this point, we don't know if we'll
+              ;; actually use them, though
+              (*repeat-counters* *repeat-counters*)
+              (*last-pos-stores* *last-pos-stores*)
+              (*reg-starts* *reg-starts*)
+              (*regs-maybe-start* *regs-maybe-start*)
+              (*reg-ends* *reg-ends*)
+              ;; we might be able to optimize the scanning process by
+              ;; (virtually) shifting *START-POS* to the right
+              (scan-start-pos *start-pos*)
+              (starts-with-str (if start-string-test
+                                 (str starts-with)
+                                 nil))
+              ;; we don't need to try further than MAX-END-POS
+              (max-end-pos (- *end-pos* min-len)))
+         (declare (type fixnum scan-start-pos)
+                  (type function match-fn))
+         ;; definition of ADVANCE-FN will be inserted here by macrology
+         (labels ((advance-fn-definition))
+           (declare (inline advance-fn))
+           (when (plusp rep-num)
+             ;; we have at least one REPETITION which needs to count
+             ;; the number of repetitions
+             (setq *repeat-counters* (make-array rep-num
+                                                 :initial-element 0
+                                                 :element-type 'fixnum)))
+           (when (plusp zero-length-num)
+             ;; we have at least one REPETITION which needs to watch
+             ;; out for zero-length repetitions
+             (setq *last-pos-stores* (make-array zero-length-num
+                                                 :initial-element nil)))
+           (when (plusp reg-num)
+             ;; we have registers in our regular expression
+             (setq *reg-starts* (make-array reg-num :initial-element nil)
+                   *regs-maybe-start* (make-array reg-num :initial-element nil)
+                   *reg-ends* (make-array reg-num :initial-element nil)))
+           (when end-anchored-p
+             ;; the regular expression has a constant end string which
+             ;; is anchored at the very end of the target string
+             ;; (perhaps modulo a #\Newline)
+             (let ((end-test-pos (- *end-pos* (the fixnum end-string-len))))
+               (declare (type fixnum end-test-pos)
+                        (type function end-string-test))
+               (unless (setq *end-string-pos* (funcall end-string-test
+                                                       end-test-pos))
+                 (when (and (= 1 (the fixnum end-anchored-p))
+                            (char= #\Newline (schar *string* (1- *end-pos*))))
+                   ;; if we didn't find an end string candidate from
+                   ;; END-TEST-POS and if a #\Newline at the end is
+                   ;; allowed we try it again from one position to the
+                   ;; left
+                   (setq *end-string-pos* (funcall end-string-test
+                                                   (1- end-test-pos))))))
+             (unless (and *end-string-pos*
+                          (<= *start-pos* *end-string-pos*))
+               ;; no end string candidate found, so give up
+               (return-from scan nil))
+             (when end-string-offset
+               ;; if the offset of the constant end string from the
+               ;; left of the regular expression is known we can start
+               ;; scanning further to the right; this is similar to
+               ;; what we might do in ADVANCE-FN
+               (setq scan-start-pos (max scan-start-pos
+                                         (- (the fixnum *end-string-pos*)
+                                            (the fixnum end-string-offset))))))
+             (cond
+               (start-anchored-p
+                 ;; we're anchored at the start of the target string,
+                 ;; so no need to try again after first failure
+                 (when (or (/= *start-pos* scan-start-pos)
+                           (< max-end-pos *start-pos*))
+                   ;; if END-STRING-OFFSET has proven that we don't
+                   ;; need to bother to scan from *START-POS* or if the
+                   ;; minimal length of the regular expression is
+                   ;; longer than the target string we give up
+                   (return-from scan nil))
+                 (when starts-with-str
+                   (locally
+                     (declare (type fixnum starts-with-len))
+                     (cond ((and (case-insensitive-p starts-with)
+                                 (not (*string*-equal starts-with-str
+                                                      *start-pos*
+                                                      (+ *start-pos*
+                                                         starts-with-len)
+                                                      0 starts-with-len)))
+                             ;; the regular expression has a
+                             ;; case-insensitive constant start string
+                             ;; and we didn't find it
+                             (return-from scan nil))
+                           ((and (not (case-insensitive-p starts-with))
+                                 (not (*string*= starts-with-str
+                                            *start-pos*
+                                            (+ *start-pos* starts-with-len)
+                                            0 starts-with-len)))
+                             ;; the regular expression has a
+                             ;; case-sensitive constant start string
+                             ;; and we didn't find it
+                             (return-from scan nil))
+                           (t nil))))
+                 (when (and end-string-test
+                            (not end-anchored-p))
+                   ;; the regular expression has a constant end string
+                   ;; which isn't anchored so we didn't check for it
+                   ;; already
+                   (block end-string-loop
+                     ;; we temporarily use *END-STRING-POS* as our
+                     ;; starting position to look for end string
+                     ;; candidates
+                     (setq *end-string-pos* *start-pos*)
+                     (loop
+                       (unless (setq *end-string-pos*
+                                       (funcall (the function end-string-test)
+                                                *end-string-pos*))
+                         ;; no end string candidate found, so give up
+                         (return-from scan nil))
+                       (unless end-string-offset
+                         ;; end string doesn't have an offset so we
+                         ;; can start scanning now
+                         (return-from end-string-loop))
+                       (let ((maybe-start-pos (- (the fixnum *end-string-pos*)
+                                                 (the fixnum end-string-offset))))
+                         (cond ((= maybe-start-pos *start-pos*)
+                                 ;; offset of end string into regular
+                                 ;; expression matches start anchor -
+                                 ;; fine...
+                                 (return-from end-string-loop))
+                               ((and (< maybe-start-pos *start-pos*)
+                                     (< (+ *end-string-pos* end-string-len) *end-pos*))
+                                 ;; no match but maybe we find another
+                                 ;; one to the right - try again
+                                 (incf *end-string-pos*))
+                               (t
+                                 ;; otherwise give up
+                                 (return-from scan nil)))))))
+                 ;; if we got here we scan exactly once
+                 (let ((next-pos (funcall match-fn *start-pos*)))
+                   (when next-pos
+                     (values (if next-pos *start-pos* nil)
+                             next-pos
+                             *reg-starts*
+                             *reg-ends*))))
+               (t
+                 (loop for pos = (if starts-with-everything
+                                   ;; don't jump to the next
+                                   ;; #\Newline on the first
+                                   ;; iteration
+                                   scan-start-pos
+                                   (advance-fn scan-start-pos))
+                           then (advance-fn pos)
+                       ;; give up if the regular expression can't fit
+                       ;; into the rest of the target string
+                       while (and pos
+                                  (<= (the fixnum pos) max-end-pos))
+                       do (let ((next-pos (funcall match-fn pos)))
+                            (when next-pos
+                              (return-from scan (values pos
+                                                        next-pos
+                                                        *reg-starts*
+                                                        *reg-ends*)))
+                            ;; not yet found, increment POS
+                            #-cormanlisp (incf (the fixnum pos))
+                            #+cormanlisp (incf pos)))))))))
+    :test #'equalp))
+
+(defun create-scanner-aux (match-fn
+                           min-len
+                           start-anchored-p
+                           starts-with
+                           start-string-test
+                           end-anchored-p
+                           end-string-test
+                           end-string-len
+                           end-string-offset
+                           rep-num
+                           zero-length-num
+                           reg-num)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  (declare (type fixnum min-len zero-length-num rep-num reg-num))
+  "Auxiliary function to create and return a scanner \(which is
+actually a closure). Used by CREATE-SCANNER."
+  (let ((starts-with-len (if (typep starts-with 'str)
+                           (len starts-with)))
+        (starts-with-everything (typep starts-with 'everything)))
+    (cond
+      ;; this COND statement dispatches on the different versions we
+      ;; have for ADVANCE-FN and creates different closures for each;
+      ;; note that you see only the bodies of ADVANCE-FN below - the
+      ;; actual scanner is defined in INSERT-ADVANCE-FN above; (we
+      ;; could have done this with closures instead of macrology but
+      ;; would have consed a lot more)
+      ((and start-string-test end-string-test end-string-offset)
+        ;; we know that the regular expression has constant start and
+        ;; end strings and we know the end string's offset (from the
+        ;; left)
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type fixnum end-string-offset starts-with-len)
+                     (type function start-string-test end-string-test))
+            (loop
+              (unless (setq pos (funcall start-string-test pos))
+                ;; give up completely if we can't find a start string
+                ;; candidate
+                (return-from scan nil))
+              (locally
+                ;; from here we know that POS is a FIXNUM
+                (declare (type fixnum pos))
+                (when (= pos (- (the fixnum *end-string-pos*) end-string-offset))
+                  ;; if we already found an end string candidate the
+                  ;; position of which matches the start string
+                  ;; candidate we're done
+                  (return-from advance-fn pos))
+                (let ((try-pos (+ pos starts-with-len)))
+                  ;; otherwise try (again) to find an end string
+                  ;; candidate which starts behind the start string
+                  ;; candidate
+                  (loop
+                    (unless (setq *end-string-pos*
+                                    (funcall end-string-test try-pos))
+                      ;; no end string candidate found, so give up
+                      (return-from scan nil))
+                    ;; NEW-POS is where we should start scanning
+                    ;; according to the end string candidate
+                    (let ((new-pos (- (the fixnum *end-string-pos*)
+                                      end-string-offset)))
+                      (declare (type fixnum new-pos *end-string-pos*))
+                      (cond ((= new-pos pos)
+                              ;; if POS and NEW-POS are equal then the
+                              ;; two candidates agree so we're fine
+                              (return-from advance-fn pos))
+                            ((> new-pos pos)
+                              ;; if NEW-POS is further to the right we
+                              ;; advance POS and try again, i.e. we go
+                              ;; back to the start of the outer LOOP
+                              (setq pos new-pos)
+                              ;; this means "return from inner LOOP"
+                              (return))
+                            (t
+                              ;; otherwise NEW-POS is smaller than POS,
+                              ;; so we have to redo the inner LOOP to
+                              ;; find another end string candidate
+                              ;; further to the right
+                              (setq try-pos (1+ *end-string-pos*))))))))))))
+      ((and starts-with-everything end-string-test end-string-offset)
+        ;; we know that the regular expression starts with ".*" (which
+        ;; is not in single-line-mode, see CREATE-SCANNER-AUX) and ends
+        ;; with a constant end string and we know the end string's
+        ;; offset (from the left)
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type fixnum end-string-offset)
+                     (type function end-string-test))
+            (loop
+              (unless (setq pos (newline-skipper pos))
+                ;; if we can't find a #\Newline we give up immediately
+                (return-from scan nil))
+              (locally
+                ;; from here we know that POS is a FIXNUM
+                (declare (type fixnum pos))
+                (when (= pos (- (the fixnum *end-string-pos*) end-string-offset))
+                  ;; if we already found an end string candidate the
+                  ;; position of which matches the place behind the
+                  ;; #\Newline we're done
+                  (return-from advance-fn pos))
+                (let ((try-pos pos))
+                  ;; otherwise try (again) to find an end string
+                  ;; candidate which starts behind the #\Newline
+                  (loop
+                    (unless (setq *end-string-pos*
+                                    (funcall end-string-test try-pos))
+                      ;; no end string candidate found, so we give up
+                      (return-from scan nil))
+                    ;; NEW-POS is where we should start scanning
+                    ;; according to the end string candidate
+                    (let ((new-pos (- (the fixnum *end-string-pos*)
+                                      end-string-offset)))
+                      (declare (type fixnum new-pos *end-string-pos*))
+                      (cond ((= new-pos pos)
+                              ;; if POS and NEW-POS are equal then the
+                              ;; the end string candidate agrees with
+                              ;; the #\Newline so we're fine
+                              (return-from advance-fn pos))
+                            ((> new-pos pos)
+                              ;; if NEW-POS is further to the right we
+                              ;; advance POS and try again, i.e. we go
+                              ;; back to the start of the outer LOOP
+                              (setq pos new-pos)
+                              ;; this means "return from inner LOOP"
+                              (return))
+                            (t
+                              ;; otherwise NEW-POS is smaller than POS,
+                              ;; so we have to redo the inner LOOP to
+                              ;; find another end string candidate
+                              ;; further to the right
+                              (setq try-pos (1+ *end-string-pos*))))))))))))
+      ((and start-string-test end-string-test)
+        ;; we know that the regular expression has constant start and
+        ;; end strings; similar to the first case but we only need to
+        ;; check for the end string, it doesn't provide enough
+        ;; information to advance POS
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type function start-string-test end-string-test))
+            (unless (setq pos (funcall start-string-test pos))
+              (return-from scan nil))
+            (if (<= (the fixnum pos)
+                    (the fixnum *end-string-pos*))
+              (return-from advance-fn pos))
+            (unless (setq *end-string-pos* (funcall end-string-test pos))
+              (return-from scan nil))
+            pos)))
+      ((and starts-with-everything end-string-test)
+        ;; we know that the regular expression starts with ".*" (which
+        ;; is not in single-line-mode, see CREATE-SCANNER-AUX) and ends
+        ;; with a constant end string; similar to the second case but we
+        ;; only need to check for the end string, it doesn't provide
+        ;; enough information to advance POS
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type function end-string-test))
+            (unless (setq pos (newline-skipper pos))
+              (return-from scan nil))
+            (if (<= (the fixnum pos)
+                    (the fixnum *end-string-pos*))
+              (return-from advance-fn pos))
+            (unless (setq *end-string-pos* (funcall end-string-test pos))
+              (return-from scan nil))
+            pos)))
+      (start-string-test
+        ;; just check for constant start string candidate
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type function start-string-test))
+            (unless (setq pos (funcall start-string-test pos))
+              (return-from scan nil))
+            pos)))
+      (starts-with-everything
+        ;; just advance POS with NEWLINE-SKIPPER
+        (insert-advance-fn
+          (advance-fn (pos)
+            (unless (setq pos (newline-skipper pos))
+              (return-from scan nil))
+            pos)))
+      (end-string-test
+        ;; just check for the next end string candidate if POS has
+        ;; advanced beyond the last one
+        (insert-advance-fn
+          (advance-fn (pos)
+            (declare (type function end-string-test))
+            (if (<= (the fixnum pos)
+                    (the fixnum *end-string-pos*))
+              (return-from advance-fn pos))
+            (unless (setq *end-string-pos* (funcall end-string-test pos))
+              (return-from scan nil))
+            pos)))
+      (t
+        ;; not enough optimization information about the regular
+        ;; expression to optimize so we just return POS
+        (insert-advance-fn
+          (advance-fn (pos)
+            pos))))))
--- a/specials.lisp
+++ b/specials.lisp
@ -0,0 +1,107 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/specials.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; globally declared special variables
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+;;; special variables used by the lexer/parser combo
+
+(defvar *extended-mode-p* nil
+  "Whether the parser will start in extended mode.")
+(declaim (type boolean *extended-mode-p*))
+
+;;; special variables used by the SCAN function and the matchers
+
+(defvar *string* ""
+  "The string which is currently scanned by SCAN.
+Will always be coerced to a SIMPLE-STRING.")
+(declaim (type simple-string *string*))
+
+(defvar *start-pos* 0
+  "Where to start scanning within *STRING*.")
+(declaim (type fixnum *start-pos*))
+
+(defvar *real-start-pos* nil
+  "The real start of *STRING*. This is for repeated scans and is only used internally.")
+(declaim (type (or null fixnum) *real-start-pos*))
+
+(defvar *end-pos* 0
+  "Where to stop scanning within *STRING*.")
+(declaim (type fixnum *end-pos*))
+
+(defvar *reg-starts* (make-array 0)
+  "An array which holds the start positions
+of the current register candidates.")
+(declaim (type simple-vector *reg-starts*))
+  
+(defvar *regs-maybe-start* (make-array 0)
+  "An array which holds the next start positions
+of the current register candidates.")
+(declaim (type simple-vector *regs-maybe-start*))
+
+(defvar *reg-ends* (make-array 0)
+  "An array which holds the end positions
+of the current register candidates.")
+(declaim (type simple-vector *reg-ends*))
+
+(defvar *end-string-pos* nil
+  "Start of the next possible end-string candidate.")
+
+(defvar *rep-num* 0
+  "Counts the number of \"complicated\" repetitions while the matchers
+are built.")
+(declaim (type fixnum *rep-num*))
+
+(defvar *zero-length-num* 0
+  "Counts the number of repetitions the inner regexes of which may
+have zero-length while the matchers are built.")
+(declaim (type fixnum *zero-length-num*))
+
+(defvar *repeat-counters* (make-array 0
+                                      :initial-element 0
+                                      :element-type 'fixnum)
+  "An array to keep track of how often
+repetitive patterns have been tested already.")
+(declaim (type (array fixnum (*)) *repeat-counters*))
+
+(defvar *last-pos-stores* (make-array 0)
+  "An array to keep track of the last positions
+where we saw repetitive patterns.
+Only used for patterns which might have zero length.")
+(declaim (type simple-vector *last-pos-stores*))
+
+(defvar *use-bmh-matchers* t
+  "Whether the scanners created by CREATE-SCANNER should use the \(fast
+but large) Boyer-Moore-Horspool matchers.")
+
+(defvar *allow-quoting* nil
+  "Whether the parser should support Perl's \\Q and \\E.")
+
+(pushnew :cl-ppcre *features*)
--- a/14287
+++ b/14287
--- a/3945
+++ b/3945
--- a/util.lisp
+++ b/util.lisp
@ -0,0 +1,278 @@
+;;; -*- Mode: LISP; Syntax: COMMON-LISP; Package: CL-PPCRE; Base: 10 -*-
+;;; $Header: /home/manuel/bknr-cvs/cvs/thirdparty/cl-ppcre/util.lisp,v 1.1 2004/06/23 08:27:10 hans Exp $
+
+;;; Utility functions and constants dealing with the hash-tables
+;;; we use to encode character classes
+
+;;; Hash-tables are treated like sets, i.e. a character C is a member of the
+;;; hash-table H iff (GETHASH C H) is true.
+
+;;; Copyright (c) 2002-2003, Dr. Edmund Weitz. All rights reserved.
+
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+
+;;;   * Redistributions of source code must retain the above copyright
+;;;     notice, this list of conditions and the following disclaimer.
+
+;;;   * Redistributions in binary form must reproduce the above
+;;;     copyright notice, this list of conditions and the following
+;;;     disclaimer in the documentation and/or other materials
+;;;     provided with the distribution.
+
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR 'AS IS' AND ANY EXPRESSED
+;;; OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+;;; WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+;;; ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
+;;; DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+;;; DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
+;;; GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+;;; INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+;;; WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+;;; NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
+;;; SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+(in-package #:cl-ppcre)
+
+(defmacro with-unique-names ((&rest bindings) &body body)
+  "Syntax: WITH-UNIQUE-NAMES ( { var | (var x) }* ) declaration* form*
+
+Executes a series of forms with each VAR bound to a fresh,
+uninterned symbol. The uninterned symbol is as if returned by a call
+to GENSYM with the string denoted by X - or, if X is not supplied, the
+string denoted by VAR - as argument.
+
+The variable bindings created are lexical unless special declarations
+are specified. The scopes of the name bindings and declarations do not
+include the Xs.
+
+The forms are evaluated in order, and the values of all but the last
+are discarded \(that is, the body is an implicit PROGN)."
+  ;; reference implementation posted to comp.lang.lisp as
+  ;; <cy3bshuf30f.fsf@ljosa.com> by Vebjorn Ljosa - see also
+  ;; <http://www.cliki.net/Common%20Lisp%20Utilities>
+  `(let ,(mapcar #'(lambda (binding)
+                     (check-type binding (or cons symbol))
+                     (if (consp binding)
+                       (destructuring-bind (var x) binding
+                         (check-type var symbol)
+                         `(,var (gensym ,(etypecase x
+                                          (symbol (symbol-name x))
+                                          (character (string x))
+                                          (string x)))))
+                       `(,binding (gensym ,(symbol-name binding)))))
+                 bindings)
+         ,@body))
+
+(defmacro rebinding (bindings &body body)
+  "REBINDING ( { var | (var prefix) }* ) form*
+
+Evaluates a series of forms in the lexical environment that is
+formed by adding the binding of each VAR to a fresh, uninterned
+symbol, and the binding of that fresh, uninterned symbol to VAR's
+original value, i.e., its value in the current lexical environment.
+
+The uninterned symbol is created as if by a call to GENSYM with the
+string denoted by PREFIX - or, if PREFIX is not supplied, the string
+denoted by VAR - as argument.
+
+The forms are evaluated in order, and the values of all but the last
+are discarded \(that is, the body is an implicit PROGN)."
+  ;; reference implementation posted to comp.lang.lisp as
+  ;; <cy3wv0fya0p.fsf@ljosa.com> by Vebjorn Ljosa - see also
+  ;; <http://www.cliki.net/Common%20Lisp%20Utilities>
+  (loop for binding in bindings
+        for var = (if (consp binding) (car binding) binding)
+        for name = (gensym)
+        collect `(,name ,var) into renames
+        collect ``(,,var ,,name) into temps
+        finally (return `(let ,renames
+                          (with-unique-names ,bindings
+                            `(let (,,@temps)
+                              ,,@body))))))
+
+(eval-when (:compile-toplevel :execute :load-toplevel)
+  (defvar *regex-char-code-limit* char-code-limit
+    "The upper exclusive bound on the char-codes of characters
+which can occur in character classes.
+Change this value BEFORE creating scanners if you don't need
+the full Unicode support of LW, ACL, or CLISP.")
+  (declaim (type fixnum *regex-char-code-limit*))
+  
+  (defun make-char-hash (test)
+    (declare (optimize speed space))
+    "Returns a hash-table of all characters satisfying test."
+    (loop with hash = (make-hash-table)
+          for c of-type fixnum from 0 below char-code-limit
+          for chr = (code-char c)
+          if (and chr (funcall test chr))
+            do (setf (gethash chr hash) t)
+          finally (return hash)))
+
+  (declaim (inline word-char-p))
+  
+  (defun word-char-p (chr)
+    (declare (optimize speed
+                       (safety 0)
+                       (space 0)
+                       (debug 0)
+                       (compilation-speed 0)
+                       #+:lispworks (hcl:fixnum-safety 0)))
+    "Tests whether a character is a \"word\" character.
+In the ASCII charset this is equivalent to a-z, A-Z, 0-9, or _,
+i.e. the same as Perl's [\\w]."
+    (or (alphanumericp chr)
+        (char= chr #\_)))
+
+  (unless (boundp '+whitespace-char-string+)
+    (defconstant +whitespace-char-string+
+      (coerce
+       '(#\Space #\Tab #\Linefeed #\Return #\Page)
+       'string)
+      "A string of all characters which are considered to be whitespace.
+Same as Perl's [\\s]."))
+
+  (defun whitespacep (chr)
+    (declare (optimize speed space))
+    "Tests whether a character is whitespace,
+i.e. whether it would match [\\s] in Perl."
+    (find chr +whitespace-char-string+ :test #'char=)))
+
+;; the following DEFCONSTANT statements are wrapped with
+;; (UNLESS (BOUNDP ...) ...) to make SBCL happy
+
+(unless (boundp '+digit-hash+)
+  (defconstant +digit-hash+
+    (make-char-hash (lambda (chr) (char<= #\0 chr #\9)))
+    "Hash-table containing the digits from 0 to 9."))
+
+(unless (boundp '+word-char-hash+)
+  (defconstant +word-char-hash+
+    (make-char-hash #'word-char-p)
+    "Hash-table containing all \"word\" characters."))
+
+(unless (boundp '+whitespace-char-hash+)
+  (defconstant +whitespace-char-hash+
+    (make-char-hash #'whitespacep)
+    "Hash-table containing all whitespace characters."))
+
+(defun merge-hash (hash1 hash2)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns the \"sum\" of two hashes. This is a destructive operation
+on HASH1."
+  (cond ((> (hash-table-count hash2)
+            *regex-char-code-limit*)
+          ;; don't walk through, e.g., the whole +WORD-CHAR-HASH+ if
+          ;; the user has set *REGEX-CHAR-CODE-LIMIT* to a lower value
+          (loop for c of-type fixnum from 0 below *regex-char-code-limit*
+                for chr = (code-char c)
+                if (and chr (gethash chr hash2))
+                do (setf (gethash chr hash1) t)))
+        (t
+          (loop for chr being the hash-keys of hash2
+                do (setf (gethash chr hash1) t))))
+  hash1)
+
+(defun merge-inverted-hash (hash1 hash2)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Returns the \"sum\" of HASH1 and the \"inverse\" of HASH2. This is
+a destructive operation on HASH1."
+  (loop for c of-type fixnum from 0 below *regex-char-code-limit*
+        for chr = (code-char c)
+        if (and chr (not (gethash chr hash2)))
+          do (setf (gethash chr hash1) t))
+  hash1)
+
+(defun create-ranges-from-hash (hash &key downcasep)
+  (declare (optimize speed
+                     (safety 0)
+                     (space 0)
+                     (debug 0)
+                     (compilation-speed 0)
+                     #+:lispworks (hcl:fixnum-safety 0)))
+  "Tries to identify up to three intervals (with respect to CHAR<)
+which together comprise HASH. Returns NIL if this is not possible.
+If DOWNCASEP is true it will treat the hash-table as if it represents
+both the lower-case and the upper-case variants of its members and
+will only return the respective lower-case intervals."
+  ;; discard empty hash-tables
+  (unless (and hash (plusp (hash-table-count hash)))
+    (return-from create-ranges-from-hash nil))
+  (loop with min1 and min2 and min3
+        and max1 and max2 and max3
+        ;; loop through all characters in HASH, sorted by CHAR<
+        for chr in (sort (the list
+                           (loop for chr being the hash-keys of hash
+                                 collect (if downcasep
+                                           (char-downcase chr)
+                                           chr)))
+		         #'char<)
+        for code = (char-code chr)
+        ;; MIN1, MAX1, etc. are _exclusive_
+        ;; bounds of the intervals identified so far
+        do (cond
+             ((not min1)
+               ;; this will only happen once, for the first character
+               (setq min1 (1- code)
+                     max1 (1+ code)))
+             ((<= (the fixnum min1) code (the fixnum max1))
+               ;; we're here as long as CHR fits into the first interval
+               (setq min1 (min (the fixnum min1) (1- code))
+                     max1 (max (the fixnum max1) (1+ code))))
+             ((not min2)
+               ;; we need to open a second interval
+               ;; this'll also happen only once
+               (setq min2 (1- code)
+                     max2 (1+ code)))
+             ((<= (the fixnum min2) code (the fixnum max2))
+               ;; CHR fits into the second interval
+               (setq min2 (min (the fixnum min2) (1- code))
+                     max2 (max (the fixnum max2) (1+ code))))
+             ((not min3)
+               ;; we need to open the third interval
+               ;; happens only once
+               (setq min3 (1- code)
+                     max3 (1+ code)))
+             ((<= (the fixnum min3) code (the fixnum max3))
+               ;; CHR fits into the third interval
+               (setq min3 (min (the fixnum min3) (1- code))
+                     max3 (max (the fixnum max3) (1+ code))))
+             (t
+               ;; we're out of luck, CHR doesn't fit
+               ;; into one of the three intervals
+               (return nil)))
+        ;; on success return all bounds
+        ;; make them inclusive bounds before returning
+        finally (return (values (code-char (1+ min1))
+                                (code-char (1- max1))
+                                (and min2 (code-char (1+ min2)))
+                                (and max2 (code-char (1- max2)))
+                                (and min3 (code-char (1+ min3)))
+                                (and max3 (code-char (1- max3)))))))
+
+(defmacro maybe-coerce-to-simple-string (string)
+  (with-unique-names (=string=)
+    `(let ((,=string= ,string))
+      (cond ((simple-string-p ,=string=)
+              ,=string=)
+            (t
+              (coerce ,=string= 'simple-string))))))
+
+(declaim (inline nsubseq))
+(defun nsubseq (sequence start &optional (end (length sequence)))
+  "Return a subsequence by pointing to location in original sequence."
+  (make-array (- end start)
+              :element-type (array-element-type sequence)
+              :displaced-to sequence
+              :displaced-index-offset start))