r5331: *** empty log message ***
This commit is contained in:
21
README
21
README
@ -1,19 +1,22 @@
|
||||
PURI - Portable URI Library
|
||||
|
||||
Kevin Rosenberg <kevin@rosenberg.net>
|
||||
Franz, Inc <http://www.franz.com>
|
||||
|
||||
Kevin Rosenberg <kevin@rosenberg.net>
|
||||
|
||||
This is portable Universal Resource Identifier library for Common Lisp
|
||||
programs. It parses URI according to the RFC 2396 specification. It's
|
||||
is based on Franz, Inc's opensource URI package and hash been
|
||||
ported to work other CL implementations. It is licensed with the
|
||||
LLGPL as include in this distribution.
|
||||
is based on Franz, Inc's opensource URI package and has been ported to
|
||||
work other CL implementations. It is licensed under the LLGPL which
|
||||
is included in this distribution.
|
||||
|
||||
A regression package is include which uses Franz's open-source
|
||||
tester library. I've also ported that library for use on other
|
||||
CL implementations. Puri completes 126/126 regression tests
|
||||
successfully.
|
||||
A regression suite is included which uses Franz's open-source tester
|
||||
library. I've ported that library for use on other CL
|
||||
implementations. Puri completes 126/126 regression tests successfully.
|
||||
|
||||
Franz's unmodified documentation file is included in the file
|
||||
uri.html. The only divergence in usage between Puri and Franz's
|
||||
package is that Puri's symbols are located in the package PURI while
|
||||
Franz's original uses the package NET.URI.
|
||||
|
||||
Puri home: http://files.b9.com/puri/
|
||||
Portable tester home: http://files.b9.com/tester/
|
||||
|
||||
6
debian/changelog
vendored
6
debian/changelog
vendored
@ -1,3 +1,9 @@
|
||||
cl-puri (1.2.3-1) unstable; urgency=low
|
||||
|
||||
* Include uri.html documentation file.
|
||||
|
||||
-- Kevin M. Rosenberg <kmr@debian.org> Fri, 18 Jul 2003 19:57:03 -0600
|
||||
|
||||
cl-puri (1.2.2-1) unstable; urgency=low
|
||||
|
||||
* Improve tests.lisp
|
||||
|
||||
2
debian/rules
vendored
2
debian/rules
vendored
@ -51,7 +51,7 @@ binary-indep: build install
|
||||
binary-arch: build install
|
||||
dh_testdir
|
||||
dh_testroot
|
||||
dh_installdocs README
|
||||
dh_installdocs README uri.html
|
||||
dh_installchangelogs
|
||||
dh_strip
|
||||
dh_compress
|
||||
|
||||
408
uri.html
Normal file
408
uri.html
Normal file
@ -0,0 +1,408 @@
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<title>URI support in Allegro CL</title>
|
||||
</head>
|
||||
|
||||
<body>
|
||||
|
||||
<h1>URI support in Allegro CL</h1>
|
||||
|
||||
<p>This document contains the following sections:</p>
|
||||
<a href="#uri-intro-1">
|
||||
|
||||
<p>1.0 Introduction</a><br>
|
||||
<a href="#uri-api-1">2.0 The URI API definition</a><br>
|
||||
<a href="#parsing-decoding-1">3.0 Parsing, escape decoding/encoding and the path</a><br>
|
||||
<a href="#interning-uris-1">4.0 Interning URIs</a><br>
|
||||
<a href="#acl-implementation-1">5.0 Allegro CL implementation notes</a><br>
|
||||
<a href="#examples-1">6.0 Examples</a><br>
|
||||
</p>
|
||||
|
||||
<p>This version of the Allegro CL URI support documentation is for distribution with the
|
||||
Open Source version of the URI code. Links to Allegro CL documentation other than
|
||||
URI-specific files have been supressed. To see Allegro CL documentation, see <a
|
||||
href="http://www.franz.com/support/documentation/">http://www.franz.com/support/documentation/</a>,
|
||||
which is the Allegro CL documentation page of the franz inc. website. Links to Allegro CL
|
||||
documentation can be found on that page. </p>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="uri-intro-1">1.0 Introduction</a></h2>
|
||||
|
||||
<p><em>URI</em> stands for <em>Universal Resource Identifier</em>. For a description of
|
||||
URIs, see RFC2396, which can be found in several places, including the IETF web site (<a
|
||||
href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</a>) and
|
||||
the UCI/ICS web site (<a href="http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt">http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt</a>).
|
||||
We prefer the UCI/ICS one as it has more examples. </p>
|
||||
|
||||
<p>URIs are a superset in functionality and syntax to URLs (Universal Resource Locators)
|
||||
and URNs (Universal Resource Names). That is, RFC2396 updates and merges RFC1738 and
|
||||
RFC1808 into a single syntax, called the URI. It does exclude some portions of RFC1738
|
||||
that define specific syntax of individual URL schemes. </p>
|
||||
|
||||
<p>In URL slang, the <em>scheme</em> is usually called the `protocol', but it is called
|
||||
scheme in RFC1738. A URL `host' corresponds to the URI `authority.' The URL slang
|
||||
`bookmark' or `anchor' is `fragment' in URI lingo. </p>
|
||||
|
||||
<p>The URI facility was available as a patch to Allegro CL 5.0.1 and is included with
|
||||
release 6.0. the URI facility might not be in an Allegro CL image. Evaluate <code>(require
|
||||
:uri)</code> to ensure the facility is loaded (that form returns <code>nil</code> if the
|
||||
URI module is already loaded). </p>
|
||||
|
||||
<p>Broadly, the URI facility creates a Lisp object that represents a URI, and provides
|
||||
setters and accessors to fields in the URI object. The URI object can also be interned,
|
||||
much like symbols in CL are. This document describes the facility and the related
|
||||
operators. </p>
|
||||
|
||||
<p>Aside from the obvious slots which are called out in the RFC, URIs also have a property
|
||||
list. With interning, this is another similarity between URIs and CL symbols. </p>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="uri-api-1">2.0 The URI API definition</a></h2>
|
||||
|
||||
<p>Symbols naming objects (functions, variables, etc.) in the <em>uri</em> module are
|
||||
exported from the <code>net.uri</code> package. </p>
|
||||
|
||||
<p>URIs are represented by CLOS objects. Their slots are: </p>
|
||||
|
||||
<pre>
|
||||
scheme
|
||||
host
|
||||
port
|
||||
path
|
||||
query
|
||||
fragment
|
||||
plist
|
||||
</pre>
|
||||
|
||||
<p>The <code>host</code> and <code>port</code> slots together correspond to the <code>authority</code>
|
||||
(see RFC2396). There is an accessor-like function, <a href="operators/uri-authority.htm"><b>uri-authority</b></a>,
|
||||
that can be used to extract the authority from a URI. See the RFC2396 specifications
|
||||
pointed to at the beginning of the <a href="#uri-intro-1">1.0 Introduction</a> for details
|
||||
of all the slots except <code>plist</code>. The <code>plist</code> slot contains a
|
||||
standard Common Lisp property list. </p>
|
||||
|
||||
<p>All symbols are external in the <code>net.uri</code> package, unless otherwise noted.
|
||||
Brief descriptions are given in this document, with complete descriptions in the
|
||||
individual pages.
|
||||
|
||||
<ul>
|
||||
<li><a href="classes/uri.htm"><code>uri</code></a>: the class of URI objects. </li>
|
||||
<li><a href="classes/urn.htm"><code>urn</code></a>: the class of URN objects. </li>
|
||||
<li><a href="operators/uri-p.htm"><b>uri-p</b></a> <p><b>Arguments: </b><i>object</i></p>
|
||||
<p>Returns true if <i>object</i> is an instance of class <a href="classes/uri.htm"><code>uri</code></a>.
|
||||
</p>
|
||||
</li>
|
||||
<li><a href="operators/copy-uri.htm"><b>copy-uri</b></a> <p><b>Arguments: </b><i>uri </i>&key
|
||||
<i>place scheme host port path query fragment plist </i></p>
|
||||
<p>Copies the specified URI object. See the description page for information on the
|
||||
keyword arguments. </p>
|
||||
</li>
|
||||
<li><a href="operators/uri-scheme.htm"><b>uri-scheme</b></a><br>
|
||||
<a href="operators/uri-host.htm"><b>uri-host</b></a><br>
|
||||
<a href="operators/uri-port.htm"><b>uri-port</b></a><br>
|
||||
<a href="operators/uri-path.htm"><b>uri-path</b></a><br>
|
||||
<a href="operators/uri-query.htm"><b>uri-query</b></a><br>
|
||||
<a href="operators/uri-fragment.htm"><b>uri-fragment</b></a><br>
|
||||
<a href="operators/uri-plist.htm"><b>uri-plist</b></a><br>
|
||||
<p><b>Arguments: </b><i>uri-object </i></p>
|
||||
<p>These accessors return the value of the associated slots of the <i>uri-object</i> </p>
|
||||
</li>
|
||||
<li><a href="operators/uri-authority.htm"><b>uri-authority</b></a> <p><b>Arguments: </b><i>uri-object
|
||||
</i></p>
|
||||
<p>Returns the authority of <i>uri-object</i>. The authority combines the host and port. </p>
|
||||
</li>
|
||||
<li><a href="operators/render-uri.htm"><b>render-uri</b></a> <p><b>Arguments: </b><i>uri
|
||||
stream </i></p>
|
||||
<p>Print to <i>stream</i> the printed representation of <i>uri</i>. </p>
|
||||
</li>
|
||||
<li><a href="operators/parse-uri.htm"><b>parse-uri</b></a> <p><b>Arguments: </b><i>string </i>&key
|
||||
(<i>class</i> 'uri)<i> </i></p>
|
||||
<p>Parse <i>string</i> into a URI object. </p>
|
||||
</li>
|
||||
<li><a href="operators/merge-uris.htm"><b>merge-uris</b></a> <p><b>Arguments: </b><i>uri
|
||||
base-uri </i>&optional <i>place </i></p>
|
||||
<p>Return an absolute URI, based on <i>uri</i>, which can be relative, and <i>base-uri</i>
|
||||
which must be absolute. </p>
|
||||
</li>
|
||||
<li><a href="operators/enough-uri.htm"><b>enough-uri</b></a> <p><b>Arguments: </b><i>uri
|
||||
base </i></p>
|
||||
<p>Converts <i>uri</i> into a relative URI using <i>base</i> as the base URI. </p>
|
||||
</li>
|
||||
<li><a href="operators/uri-parsed-path.htm"><b>uri-parsed-path</b></a> <p><b>Arguments: </b><i>uri
|
||||
</i></p>
|
||||
<p>Return the parsed representation of the path. </p>
|
||||
</li>
|
||||
<li><a href="operators/uri.htm"><b>uri</b></a> <p><b>Arguments: </b><i>object </i></p>
|
||||
<p>Defined methods: if argument is a uri object, return it; create a uri object if
|
||||
possible and return it, or error if not possible. </p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="parsing-decoding-1">3.0 Parsing, escape decoding/encoding and the path</a></h2>
|
||||
|
||||
<p>The method <a href="operators/uri-path.htm"><b>uri-path</b></a> returns the path
|
||||
portion of the URI, in string form. The method <a href="operators/uri-parsed-path.htm"><b>uri-parsed-path</b></a>
|
||||
returns the path portion of the URI, in list form. This list form is discussed below,
|
||||
after a discussion of decoding/encoding. </p>
|
||||
|
||||
<p>RFC2396 lays out a method for inserting into URIs <em>reserved characters</em>. You do
|
||||
this by escaping the character. An <em>escaped</em> character is defined like this: </p>
|
||||
|
||||
<pre>
|
||||
escaped = "%" hex hex
|
||||
|
||||
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f"
|
||||
</pre>
|
||||
|
||||
<p>In addition, the RFC defines excluded characters: </p>
|
||||
|
||||
<pre>
|
||||
"<" | ">" | "#" | "%" | <"> | "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
|
||||
</pre>
|
||||
|
||||
<p>The set of reserved characters are: </p>
|
||||
|
||||
<pre>
|
||||
";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","
|
||||
</pre>
|
||||
|
||||
<p>with the following exceptions:
|
||||
|
||||
<ul>
|
||||
<li>within the authority component, the characters ";", ":",
|
||||
"@", "?", and "/" are reserved. </li>
|
||||
<li>within a path segment, the characters "/", ";", "=", and
|
||||
"?" are reserved. </li>
|
||||
<li>within a query component, the characters ";", "/", "?",
|
||||
":", "@", "&", "=", "+",
|
||||
",", and "$" are reserved. </li>
|
||||
</ul>
|
||||
|
||||
<p>From the RFC, there are two important rules about escaping and unescaping (encoding and
|
||||
decoding):
|
||||
|
||||
<ul>
|
||||
<li>decoding should only happen when the URI is parsed into component parts;</li>
|
||||
<li>encoding can only occur when a URI is made from component parts (ie, rendered for
|
||||
printing). </li>
|
||||
</ul>
|
||||
|
||||
<p>The implication of this is that to decode the URI, it must be in a parsed state. That
|
||||
is, you can't convert <font face="Courier New">%2f</font> (the escaped form of
|
||||
"/") until the path has been parsed into its component parts. Another important
|
||||
desire is for the application viewing the component parts to see the decoded values of the
|
||||
components. For example, consider: </p>
|
||||
|
||||
<pre>
|
||||
http://www.franz.com/calculator/3%2f2
|
||||
</pre>
|
||||
|
||||
<p>This might be the implementation of a calculator, and how someone would execute 3/2.
|
||||
Clearly, the application that implements this would want to see path components of
|
||||
"calculator" and "3/2". "3%2f2" would not be useful to the
|
||||
calculator application. </p>
|
||||
|
||||
<p>For the reasons given above, a parsed version of the path is available and has the
|
||||
following form: </p>
|
||||
|
||||
<pre>
|
||||
([:absolute | :relative] component1 [component2...])
|
||||
</pre>
|
||||
|
||||
<p>where components are: </p>
|
||||
|
||||
<pre>
|
||||
element | (element param1 [param2 ...])
|
||||
</pre>
|
||||
|
||||
<p>and <em>element</em> is a path element, and the param's are path element parameters.
|
||||
For example, the result of </p>
|
||||
|
||||
<pre>
|
||||
(uri-parsed-path (parse-uri "foo;10/bar:x;y;z/baz.htm"))
|
||||
</pre>
|
||||
|
||||
<p>is </p>
|
||||
|
||||
<pre>
|
||||
(:relative ("foo" "10") ("bar:x" "y" "z") "baz.htm")
|
||||
</pre>
|
||||
|
||||
<p>There is a certain amount of canonicalization that occurs when parsing:
|
||||
|
||||
<ul>
|
||||
<li>A path of <code>(:absolute)</code> or <code>(:absolute "")</code> is
|
||||
equivalent to a <code>nil</code> path. That is, <code>http://a/</code> is parsed with a <code>nil</code>
|
||||
path and printed as <code>http://a</code>. </li>
|
||||
<li>Escaped characters that are not reserved are not escaped upon printing. For example, <code>"foob%61r"</code>
|
||||
is parsed into <code>"foobar"</code> and appears as <code>"foobar"</code>
|
||||
when the URI is printed. </li>
|
||||
</ul>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="interning-uris-1">4.0 Interning URIs</a></h2>
|
||||
|
||||
<p>This section describes how to intern URIs. Interning is not mandatory. URIs can be used
|
||||
perfectly well without interning them. </p>
|
||||
|
||||
<p>Interned URIs in Allegro are like symbols. That is, a string representing a URI, when
|
||||
parsed and interned, will always yield an <strong>eq</strong> object. For example: </p>
|
||||
|
||||
<pre>
|
||||
(eq (intern-uri "http://www.franz.com")
|
||||
(intern-uri "http://www.franz.com"))
|
||||
</pre>
|
||||
|
||||
<p>is always true. (Two strings with identical contents may or may not be <strong>eq</strong>
|
||||
in Common Lisp, note.) </p>
|
||||
|
||||
<p>The functions associated with interning are:
|
||||
|
||||
<ul>
|
||||
<li><a href="operators/make-uri-space.htm"><b>make-uri-space</b></a> <p><b>Arguments: </b>&key
|
||||
<i>size </i></p>
|
||||
<p>Make a new hash-table object to contain interned URIs. </p>
|
||||
</li>
|
||||
<li><a href="operators/uri-space.htm"><b>uri-space</b></a> <p><b>Arguments: </b></p>
|
||||
<p>Return the object into which URIs are currently being interned. </p>
|
||||
</li>
|
||||
<li><a href="operators/uri_eq.htm"><b>uri=</b></a> <p><b>Arguments: </b><i>uri1 uri2 </i></p>
|
||||
<p>Returns true if <i>uri1</i> and <i>uri2</i> are equivalent. </p>
|
||||
</li>
|
||||
<li><a href="operators/intern-uri.htm"><b>intern-uri</b></a> <p><b>Arguments: </b><i>uri-name
|
||||
</i>&optional <i>uri-space </i></p>
|
||||
<p>Intern the uri object specified in the uri-space specified. Methods exist for strings
|
||||
and uri objects. </p>
|
||||
</li>
|
||||
<li><a href="operators/unintern-uri.htm"><b>unintern-uri</b></a> <p><b>Arguments: </b><i>uri
|
||||
</i>&optional <i>uri-space </i></p>
|
||||
<p>Unintern the uri object specified or all uri objects (in <i>uri-space</i> if specified)
|
||||
if <i>uri</i> is <code>t</code>. </p>
|
||||
</li>
|
||||
<li><a href="operators/do-all-uris.htm"><b>do-all-uris</b></a> <p><b>Arguments: </b><i>(var </i>&optional
|
||||
<i>uri-space result) </i>&body <i>body </i></p>
|
||||
<p>Bind <i>var</i> to all currently defined uris (in <i>uri-space</i> if specified) and
|
||||
evaluate <i>body</i>. </p>
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="acl-implementation-1">5.0 Allegro CL implementation notes</a></h2>
|
||||
|
||||
<ol>
|
||||
<li>The following are true: <br>
|
||||
<code>(uri= (parse-uri "http://www.franz.com/")</code> <br>
|
||||
<code>(parse-uri "http://www.franz.com"))</code> <br>
|
||||
<code>(eq (intern-uri "http://www.franz.com/")</code> <br>
|
||||
<code>(intern-uri "http://www.franz.com"))</code><br>
|
||||
</li>
|
||||
<li>The following is true: <br>
|
||||
<code>(eq (intern-uri "http://www.franz.com:80/foo/bar.htm")</code> <br>
|
||||
<code>(intern-uri "http://www.franz.com/foo/bar.htm"))</code><br>
|
||||
(I.e. specifying the default port is the same as specifying no port at all. This is
|
||||
specific in RFC2396.) </li>
|
||||
<li>The <em>scheme</em> and <em>authority</em> are case-insensitive. In Allegro CL, the
|
||||
scheme is a keyword that appears in the normal case for the Lisp in which you are
|
||||
executing. </li>
|
||||
<li><code>#u"..."</code> is shorthand for <code>(parse-uri "...")</code>
|
||||
but if an existing <code>#u</code> dispatch macro definition exists, it will not be
|
||||
overridden. </li>
|
||||
<li>The interaction between setting the scheme, host, port, path, query, and fragment slots
|
||||
of URI objects, in conjunction with interning URIs will have very bad and unpredictable
|
||||
results. </li>
|
||||
<li>The printable representation of URIs is cached, for efficiency. This caching is undone
|
||||
when the above slots are changed. That is, when you create a URI the printed
|
||||
representation is cached. When you change one of the above mentioned slots, the printed
|
||||
representation is cleared and calculated when the URI is next printed. For example: </li>
|
||||
</ol>
|
||||
|
||||
<pre>
|
||||
user(10): (setq u #u"http://foo.bar.com/foo/bar")
|
||||
#<uri http://foo.bar.com/foo/bar>
|
||||
user(11): (setf (net.uri:uri-host u) "foo.com")
|
||||
"foo.com"
|
||||
user(12): u
|
||||
#<uri http://foo.com/foo/bar>
|
||||
user(13):
|
||||
</pre>
|
||||
|
||||
<p>This allows URIs behavior to follow the principle of least surprise. </p>
|
||||
|
||||
<hr>
|
||||
|
||||
<hr>
|
||||
|
||||
<h2><a name="examples-1">6.0 Examples</a></h2>
|
||||
|
||||
<pre>
|
||||
uri(10): (use-package :net.uri)
|
||||
t
|
||||
uri(11): (parse-uri "foo")
|
||||
#<uri foo>
|
||||
uri(12): #u"foo"
|
||||
#<uri foo>
|
||||
uri(13): (setq base (intern-uri "http://www.franz.com/foo/bar/"))
|
||||
#<uri http://www.franz.com/foo/bar/>
|
||||
uri(14): (merge-uris (parse-uri "foo.htm") base)
|
||||
#<uri http://www.franz.com/foo/bar/foo.htm>
|
||||
uri(15): (merge-uris (parse-uri "?foo") base)
|
||||
#<uri http://www.franz.com/foo/bar/?foo>
|
||||
uri(16): (setq base (intern-uri "http://www.franz.com/foo/bar/baz.htm"))
|
||||
#<uri http://www.franz.com/foo/bar/baz.htm>
|
||||
uri(17): (merge-uris (parse-uri "foo.htm") base)
|
||||
#<uri http://www.franz.com/foo/bar/foo.htm>
|
||||
uri(18): (merge-uris #u"?foo" base)
|
||||
#<uri http://www.franz.com/foo/bar/?foo>
|
||||
uri(19): (describe #u"http://www.franz.com")
|
||||
#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>:
|
||||
The following slots have :instance allocation:
|
||||
scheme :http
|
||||
host "www.franz.com"
|
||||
port nil
|
||||
path nil
|
||||
query nil
|
||||
fragment nil
|
||||
plist nil
|
||||
escaped nil
|
||||
string "http://www.franz.com"
|
||||
parsed-path nil
|
||||
hashcode nil
|
||||
uri(20): (describe #u"http://www.franz.com/")
|
||||
#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>:
|
||||
The following slots have :instance allocation:
|
||||
scheme :http
|
||||
host "www.franz.com"
|
||||
port nil
|
||||
path nil
|
||||
query nil
|
||||
fragment nil
|
||||
plist nil
|
||||
escaped nil
|
||||
string "http://www.franz.com"
|
||||
parsed-path nil
|
||||
hashcode nil
|
||||
uri(21): #u"foobar#baz%23xxx"
|
||||
#<uri foobar#baz#xxx>
|
||||
</pre>
|
||||
|
||||
<p><small>Copyright (c) 1998-2001, Franz Inc. Berkeley, CA., USA. All rights reserved.
|
||||
Created 2001.8.16.</small></p>
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user