Create 1.3 release series.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.3@590 48356398-32a2-884e-a903-53898d9a118a
Release 1.3.0 (bumped TODO items)
2025-08-04 05:07:55 +02:00 · 2006-11-26 23:30:22 +00:00 · 2006-11-26 23:21:19 +00:00 · 2006-11-26 23:18:32 +00:00 · 2006-11-26 23:14:12 +00:00 · 2006-11-26 00:46:57 +00:00
54 changed files with 2178 additions and 692 deletions
--- a/2
+++ b/2
@@ -4,7 +4,7 @@
 # Project related configuration options
 #---------------------------------------------------------------------------
 PROJECT_NAME           = HTML Purifier
-PROJECT_NUMBER         = 1.2.0
+PROJECT_NUMBER         = 1.3.0
 OUTPUT_DIRECTORY       = "C:/Documents and Settings/Edward/My Documents/My Webs/htmlpurifier/docs/doxygen"
 CREATE_SUBDIRS         = NO
 OUTPUT_LANGUAGE        = English
--- a/31
+++ b/31
@@ -9,6 +9,37 @@ NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
    . Internal change
 ==========================

+1.3.0, released 2006-11-26
+# Invalid images are now removed, rather than replaced with a dud
+  <img src="" alt="Invalid image" />. Previous behavior can be restored
+  with new directive %Core.RemoveInvalidImg set to false.
+! (X)HTML Strict now supported
+  + Transparently handles inline elements in block context (blockquote)
+! Added GET method to demo for easier validation, added 50kb max input size
+! New directive %HTML.BlockWrapper, for block-ifying inline elements
+! New directive %HTML.Parent, allows you to only allow inline content
+! New directives %HTML.AllowedElements and %HTML.AllowedAttributes to let
+  users narrow the set of allowed tags
+! <li value="4"> and <ul start="2"> now allowed in loose mode
+! New directives %URI.DisableExternalResources and %URI.DisableResources
+! New directive %Attr.DisableURI, which eliminates all hyperlinking
+! New directive %URI.Munge, munges URI so you can use some sort of redirector
+  service to avoid PageRank leaks or warn users that they are exiting your site.
+! Added spiffy new smoketest printDefinition.php, which lets you twiddle with
+  the configuration settings and see how the internal rules are affected.
+! New directive %URI.HostBlacklist for blocking links to bad hosts.
+  xssAttacks.php smoketest updated accordingly.
+- Added missing type to ChildDef_Chameleon
+- Remove Tidy option from demo if there is not Tidy available
+. ChildDef_Required guards against empty tags
+. Lookup table HTMLDefinition->info_flow_elements added
+. Added peace-of-mind variable initialization to Strategy_FixNesting
+. Added HTMLPurifier->info_parent_def, parent child processing made special
+. Added internal documents briefly summarizing future progression of HTML
+. HTMLPurifier_Config->getBatch($namespace) added
+. More lenient casting to bool from string in HTMLPurifier_ConfigSchema
+. Refactored ChildDef classes into their own files
+
 1.2.0, released 2006-11-19
 # ID attributes now disabled by default. New directives:
  + %HTML.EnableAttrID - restores old behavior by allowing IDs
--- a/53
+++ b/53
@@ -1,45 +1,61 @@

 TODO List

-1.2 release
- - Make URI validation routines tighter (especially mailto)
- - More extensive URI filtering schemes (see docs/proposal-new-directives.txt)
- - Allow for background-image and list-style-image (see above)
- - Error logging for filtering/cleanup procedures
- - Rich set* methods and config file loaders for HTMLPurifier_Config
+= KEY ====================
+    # Flagship
+    - Regular
+    ? At-risk
+==========================

-1.3 release
- - Add various "levels" of cleaning
-    - Related: Allow strict (X)HTML
+1.4 release
+ # More extensive URI filtering schemes (see docs/proposal-new-directives.txt)
+ # Allow for background-image and list-style-image (intrinsically tied to above)
+ - Aggressive caching
+ ? Rich set* methods and config file loaders for HTMLPurifier_Config
+ ? Configuration profiles: sets of directives that get set with one func call
+ ? ConfigSchema directive aliases (so we can rename some of them)
+ ? URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX)
+
+1.5 release
+ # Error logging for filtering/cleanup procedures
+    - Requires I18N facilities to be created first (COMPLEX)
+
+1.6 release
+ # Add pre-packaged "levels" of cleaning (custom behavior already done)
 - More fine-grained control over escaping behavior
    - Silently drop content inbetween SCRIPT tags (can be generalized to allow
      specification of elements that, when detected as foreign, trigger removal
      of children, although unbalanced tags could wreck havoc (or at least
      delete the rest of the document)).

-1.4 release
- - Additional support for poorly written HTML
-    - Implement all non-essential attribute transforms
-    - Microsoft Word HTML cleaning (i.e. MsoNormal)
+1.7 release
+ # Additional support for poorly written HTML
+    - Implement all non-essential attribute transforms (BIG!)
+    - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
+    - Friendly strict handling of <address> (block -> <br>)

 2.0 release
- - Formatters for plaintext
+ # Formatters for plaintext (COMPLEX)
    - Auto-paragraphing (be sure to leverage fact that we know when things
      shouldn't be paragraphed, such as lists and tables).
    - Linkify URLs
    - Smileys
-    - Linkification for HTML Purifier docs: notably configuration and
-      class names
+    - Linkification for HTML Purifier docs: notably configuration and classes

 3.0 release
- - Extended HTML capabilities based on namespacing and tag transforms
+ - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
    - Hooks for adding custom processors to custom namespaced tags and
      attributes, offer default implementation
    - Lots of documentation and samples
+ - XHTML 1.1 support

 Ongoing
 - Lots of profiling, make it faster!
- - Plugins for major CMSes (very tricky issue)
+ - Plugins for major CMSes (COMPLEX)
+    - Drupal
+    - WordPress
+    - eFiction
+    - more! (look for ones that use WYSIWYGs)

 Unknown release (on a scratch-an-itch basis)
 - Fixes for Firefox's inability to handle COL alignment props (Bug 915)
@@ -50,6 +66,7 @@ Unknown release (on a scratch-an-itch basis)
 - Append something to duplicate IDs so they're still usable (impl. note: the
   dupe detector would also need to detect the suffix as well)
 - Have 'lang' attribute be checked against official lists
+ - Docs on how to embed YouTube videos (and friends) without patches

 Encoding workarounds
 - Non-lossy dumb alternate character encoding transformations, achieved by
--- a/docs/dev-code-quality.html
+++ b/docs/dev-code-quality.html
@@ -22,6 +22,8 @@ of code that should be aggressively refactored.  This does not list
 optimization issues, that needs to be done after intense profiling.</p>

 <pre>
+docs/examples/demo.php - ad hoc HTML/PHP soup to the extreme
+
 AttrDef
    Class - doesn't support Unicode characters (fringe); uses regular
        expressions
@@ -32,7 +34,8 @@ AttrDef
    Number - constructor interface inconsistent with Integer
 ConfigSchema - redefinition is a mess
 Strategy
-    FixNesting - cannot bubble nodes out of structures
+    FixNesting - cannot bubble nodes out of structures, duplicated checks
+        for special-case parent node
    MakeWellFormed - insufficient automatic closing definitions (check HTML
        spec for optional end tags, also, closing based on type (block/inline)
        might be efficient).
--- a/docs/dev-progress.html
+++ b/docs/dev-progress.html
@@ -128,19 +128,20 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}

 <tbody>
 <tr><th colspan="2">Absolute positioning, unknown release milestone</th></tr>
-<tr class="danger"><td>bottom</td><td rowspan="4">Dangerous, must be non-negative</td></tr>
-<tr class="danger"><td>left</td></tr>
-<tr class="danger"><td>right</td></tr>
-<tr class="danger"><td>top</td></tr>
-<tr><td>clip</td><td>-</td></tr>
-<tr class="danger"><td>position</td><td>ENUM(static, relative, absolute, fixed), permit
+<tr class="danger impl-no"><td>bottom</td><td rowspan="4">Dangerous, must be non-negative to even be considered,
+    but it's still possible to arbitrarily position by running over.</td></tr>
+<tr class="danger impl-no"><td>left</td></tr>
+<tr class="danger impl-no"><td>right</td></tr>
+<tr class="danger impl-no"><td>top</td></tr>
+<tr class="impl-no"><td>clip</td><td>-</td></tr>
+<tr class="danger impl-no"><td>position</td><td>ENUM(static, relative, absolute, fixed)
    relative not absolute?</td></tr>
-<tr class="danger"><td>z-index</td><td>Dangerous</td></tr>
+<tr class="danger impl-no"><td>z-index</td><td>Dangerous</td></tr>
 </tbody>

 <tbody>
 <tr><th colspan="2">Unknown</th></tr>
-<tr class="danger css1"><td>background-image</td><td>Dangerous, target milestone 1.2</td></tr>
+<tr class="danger css1"><td>background-image</td><td>Dangerous, target milestone 1.3</td></tr>
 <tr class="css1"><td>background-attachment</td><td>ENUM(scroll, fixed),
    Depends on background-image</td></tr>
 <tr class="css1"><td>background-position</td><td>Depends on background-image</td></tr>
@@ -150,7 +151,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
    inline-block has incomplete IE6 support and requires -moz-inline-box
    for Mozilla. Unknown target milestone.</td></tr>
 <tr><td class="css1">height</td><td>Interesting, why use it? Unknown target milestone.</td></tr>
-<tr class="danger css1"><td>list-style-image</td><td>Dangerous? Target milestone 1.2</td></tr>
+<tr class="danger css1"><td>list-style-image</td><td>Dangerous? Target milestone 1.3</td></tr>
 <tr class="impl-no"><td>max-height</td><td rowspan="4">No IE 5/6</td></tr>
 <tr class="impl-no"><td>min-height</td></tr>
 <tr class="impl-no"><td>max-width</td></tr>
@@ -236,7 +237,7 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>
 <tr><th colspan="3">Questionable</th></tr>
 <tr class="impl-no"><td>accesskey</td><td>A</td><td>May interfere with main interface</td></tr>
 <tr class="impl-no"><td>tabindex</td><td>A</td><td>May interfere with main interface</td></tr>
-<tr><td>target</td><td>A</td><td>Config enabled, only useful for frame layouts</td></tr>
+<tr><td>target</td><td>A</td><td>Config enabled, only useful for frame layouts, disallowed in strict</td></tr>
 </tbody>

 <tbody>
@@ -283,11 +284,11 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>
 <tr><td>nowrap</td><td>TD, TH</td><td>Boolean, style 'white-space:nowrap;' (not compat with IE5)</td></tr>
 <tr><td>size</td><td>HR</td><td>Near-equiv 'width', needs px suffix if original was pixels</td></tr>
 <tr class="required impl-yes"><td>src</td><td>IMG</td><td>Required, insert blank or default img if not set</td></tr>
-<tr><td>start</td><td>OL</td><td>Poorly supported 'counter-reset', transform may not be desirable</td></tr>
+<tr class="impl-yes"><td>start</td><td>OL</td><td>Poorly supported 'counter-reset', allowed in loose, dropped in strict</td></tr>
 <tr><td rowspan="3">type</td><td>LI</td><td rowspan="3">Equivalent style 'list-style-type', different allowed values though. (needs testing)</td></tr>
    <tr><td>OL</td></tr>
    <tr><td>UL</td></tr>
-<tr><td>value</td><td>LI</td><td>Poorly supported 'counter-reset', transform may not be desirable, see ol.start. Configurable.</td></tr>
+<tr class="impl-yes"><td>value</td><td>LI</td><td>Poorly supported 'counter-reset', allowed in loose, dropped in strict</td></tr>
 <tr><td>vspace</td><td>IMG</td><td>Near-equiv styles 'margin-left' and 'margin-right', needs px suffix, see hspace</td></tr>
 <tr><td rowspan="2">width</td><td>HR</td><td rowspan="2">Near-equiv style 'width', needs px suffix if original was pixels</td></tr>
    <tr><td>TD, TH</td></tr>
--- a/docs/examples/demo.php
+++ b/docs/examples/demo.php
@@ -1,34 +1,66 @@
 <?php

-header('Content-type:text/html;charset=UTF-8');
+// using _REQUEST because we accept GET and POST requests

-?><!DOCTYPE html 
-     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+$content = empty($_REQUEST['xml']) ? 'text/html' : 'application/xhtml+xml';
+header("Content-type:$content;charset=UTF-8");
+
+// prevent PHP versions with shorttags from barfing
+echo '<?xml version="1.0" encoding="UTF-8" ?>
+';
+
+function getFormMethod() {
+    return (isset($_REQUEST['post'])) ? 'post' : 'get';
+}
+
+if (empty($_REQUEST['strict'])) {
+?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html>
+<?php
+} else {
+?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<?php
+}
+?>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
 <head>
-<title>HTMLPurifier Live Demo</title>
+<title>HTML Purifier Live Demo</title>
 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTMLPurifier Live Demo</h1>
+<h1>HTML Purifier Live Demo</h1>
 <?php

-set_include_path('../../library' . PATH_SEPARATOR . get_include_path());
-require_once 'HTMLPurifier.php';
+require_once '../../library/HTMLPurifier.auto.php';

-if (!empty($_POST['html'])) {
+if (!empty($_REQUEST['html'])) { // start result
    
-    $html = get_magic_quotes_gpc() ? stripslashes($_POST['html']) : $_POST['html'];
+    if (strlen($_REQUEST['html']) > 50000) {
+        ?>
+        <p>Request exceeds maximum allowed text size of 50kb.</p>
+        <?php
+    } else { // start main processing
+    
+    $html = get_magic_quotes_gpc() ? stripslashes($_REQUEST['html']) : $_REQUEST['html'];
    
    $config = HTMLPurifier_Config::createDefault();
-    $config->set('Core', 'TidyFormat', !empty($_POST['tidy']));
+    $config->set('Core', 'TidyFormat', !empty($_REQUEST['tidy']));
+    $config->set('HTML', 'Strict',     !empty($_REQUEST['strict']));
    $purifier = new HTMLPurifier($config);
    $pure_html = $purifier->purify($html);
    
 ?>
 <p>Here is your purified HTML:</p>
 <div style="border:5px solid #CCC;margin:0 10%;padding:1em;">
+<?php if(getFormMethod() == 'get') { ?>
+<div style="float:right;">
+    <a href="http://validator.w3.org/check?uri=referer"><img
+        src="http://www.w3.org/Icons/valid-xhtml10"
+        alt="Valid XHTML 1.0 Transitional" height="31" width="88" style="border:0;" /></a>
+</div>
+<?php } ?>
 <?php

 echo $pure_html;
@@ -43,23 +75,34 @@ echo htmlspecialchars($pure_html, ENT_COMPAT, 'UTF-8');

 ?></pre>
 <?php
-    
+if (getFormMethod() == 'post') { // start POST validation notice
+?>
+<p>If you would like to validate the code with
+<a href="http://validator.w3.org/#validate-by-input">W3C's
+validator</a>, copy and paste the <em>entire</em> demo page's source.</p>
+<?php
+} // end POST validation notice
+
+} // end main processing
+
+// end result
 } else {

 ?>
-<p>Welcome to the live demo.  Enter some HTML and see how HTMLPurifier
+<p>Welcome to the live demo.  Enter some HTML and see how HTML Purifier
 will filter it.</p>
 <?php

 }

 ?>
-<form name="filter" action="demo.php<?php
-if (isset($_GET['profile']) || isset($_GET['XDEBUG_PROFILE'])) {
-    echo '?XDEBUG_PROFILE=1';
-} ?>" method="post">
+<form id="filter" action="demo.php<?php
+echo '?' . getFormMethod();
+if (isset($_REQUEST['profile']) || isset($_REQUEST['XDEBUG_PROFILE'])) {
+    echo '&amp;XDEBUG_PROFILE=1';
+} ?>" method="<?php echo getFormMethod();  ?>">
    <fieldset>
-        <legend>HTML</legend>
+        <legend>HTML Purifier Input (<?php echo getFormMethod(); ?>)</legend>
        <textarea name="html" cols="60" rows="15"><?php

 if (isset($html)) {
@@ -67,13 +110,27 @@ if (isset($html)) {
            HTMLPurifier_Encoder::cleanUTF8($html), ENT_COMPAT, 'UTF-8');
 }
        ?></textarea>
-        <div>Nicely format output with Tidy? <input type="checkbox" value="1"
-        name="tidy"<?php if (!empty($_POST['tidy'])) echo ' checked="checked"'; ?> /></div>
+        <?php if (getFormMethod() == 'get') { ?>
+            <p><strong>Warning:</strong> GET request method can only hold
+                8129 characters (probably less depending on your browser).
+                If you need to test anything
+                larger than that, try the <a href="demo.php?post">POST form</a>.</p>
+        <?php } ?>
+        <?php if (extension_loaded('tidy')) { ?>
+            <div>Nicely format output with Tidy? <input type="checkbox" value="1"
+            name="tidy"<?php if (!empty($_REQUEST['tidy'])) echo ' checked="checked"'; ?> /></div>
+        <?php } ?>
+        <div>XHTML 1.0 Strict output? <input type="checkbox" value="1"
+        name="strict"<?php if (!empty($_REQUEST['strict'])) echo ' checked="checked"'; ?> /></div>
+        <div>Serve as application/xhtml+xml? (not for IE) <input type="checkbox" value="1"
+        name="xml"<?php if (!empty($_REQUEST['xml'])) echo ' checked="checked"'; ?> /></div>
        <div>
            <input type="submit" value="Submit" name="submit" class="button" />
        </div>
    </fieldset>
 </form>
-<p>Return to <a href="http://hp.jpsband.org/">HTMLPurifier's home page</a>.</p>
+<p>Return to <a href="http://hp.jpsband.org/">HTML Purifier's home page</a>.
+Try the form in <a href="demo.php?get">GET</a> and <a href="demo.php?post">POST</a> request
+flavors (GET is easy to validate with W3C, but POST allows larger inputs).</p>
 </body>
 </html>
--- a/docs/index.html
+++ b/docs/index.html
@@ -65,6 +65,85 @@ that may not directly discuss HTML Purifier.</p>
 <dd>Credits and links to DevNetwork forum topics.</dd>
 </dl>

+<h2>Internal memos</h2>
+
+<p>Plaintext documents that are more for use by active developers of
+the code. They may be upgraded to HTML files or stay as TXT scratchpads.</p>
+
+<table class="table">
+
+<thead><tr>
+    <th width="10%">Type</th>
+    <th width="20%">Name</th>
+    <th>Description</th>
+</tr></thead>
+
+<tbody>
+
+<tr>
+    <td>End-user</td>
+    <td><a href="enduser-overview.txt">Overview</a></td>
+    <td>High level overview of the general control flow (mostly obsolete).</td>
+</tr>
+
+<tr>
+    <td>End-user</td>
+    <td><a href="enduser-security.txt">Security</a></td>
+    <td>Common security issues that may still arise (half-baked).</td>
+</tr>
+
+<tr>
+    <td>Proposal</td>
+    <td><a href="proposal-filter-levels.txt">Filter levels</a></td>
+    <td>Outlines details of projected configurable level of filtering.</td>
+</tr>
+
+<tr>
+    <td>Proposal</td>
+    <td><a href="proposal-language.txt">Language</a></td>
+    <td>Specification of I18N for error messages derived from MediaWiki (half-baked).</td>
+</tr>
+
+<tr>
+    <td>Proposal</td>
+    <td><a href="proposal-new-directives.txt">New directives</a></td>
+    <td>Assorted configuration options that could be implemented.</td>
+</tr>
+
+<tr>
+    <td>Reference</td>
+    <td><a href="ref-loose-vs-strict.txt">Loose vs.Strict</a></td>
+    <td>Differences between HTML Strict and Transitional versions.</td>
+</tr>
+
+<tr>
+    <td>Reference</td>
+    <td><a href="ref-proprietary-tags.txt">Proprietary tags</a></td>
+    <td>List of vendor-specific tags we may want to transform to W3C compliant markup.</td>
+</tr>
+
+<tr>
+    <td>Reference</td>
+    <td><a href="ref-strictness.txt">Strictness</a></td>
+    <td>Short essay on how loose definition isn't really loose.</td>
+</tr>
+
+<tr>
+    <td>Reference</td>
+    <td><a href="ref-xhtml-1.1.txt">XHTML 1.1</a></td>
+    <td>What we'd have to do to support XHTML 1.1.</td>
+</tr>
+
+<tr>
+    <td>Reference</td>
+    <td><a href="ref-whatwg.txt">WHATWG</a></td>
+    <td>How WHATWG plays into what we need to do.</td>
+</tr>
+
+</tbody>
+
+</table>
+
 <div id="version">$Id$</div>
 </body>
 </html>
--- a/docs/proposal-filter-levels.txt
+++ b/docs/proposal-filter-levels.txt
@@ -8,11 +8,11 @@ could go into this definition: the set of HTML good for blog entries is
 definitely too large for HTML that would be allowed in blog comments. Going
 from Transitional to Strict requires changes to the definition.

-However, allowing users to specify their own whitelists was an idea I
-rejected from the start.  Simply put, the typical programmer is too lazy
-to actually go through the trouble of investigating which tags, attributes
-and properties to allow.  HTMLDefinition makes a big part of what HTMLPurifier
-is.
+Allowing users to specify their own whitelists is one step (implemented, btw), 
+but I have doubts on only doing this. Simply put, the typical programmer is too 
+lazy to actually go through the trouble of investigating which tags, attributes 
+and properties to allow. HTMLDefinition makes a big part of what HTMLPurifier 
+is. 

 The idea, then, is to setup fundamentally different set of definitions, which
 can further be customized using simpler configuration options.
@@ -28,7 +28,7 @@ Here are some fuzzy levels you could set:
    to be useful)
 3. Pages - As permissive as possible without allowing XSS.  No protection
    against bad design sense, unfortunantely.  Suitable for wiki and page
-    environments.
+    environments. (probably what we have now)
 4. Lint - Accept everything in the spec, a Tidy wannabe. (This probably won't
    get implemented as it would require routines for things like <object>
    and friends to be implemented, which is a lot of work for not a lot of
--- a/docs/proposal-new-directives.txt
+++ b/docs/proposal-new-directives.txt
@@ -21,20 +21,11 @@ time.  Note the naming convention: %Namespace.Directive
 %Attr.MaxHeight - caps for width and height related checks.
    (the hack in Pixels for an image crashing attack could be replaced by this)

-%URI.Munge - will munge all external URIs to a different URI, which redirects
-    the user to the applicable page. A urlencoded version of the URI
-    will replace any instances of %s in the string. One possible
-    string is 'http://www.google.com/url?q=%s'. Useful for preventing
-    pagerank from being sent to other sites, but can also be used to
-    redirect to a splash page notifying user that they are leaving your
-    website.
-
 %URI.AddRelNofollow - will add rel="nofollow" to all links, preventing the
    spread of ill-gotten pagerank

 %URI.RelativeToAbsolute - transforms all relative URIs to absolute form

-%URI.HostBlacklist - strings that if found in the host of a URI are disallowed
 %URI.HostBlacklistRegex - regexes that if matching the host are disallowed
 %URI.HostWhitelist - domain names that are excluded from the host blacklist
 %URI.HostPolicy - determines whether or not its reject all and then whitelist
@@ -53,7 +44,3 @@ time.  Note the naming convention: %Namespace.Directive
    absolute DNS.  While this is actually the preferred method according to
    the RFC, most people opt to use a relative domain name relative to . (root).

-%URI.DisableExternalResources - disallow resource links (i.e. URIs that result
-    in immediate requests, such as src in IMG) to external websites
-
-%HTML.DisableImg - disables all images
--- a/docs/ref-loose-vs-strict.txt
+++ b/docs/ref-loose-vs-strict.txt
@@ -0,0 +1,37 @@
+
+Loose versus Strict
+    Changes from one doctype to another
+
+There are changes.  Wow, how insightful.  Not everything changed is relevant
+to HTML Purifier, though, so let's take a look:
+
+== Major incompatibilities ==
+
+[done] BLOCKQUOTE changes from 'flow' to 'block'
+    current behavior: inline inner contents should not be nuked, block-ify as necessary
+[partially-done] U, S, STRIKE cut
+    current behavior: removed completely
+    projected behavior: replace with appropriate inline span + CSS
+[done] ADDRESS from potpourri to Inline (removes p tags)
+    current behavior: block tags silently dropped
+    ideal behavior: replace tags with something like <br>. (not high priority)
+
+== Things we can loosen up ==
+
+Tags DIR, MENU, CENTER, ISINDEX, FONT, BASEFONT? allowed in loose
+    current behavior: transform to strict-valid forms
+Attributes allowed in loose (see attribute transforms in 'dev-progress.html')
+    current behavior: projected to transform into strict-valid forms
+
+== Periphery issues ==
+
+A tag's attribute 'target' (for selecting frames) cut
+    current behavior: not allowed at all
+    projected behavior: use loose doctype if needed, needs valid values
+[done] OL/LI tag's attribute 'start'/'value' (for renumbering lists) cut
+    current behavior: no substitute, just delete when in strict, allow in loose
+Attribute 'name' deprecated in favor of 'id'
+    current behavior: dropped silently
+    projected behavior: create proper AttrTransform (currently not allowed at all)
+[done] PRE tag allows SUB/SUP? (strict dtd comment vs syntax, loose disallows)
+    current behavior: disallow as usual
--- a/docs/ref-proprietary-tags.txt
+++ b/docs/ref-proprietary-tags.txt
@@ -0,0 +1,22 @@
+
+Proprietary Tags
+    <nobr> and friends
+
+Here are some proprietary tags that W3C does not define but occasionally show
+up in the wild.  We have only included tags that would make sense in an
+HTML Purifier context.
+
+<align>, block element that aligns (extremely rare)
+<blackface>, inline that double-bolds text (extremely rare)
+<comment>, hidden comment for IE and WebTV
+<multicol cols=number gutter=pixels width=pixels>, multiple columns
+<nobr>, no linebreaks
+<spacer align=* type="vertical|horizontal|block">, whitespace in doc,
+    use width/height for block and size for vertical/horizontal (attributes)
+    (extremely rare)
+<wbr>, potential word break point: allows linebreaks. Only works in <nobr>
+
+<listing>, monospace pre-variant (extremely rare)
+<plaintext>, escapes all tags to the end of document
+<ruby> and friends, (more research needed, appears to be XHTML 1.1 markup)
+<xmp>, monospace, replace with pre
--- a/docs/ref-strictness.txt
+++ b/docs/ref-strictness.txt
@@ -22,4 +22,15 @@ whole point about CSS is to seperate styling from content, so inline styling
 doesn't solve that problem.

 It's an icky question, and we'll have to deal with it as more and more 
-transforms get implemented.
+transforms get implemented.  As of right now, however, we currently support
+these loose-only constructs in loose mode:
+
+- <ul start="1">, <li value="1"> attributes
+- <u>, <strike>, <s> tags
+- flow children in <blockquote>
+- mixed children in <address>
+
+The changed child definitions as well as the ul.start li.value are the most
+compelling reasons why loose should be used.  We may want offer disabling <u>,
+<strike> and <s> by themselves.
+
--- a/docs/ref-whatwg.txt
+++ b/docs/ref-whatwg.txt
@@ -0,0 +1,9 @@
+
+Web Hypertext Application Technology Working Group
+    WHATWG
+
+I don't think we need to worry about them.  Untrusted users shouldn't be
+submitting applications, eh?  But if some interesting attribute pops up in
+their spec, and might be worth supporting, stick it here.
+
+(none so far, as you can see)
--- a/docs/ref-xhtml-1.1.txt
+++ b/docs/ref-xhtml-1.1.txt
@@ -0,0 +1,20 @@
+
+Getting XHTML 1.1 Working
+
+It's quite simple, according to <http://www.w3.org/TR/xhtml11/changes.html>
+
+1. Scratch lang entirely in favor of xml:lang
+2. Scratch name entirely in favor of id (partially-done)
+3. Support Ruby <http://www.w3.org/TR/2001/REC-ruby-20010531/>
+
+...but that's only an informative section. More things to do:
+
+1. Scratch style attribute (it's deprecated)
+2. Be module-aware
+3. Cross-reference minimal content models with existing DTDs and determine
+   changes (todo)
+4. Watch out for the Legacy Module
+<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/abstract_modules.html#s_legacymodule>
+5. Let users specify their own custom modules
+6. Study Modularization document
+<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/>
--- a/library/HTMLPurifier.php
+++ b/library/HTMLPurifier.php
@@ -22,7 +22,7 @@
 */

 /*
-    HTML Purifier 1.2.0 - Standards Compliant HTML Filtering
+    HTML Purifier 1.3.0 - Standards Compliant HTML Filtering
    Copyright (C) 2006 Edward Z. Yang

    This library is free software; you can redistribute it and/or
--- a/library/HTMLPurifier/AttrDef/URI.php
+++ b/library/HTMLPurifier/AttrDef/URI.php
@@ -24,7 +24,7 @@ HTMLPurifier_ConfigSchema::define(
    'This directive has been available since 1.2.0.'
 );

-HTMLPurifier_ConfigSchema::Define(
+HTMLPurifier_ConfigSchema::define(
    'URI', 'DisableExternal', false, 'bool',
    'Disables links to external websites.  This is a highly effective '.
    'anti-spam and anti-pagerank-leech measure, but comes at a hefty price: no'.
@@ -34,6 +34,49 @@ HTMLPurifier_ConfigSchema::Define(
    'This directive has been available since 1.2.0.'
 );

+HTMLPurifier_ConfigSchema::define(
+    'URI', 'DisableExternalResources', false, 'bool',
+    'Disables the embedding of external resources, preventing users from '.
+    'embedding things like images from other hosts. This prevents '.
+    'access tracking (good for email viewers), bandwidth leeching, '.
+    'cross-site request forging, goatse.cx posting, and '.
+    'other nasties, but also results in '.
+    'a loss of end-user functionality (they can\'t directly post a pic '.
+    'they posted from Flickr anymore). Use it if you don\'t have a '.
+    'robust user-content moderation team. This directive has been '.
+    'available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'URI', 'DisableResources', false, 'bool',
+    'Disables embedding resources, essentially meaning no pictures. You can '.
+    'still link to them though. See %URI.DisableExternalResources for why '.
+    'this might be a good idea. This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'URI', 'Munge', null, 'string/null',
+    'Munges all browsable (usually http, https and ftp) URI\'s into some URL '.
+    'redirection service. Pass this directive a URI, with %s inserted where '.
+    'the url-encoded original URI should be inserted (sample: '.
+    '<code>http://www.google.com/url?q=%s</code>). '.
+    'This prevents PageRank leaks, while being as transparent as possible '.
+    'to users (you may also want to add some client side JavaScript to '.
+    'override the text in the statusbar). Warning: many security experts '.
+    'believe that this form of protection does not deter spam-bots. '.
+    'You can also use this directive to redirect users to a splash page '.
+    'telling them they are leaving your website. '.
+    'This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'URI', 'HostBlacklist', array(), 'list',
+    'List of strings that are forbidden in the host of any URI. Use it to '.
+    'kill domain names of spam, etc. Note that it will catch anything in '.
+    'the domain, so <tt>moo.com</tt> will catch <tt>moo.com.example.com</tt>. '.
+    'This directive has been available since 1.3.0.'
+);
+
 /**
 * Validates a URI as defined by RFC 3986.
 * @note Scheme-specific mechanics deferred to HTMLPurifier_URIScheme
@@ -43,15 +86,15 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
    
    var $host;
    var $PercentEncoder;
-    var $embeds;
+    var $embeds_resource;
    
    /**
-     * @param $embeds Does the URI here result in an extra HTTP request?
+     * @param $embeds_resource_resource Does the URI here result in an extra HTTP request?
     */
-    function HTMLPurifier_AttrDef_URI($embeds = false) {
+    function HTMLPurifier_AttrDef_URI($embeds_resource = false) {
        $this->host = new HTMLPurifier_AttrDef_Host();
        $this->PercentEncoder = new HTMLPurifier_PercentEncoder();
-        $this->embeds = (bool) $embeds;
+        $this->embeds_resource = (bool) $embeds_resource;
    }
    
    function validate($uri, $config, &$context) {
@@ -105,18 +148,25 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        }
        
        
-        // the URI we're processing embeds a resource in the page, but the URI
+        // the URI we're processing embeds_resource a resource in the page, but the URI
        // it references cannot be located
-        if ($this->embeds && !$scheme_obj->browsable) {
+        if ($this->embeds_resource && !$scheme_obj->browsable) {
            return false;
        }
        
        
        if ($authority !== null) {
            
-            // remove URI if it's absolute and we disallow externals
+            // remove URI if it's absolute and we disabled externals or
+            // if it's absolute and embedded and we disabled external resources
            unset($our_host);
-            if ($config->get('URI', 'DisableExternal')) {
+            if (
+                $config->get('URI', 'DisableExternal') ||
+                (
+                    $config->get('URI', 'DisableExternalResources') &&
+                    $this->embeds_resource
+                )
+            ) {
                $our_host = $config->get('URI', 'Host');
                if ($our_host === null) return false;
            }
@@ -143,6 +193,8 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
            $host = $this->host->validate($host, $config, $context);
            if ($host === false) $host = null;
            
+            if ($this->checkBlacklist($host, $config, $context)) return false;
+            
            // more lenient absolute checking
            if (isset($our_host)) {
                $host_parts = array_reverse(explode('.', $host));
@@ -198,10 +250,37 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        if ($query !== null) $result .= "?$query";
        if ($fragment !== null) $result .= "#$fragment";
        
+        // munge if necessary
+        $munge = $config->get('URI', 'Munge');
+        if (!empty($scheme_obj->browsable) && $munge !== null) {
+            if ($authority !== null) {
+                $result = str_replace('%s', rawurlencode($result), $munge);
+            }
+        }
+        
        return $result;
        
    }
    
+    /**
+     * Checks a host against an array blacklist
+     * @param $host Host to check
+     * @param $config HTMLPurifier_Config instance
+     * @param $context HTMLPurifier_Context instance
+     * @return bool Is spam?
+     */
+    function checkBlacklist($host, &$config, &$context) {
+        $blacklist = $config->get('URI', 'HostBlacklist');
+        if (!empty($blacklist)) {
+            foreach($blacklist as $blacklisted_host_fragment) {
+                if (strpos($host, $blacklisted_host_fragment) !== false) {
+                    return true;
+                }
+            }
+        }
+        return false;
+    }
+    
 }

 ?>
--- a/library/HTMLPurifier/ChildDef.php
+++ b/library/HTMLPurifier/ChildDef.php
@@ -20,10 +20,9 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_ChildDef
 {
    /**
-     * Type of child definition, usually right-most part of class name lowercase
-     * 
-     * Used occasionally in terms of context.  Possible values include
-     * custom, required, optional and empty.
+     * Type of child definition, usually right-most part of class name lowercase.
+     * Used occasionally in terms of context.
+     * @public
     */
    var $type;
    
@@ -32,12 +31,15 @@ class HTMLPurifier_ChildDef
     * 
     * This is necessary for redundant checking when changes affecting
     * a child node may cause a parent node to now be disallowed.
+     * 
+     * @public
     */
    var $allow_empty;
    
    /**
     * Validates nodes according to definition and returns modification.
     * 
+     * @public
     * @param $tokens_of_children Array of HTMLPurifier_Token
     * @param $config HTMLPurifier_Config object
     * @param $context HTMLPurifier_Context object
@@ -50,391 +52,4 @@ class HTMLPurifier_ChildDef
    }
 }

-/**
- * Custom validation class, accepts DTD child definitions
- * 
- * @warning Currently this class is an all or nothing proposition, that is,
- *          it will only give a bool return value.
- * @note This class is currently not used by any code, although it is unit
- *       tested.
- */
-class HTMLPurifier_ChildDef_Custom extends HTMLPurifier_ChildDef
-{
-    var $type = 'custom';
-    var $allow_empty = false;
-    /**
-     * Allowed child pattern as defined by the DTD
-     */
-    var $dtd_regex;
-    /**
-     * PCRE regex derived from $dtd_regex
-     * @private
-     */
-    var $_pcre_regex;
-    /**
-     * @param $dtd_regex Allowed child pattern from the DTD
-     */
-    function HTMLPurifier_ChildDef_Custom($dtd_regex) {
-        $this->dtd_regex = $dtd_regex;
-        $this->_compileRegex();
-    }
-    /**
-     * Compiles the PCRE regex from a DTD regex ($dtd_regex to $_pcre_regex)
-     */
-    function _compileRegex() {
-        $raw = str_replace(' ', '', $this->dtd_regex);
-        if ($raw{0} != '(') {
-            $raw = "($raw)";
-        }
-        $reg = str_replace(',', ',?', $raw);
-        $reg = preg_replace('/([#a-zA-Z0-9_.-]+)/', '(,?\\0)', $reg);
-        $this->_pcre_regex = $reg;
-    }
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $list_of_children = '';
-        $nesting = 0; // depth into the nest
-        foreach ($tokens_of_children as $token) {
-            if (!empty($token->is_whitespace)) continue;
-            
-            $is_child = ($nesting == 0); // direct
-            
-            if ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            if ($is_child) {
-                $list_of_children .= $token->name . ',';
-            }
-        }
-        $list_of_children = rtrim($list_of_children, ',');
-        
-        $okay =
-            preg_match(
-                '/^'.$this->_pcre_regex.'$/',
-                $list_of_children
-            );
-        
-        return (bool) $okay;
-    }
-}
-
-/**
- * Definition that allows a set of elements, but disallows empty children.
- */
-class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
-{
-    /**
-     * Lookup table of allowed elements.
-     */
-    var $elements = array();
-    /**
-     * @param $elements List of allowed element names (lowercase).
-     */
-    function HTMLPurifier_ChildDef_Required($elements) {
-        if (is_string($elements)) {
-            $elements = str_replace(' ', '', $elements);
-            $elements = explode('|', $elements);
-        }
-        $elements = array_flip($elements);
-        foreach ($elements as $i => $x) $elements[$i] = true;
-        $this->elements = $elements;
-        $this->gen = new HTMLPurifier_Generator();
-    }
-    var $allow_empty = false;
-    var $type = 'required';
-    function validateChildren($tokens_of_children, $config, &$context) {
-        // if there are no tokens, delete parent node
-        if (empty($tokens_of_children)) return false;
-        
-        // the new set of children
-        $result = array();
-        
-        // current depth into the nest
-        $nesting = 0;
-        
-        // whether or not we're deleting a node
-        $is_deleting = false;
-        
-        // whether or not parsed character data is allowed
-        // this controls whether or not we silently drop a tag
-        // or generate escaped HTML from it
-        $pcdata_allowed = isset($this->elements['#PCDATA']);
-        
-        // a little sanity check to make sure it's not ALL whitespace
-        $all_whitespace = true;
-        
-        // some configuration
-        $escape_invalid_children = $config->get('Core', 'EscapeInvalidChildren');
-        
-        foreach ($tokens_of_children as $token) {
-            if (!empty($token->is_whitespace)) {
-                $result[] = $token;
-                continue;
-            }
-            $all_whitespace = false; // phew, we're not talking about whitespace
-            
-            $is_child = ($nesting == 0);
-            
-            if ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            if ($is_child) {
-                $is_deleting = false;
-                if (!isset($this->elements[$token->name])) {
-                    $is_deleting = true;
-                    if ($pcdata_allowed && $token->type == 'text') {
-                        $result[] = $token;
-                    } elseif ($pcdata_allowed && $escape_invalid_children) {
-                        $result[] = new HTMLPurifier_Token_Text(
-                            $this->gen->generateFromToken($token, $config)
-                        );
-                    }
-                    continue;
-                }
-            }
-            if (!$is_deleting || ($pcdata_allowed && $token->type == 'text')) {
-                $result[] = $token;
-            } elseif ($pcdata_allowed && $escape_invalid_children) {
-                $result[] =
-                    new HTMLPurifier_Token_Text(
-                        $this->gen->generateFromToken( $token, $config )
-                    );
-            } else {
-                // drop silently
-            }
-        }
-        if (empty($result)) return false;
-        if ($all_whitespace) return false;
-        if ($tokens_of_children == $result) return true;
-        return $result;
-    }
-}
-
-/**
- * Definition that allows a set of elements, and allows no children.
- * @note This is a hack to reuse code from HTMLPurifier_ChildDef_Required,
- *       really, one shouldn't inherit from the other.  Only altered behavior
- *       is to overload a returned false with an array.  Thus, it will never
- *       return false.
- */
-class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
-{
-    var $allow_empty = true;
-    var $type = 'optional';
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $result = parent::validateChildren($tokens_of_children, $config, $context);
-        if ($result === false) return array();
-        return $result;
-    }
-}
-
-/**
- * Definition that disallows all elements.
- * @warning validateChildren() in this class is actually never called, because
- *          empty elements are corrected in HTMLPurifier_Strategy_MakeWellFormed
- *          before child definitions are parsed in earnest by
- *          HTMLPurifier_Strategy_FixNesting.
- */
-class HTMLPurifier_ChildDef_Empty extends HTMLPurifier_ChildDef
-{
-    var $allow_empty = true;
-    var $type = 'empty';
-    function HTMLPurifier_ChildDef_Empty() {}
-    function validateChildren($tokens_of_children, $config, &$context) {
-        return array();
-    }
-}
-
-/**
- * Definition that uses different definitions depending on context.
- * 
- * The del and ins tags are notable because they allow different types of
- * elements depending on whether or not they're in a block or inline context.
- * Chameleon allows this behavior to happen by using two different
- * definitions depending on context.  While this somewhat generalized,
- * it is specifically intended for those two tags.
- */
-class HTMLPurifier_ChildDef_Chameleon extends HTMLPurifier_ChildDef
-{
-    
-    /**
-     * Instance of the definition object to use when inline. Usually stricter.
-     */
-    var $inline;
-    /**
-     * Instance of the definition object to use when block.
-     */
-    var $block;
-    
-    /**
-     * @param $inline List of elements to allow when inline.
-     * @param $block List of elements to allow when block.
-     */
-    function HTMLPurifier_ChildDef_Chameleon($inline, $block) {
-        $this->inline = new HTMLPurifier_ChildDef_Optional($inline);
-        $this->block  = new HTMLPurifier_ChildDef_Optional($block);
-    }
-    
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $parent_type = $context->get('ParentType');
-        switch ($parent_type) {
-            case 'unknown':
-            case 'inline':
-                $result = $this->inline->validateChildren(
-                    $tokens_of_children, $config, $context);
-                break;
-            case 'block':
-                $result = $this->block->validateChildren(
-                    $tokens_of_children, $config, $context);
-                break;
-            default:
-                trigger_error('Invalid context', E_USER_ERROR);
-                return false;
-        }
-        return $result;
-    }
-}
-
-/**
- * Definition for tables
- */
-class HTMLPurifier_ChildDef_Table extends HTMLPurifier_ChildDef
-{
-    var $allow_empty = false;
-    var $type = 'table';
-    function HTMLPurifier_ChildDef_Table() {}
-    function validateChildren($tokens_of_children, $config, &$context) {
-        if (empty($tokens_of_children)) return false;
-        
-        // this ensures that the loop gets run one last time before closing
-        // up. It's a little bit of a hack, but it works! Just make sure you
-        // get rid of the token later.
-        $tokens_of_children[] = false;
-        
-        // only one of these elements is allowed in a table
-        $caption = false;
-        $thead   = false;
-        $tfoot   = false;
-        
-        // as many of these as you want
-        $cols    = array();
-        $content = array();
-        
-        $nesting = 0; // current depth so we can determine nodes
-        $is_collecting = false; // are we globbing together tokens to package
-                                // into one of the collectors?
-        $collection = array(); // collected nodes
-        $tag_index = 0; // the first node might be whitespace,
-                            // so this tells us where the start tag is
-        
-        foreach ($tokens_of_children as $token) {
-            $is_child = ($nesting == 0);
-            
-            if ($token === false) {
-                // terminating sequence started
-            } elseif ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            // handle node collection
-            if ($is_collecting) {
-                if ($is_child) {
-                    // okay, let's stash the tokens away
-                    // first token tells us the type of the collection
-                    switch ($collection[$tag_index]->name) {
-                        case 'tr':
-                        case 'tbody':
-                            $content[] = $collection;
-                            break;
-                        case 'caption':
-                            if ($caption !== false) break;
-                            $caption = $collection;
-                            break;
-                        case 'thead':
-                        case 'tfoot':
-                            // access the appropriate variable, $thead or $tfoot
-                            $var = $collection[$tag_index]->name;
-                            if ($$var === false) {
-                                $$var = $collection;
-                            } else {
-                                // transmutate the first and less entries into
-                                // tbody tags, and then put into content
-                                $collection[$tag_index]->name = 'tbody';
-                                $collection[count($collection)-1]->name = 'tbody';
-                                $content[] = $collection;
-                            }
-                            break;
-                         case 'colgroup':
-                            $cols[] = $collection;
-                            break;
-                    }
-                    $collection = array();
-                    $is_collecting = false;
-                    $tag_index = 0;
-                } else {
-                    // add the node to the collection
-                    $collection[] = $token;
-                }
-            }
-            
-            // terminate
-            if ($token === false) break;
-            
-            if ($is_child) {
-                // determine what we're dealing with
-                if ($token->name == 'col') {
-                    // the only empty tag in the possie, we can handle it
-                    // immediately
-                    $cols[] = array_merge($collection, array($token));
-                    $collection = array();
-                    $tag_index = 0;
-                    continue;
-                }
-                switch($token->name) {
-                    case 'caption':
-                    case 'colgroup':
-                    case 'thead':
-                    case 'tfoot':
-                    case 'tbody':
-                    case 'tr':
-                        $is_collecting = true;
-                        $collection[] = $token;
-                        continue;
-                    default:
-                        if ($token->type == 'text' && $token->is_whitespace) {
-                            $collection[] = $token;
-                            $tag_index++;
-                        }
-                        continue;
-                }
-            }
-        }
-        
-        if (empty($content)) return false;
-        
-        $ret = array();
-        if ($caption !== false) $ret = array_merge($ret, $caption);
-        if ($cols !== false)    foreach ($cols as $token_array) $ret = array_merge($ret, $token_array);
-        if ($thead !== false)   $ret = array_merge($ret, $thead);
-        if ($tfoot !== false)   $ret = array_merge($ret, $tfoot);
-        foreach ($content as $token_array) $ret = array_merge($ret, $token_array);
-        if (!empty($collection) && $is_collecting == false){
-            // grab the trailing space
-            $ret = array_merge($ret, $collection);
-        }
-        
-        array_pop($tokens_of_children); // remove phantom token
-        
-        return ($ret === $tokens_of_children) ? true : $ret;
-        
-    }
-}
-
 ?>
--- a/library/HTMLPurifier/ChildDef/Chameleon.php
+++ b/library/HTMLPurifier/ChildDef/Chameleon.php
@@ -0,0 +1,60 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef.php';
+
+/**
+ * Definition that uses different definitions depending on context.
+ * 
+ * The del and ins tags are notable because they allow different types of
+ * elements depending on whether or not they're in a block or inline context.
+ * Chameleon allows this behavior to happen by using two different
+ * definitions depending on context.  While this somewhat generalized,
+ * it is specifically intended for those two tags.
+ */
+class HTMLPurifier_ChildDef_Chameleon extends HTMLPurifier_ChildDef
+{
+    
+    /**
+     * Instance of the definition object to use when inline. Usually stricter.
+     * @public
+     */
+    var $inline;
+    
+    /**
+     * Instance of the definition object to use when block.
+     * @public
+     */
+    var $block;
+    
+    var $type = 'chameleon';
+    
+    /**
+     * @param $inline List of elements to allow when inline.
+     * @param $block List of elements to allow when block.
+     */
+    function HTMLPurifier_ChildDef_Chameleon($inline, $block) {
+        $this->inline = new HTMLPurifier_ChildDef_Optional($inline);
+        $this->block  = new HTMLPurifier_ChildDef_Optional($block);
+    }
+    
+    function validateChildren($tokens_of_children, $config, &$context) {
+        $parent_type = $context->get('ParentType');
+        switch ($parent_type) {
+            case 'unknown':
+            case 'inline':
+                $result = $this->inline->validateChildren(
+                    $tokens_of_children, $config, $context);
+                break;
+            case 'block':
+                $result = $this->block->validateChildren(
+                    $tokens_of_children, $config, $context);
+                break;
+            default:
+                trigger_error('Invalid context', E_USER_ERROR);
+                return false;
+        }
+        return $result;
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/Custom.php
+++ b/library/HTMLPurifier/ChildDef/Custom.php
@@ -0,0 +1,75 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef.php';
+
+/**
+ * Custom validation class, accepts DTD child definitions
+ * 
+ * @warning Currently this class is an all or nothing proposition, that is,
+ *          it will only give a bool return value.
+ * @note This class is currently not used by any code, although it is unit
+ *       tested.
+ */
+class HTMLPurifier_ChildDef_Custom extends HTMLPurifier_ChildDef
+{
+    var $type = 'custom';
+    var $allow_empty = false;
+    /**
+     * Allowed child pattern as defined by the DTD
+     */
+    var $dtd_regex;
+    /**
+     * PCRE regex derived from $dtd_regex
+     * @private
+     */
+    var $_pcre_regex;
+    /**
+     * @param $dtd_regex Allowed child pattern from the DTD
+     */
+    function HTMLPurifier_ChildDef_Custom($dtd_regex) {
+        $this->dtd_regex = $dtd_regex;
+        $this->_compileRegex();
+    }
+    /**
+     * Compiles the PCRE regex from a DTD regex ($dtd_regex to $_pcre_regex)
+     */
+    function _compileRegex() {
+        $raw = str_replace(' ', '', $this->dtd_regex);
+        if ($raw{0} != '(') {
+            $raw = "($raw)";
+        }
+        $reg = str_replace(',', ',?', $raw);
+        $reg = preg_replace('/([#a-zA-Z0-9_.-]+)/', '(,?\\0)', $reg);
+        $this->_pcre_regex = $reg;
+    }
+    function validateChildren($tokens_of_children, $config, &$context) {
+        $list_of_children = '';
+        $nesting = 0; // depth into the nest
+        foreach ($tokens_of_children as $token) {
+            if (!empty($token->is_whitespace)) continue;
+            
+            $is_child = ($nesting == 0); // direct
+            
+            if ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            if ($is_child) {
+                $list_of_children .= $token->name . ',';
+            }
+        }
+        $list_of_children = rtrim($list_of_children, ',');
+        
+        $okay =
+            preg_match(
+                '/^'.$this->_pcre_regex.'$/',
+                $list_of_children
+            );
+        
+        return (bool) $okay;
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/Empty.php
+++ b/library/HTMLPurifier/ChildDef/Empty.php
@@ -0,0 +1,22 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef.php';
+
+/**
+ * Definition that disallows all elements.
+ * @warning validateChildren() in this class is actually never called, because
+ *          empty elements are corrected in HTMLPurifier_Strategy_MakeWellFormed
+ *          before child definitions are parsed in earnest by
+ *          HTMLPurifier_Strategy_FixNesting.
+ */
+class HTMLPurifier_ChildDef_Empty extends HTMLPurifier_ChildDef
+{
+    var $allow_empty = true;
+    var $type = 'empty';
+    function HTMLPurifier_ChildDef_Empty() {}
+    function validateChildren($tokens_of_children, $config, &$context) {
+        return array();
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/Optional.php
+++ b/library/HTMLPurifier/ChildDef/Optional.php
@@ -0,0 +1,23 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef/Required.php';
+
+/**
+ * Definition that allows a set of elements, and allows no children.
+ * @note This is a hack to reuse code from HTMLPurifier_ChildDef_Required,
+ *       really, one shouldn't inherit from the other.  Only altered behavior
+ *       is to overload a returned false with an array.  Thus, it will never
+ *       return false.
+ */
+class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
+{
+    var $allow_empty = true;
+    var $type = 'optional';
+    function validateChildren($tokens_of_children, $config, &$context) {
+        $result = parent::validateChildren($tokens_of_children, $config, $context);
+        if ($result === false) return array();
+        return $result;
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/Required.php
+++ b/library/HTMLPurifier/ChildDef/Required.php
@@ -0,0 +1,104 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef.php';
+
+/**
+ * Definition that allows a set of elements, but disallows empty children.
+ */
+class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
+{
+    /**
+     * Lookup table of allowed elements.
+     * @public
+     */
+    var $elements = array();
+    /**
+     * @param $elements List of allowed element names (lowercase).
+     */
+    function HTMLPurifier_ChildDef_Required($elements) {
+        if (is_string($elements)) {
+            $elements = str_replace(' ', '', $elements);
+            $elements = explode('|', $elements);
+        }
+        $elements = array_flip($elements);
+        foreach ($elements as $i => $x) {
+            $elements[$i] = true;
+            if (empty($i)) unset($elements[$i]);
+        }
+        $this->elements = $elements;
+        $this->gen = new HTMLPurifier_Generator();
+    }
+    var $allow_empty = false;
+    var $type = 'required';
+    function validateChildren($tokens_of_children, $config, &$context) {
+        // if there are no tokens, delete parent node
+        if (empty($tokens_of_children)) return false;
+        
+        // the new set of children
+        $result = array();
+        
+        // current depth into the nest
+        $nesting = 0;
+        
+        // whether or not we're deleting a node
+        $is_deleting = false;
+        
+        // whether or not parsed character data is allowed
+        // this controls whether or not we silently drop a tag
+        // or generate escaped HTML from it
+        $pcdata_allowed = isset($this->elements['#PCDATA']);
+        
+        // a little sanity check to make sure it's not ALL whitespace
+        $all_whitespace = true;
+        
+        // some configuration
+        $escape_invalid_children = $config->get('Core', 'EscapeInvalidChildren');
+        
+        foreach ($tokens_of_children as $token) {
+            if (!empty($token->is_whitespace)) {
+                $result[] = $token;
+                continue;
+            }
+            $all_whitespace = false; // phew, we're not talking about whitespace
+            
+            $is_child = ($nesting == 0);
+            
+            if ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            if ($is_child) {
+                $is_deleting = false;
+                if (!isset($this->elements[$token->name])) {
+                    $is_deleting = true;
+                    if ($pcdata_allowed && $token->type == 'text') {
+                        $result[] = $token;
+                    } elseif ($pcdata_allowed && $escape_invalid_children) {
+                        $result[] = new HTMLPurifier_Token_Text(
+                            $this->gen->generateFromToken($token, $config)
+                        );
+                    }
+                    continue;
+                }
+            }
+            if (!$is_deleting || ($pcdata_allowed && $token->type == 'text')) {
+                $result[] = $token;
+            } elseif ($pcdata_allowed && $escape_invalid_children) {
+                $result[] =
+                    new HTMLPurifier_Token_Text(
+                        $this->gen->generateFromToken( $token, $config )
+                    );
+            } else {
+                // drop silently
+            }
+        }
+        if (empty($result)) return false;
+        if ($all_whitespace) return false;
+        if ($tokens_of_children == $result) return true;
+        return $result;
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/StrictBlockquote.php
+++ b/library/HTMLPurifier/ChildDef/StrictBlockquote.php
@@ -0,0 +1,70 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef/Required.php';
+
+/**
+ * Takes the contents of blockquote when in strict and reformats for validation.
+ * 
+ * From XHTML 1.0 Transitional to Strict, there is a notable change where 
+ */
+class   HTMLPurifier_ChildDef_StrictBlockquote
+extends HTMLPurifier_ChildDef_Required
+{
+    var $allow_empty = true;
+    var $type = 'strictblockquote';
+    var $init = false;
+    function HTMLPurifier_ChildDef_StrictBlockquote() {}
+    function validateChildren($tokens_of_children, $config, &$context) {
+        
+        $def = $config->getHTMLDefinition();
+        if (!$this->init) {
+            // allow all inline elements
+            $this->elements = $def->info_flow_elements;
+            $this->elements['#PCDATA'] = true;
+            $this->init = true;
+        }
+        
+        $result = parent::validateChildren($tokens_of_children, $config, $context);
+        if ($result === false) return array();
+        if ($result === true) $result = $tokens_of_children;
+        
+        $block_wrap_start = new HTMLPurifier_Token_Start($def->info_block_wrapper);
+        $block_wrap_end   = new HTMLPurifier_Token_End(  $def->info_block_wrapper);
+        $is_inline = false;
+        $depth = 0;
+        $ret = array();
+        
+        // assuming that there are no comment tokens
+        foreach ($result as $i => $token) {
+            $token = $result[$i];
+            // ifs are nested for readability
+            if (!$is_inline) {
+                if (!$depth) {
+                     if (($token->type == 'text') ||
+                         ($def->info[$token->name]->type == 'inline')) {
+                        $is_inline = true;
+                        $ret[] = $block_wrap_start;
+                     }
+                }
+            } else {
+                if (!$depth) {
+                    // starting tokens have been inline text / empty
+                    if ($token->type == 'start' || $token->type == 'empty') {
+                        if ($def->info[$token->name]->type == 'block') {
+                            // ended
+                            $ret[] = $block_wrap_end;
+                            $is_inline = false;
+                        }
+                    }
+                }
+            }
+            $ret[] = $token;
+            if ($token->type == 'start') $depth++;
+            if ($token->type == 'end')   $depth--;
+        }
+        if ($is_inline) $ret[] = $block_wrap_end;
+        return $ret;
+    }
+}
+
+?>
--- a/library/HTMLPurifier/ChildDef/Table.php
+++ b/library/HTMLPurifier/ChildDef/Table.php
@@ -0,0 +1,142 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDef.php';
+
+/**
+ * Definition for tables
+ */
+class HTMLPurifier_ChildDef_Table extends HTMLPurifier_ChildDef
+{
+    var $allow_empty = false;
+    var $type = 'table';
+    function HTMLPurifier_ChildDef_Table() {}
+    function validateChildren($tokens_of_children, $config, &$context) {
+        if (empty($tokens_of_children)) return false;
+        
+        // this ensures that the loop gets run one last time before closing
+        // up. It's a little bit of a hack, but it works! Just make sure you
+        // get rid of the token later.
+        $tokens_of_children[] = false;
+        
+        // only one of these elements is allowed in a table
+        $caption = false;
+        $thead   = false;
+        $tfoot   = false;
+        
+        // as many of these as you want
+        $cols    = array();
+        $content = array();
+        
+        $nesting = 0; // current depth so we can determine nodes
+        $is_collecting = false; // are we globbing together tokens to package
+                                // into one of the collectors?
+        $collection = array(); // collected nodes
+        $tag_index = 0; // the first node might be whitespace,
+                            // so this tells us where the start tag is
+        
+        foreach ($tokens_of_children as $token) {
+            $is_child = ($nesting == 0);
+            
+            if ($token === false) {
+                // terminating sequence started
+            } elseif ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            // handle node collection
+            if ($is_collecting) {
+                if ($is_child) {
+                    // okay, let's stash the tokens away
+                    // first token tells us the type of the collection
+                    switch ($collection[$tag_index]->name) {
+                        case 'tr':
+                        case 'tbody':
+                            $content[] = $collection;
+                            break;
+                        case 'caption':
+                            if ($caption !== false) break;
+                            $caption = $collection;
+                            break;
+                        case 'thead':
+                        case 'tfoot':
+                            // access the appropriate variable, $thead or $tfoot
+                            $var = $collection[$tag_index]->name;
+                            if ($$var === false) {
+                                $$var = $collection;
+                            } else {
+                                // transmutate the first and less entries into
+                                // tbody tags, and then put into content
+                                $collection[$tag_index]->name = 'tbody';
+                                $collection[count($collection)-1]->name = 'tbody';
+                                $content[] = $collection;
+                            }
+                            break;
+                         case 'colgroup':
+                            $cols[] = $collection;
+                            break;
+                    }
+                    $collection = array();
+                    $is_collecting = false;
+                    $tag_index = 0;
+                } else {
+                    // add the node to the collection
+                    $collection[] = $token;
+                }
+            }
+            
+            // terminate
+            if ($token === false) break;
+            
+            if ($is_child) {
+                // determine what we're dealing with
+                if ($token->name == 'col') {
+                    // the only empty tag in the possie, we can handle it
+                    // immediately
+                    $cols[] = array_merge($collection, array($token));
+                    $collection = array();
+                    $tag_index = 0;
+                    continue;
+                }
+                switch($token->name) {
+                    case 'caption':
+                    case 'colgroup':
+                    case 'thead':
+                    case 'tfoot':
+                    case 'tbody':
+                    case 'tr':
+                        $is_collecting = true;
+                        $collection[] = $token;
+                        continue;
+                    default:
+                        if ($token->type == 'text' && $token->is_whitespace) {
+                            $collection[] = $token;
+                            $tag_index++;
+                        }
+                        continue;
+                }
+            }
+        }
+        
+        if (empty($content)) return false;
+        
+        $ret = array();
+        if ($caption !== false) $ret = array_merge($ret, $caption);
+        if ($cols !== false)    foreach ($cols as $token_array) $ret = array_merge($ret, $token_array);
+        if ($thead !== false)   $ret = array_merge($ret, $thead);
+        if ($tfoot !== false)   $ret = array_merge($ret, $tfoot);
+        foreach ($content as $token_array) $ret = array_merge($ret, $token_array);
+        if (!empty($collection) && $is_collecting == false){
+            // grab the trailing space
+            $ret = array_merge($ret, $collection);
+        }
+        
+        array_pop($tokens_of_children); // remove phantom token
+        
+        return ($ret === $tokens_of_children) ? true : $ret;
+        
+    }
+}
+
+?>
--- a/library/HTMLPurifier/Config.php
+++ b/library/HTMLPurifier/Config.php
@@ -68,6 +68,19 @@ class HTMLPurifier_Config
        return $this->conf[$namespace][$key];
    }
    
+    /**
+     * Retreives an array of directives to values from a given namespace
+     * @param $namespace String namespace
+     */
+    function getBatch($namespace) {
+        if (!isset($this->def->info[$namespace])) {
+            trigger_error('Cannot retrieve undefined namespace',
+                E_USER_WARNING);
+            return;
+        }
+        return $this->conf[$namespace];
+    }
+    
    /**
     * Sets a value to configuration.
     * @param $namespace String namespace
@@ -134,6 +147,7 @@ class HTMLPurifier_Config
     */
    function loadArray($config_array) {
        foreach ($config_array as $key => $value) {
+            $key = str_replace('_', '.', $key);
            if (strpos($key, '.') !== false) {
                // condensed form
                list($namespace, $directive) = explode('.', $key);
--- a/library/HTMLPurifier/ConfigSchema.php
+++ b/library/HTMLPurifier/ConfigSchema.php
@@ -247,11 +247,26 @@ class HTMLPurifier_ConfigSchema {
            case 'bool':
                if (is_int($var) && ($var === 0 || $var === 1)) {
                    $var = (bool) $var;
+                } elseif (is_string($var)) {
+                    if ($var == 'on' || $var == 'true' || $var == '1') {
+                        $var = true;
+                    } elseif ($var == 'off' || $var == 'false' || $var == '0') {
+                        $var = false;
+                    } else {
+                        break;
+                    }
                } elseif (!is_bool($var)) break;
                return $var;
            case 'list':
            case 'hash':
            case 'lookup':
+                if (is_string($var)) {
+                    // simplistic string to array method that only works
+                    // for simple lists of tag names or alphanumeric characters
+                    $var = explode(',',$var);
+                    // remove spaces
+                    foreach ($var as $i => $j) $var[$i] = trim($j);
+                }
                if (!is_array($var)) break;
                $keys = array_keys($var);
                if ($keys === array_keys($keys)) {
--- a/library/HTMLPurifier/HTMLDefinition.php
+++ b/library/HTMLPurifier/HTMLDefinition.php
@@ -18,6 +18,12 @@ require_once 'HTMLPurifier/AttrTransform.php';
    require_once 'HTMLPurifier/AttrTransform/BdoDir.php';
    require_once 'HTMLPurifier/AttrTransform/ImgRequired.php';
 require_once 'HTMLPurifier/ChildDef.php';
+    require_once 'HTMLPurifier/ChildDef/Chameleon.php';
+    require_once 'HTMLPurifier/ChildDef/Empty.php';
+    require_once 'HTMLPurifier/ChildDef/Required.php';
+    require_once 'HTMLPurifier/ChildDef/Optional.php';
+    require_once 'HTMLPurifier/ChildDef/Table.php';
+    require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
 require_once 'HTMLPurifier/Generator.php';
 require_once 'HTMLPurifier/Token.php';
 require_once 'HTMLPurifier/TagTransform.php';
@@ -35,6 +41,63 @@ HTMLPurifier_ConfigSchema::define(
    'versions.'
 );

+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'Strict', false, 'bool',
+    'Determines whether or not to use Transitional (loose) or Strict rulesets. '.
+    'This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'BlockWrapper', 'p', 'string',
+    'String name of element to wrap inline elements that are inside a block '.
+    'context.  This only occurs in the children of blockquote in strict mode. '.
+    'Example: by default value, <code>&lt;blockquote&gt;Foo&lt;/blockquote&gt;</code> '.
+    'would become <code>&lt;blockquote&gt;&lt;p&gt;Foo&lt;/p&gt;&lt;/blockquote&gt;</code>. The '.
+    '<code>&lt;p&gt;</code> tags can be replaced '.
+    'with whatever you desire, as long as it is a block level element. '.
+    'This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'Parent', 'div', 'string',
+    'String name of element that HTML fragment passed to library will be '.
+    'inserted in.  An interesting variation would be using span as the '.
+    'parent element, meaning that only inline tags would be allowed. '.
+    'This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'AllowedElements', null, 'lookup/null',
+    'If HTML Purifier\'s tag set is unsatisfactory for your needs, you '.
+    'can overload it with your own list of tags to allow.  Note that this '.
+    'method is subtractive: it does its job by taking away from HTML Purifier '.
+    'usual feature set, so you cannot add a tag that HTML Purifier never '.
+    'supported in the first place (like embed).  If you change this, you '.
+    'probably also want to change %HTML.AllowedAttributes. '.
+    '<strong>Warning:</strong> If another directive conflicts with the '.
+    'elements here, <em>that</em> directive will win and override. '.
+    'This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'AllowedAttributes', null, 'lookup/null',
+    'IF HTML Purifier\'s attribute set is unsatisfactory, overload it! '.
+    'The syntax is \'tag.attr\' or \'*.attr\' for the global attributes '.
+    '(style, id, class, dir, lang, xml:lang).'.
+    '<strong>Warning:</strong> If another directive conflicts with the '.
+    'elements here, <em>that</em> directive will win and override. For '.
+    'example, %HTML.EnableAttrID will take precedence over *.id in this '.
+    'directive.  You must set that directive to true before you can use '.
+    'IDs at all. This directive has been available since 1.3.0.'
+);
+
+HTMLPurifier_ConfigSchema::define(
+    'Attr', 'DisableURI', false, 'bool',
+    'Disables all URIs in all forms. Not sure why you\'d want to do that '.
+    '(after all, the Internet\'s founded on the notion of a hyperlink). '.
+    'This directive has been available since 1.3.0.'
+);
+
 /**
 * Defines the purified HTML type with large amounts of objects.
 * 
@@ -69,11 +132,24 @@ class HTMLPurifier_HTMLDefinition
    
    /**
     * String name of parent element HTML will be going into.
-     * @todo Allow this to be overloaded by user config
     * @public
     */
    var $info_parent = 'div';
    
+    /**
+     * Definition for parent element, allows parent element to be a
+     * tag that's not allowed inside the HTML fragment.
+     * @public
+     */
+    var $info_parent_def;
+    
+    /**
+     * String name of element used to wrap inline elements in block context
+     * @note This is rarely used except for BLOCKQUOTEs in strict mode
+     * @public
+     */
+    var $info_block_wrapper = 'p';
+    
    /**
     * Associative array of deprecated tag name to HTMLPurifier_TagTransform
     * @public
@@ -92,14 +168,25 @@ class HTMLPurifier_HTMLDefinition
     */
    var $info_attr_transform_post = array();
    
+    /**
+     * Lookup table of flow elements
+     * @public
+     */
+    var $info_flow_elements = array();
+    
+    /**
+     * Boolean is a strict definition?
+     * @public
+     */
+    var $strict;
+    
    /**
     * Initializes the definition, the meat of the class.
     */
    function setup($config) {
        
-        // emulates the structure of the DTD
-        // these are condensed, however, with bad stuff taken out
-        // screening process was done by hand
+        // some cached config values
+        $this->strict = $config->get('HTML', 'Strict');
        
        //////////////////////////////////////////////////////////////////////
        // info[] : initializes the definition objects
@@ -111,13 +198,19 @@ class HTMLPurifier_HTMLDefinition
            array(
                'ins', 'del', 'blockquote', 'dd', 'li', 'div', 'em', 'strong',
                'dfn', 'code', 'samp', 'kbd', 'var', 'cite', 'abbr', 'acronym',
-                'q', 'sub', 'tt', 'sup', 'i', 'b', 'big', 'small', 'u', 's',
-                'strike', 'bdo', 'span', 'dt', 'p', 'h1', 'h2', 'h3', 'h4',
+                'q', 'sub', 'tt', 'sup', 'i', 'b', 'big', 'small',
+                'bdo', 'span', 'dt', 'p', 'h1', 'h2', 'h3', 'h4',
                'h5', 'h6', 'ol', 'ul', 'dl', 'address', 'img', 'br', 'hr',
                'pre', 'a', 'table', 'caption', 'thead', 'tfoot', 'tbody',
                'colgroup', 'col', 'td', 'th', 'tr'
            );
        
+        if (!$this->strict) {
+            $allowed_tags[] = 'u';
+            $allowed_tags[] = 's';
+            $allowed_tags[] = 'strike';
+        }
+        
        foreach ($allowed_tags as $tag) {
            $this->info[$tag] = new HTMLPurifier_ElementDef();
        }
@@ -125,6 +218,10 @@ class HTMLPurifier_HTMLDefinition
        //////////////////////////////////////////////////////////////////////
        // info[]->child : defines allowed children for elements
        
+        // emulates the structure of the DTD
+        // however, these are condensed, with bad stuff taken out
+        // screening process was done by hand
+        
        // entities: prefixed with e_ and _ replaces . from DTD
        // double underlines are entities we made up
        
@@ -148,11 +245,9 @@ class HTMLPurifier_HTMLDefinition
        $e_phrase_basic = 'em | strong | dfn | code | q | samp | kbd | var'.
          ' | cite | abbr | acronym';
        $e_phrase = "$e_phrase_basic | $e_phrase_extra";
-        $e_inline_forms = ''; // humor the dtd
        $e_misc_inline = 'ins | del';
        $e_misc = "$e_misc_inline";
-        $e_inline = "a | $e_special | $e_fontstyle | $e_phrase".
-          " | $e_inline_forms";
+        $e_inline = "a | $e_special | $e_fontstyle | $e_phrase";
        // pseudo-property we created for convenience, see later on
        $e__inline = "#PCDATA | $e_inline | $e_misc_inline";
        // note the casing
@@ -161,14 +256,14 @@ class HTMLPurifier_HTMLDefinition
        $e_lists = 'ul | ol | dl';
        $e_blocktext = 'pre | hr | blockquote | address';
        $e_block = "p | $e_heading | div | $e_lists | $e_blocktext | table";
+        $e_Block = new HTMLPurifier_ChildDef_Optional($e_block);
        $e__flow = "#PCDATA | $e_block | $e_inline | $e_misc";
        $e_Flow = new HTMLPurifier_ChildDef_Optional($e__flow);
        $e_a_content = new HTMLPurifier_ChildDef_Optional("#PCDATA".
-          " | $e_special | $e_fontstyle | $e_phrase | $e_inline_forms".
-          " | $e_misc_inline");
+          " | $e_special | $e_fontstyle | $e_phrase | $e_misc_inline");
        $e_pre_content = new HTMLPurifier_ChildDef_Optional("#PCDATA | a".
          " | $e_special_basic | $e_fontstyle_basic | $e_phrase_basic".
-          " | $e_inline_forms | $e_misc_inline");
+          " | $e_misc_inline");
        $e_form_content = new HTMLPurifier_ChildDef_Optional('');//unused
        $e_form_button_content = new HTMLPurifier_ChildDef_Optional('');//unused
        
@@ -176,11 +271,16 @@ class HTMLPurifier_HTMLDefinition
        $this->info['del']->child =
            new HTMLPurifier_ChildDef_Chameleon($e__inline, $e__flow);
        
-        $this->info['blockquote']->child=
        $this->info['dd']->child  =
        $this->info['li']->child  =
        $this->info['div']->child = $e_Flow;
        
+        if ($this->strict) {
+            $this->info['blockquote']->child = new HTMLPurifier_ChildDef_StrictBlockquote();
+        } else {
+            $this->info['blockquote']->child = $e_Flow;
+        }
+        
        $this->info['caption']->child   = 
        $this->info['em']->child   =
        $this->info['strong']->child    =
@@ -220,9 +320,13 @@ class HTMLPurifier_HTMLDefinition
        
        $this->info['dl']->child   = new HTMLPurifier_ChildDef_Required('dt|dd');
        
-        $this->info['address']->child =
-          new HTMLPurifier_ChildDef_Optional("#PCDATA | p | $e_inline".
-              " | $e_misc_inline");
+        if ($this->strict) {
+            $this->info['address']->child = $e_Inline;
+        } else {
+            $this->info['address']->child =
+              new HTMLPurifier_ChildDef_Optional("#PCDATA | p | $e_inline".
+                  " | $e_misc_inline");
+        }
        
        $this->info['img']->child  =
        $this->info['br']->child   =
@@ -250,15 +354,18 @@ class HTMLPurifier_HTMLDefinition
        
        // reuses $e_Inline and $e_Block
        foreach ($e_Inline->elements as $name => $bool) {
-            if ($name == '#PCDATA' || $name == '') continue;
+            if ($name == '#PCDATA') continue;
            $this->info[$name]->type = 'inline';
        }
        
-        $e_Block = new HTMLPurifier_ChildDef_Optional($e_block);
        foreach ($e_Block->elements as $name => $bool) {
            $this->info[$name]->type = 'block';
        }
        
+        foreach ($e_Flow->elements as $name => $bool) {
+            $this->info_flow_elements[$name] = true;
+        }
+        
        //////////////////////////////////////////////////////////////////////
        // info[]->excludes : defines elements that aren't allowed in here
        
@@ -348,16 +455,23 @@ class HTMLPurifier_HTMLDefinition
        $this->info['td']->attr['colspan'] =
        $this->info['th']->attr['colspan'] = $e__NumberSpan;
        
-        $e_URI = new HTMLPurifier_AttrDef_URI();
-        $this->info['a']->attr['href'] =
-        $this->info['img']->attr['longdesc'] =
-        $this->info['del']->attr['cite'] =
-        $this->info['ins']->attr['cite'] =
-        $this->info['blockquote']->attr['cite'] =
-        $this->info['q']->attr['cite'] = $e_URI;
+        if (!$config->get('Attr', 'DisableURI')) {
+            $e_URI = new HTMLPurifier_AttrDef_URI();
+            $this->info['a']->attr['href'] =
+            $this->info['img']->attr['longdesc'] =
+            $this->info['del']->attr['cite'] =
+            $this->info['ins']->attr['cite'] =
+            $this->info['blockquote']->attr['cite'] =
+            $this->info['q']->attr['cite'] = $e_URI;
+            
+            // URI that causes HTTP request
+            $this->info['img']->attr['src'] = new HTMLPurifier_AttrDef_URI(true);
+        }
        
-        // URI that causes HTTP request
-        $this->info['img']->attr['src'] = new HTMLPurifier_AttrDef_URI(true);
+        if (!$this->strict) {
+            $this->info['li']->attr['value'] = new HTMLPurifier_AttrDef_Integer();
+            $this->info['ol']->attr['start'] = new HTMLPurifier_AttrDef_Integer();
+        }
        
        //////////////////////////////////////////////////////////////////////
        // info_tag_transform : transformations of tags
@@ -422,6 +536,53 @@ class HTMLPurifier_HTMLDefinition
            }
        }
        
+        //////////////////////////////////////////////////////////////////////
+        // info_block_wrapper : wraps inline elements in block context
+        
+        $block_wrapper = $config->get('HTML', 'BlockWrapper');
+        if (isset($e_Block->elements[$block_wrapper])) {
+            $this->info_block_wrapper = $block_wrapper;
+        } else {
+            trigger_error('Cannot use non-block element as block wrapper.',
+                E_USER_ERROR);
+        }
+        
+        //////////////////////////////////////////////////////////////////////
+        // info_parent : parent element of the HTML fragment
+        
+        $parent = $config->get('HTML', 'Parent');
+        if (isset($this->info[$parent])) {
+            $this->info_parent = $parent;
+        } else {
+            trigger_error('Cannot use unrecognized element as parent.',
+                E_USER_ERROR);
+        }
+        $this->info_parent_def = $this->info[$this->info_parent];
+        
+        //////////////////////////////////////////////////////////////////////
+        // %HTML.Allowed(Elements|Attributes) : cut non-allowed elements
+        $allowed_elements = $config->get('HTML', 'AllowedElements');
+        if (is_array($allowed_elements)) {
+            // $allowed_elements[$this->info_parent] = true; // allow parent element
+            foreach ($this->info as $name => $d) {
+                if(!isset($allowed_elements[$name])) unset($this->info[$name]);
+            }
+        }
+        $allowed_attributes = $config->get('HTML', 'AllowedAttributes');
+        if (is_array($allowed_attributes)) {
+            foreach ($this->info_global_attr as $attr => $info) {
+                if (!isset($allowed_attributes["*.$attr"])) {
+                    unset($this->info_global_attr[$attr]);
+                }
+            }
+            foreach ($this->info as $tag => $info) {
+                foreach ($info->attr as $attr => $attr_info) {
+                    if (!isset($allowed_attributes["$tag.$attr"])) {
+                        unset($this->info[$tag]->attr[$attr]);
+                    }
+                }
+            }
+        }
    }
    
    function setAttrForTableElements($attr, $def) {
--- a/library/HTMLPurifier/Printer.php
+++ b/library/HTMLPurifier/Printer.php
@@ -0,0 +1,149 @@
+<?php
+
+require_once 'HTMLPurifier/Generator.php';
+require_once 'HTMLPurifier/Token.php';
+require_once 'HTMLPurifier/Encoder.php';
+
+class HTMLPurifier_Printer
+{
+    
+    /**
+     * Instance of HTMLPurifier_Generator for HTML generation convenience funcs
+     */
+    var $generator;
+    
+    /**
+     * Instance of HTMLPurifier_Config, for easy access
+     */
+    var $config;
+    
+    /**
+     * Initialize $generator.
+     */
+    function HTMLPurifier_Printer() {
+        $this->generator = new HTMLPurifier_Generator();
+    }
+    
+    /**
+     * Main function that renders object or aspect of that object
+     * @param $config Configuration object
+     */
+    function render($config) {}
+    
+    /**
+     * Returns a start tag
+     * @param $tag Tag name
+     * @param $attr Attribute array
+     */
+    function start($tag, $attr = array()) {
+        return $this->generator->generateFromToken(
+                    new HTMLPurifier_Token_Start($tag, $attr ? $attr : array())
+               );
+    }
+    
+    /**
+     * Returns an end teg
+     * @param $tag Tag name
+     */
+    function end($tag) {
+        return $this->generator->generateFromToken(
+                    new HTMLPurifier_Token_End($tag)
+               );
+    }
+    
+    /**
+     * Prints a complete element with content inside
+     * @param $tag Tag name
+     * @param $contents Element contents
+     * @param $attr Tag attributes
+     * @param $escape Bool whether or not to escape contents
+     */
+    function element($tag, $contents, $attr = array(), $escape = true) {
+        return $this->start($tag, $attr) .
+               ($escape ? $this->escape($contents) : $contents) .
+               $this->end($tag);
+    }
+    
+    /**
+     * Prints a simple key/value row in a table.
+     * @param $name Key
+     * @param $value Value
+     */
+    function row($name, $value) {
+        if (is_bool($value)) $value = $value ? 'On' : 'Off';
+        return
+            $this->start('tr') . "\n" .
+                $this->element('th', $name) . "\n" .
+                $this->element('td', $value) . "\n" .
+            $this->end('tr')
+        ;
+    }
+    
+    /**
+     * Escapes a string for HTML output.
+     * @param $string String to escape
+     */
+    function escape($string) {
+        $string = HTMLPurifier_Encoder::cleanUTF8($string);
+        $string = htmlspecialchars($string, ENT_COMPAT, 'UTF-8');
+        return $string;
+    }
+    
+    /**
+     * Takes a list of strings and turns them into a single list
+     * @param $array List of strings
+     * @param $polite Bool whether or not to add an end before the last
+     */
+    function listify($array, $polite = false) {
+        if (empty($array)) return 'None';
+        $ret = '';
+        $i = count($array);
+        foreach ($array as $value) {
+            $i--;
+            $ret .= $value;
+            if ($i > 0 && !($polite && $i == 1)) $ret .= ', ';
+            if ($polite && $i == 1) $ret .= 'and ';
+        }
+        return $ret;
+    }
+    
+    /**
+     * Retrieves the class of an object without prefixes, as well as metadata
+     * @param $obj Object to determine class of
+     * @param $prefix Further prefix to remove
+     */
+    function getClass($obj, $sec_prefix = '') {
+        static $five = null;
+        if ($five === null) $five = version_compare(PHP_VERSION, '5', '>=');
+        $prefix = 'HTMLPurifier_' . $sec_prefix;
+        if (!$five) $prefix = strtolower($prefix);
+        $class = str_replace($prefix, '', get_class($obj));
+        $lclass = strtolower($class);
+        $class .= '(';
+        switch ($lclass) {
+            case 'enum':
+                $values = array();
+                foreach ($obj->valid_values as $value => $bool) {
+                    $values[] = $value;
+                }
+                $class .= implode(', ', $values);
+                break;
+            case 'composite':
+                $values = array();
+                foreach ($obj->defs as $def) {
+                    $values[] = $this->getClass($def, $sec_prefix);
+                }
+                $class .= implode(', ', $values);
+                break;
+            case 'multiple':
+                $class .= $this->getClass($obj->single, $sec_prefix) . ', ';
+                $class .= $obj->max;
+                break;
+        }
+        $class .= ')';
+        return $class;
+    }
+    
+}
+
+?>
--- a/library/HTMLPurifier/Printer/CSSDefinition.php
+++ b/library/HTMLPurifier/Printer/CSSDefinition.php
@@ -0,0 +1,40 @@
+<?php
+
+require_once 'HTMLPurifier/Printer.php';
+
+class HTMLPurifier_Printer_CSSDefinition extends HTMLPurifier_Printer
+{
+    
+    var $def;
+    
+    function render($config) {
+        $this->def = $config->getCSSDefinition();
+        $ret = '';
+        
+        $ret .= $this->start('div', array('class' => 'HTMLPurifier_Printer'));
+        $ret .= $this->start('table');
+        
+        $ret .= $this->element('caption', 'Properties ($info)');
+        
+        $ret .= $this->start('thead');
+        $ret .= $this->start('tr');
+        $ret .= $this->element('th', 'Property', array('class' => 'heavy'));
+        $ret .= $this->element('th', 'Definition', array('class' => 'heavy', 'style' => 'width:auto;'));
+        $ret .= $this->end('tr');
+        $ret .= $this->end('thead');
+        
+        ksort($this->def->info);
+        foreach ($this->def->info as $property => $obj) {
+            $name = $this->getClass($obj, 'AttrDef_');
+            $ret .= $this->row($property, $name);
+        }
+        
+        $ret .= $this->end('table');
+        $ret .= $this->end('div');
+        
+        return $ret;
+    }
+    
+}
+
+?>
--- a/library/HTMLPurifier/Printer/HTMLDefinition.php
+++ b/library/HTMLPurifier/Printer/HTMLDefinition.php
@@ -0,0 +1,206 @@
+<?php
+
+require_once 'HTMLPurifier/Printer.php';
+
+class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
+{
+    
+    /**
+     * Instance of HTMLPurifier_HTMLDefinition, for easy access
+     */
+    var $def;
+    
+    function render(&$config) {
+        $ret = '';
+        $this->config =& $config;
+        $this->def =& $config->getHTMLDefinition();
+        $def =& $this->def;
+        
+        $ret .= $this->start('div', array('class' => 'HTMLPurifier_Printer'));
+        $ret .= $this->start('table');
+        $ret .= $this->element('caption', 'Environment');
+        
+        $ret .= $this->row('Parent of fragment', $def->info_parent);
+        $ret .= $this->row('Strict mode', $def->strict);
+        if ($def->strict) $ret .= $this->row('Block wrap name', $def->info_block_wrapper);
+        
+        $ret .= $this->start('tr');
+            $ret .= $this->element('th', 'Global attributes');
+            $ret .= $this->element('td', $this->listifyAttr($def->info_global_attr),0,0);
+        $ret .= $this->end('tr');
+        
+        $ret .= $this->renderChildren($def->info_parent_def->child);
+        
+        $ret .= $this->start('tr');
+            $ret .= $this->element('th', 'Tag transforms');
+            $list = array();
+            foreach ($def->info_tag_transform as $old => $new) {
+                $new = $this->getClass($new, 'TagTransform_');
+                $list[] = "<$old> with $new";
+            }
+            $ret .= $this->element('td', $this->listify($list));
+        $ret .= $this->end('tr');
+        
+        $ret .= $this->start('tr');
+            $ret .= $this->element('th', 'Pre-AttrTransform');
+            $ret .= $this->element('td', $this->listifyObjectList($def->info_attr_transform_pre));
+        $ret .= $this->end('tr');
+        
+        $ret .= $this->start('tr');
+            $ret .= $this->element('th', 'Post-AttrTransform');
+            $ret .= $this->element('td', $this->listifyObjectList($def->info_attr_transform_post));
+        $ret .= $this->end('tr');
+        
+        $ret .= $this->end('table');
+        
+        
+        $ret .= $this->renderInfo();
+        
+        
+        $ret .= $this->end('div');
+        
+        return $ret;
+    }
+    
+    /**
+     * Renders the Elements ($info) table
+     */
+    function renderInfo() {
+        $ret = '';
+        $ret .= $this->start('table');
+        $ret .= $this->element('caption', 'Elements ($info)');
+        ksort($this->def->info);
+        $ret .= $this->start('tr');
+        $ret .= $this->element('th', 'Allowed tags', array('colspan' => 2, 'class' => 'heavy'));
+        $ret .= $this->end('tr');
+        $ret .= $this->start('tr');
+        $ret .= $this->element('td', $this->listifyTagLookup($this->def->info), array('colspan' => 2));
+        $ret .= $this->end('tr');
+        foreach ($this->def->info as $name => $def) {
+            $ret .= $this->start('tr');
+                $ret .= $this->element('th', "<$name>", array('class'=>'heavy', 'colspan' => 2));
+            $ret .= $this->end('tr');
+            $ret .= $this->start('tr');
+                $ret .= $this->element('th', 'Type');
+                $ret .= $this->element('td', ucfirst($def->type));
+            $ret .= $this->end('tr');
+            if (!empty($def->excludes)) {
+                $ret .= $this->start('tr');
+                    $ret .= $this->element('th', 'Excludes');
+                    $ret .= $this->element('td', $this->listifyTagLookup($def->excludes));
+                $ret .= $this->end('tr');
+            }
+            if (!empty($def->attr_transform_pre)) {
+                $ret .= $this->start('tr');
+                    $ret .= $this->element('th', 'Pre-AttrTransform');
+                    $ret .= $this->element('td', $this->listifyObjectList($def->attr_transform_pre));
+                $ret .= $this->end('tr');
+            }
+            if (!empty($def->attr_transform_post)) {
+                $ret .= $this->start('tr');
+                    $ret .= $this->element('th', 'Post-AttrTransform');
+                    $ret .= $this->element('td', $this->listifyObjectList($def->attr_transform_post));
+                $ret .= $this->end('tr');
+            }
+            if (!empty($def->auto_close)) {
+                $ret .= $this->start('tr');
+                    $ret .= $this->element('th', 'Auto closed by');
+                    $ret .= $this->element('td', $this->listifyTagLookup($def->auto_close));
+                $ret .= $this->end('tr');
+            }
+            $ret .= $this->start('tr');
+                $ret .= $this->element('th', 'Allowed attributes');
+                $ret .= $this->element('td',$this->listifyAttr($def->attr),0,0);
+            $ret .= $this->end('tr');
+            
+            $ret .= $this->renderChildren($def->child);
+        }
+        $ret .= $this->end('table');
+        return $ret;
+    }
+    
+    /** 
+     * Renders a row describing the allowed children of an element
+     * @param $def HTMLPurifier_ChildDef of pertinent element
+     */
+    function renderChildren($def) {
+        $context = new HTMLPurifier_Context();
+        $ret = '';
+        $ret .= $this->start('tr');
+            $elements = array();
+            $attr = array();
+            if (isset($def->elements)) {
+                if ($def->type == 'strictblockquote') $def->validateChildren(array(), $this->config, $context);
+                $elements = $def->elements;
+            } elseif ($def->type == 'chameleon') {
+                $attr['rowspan'] = 2;
+            } elseif ($def->type == 'empty') {
+                $elements = array();
+            } elseif ($def->type == 'table') {
+                $elements = array('col', 'caption', 'colgroup', 'thead',
+                    'tfoot', 'tbody', 'tr');
+            }
+            $ret .= $this->element('th', 'Allowed children', $attr);
+            
+            if ($def->type == 'chameleon') {
+                
+                $ret .= $this->element('td',
+                    '<em>Block</em>: ' .
+                    $this->escape($this->listifyTagLookup($def->block->elements)),0,0);
+                $ret .= $this->end('tr');
+                $ret .= $this->start('tr');
+                $ret .= $this->element('td',
+                    '<em>Inline</em>: ' .
+                    $this->escape($this->listifyTagLookup($def->inline->elements)),0,0);
+                
+            } else {
+                $ret .= $this->element('td',
+                    '<em>'.ucfirst($def->type).'</em>: ' .
+                    $this->escape($this->listifyTagLookup($elements)),0,0);
+            }
+        $ret .= $this->end('tr');
+        return $ret;
+    }
+    
+    /** 
+     * Listifies a tag lookup table.
+     * @param $array Tag lookup array in form of array('tagname' => true)
+     */
+    function listifyTagLookup($array) {
+        $list = array();
+        foreach ($array as $name => $discard) {
+            if ($name !== '#PCDATA' && !isset($this->def->info[$name])) continue;
+            $list[] = $name;
+        }
+        return $this->listify($list);
+    }
+    
+    /**
+     * Listifies a list of objects by retrieving class names and internal state
+     * @param $array List of objects
+     * @todo Also add information about internal state
+     */
+    function listifyObjectList($array) {
+        $list = array();
+        foreach ($array as $discard => $obj) {
+            $list[] = $this->getClass($obj, 'AttrTransform_');
+        }
+        return $this->listify($list);
+    }
+    
+    /**
+     * Listifies a hash of attributes to AttrDef classes
+     * @param $array Array hash in form of array('attrname' => HTMLPurifier_AttrDef)
+     */
+    function listifyAttr($array) {
+        $list = array();
+        foreach ($array as $name => $obj) {
+            if ($obj === false) continue;
+            $list[] = "$name&nbsp;=&nbsp;<i>" . $this->getClass($obj, 'AttrDef_') . '</i>';
+        }
+        return $this->listify($list);
+    }
+    
+}
+
+?>
--- a/library/HTMLPurifier/Strategy/FixNesting.php
+++ b/library/HTMLPurifier/Strategy/FixNesting.php
@@ -104,7 +104,11 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            if ($count = count($stack)) {
                $parent_index = $stack[$count-1];
                $parent_name  = $tokens[$parent_index]->name;
-                $parent_def   = $definition->info[$parent_name];
+                if ($parent_index == 0) {
+                    $parent_def   = $definition->info_parent_def;
+                } else {
+                    $parent_def   = $definition->info[$parent_name];
+                }
            } else {
                // unknown info, it won't be used anyway
                $parent_index = $parent_name = $parent_def = null;
@@ -141,9 +145,17 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            if ($excluded) {
                // there is an exclusion, remove the entire node
                $result = false;
+                $excludes = array(); // not used, but good to initialize anyway
            } else {
                // DEFINITION CALL
-                $def = $definition->info[$tokens[$i]->name];
+                if ($i === 0) {
+                    // special processing for the first node
+                    $def = $definition->info_parent_def;
+                } else {
+                    $def = $definition->info[$tokens[$i]->name];
+                    
+                }
+                
                $child_def = $def->child;
                
                // have DTD child def validate children
@@ -228,13 +240,20 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            
            // Test if the token indeed is a start tag, if not, move forward
            // and test again.
+            $size = count($tokens);
            while ($i < $size and $tokens[$i]->type != 'start') {
                if ($tokens[$i]->type == 'end') {
                    // pop a token index off the stack if we ended a node
                    array_pop($stack);
                    // pop an exclusion lookup off exclusion stack if
                    // we ended node and that node had exclusions
-                    if ($definition->info[$tokens[$i]->name]->excludes) {
+                    if ($i == 0 || $i == $size - 1) {
+                        // use specialized var if it's the super-parent
+                        $s_excludes = $definition->info_parent_def->excludes;
+                    } else {
+                        $s_excludes = $definition->info[$tokens[$i]->name]->excludes;
+                    }
+                    if ($s_excludes) {
                        array_pop($exclude_stack);
                    }
                }
--- a/library/HTMLPurifier/Strategy/RemoveForeignElements.php
+++ b/library/HTMLPurifier/Strategy/RemoveForeignElements.php
@@ -5,6 +5,14 @@ require_once 'HTMLPurifier/HTMLDefinition.php';
 require_once 'HTMLPurifier/Generator.php';
 require_once 'HTMLPurifier/TagTransform.php';

+HTMLPurifier_ConfigSchema::define(
+    'Core', 'RemoveInvalidImg', true, 'bool',
+    'This directive enables pre-emptive URI checking in <code>img</code> '.
+    'tags, as the attribute validation strategy is not authorized to '.
+    'remove elements from the document.  This directive has been available '.
+    'since 1.3.0, revert to pre-1.3.0 behavior by setting to false.'
+);
+
 /**
 * Removes all unrecognized tags from the list of tokens.
 * 
@@ -25,7 +33,23 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
            if (!empty( $token->is_tag )) {
                // DEFINITION CALL
                if (isset($definition->info[$token->name])) {
-                    // leave untouched
+                    // leave untouched, except for a few special cases:
+                    
+                    // hard-coded image special case, pre-emptively drop
+                    // if not available. Probably not abstract-able
+                    if ( $token->name == 'img' ) {
+                        if (!isset($token->attr['src'])) continue;
+                        if (!isset($definition->info['img']->attr['src'])) {
+                            continue;
+                        }
+                        $token->attr['src'] =
+                            $definition->
+                                info['img']->
+                                    attr['src']->
+                                        validate($token->attr['src']);
+                        if ($token->attr['src'] === false) continue;
+                    }
+                    
                } elseif (
                    isset($definition->info_tag_transform[$token->name])
                ) {
--- a/smoketests/common.php
+++ b/smoketests/common.php
@@ -2,8 +2,7 @@

 header('Content-type: text/html; charset=UTF-8');

-set_include_path('../library' . PATH_SEPARATOR . get_include_path());
-require_once 'HTMLPurifier.php';
+require_once '../library/HTMLPurifier.auto.php';

 function escapeHTML($string) {
    $string = HTMLPurifier_Encoder::cleanUTF8($string);
--- a/smoketests/printDefinition.php
+++ b/smoketests/printDefinition.php
@@ -0,0 +1,137 @@
+<?php
+
+require_once 'common.php'; // load library
+
+require_once 'HTMLPurifier/Printer/HTMLDefinition.php';
+require_once 'HTMLPurifier/Printer/CSSDefinition.php';
+
+$config = HTMLPurifier_Config::createDefault();
+
+// you can do custom configuration!
+if (file_exists('printDefinition.settings.php')) {
+    include 'printDefinition.settings.php';
+}
+
+$get = $_GET;
+foreach ($_GET as $key => $value) {
+    if (!strncmp($key, 'Null_', 5) && !empty($value)) {
+        unset($get[substr($key, 5)]);
+        unset($get[$key]);
+    }
+}
+
+@$config->loadArray($get);
+
+$printer_html_definition = new HTMLPurifier_Printer_HTMLDefinition();
+$printer_css_definition  = new HTMLPurifier_Printer_CSSDefinition();
+
+echo '<?xml version="1.0" encoding="UTF-8" ?>';
+?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+<head>
+    <title>HTML Purifier Printer Smoketest</title>
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+    <style type="text/css">
+        form table {margin:1em auto;}
+        form th {text-align:right;padding-right:1em;}
+        .HTMLPurifier_Printer table {border-collapse:collapse;
+            border:1px solid #000; width:600px;
+            margin:1em auto;font-family:sans-serif;font-size:75%;}
+        .HTMLPurifier_Printer td, .HTMLPurifier_Printer th {padding:3px;
+            border:1px solid #000;background:#CCC; vertical-align: baseline;}
+        .HTMLPurifier_Printer th {text-align:left;background:#CCF;width:20%;}
+        .HTMLPurifier_Printer caption {font-size:1.5em; font-weight:bold;
+            width:100%;}
+        .HTMLPurifier_Printer .heavy {background:#99C;text-align:center;}
+    </style>
+    <script type="text/javascript">
+        function toggleWriteability(id_of_patient, checked) {
+            document.getElementById(id_of_patient).disabled = checked;
+        }
+    </script>
+</head>
+<body>
+<h1>HTML Purifier Printer Smoketest</h1>
+<p>This page will allow you to see precisely what HTML Purifier's internal
+whitelist is. You can
+also twiddle with the configuration settings to see how a directive
+influences the internal workings of the definition objects.</p>
+<h2>Modify configuration</h2>
+
+<p>You can specify an array by typing in a comma-separated
+list of items, HTML Purifier will take care of the rest (including
+transformation into a real array list or a lookup table). If a
+directive can be set to null, that usually means that the feature
+is disabled when it is null (not that, say, no tags are allowed).</p>
+
+<form id="edit-config" method="get" action="printDefinition.php">
+<table>
+<?php
+    $directives = $config->getBatch('HTML');
+    // can't handle hashes
+    foreach ($directives as $key => $value) {
+        $directive = "HTML.$key";
+        if (is_array($value)) {
+            $keys = array_keys($value);
+            if ($keys === array_keys($keys)) {
+                $value = implode(',', $keys);
+            } else {
+                $new_value = '';
+                foreach ($value as $name => $bool) {
+                    if ($bool !== true) continue;
+                    $new_value .= "$name,";
+                }
+                $value = rtrim($new_value, ',');
+            }
+        }
+        $allow_null = $config->def->info['HTML'][$key]->allow_null;
+?>
+<tr>
+<th>
+    <a href="http://hp.jpsband.org/live/configdoc/plain.html#<?php echo $directive ?>">
+        %<?php echo $directive; ?>
+    </a>
+</th>
+<td>
+<?php if (is_bool($value)) { ?>
+    Yes <input type="radio" name="<?php echo $directive; ?>" value="1"<?php if ($value) { ?> checked="checked"<?php } ?> /> &nbsp;
+    No <input type="radio" name="<?php echo $directive; ?>" value="0"<?php if (!$value) { ?> checked="checked"<?php } ?> />
+<?php } else { ?>
+    <?php if($allow_null) { ?>
+        Null/Disabled <input
+                type="checkbox"
+                value="1"
+                onclick="toggleWriteability('<?php echo $directive ?>',checked)"
+                name="Null_<?php echo $directive; ?>"
+                <?php if ($value === null) { ?> checked="checked"<?php } ?>
+              /> or <br />
+    <?php } ?>
+    <input
+        type="text"
+        id="<?php echo $directive; ?>"
+        name="<?php echo $directive; ?>"
+        value="<?php echo escapeHTML($value); ?>"
+        <?php if($value === null) {echo 'disabled="disabled"';} ?>
+    />
+<?php } ?>
+</td>
+</tr>
+<?php
+    }
+?>
+<tr>
+    <td colspan="2" style="text-align:right;">
+        [<a href="printDefinition.php">Reset</a>]
+        <input type="submit" value="Submit" />
+    </td>
+</tr>
+</table>
+</form>
+<h2>HTMLDefinition</h2>
+<?php echo $printer_html_definition->render($config) ?>
+<h2>CSSDefinition</h2>
+<?php echo $printer_css_definition->render($config) ?>
+</body>
+</html>
--- a/smoketests/utf8.php
+++ b/smoketests/utf8.php
@@ -2,16 +2,17 @@

 require_once 'common.php';

+echo '<?xml version="1.0" encoding="UTF-8" ?>';
 ?><!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-<title>HTMLPurifier UTF-8 Smoketest</title>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+    <title>HTML Purifier UTF-8 Smoketest</title>
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTMLPurifier UTF-8 Smoketest</h1>
+<h1>HTML Purifier UTF-8 Smoketest</h1>
 <?php

 $purifier = new HTMLPurifier();
--- a/smoketests/variableWidthAttack.php
+++ b/smoketests/variableWidthAttack.php
@@ -2,16 +2,17 @@

 require_once 'common.php';

+echo '<?xml version="1.0" encoding="UTF-8" ?>';
 ?><!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-<title>HTMLPurifier Variable Width Attack Smoketest</title>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+    <title>HTML Purifier Variable Width Attack Smoketest</title>
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTMLPurifier Variable Width Attack Smoketest</h1>
+<h1>HTML Purifier Variable Width Attack Smoketest</h1>
 <p>For more information, see
 <a href="http://applesoup.googlepages.com/bypass_filter.txt">Cheng Peng Su's
 original advisory.</a>  This particular exploit code appears only to work
--- a/smoketests/xssAttacks.php
+++ b/smoketests/xssAttacks.php
@@ -20,7 +20,7 @@ function formatCode($string) {
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-    <title>HTMLPurifier XSS Attacks Smoketest</title>
+    <title>HTML Purifier XSS Attacks Smoketest</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <style type="text/css">
        .scroll {overflow:auto; width:100%;}
@@ -31,13 +31,13 @@ function formatCode($string) {
    </style>
 </head>
 <body>
-<h1>HTMLPurifier XSS Attacks Smoketest</h1>
+<h1>HTML Purifier XSS Attacks Smoketest</h1>
 <p>XSS attacks are from
 <a href="http://ha.ckers.org/xss.html">http://ha.ckers.org/xss.html</a>.</p>
 <p><strong>Caveats:</strong>
-The last segment of tests regarding blacklisted websites is not
-applicable at the moment, but when we add that functionality they'll be
-relevant. Most XSS broadcasts its presence by spawning an alert dialogue.
+<tt>Google.com</tt> has been programatically disallowed, but as you can
+see, there are ways of getting around that, so coverage in this area
+is not complete. Most XSS broadcasts its presence by spawning an alert dialogue.
 The displayed code is not strictly correct, as linebreaks have been forced for
 readability. Linewraps have been marked with <tt>»</tt>.  Some tests are
 omitted for your convenience. Not all control characters are displayed.</p>
@@ -48,7 +48,12 @@ omitted for your convenience. Not all control characters are displayed.</p>
 if (version_compare(PHP_VERSION, '5', '<')) exit('<p>Requires PHP 5.</p>');

 $xml = simplexml_load_file('xssAttacks.xml');
-$purifier = new HTMLPurifier();
+
+// programatically disallow google.com for URI evasion tests
+// not complete
+$config = HTMLPurifier_Config::createDefault();
+$config->set('URI', 'HostBlacklist', array('google.com'));
+$purifier = new HTMLPurifier($config);

 ?>
 <table cellspacing="0" cellpadding="2">
--- a/smoketests/xssAttacks.xml
+++ b/smoketests/xssAttacks.xml
@@ -2,7 +2,7 @@
 <xss>
 	<attack>
 		<name>XSS Locator</name>
-		<code>&apos;;alert(String.fromCharCode(88,83,83))//\&apos;;alert(String.fromCharCode(88,83,83))//&quot;;alert(String.fromCharCode(88,83,83))//\&quot;;alert(String.fromCharCode(88,83,83))//&gt;&lt;/SCRIPT&gt;!--&lt;SCRIPT&gt;alert(String.fromCharCode(88,83,83))&lt;/SCRIPT&gt;=&amp;{}</code>
+		<code>&apos;;alert(String.fromCharCode(88,83,83))//\&apos;;alert(String.fromCharCode(88,83,83))//&quot;;alert(String.fromCharCode(88,83,83))//\&quot;;alert(String.fromCharCode(88,83,83))//--&gt;&lt;/SCRIPT&gt;&quot;&gt;&apos;&gt;&lt;SCRIPT&gt;alert(String.fromCharCode(88,83,83))&lt;/SCRIPT&gt;=&amp;{}</code>

 		<desc>Inject this string, and in most cases where a script is vulnerable with no special XSS vector requirements the word &quot;XSS&quot; will pop up.  You&apos;ll need to replace the &quot;&amp;&quot; with &quot;%26&quot; if you are submitting this XSS string via HTTP GET or it will be ignored and everything after it will be interpreted as another variable.  Tip: If you&apos;re in a rush and need to quickly check a page, often times injecting the deprecated &quot;&lt;PLAINTEXT&gt;&quot; tag will be enough to check to see if something is vulnerable to XSS by messing up the output appreciably.</desc>
 		<label>Basic XSS Attacks</label>
--- a/tests/HTMLPurifier/AttrDef/URITest.php
+++ b/tests/HTMLPurifier/AttrDef/URITest.php
@@ -271,6 +271,61 @@ class HTMLPurifier_AttrDef_URITest extends HTMLPurifier_AttrDefHarness
        
    }
    
+    function testDisableExternalResources() {
+        
+        $this->config->set('URI', 'DisableExternalResources', true);
+        
+        $this->def = new HTMLPurifier_AttrDef_URI();
+        $this->assertDef('http://sub.example.com/alas?foo=asd');
+        $this->assertDef('/img.png');
+        
+        $this->def = new HTMLPurifier_AttrDef_URI(true);
+        $this->assertDef('http://sub.example.com/alas?foo=asd', false);
+        $this->assertDef('/img.png');
+        
+    }
+    
+    function testMunge() {
+        
+        $this->config->set('URI', 'Munge', 'http://www.google.com/url?q=%s');
+        $this->def = new HTMLPurifier_AttrDef_URI();
+        
+        $this->assertDef(
+            'http://www.example.com/',
+            'http://www.google.com/url?q=http%3A%2F%2Fwww.example.com%2F'
+        );
+        
+        $this->assertDef('index.html');
+        $this->assertDef('javascript:foobar();', false);
+        
+    }
+    
+    function testBlacklist() {
+        
+        $this->config->set('URI', 'HostBlacklist', array('example.com', 'moo'));
+        
+        $this->assertDef('foo.txt');
+        $this->assertDef('http://www.google.com/example.com/moo');
+        
+        $this->assertDef('http://example.com/#23', false);
+        $this->assertDef('https://sub.domain.example.com/foobar', false);
+        $this->assertDef('http://example.com.example.net/?whoo=foo', false);
+        $this->assertDef('ftp://moo-moo.net/foo/foo/', false);
+        
+    }
+    
+    function testWhitelist() {
+        /*
+        $this->config->set('URI', 'HostPolicy', 'DenyAll');
+        $this->config->set('URI', 'HostWhitelist', array(null, 'google.com'));
+        
+        $this->assertDef('http://example.com/fo/google.com', false);
+        $this->assertDef('server.txt');
+        $this->assertDef('ftp://www.google.com/?t=a');
+        $this->assertDef('http://google.com.tricky.spamsite.net', false);
+        */
+    }
+    
 }

 ?>
--- a/tests/HTMLPurifier/ChildDef/ChameleonTest.php
+++ b/tests/HTMLPurifier/ChildDef/ChameleonTest.php
@@ -0,0 +1,35 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/Chameleon.php';
+
+class HTMLPurifier_ChildDef_ChameleonTest extends HTMLPurifier_ChildDefHarness
+{
+    
+    function test() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Chameleon(
+            'b | i',      // allowed only when in inline context
+            'b | i | div' // allowed only when in block context
+        );
+        
+        $this->assertResult(
+            '<b>Allowed.</b>', true,
+            array(), array('ParentType' => 'inline')
+        );
+        
+        $this->assertResult(
+            '<div>Not allowed.</div>', '',
+            array(), array('ParentType' => 'inline')
+        );
+        
+        $this->assertResult(
+            '<div>Allowed.</div>', true,
+            array(), array('ParentType' => 'block')
+        );
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDef/CustomTest.php
+++ b/tests/HTMLPurifier/ChildDef/CustomTest.php
@@ -0,0 +1,24 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/Custom.php';
+
+class HTMLPurifier_ChildDef_CustomTest extends HTMLPurifier_ChildDefHarness
+{
+    
+    function test() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Custom('(a,b?,c*,d+,(a,b)*)');
+        
+        $this->assertResult('', false);
+        $this->assertResult('<a /><a />', false);
+        
+        $this->assertResult('<a /><b /><c /><d /><a /><b />');
+        $this->assertResult('<a /><d>Dob</d><a /><b>foo</b>'.
+          '<a href="moo" /><b>foo</b>');
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDef/OptionalTest.php
+++ b/tests/HTMLPurifier/ChildDef/OptionalTest.php
@@ -0,0 +1,20 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/Optional.php';
+
+class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness
+{
+    
+    function test() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Optional('b | i');
+        
+        $this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
+        $this->assertResult('Not allowed text', '');
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDef/RequiredTest.php
+++ b/tests/HTMLPurifier/ChildDef/RequiredTest.php
@@ -0,0 +1,69 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/Required.php';
+
+class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness
+{
+    
+    function testParsing() {
+        
+        $def = new HTMLPurifier_ChildDef_Required('foobar | bang |gizmo');
+        $this->assertEqual($def->elements,
+          array(
+            'foobar' => true
+           ,'bang'   => true
+           ,'gizmo'  => true
+          ));
+        
+        $def = new HTMLPurifier_ChildDef_Required(array('href', 'src'));
+        $this->assertEqual($def->elements,
+          array(
+            'href' => true
+           ,'src'  => true
+          ));
+        
+    }
+    
+    function testPCDATAForbidden() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Required('dt | dd');
+        
+        $this->assertResult('', false);
+        $this->assertResult(
+          '<dt>Term</dt>Text in an illegal location'.
+             '<dd>Definition</dd><b>Illegal tag</b>',
+          '<dt>Term</dt><dd>Definition</dd>');
+        $this->assertResult('How do you do!', false);
+        
+        // whitespace shouldn't trigger it
+        $this->assertResult("\n<dd>Definition</dd>       ");
+        
+        $this->assertResult(
+          '<dd>Definition</dd>       <b></b>       ',
+          '<dd>Definition</dd>              '
+        );
+        $this->assertResult("\t      ", false);
+        
+    }
+    
+    function testPCDATAAllowed() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
+        
+        $this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
+        
+        // with child escaping on
+        $this->assertResult(
+            '<b>Bold text</b><img />',
+            '<b>Bold text</b>&lt;img /&gt;',
+            array(
+              'Core.EscapeInvalidChildren' => true
+            )
+        );
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
+++ b/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
@@ -0,0 +1,50 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
+
+class   HTMLPurifier_ChildDef_StrictBlockquoteTest
+extends HTMLPurifier_ChildDefHarness
+{
+    
+    function test() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_StrictBlockquote();
+        
+        $this->assertResult('');
+        $this->assertResult('<p>Valid</p>');
+        $this->assertResult('<div>Still valid</div>');
+        $this->assertResult('Needs wrap', '<p>Needs wrap</p>');
+        $this->assertResult(
+               'Wrap'. '<p>Do not wrap</p>',
+            '<p>Wrap</p><p>Do not wrap</p>'
+        );
+        $this->assertResult(
+            '<p>Do not</p>'.'<b>Wrap</b>',
+            '<p>Do not</p><p><b>Wrap</b></p>'
+        );
+        $this->assertResult(
+            '<li>Not allowed</li>Paragraph.<p>Hmm.</p>',
+            '<p>Not allowedParagraph.</p><p>Hmm.</p>'
+        );
+        $this->assertResult(
+            $var = 'He said<br />perhaps<br />we should <b>nuke</b> them.',
+            "<p>$var</p>"
+        );
+        $this->assertResult(
+            '<foo>Bar</foo><bas /><b>People</b>Conniving.'. '<p>Fools!</p>',
+              '<p>Bar'.          '<b>People</b>Conniving.</p><p>Fools!</p>'
+        );
+        $this->assertResult('Needs wrap', '<div>Needs wrap</div>',
+            array('HTML.BlockWrapper' => 'div'));
+        
+        $this->assertResult('Needs wrap', '<p>Needs wrap</p>',
+            array('HTML.BlockWrapper' => 'dav'));
+        $this->assertError('Cannot use non-block element as block wrapper.');
+        $this->assertNoErrors();
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDef/TableTest.php
+++ b/tests/HTMLPurifier/ChildDef/TableTest.php
@@ -0,0 +1,51 @@
+<?php
+
+require_once 'HTMLPurifier/ChildDefHarness.php';
+require_once 'HTMLPurifier/ChildDef/Table.php';
+
+class HTMLPurifier_ChildDef_TableTest extends HTMLPurifier_ChildDefHarness
+{
+    
+    function test() {
+        
+        $this->obj = new HTMLPurifier_ChildDef_Table();
+        
+        $this->assertResult('', false);
+        
+        // we're using empty tags to compact the tests: under real circumstances
+        // there would be contents in them
+        
+        $this->assertResult('<tr />');
+        $this->assertResult('<caption /><col /><thead /><tfoot /><tbody>'.
+            '<tr><td>asdf</td></tr></tbody>');
+        $this->assertResult('<col /><col /><col /><tr />');
+        
+        // mixed up order
+        $this->assertResult(
+          '<col /><colgroup /><tbody /><tfoot /><thead /><tr>1</tr><caption /><tr />',
+          '<caption /><col /><colgroup /><thead /><tfoot /><tbody /><tr>1</tr><tr />');
+        
+        // duplicates of singles
+        // - first caption serves
+        // - trailing tfoots/theads get turned into tbodys
+        $this->assertResult(
+          '<caption>1</caption><caption /><tbody /><tbody /><tfoot>1</tfoot><tfoot />',
+          '<caption>1</caption><tfoot>1</tfoot><tbody /><tbody /><tbody />'
+        );
+        
+        // errant text dropped (until bubbling is implemented)
+        $this->assertResult('foo', false);
+        
+        // whitespace sticks to the previous element, last whitespace is
+        // stationary
+        $this->assertResult("\n   <tr />\n  <tr />\n ");
+        $this->assertResult(
+          "\n\t<tbody />\n\t\t<tfoot />\n\t\t\t",
+          "\n\t\t<tfoot />\n\t<tbody />\n\t\t\t"
+        );
+        
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDefHarness.php
+++ b/tests/HTMLPurifier/ChildDefHarness.php
@@ -0,0 +1,18 @@
+<?php
+
+require_once 'HTMLPurifier/Harness.php';
+require_once 'HTMLPurifier/ChildDef.php';
+
+class HTMLPurifier_ChildDefHarness extends HTMLPurifier_Harness
+{
+    
+    function setUp() {
+        $this->obj       = null;
+        $this->func      = 'validateChildren';
+        $this->to_tokens = true;
+        $this->to_html   = true;
+    }
+    
+}
+
+?>
--- a/tests/HTMLPurifier/ChildDefTest.php
+++ b/tests/HTMLPurifier/ChildDefTest.php
@@ -1,168 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/Harness.php';
-
-require_once 'HTMLPurifier/ChildDef.php';
-require_once 'HTMLPurifier/Lexer/DirectLex.php';
-require_once 'HTMLPurifier/Generator.php';
-
-class HTMLPurifier_ChildDefTest extends HTMLPurifier_Harness
-{
-    
-    function setUp() {
-        $this->obj       = null;
-        $this->func      = 'validateChildren';
-        $this->to_tokens = true;
-        $this->to_html   = true;
-    }
-    
-    function test_custom() {
-        
-        $this->obj = new HTMLPurifier_ChildDef_Custom('(a,b?,c*,d+,(a,b)*)');
-        
-        $this->assertResult('', false);
-        $this->assertResult('<a /><a />', false);
-        
-        $this->assertResult('<a /><b /><c /><d /><a /><b />');
-        $this->assertResult('<a /><d>Dob</d><a /><b>foo</b>'.
-          '<a href="moo" /><b>foo</b>');
-        
-    }
-    
-    function test_table() {
-        
-        // the table definition
-        $this->obj = new HTMLPurifier_ChildDef_Table();
-        
-        $inputs = $expect = $config = array();
-        
-        $this->assertResult('', false);
-        
-        // we're using empty tags to compact the tests: under real circumstances
-        // there would be contents in them
-        
-        $this->assertResult('<tr />');
-        $this->assertResult('<caption /><col /><thead /><tfoot /><tbody>'.
-            '<tr><td>asdf</td></tr></tbody>');
-        $this->assertResult('<col /><col /><col /><tr />');
-        
-        // mixed up order
-        $this->assertResult(
-          '<col /><colgroup /><tbody /><tfoot /><thead /><tr>1</tr><caption /><tr />',
-          '<caption /><col /><colgroup /><thead /><tfoot /><tbody /><tr>1</tr><tr />');
-        
-        // duplicates of singles
-        // - first caption serves
-        // - trailing tfoots/theads get turned into tbodys
-        $this->assertResult(
-          '<caption>1</caption><caption /><tbody /><tbody /><tfoot>1</tfoot><tfoot />',
-          '<caption>1</caption><tfoot>1</tfoot><tbody /><tbody /><tbody />'
-        );
-        
-        // errant text dropped (until bubbling is implemented)
-        $this->assertResult('foo', false);
-        
-        // whitespace sticks to the previous element, last whitespace is
-        // stationary
-        $this->assertResult("\n   <tr />\n  <tr />\n ");
-        $this->assertResult(
-          "\n\t<tbody />\n\t\t<tfoot />\n\t\t\t",
-          "\n\t\t<tfoot />\n\t<tbody />\n\t\t\t"
-        );
-        
-    }
-    
-    function testParsing() {
-        
-        $def = new HTMLPurifier_ChildDef_Required('foobar | bang |gizmo');
-        $this->assertEqual($def->elements,
-          array(
-            'foobar' => true
-           ,'bang'   => true
-           ,'gizmo'  => true
-          ));
-        
-        $def = new HTMLPurifier_ChildDef_Required(array('href', 'src'));
-        $this->assertEqual($def->elements,
-          array(
-            'href' => true
-           ,'src'  => true
-          ));
-        
-    }
-    
-    function test_required_pcdata_forbidden() {
-        
-        $this->obj = new HTMLPurifier_ChildDef_Required('dt | dd');
-        
-        $this->assertResult('', false);
-        $this->assertResult(
-          '<dt>Term</dt>Text in an illegal location'.
-             '<dd>Definition</dd><b>Illegal tag</b>',
-          '<dt>Term</dt><dd>Definition</dd>');
-        $this->assertResult('How do you do!', false);
-        
-        // whitespace shouldn't trigger it
-        $this->assertResult("\n<dd>Definition</dd>       ");
-        
-        $this->assertResult(
-          '<dd>Definition</dd>       <b></b>       ',
-          '<dd>Definition</dd>              '
-        );
-        $this->assertResult("\t      ", false);
-        
-    }
-    
-    function test_required_pcdata_allowed() {
-        
-        $this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
-        
-        $this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
-        
-        // with child escaping on
-        $this->assertResult(
-            '<b>Bold text</b><img />',
-            '<b>Bold text</b>&lt;img /&gt;',
-            array(
-              'Core.EscapeInvalidChildren' => true
-            )
-        );
-        
-    }
-    
-    function test_optional() {
-        
-        $this->obj = new HTMLPurifier_ChildDef_Optional('b | i');
-        
-        $this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
-        $this->assertResult('Not allowed text', '');
-        
-    }
-    
-    function test_chameleon() {
-        
-        $this->obj = new HTMLPurifier_ChildDef_Chameleon(
-            'b | i',      // allowed only when in inline context
-            'b | i | div' // allowed only when in block context
-        );
-        
-        $this->assertResult(
-            '<b>Allowed.</b>', true,
-            array(), array('ParentType' => 'inline')
-        );
-        
-        $this->assertResult(
-            '<div>Not allowed.</div>', '',
-            array(), array('ParentType' => 'inline')
-        );
-        
-        $this->assertResult(
-            '<div>Allowed.</div>', true,
-            array(), array('ParentType' => 'block')
-        );
-        
-    }
-    
-}
-
-?>
--- a/tests/HTMLPurifier/ConfigSchemaTest.php
+++ b/tests/HTMLPurifier/ConfigSchemaTest.php
@@ -284,6 +284,11 @@ class HTMLPurifier_ConfigSchemaTest extends UnitTestCase
        $this->assertInvalid(array(0 => 'moo'), 'hash');
        $this->assertValid(array(1 => 'moo'), 'hash');
        $this->assertValid(23, 'mixed');
+        $this->assertValid('foo,bar, cow', 'list', array('foo', 'bar', 'cow'));
+        $this->assertValid('foo,bar', 'lookup', array('foo' => true, 'bar' => true));
+        $this->assertValid('true', 'bool', true);
+        $this->assertValid('false', 'bool', false);
+        $this->assertValid('1', 'bool', true);
        
    }
    
--- a/tests/HTMLPurifier/ConfigTest.php
+++ b/tests/HTMLPurifier/ConfigTest.php
@@ -106,6 +106,20 @@ class HTMLPurifier_ConfigTest extends UnitTestCase
        $this->assertError('Value is of invalid type');
        $this->assertNoErrors();
        
+        // grab a namespace
+        $config->set('Attr', 'Key', 0xBEEF);
+        $this->assertIdentical(
+            $config->getBatch('Attr'),
+            array(
+                'Key' => 0xBEEF
+            )
+        );
+        
+        // grab a non-existant namespace
+        $config->getBatch('FurnishedGoods');
+        $this->assertError('Cannot retrieve undefined namespace');
+        $this->assertNoErrors();
+        
    }
    
    function test_getDefinition() {
--- a/tests/HTMLPurifier/Strategy/FixNestingTest.php
+++ b/tests/HTMLPurifier/Strategy/FixNestingTest.php
@@ -83,6 +83,20 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
          '<a><span></span></a>'
        );
        
+        // test inline parent
+        $this->assertResult(
+            '<b>Bold</b>', true, array('HTML.Parent' => 'span')
+        );
+        $this->assertResult(
+            '<div>Reject</div>', 'Reject', array('HTML.Parent' => 'span')
+        );
+        
+        $this->assertResult(
+            '<div>Accept</div>', true, array('HTML.Parent' => 'script')
+        );
+        $this->assertError('Cannot use unrecognized element as parent.');
+        $this->assertNoErrors();
+        
    }
    
 }
--- a/tests/HTMLPurifier/Strategy/RemoveForeignElementsTest.php
+++ b/tests/HTMLPurifier/Strategy/RemoveForeignElementsTest.php
@@ -42,6 +42,12 @@ class HTMLPurifier_Strategy_RemoveForeignElementsTest
              ' Warning!</span>'
        );
        
+        // test removal of img tag
+        $this->assertResult(
+            '<img />',
+            ''
+        );
+        
    }
    
 }
--- a/tests/HTMLPurifier/Strategy/ValidateAttributesTest.php
+++ b/tests/HTMLPurifier/Strategy/ValidateAttributesTest.php
@@ -125,6 +125,9 @@ class HTMLPurifier_Strategy_ValidateAttributesTest extends
        );
        
        // test required attributes for img
+        
+        // (this should never happen, as RemoveForeignElements
+        //  should have removed the offending image tag)
        $this->assertResult(
            '<img />',
            '<img src="" alt="Invalid image" />'
--- a/tests/HTMLPurifier/Test.php
+++ b/tests/HTMLPurifier/Test.php
@@ -8,17 +8,67 @@ class HTMLPurifier_Test extends UnitTestCase
 {
    var $purifier;
    
+    function setUp() {
+        $this->purifier = new HTMLPurifier();
+    }
+    
    function assertPurification($input, $expect = null) {
        if ($expect === null) $expect = $input;
        $result = $this->purifier->purify($input);
        $this->assertIdentical($expect, $result);
    }
    
-    function test() {
-        $config = HTMLPurifier_Config::createDefault();
-        $this->purifier = new HTMLPurifier($config);
+    function testNull() {
        $this->assertPurification("Null byte\0", "Null byte");
    }
+    
+    function testStrict() {
+        $config = HTMLPurifier_Config::createDefault();
+        $config->set('HTML', 'Strict', true);
+        $this->purifier = new HTMLPurifier($config);
+        
+        $this->assertPurification(
+            '<u>Illegal underline</u>',
+            'Illegal underline'
+        );
+        
+        $this->assertPurification(
+            '<blockquote>Illegal contents</blockquote>',
+            '<blockquote><p>Illegal contents</p></blockquote>'
+        );
+        
+    }
+    
+    function testDifferentAllowedElements() {
+        $config = HTMLPurifier_Config::createDefault();
+        $config->set('HTML', 'AllowedElements', array('b', 'i', 'p', 'a'));
+        $config->set('HTML', 'AllowedAttributes', array('a.href', '*.id'));
+        $this->purifier = new HTMLPurifier($config);
+        
+        $this->assertPurification(
+            '<p>Par.</p><p>Para<a href="http://google.com/">gr</a>aph</p>Text<b>Bol<i>d</i></b>'
+        );
+        
+        $this->assertPurification(
+            '<span>Not allowed</span><a class="mef" id="foobar">Foobar</a>',
+            'Not allowed<a>Foobar</a>' // no ID!!!
+        );
+        
+    }
+    
+    function testDisableURI() {
+        
+        $config = HTMLPurifier_Config::createDefault();
+        $config->set('Attr', 'DisableURI', true);
+        $this->purifier = new HTMLPurifier($config);
+        
+        $this->assertPurification(
+            '<img src="foobar"/>',
+            ''
+        );
+        
+    }
+    
 }

 ?>
--- a/tests/index.php
+++ b/tests/index.php
@@ -44,7 +44,12 @@ $test_files[] = 'ConfigSchemaTest.php';
 $test_files[] = 'LexerTest.php';
 $test_files[] = 'Lexer/DirectLexTest.php';
 $test_files[] = 'TokenTest.php';
-$test_files[] = 'ChildDefTest.php';
+$test_files[] = 'ChildDef/RequiredTest.php';
+$test_files[] = 'ChildDef/OptionalTest.php';
+$test_files[] = 'ChildDef/ChameleonTest.php';
+$test_files[] = 'ChildDef/CustomTest.php';
+$test_files[] = 'ChildDef/TableTest.php';
+$test_files[] = 'ChildDef/StrictBlockquoteTest.php';
 $test_files[] = 'GeneratorTest.php';
 $test_files[] = 'EntityLookupTest.php';
 $test_files[] = 'Strategy/RemoveForeignElementsTest.php';
Author	SHA1	Message	Date
Edward Z. Yang	d151ffd9e6	Create 1.3 release series. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.3@590 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 23:30:22 +00:00
Edward Z. Yang	2a01cf786e	Release 1.3.0 (bumped TODO items) git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@588 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 23:21:19 +00:00
Edward Z. Yang	825b0671b5	[1.3.0] Bump version numbers. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@587 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 23:18:32 +00:00
Edward Z. Yang	4bdc0446de	[1.3.0] New directive %URI.HostBlacklist for blocking links to bad hosts. xssAttacks.php smoketest updated accordingly. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@586 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 23:14:12 +00:00
Edward Z. Yang	45a70e8ae4	[1.3.0] Update xssAttacks.xml. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@585 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 00:46:57 +00:00
Edward Z. Yang	1fe60c9b9d	[1.3.0] Clarify docs on what printDefinition is for git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@584 48356398-32a2-884e-a903-53898d9a118a	2006-11-26 00:14:03 +00:00
Edward Z. Yang	dc0e2c6b3e	Revise character estimate upwards. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@582 48356398-32a2-884e-a903-53898d9a118a	2006-11-25 21:18:20 +00:00
Edward Z. Yang	9bbbb87ffa	[1.3.0] Add Printer_CSSDefinition. - Added @public identifiers to properties that the Printers are using. - Augmented Printer::getClass() to include meta-info about the object (contained inside parentheses). Currently supports: enum, composite and multiple. - Remove all linebreaks from Printer output - Document Printer_HTMLDefinition's methods. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@581 48356398-32a2-884e-a903-53898d9a118a	2006-11-25 05:05:32 +00:00
Edward Z. Yang	b63b0be21f	[1.3.0] Some housekeeping after the last commit - Add a few missing unit tests - Allow for spaces between comma separated strings to be transformed into arrays - smoketests/printDefinition.php now has documentation, links to more documentation and a friendly user-interface git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@579 48356398-32a2-884e-a903-53898d9a118a	2006-11-24 07:12:16 +00:00
Edward Z. Yang	73a1e31fad	[1.3.0] Added spiffy new smoketest printDefinition.php, which lets you twiddle with the configuration settings and see how the internal rules are affected. (currently only complete for HTMLDefinition). - HTMLPurifier -> HTML Purifier . HTMLPurifier_Config->getBatch($namespace) added . More lenient casting to bool from string in HTMLPurifier_ConfigSchema . <?xml ... tags added to all smoketests git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@578 48356398-32a2-884e-a903-53898d9a118a	2006-11-24 06:26:02 +00:00
Edward Z. Yang	775763c583	[1.3.0] New directive %URI.Munge, munges URI so you can use some sort of redirector service to avoid PageRank leaks or warn users that they are exiting your site. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@576 48356398-32a2-884e-a903-53898d9a118a	2006-11-24 00:29:16 +00:00
Edward Z. Yang	49cb2a4a7c	[1.3.0] More control of URIs granted # Invalid images are now removed, rather than replaced with a dud <img src="" alt="Invalid image" />. Previous behavior can be restored with new directive %Core.RemoveInvalidImg set to false. ! New directives %URI.DisableExternalResources and %URI.DisableResources ! New directive %Attr.DisableURI, which eliminates all hyperlinking - Missing "Available since" documentation added git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@575 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 23:59:20 +00:00
Edward Z. Yang	61b6ee7183	Update filter levels document in light of fact that user can now specify tags. We may want to upgrade this to HTML so users can be helped out in choosing things to allow. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@574 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 22:40:59 +00:00
Edward Z. Yang	d7ce6b4587	Add code quality advisory about demo.php. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@573 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 22:34:41 +00:00
Edward Z. Yang	f67ee19f31	[1.3.0] Add some forward thinking documents. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@572 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 22:33:07 +00:00
Edward Z. Yang	92b3f0e817	[1.3.0] <li value="4"> and <ul start="2"> now allowed in loose mode - Updated progress with some more impl-no decisions - Loose vs. Strict now has better tallying on current behavior - Document what we're not allowing in loose - Strict boolean indicator added to HTMLDefinition - Added XHTML 1.1 to TODO. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@571 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 22:15:35 +00:00
Edward Z. Yang	3c4da9666f	- Update TODO: Caching and Configuration profiles - Added another code-quality issue git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@570 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 21:36:17 +00:00
Edward Z. Yang	925a07b828	[1.3.0] New directives %HTML.AllowedElements and %HTML.AllowedAttributes to let users narrow the set of allowed tags . Added HTMLPurifier->info_parent_def, parent child processing made special git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@565 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 13:51:19 +00:00
Edward Z. Yang	94db380271	[1.3.0] Remove Tidy option from demo if there is not Tidy available git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@563 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 03:49:19 +00:00
Edward Z. Yang	b9e7ba6a2f	[1.3.0] Move valid XHTML 1.0 button link to better spot. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@562 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 03:39:55 +00:00
Edward Z. Yang	b1b3377b9c	[1.3.0] Huge upgrade, (X)HTML Strict now supported + Transparently handles inline elements in block context (blockquote) ! Added GET method to demo for easier validation, added 50kb max input size ! New directive %HTML.BlockWrapper, for block-ifying inline elements ! New directive %HTML.Parent, allows you to only allow inline content - Added missing type to ChildDef_Chameleon . ChildDef_Required guards against empty tags . Lookup table HTMLDefinition->info_flow_elements added . Added peace-of-mind variable initialization to Strategy_FixNesting git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@560 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 03:23:35 +00:00
Edward Z. Yang	d8673539ab	- Add more documentation about proprietary tags - Link to all text memos git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@559 48356398-32a2-884e-a903-53898d9a118a	2006-11-23 00:45:43 +00:00
Edward Z. Yang	3b26e5dc5b	[1.3.0] Refactored ChildDef classes into their own files git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@558 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 18:55:15 +00:00
Edward Z. Yang	c5ea987069	Fix parse error. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@557 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 18:19:44 +00:00
Edward Z. Yang	b152448608	[1.3.0] Implement user-unfriendly implementation of Strict doctype. We will try not to ship this one. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@556 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 18:17:39 +00:00
Edward Z. Yang	b0575cb888	Add more TODO items: - Formatter caveat to strict XHTML - YouTube video embedding git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@555 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 17:46:38 +00:00
Edward Z. Yang	224ef774f7	Commit two new docs: loose-vs-strict and proprietary-tags, both research/reference. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@554 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 04:49:26 +00:00
Edward Z. Yang	18a83acc5d	Re-prioritize (X)HTML strict output TODO. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@553 48356398-32a2-884e-a903-53898d9a118a	2006-11-22 03:00:12 +00:00
Edward Z. Yang	f9090e45c0	[1.3.0] Add items for projected 1.3.0 and 1.2.1 releases. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@552 48356398-32a2-884e-a903-53898d9a118a	2006-11-20 03:58:56 +00:00
Edward Z. Yang	450523a9ca	[1.2.0] [merged] Bump TODO items. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@547 48356398-32a2-884e-a903-53898d9a118a	2006-11-20 03:21:52 +00:00