Released 1.1.1.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@459 48356398-32a2-884e-a903-53898d9a118a
Merged 438:439, 440:441, and 442:457 from trunk/ to branches/1.1/, mostly major work done for 1.1.1 release.
2025-08-05 13:47:24 +02:00 · 2006-09-24 22:32:26 +00:00 · 2006-09-24 22:22:06 +00:00 · 2006-09-17 22:08:48 +00:00 · 2006-09-17 00:19:47 +00:00
143 changed files with 2577 additions and 6038 deletions
--- a/6
+++ b/6
@@ -2,6 +2,6 @@
 CREDITS

 Almost everything written by Edward Z. Yang (Ambush Commander).  Lots of thanks
-to the DevNetwork Community for their help (see docs/ref-devnetwork.html for
-more details), Feyd especially (namely IPv6 and optimization).  Thanks to RSnake
-for letting me package his fantastic XSS cheatsheet for a smoketest.
+to the DevNetwork Community for their help (see docs/devnetwork.html for more
+details), Feyd especially (namely IPv6 and optimization).  Thanks to RSnake for
+letting me package his fantastic XSS cheatsheet for a smoketest.
--- a/11
+++ b/11
@@ -4,7 +4,7 @@
 # Project related configuration options
 #---------------------------------------------------------------------------
 PROJECT_NAME           = HTML Purifier
-PROJECT_NUMBER         = 1.3.0
+PROJECT_NUMBER         = 1.0.0
 OUTPUT_DIRECTORY       = "C:/Documents and Settings/Edward/My Documents/My Webs/htmlpurifier/docs/doxygen"
 CREATE_SUBDIRS         = NO
 OUTPUT_LANGUAGE        = English
@@ -89,12 +89,9 @@ EXCLUDE                =
 EXCLUDE_SYMLINKS       = NO
 EXCLUDE_PATTERNS       = */tests/* \
                         */benchmarks/* \
-                         */docs/* \
-                         */test-settings.php \
-                         */configdoc/* \
-                         */test-settings.php \
-                         */maintenance/* \
-                         */smoketests/*
+                         */docs/phpdoc/* \
+                         */docs/doxygen/* \
+                         */test-settings.php
 EXAMPLE_PATH           = 
 EXAMPLE_PATTERNS       = *
 EXAMPLE_RECURSIVE      = NO
--- a/182
+++ b/182
@@ -2,183 +2,145 @@
 Install
    How to install HTML Purifier

-HTML Purifier is designed to run out of the box,  so actually using the library
-is extremely easy. (Although, if you were looking for a step-by-step
-installation GUI, you've come to the wrong place!)  The impatient can scroll
-down to the bottom of this INSTALL document to see the code, but you really
-should make sure a few things are properly done.
+Being a library, there's no fancy GUI that will take you step-by-step through
+configuring database credentials and other mumbo-jumbo.  HTML Purifier is
+designed to run "out of the box."  Regardless, there are still a couple of
+things you should be mindful of.



-1.  Compatibility
+0.  Compatibility

-HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.9 and up. It has no
-core dependencies with other libraries. (Whoopee!)
+HTML Purifier works in both PHP 4 and PHP 5.  I have run the test suite on
+these versions:

-Optional extensions are iconv (usually installed) and tidy (also common).
-If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
-not having either of these extensions.
+  - 4.3.9, 4.3.11
+  - 4.4.0, 4.4.4
+  - 5.0.0, 5.0.4
+  - 5.1.0, 5.1.6
+
+And can confidently say that HTML Purifier should work in all versions
+between and afterwards.  HTML Purifier definitely does not support PHP 4.2,
+and PHP 4.3 branch support may go further back than that, but I haven't tested
+any earlier versions.
+
+I have been unable to get PHP 5.0.5 working on my computer, so if someone
+wants to test that, be my guest.  All tests were done on Windows XP Home,
+but operating system should not be a major factor in the library.



-2.  Including the library
+1.  Including the proper files

-Simply use:
+The library/ directory must be added to your path: HTML Purifier will not be
+able to find the necessary includes otherwise.  This is as simple as:

-    require_once '/path/to/library/HTMLPurifier.auto.php';
+    set_include_path('/path/to/htmlpurifier/library' . PATH_SEPARATOR .
+        get_include_path() );

-...and you're good to go.  Since HTML Purifier's codebase is fairly
-large, I recommend only including HTML Purifier when you need it.
+...replacing /path/to/htmlpurifier with the actual location of the folder. Don't
+worry, HTML Purifier is namespaced so unless you have another file named
+HTMLPurifier.php, the files won't collide with any of your includes.

-If you don't like your include_path to be fiddled around with, simply set
-HTML Purifier's library/ directory to the include path yourself and then:
+Then, it's a simple matter of including the base file:

    require_once 'HTMLPurifier.php';

-Only the contents in the library/ folder are necessary, so you can remove
-everything else when using HTML Purifier in a production environment.  
+...and you're good to go. The library/ folder contains all the files you need,
+so you can get rid of most of everything else when using the library in a
+production environment.



-3.  Preparing the proper output environment
+2.  Preparing the proper environment

-HTML Purifier is all about web-standards, so accordingly your webpages should
-be standards compliant.  HTML Purifier can deal with these doctypes:
+While no configuration is necessary, you first should take precautions regarding
+the other output HTML that the filtered content will be going along with.  Here
+is a (short) checklist:

-* XHTML 1.0 Transitional (default)
-* HTML 4.01 Transitional
-
-...and these character encodings:
-
-* UTF-8 (default)
-* Any encoding iconv supports (support is crippled for i18n though)
-
-The defaults are there for a reason: they are best-practice choices that
-should not be changed lightly.  For those of you in the dark, you can determine
-the doctype from this code in your HTML documents:
+ * Have I specified XHTML 1.0 Transitional as the doctype?
+ * Have I specified UTF-8 as the character encoding?

+To find out what these are, browse to your website and view its source code.
+You can figure out the doctype from the a declaration that looks like
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-
-...and the character encoding from this code:
-
+or no doctype.  You can figure out the character encoding by looking for
    <meta http-equiv="Content-type" content="text/html;charset=ENCODING">

-For legacy codebases these declarations may be missing.  If that is the case,
-STOP, and read up on character encodings and doctypes (in that order).  Here
-are some links:
-
-* http://www.joelonsoftware.com/articles/Unicode.html
-* http://alistapart.com/stories/doctype/
-
-You may currently be vulnerable to XSS and other security threats, and HTML
-Purifier won't be able to fix that.
+I cannot stress the importance of these two bullets enough.  Omitting either
+of them could have dire consequences not only for security but for plain
+old usability.  You can find a more in-depth discussion of why this is needed
+in docs/security.txt, in the meantime, try to change your output so this is
+the case.  If you can't, well, we might be able to accomodate you (read
+section 3).



-4. Configuration
+3. Configuring HTML Purifier

 HTML Purifier is designed to run out-of-the-box, but occasionally HTML
-Purifier needs to be told what to do.  If you answered no to any of these
-questions, read on, otherwise, you can skip to the next section (or, if you're
-into configuring things just for the heck of it, skip to 4.3).
+Purifier needs to be told what to do.

-* Am I using UTF-8?
-* Am I using XHTML 1.0 Transitional?
-
-If you answered yes to any of these questions, instantiate a configuration
-object and read on:
-
-    $config = HTMLPurifier_Config::createDefault();
-
-
-
-4.1. Setting a different character encoding
-
-You really shouldn't use any other encoding except UTF-8, especially if you
-plan to support multilingual websites (read section three for more details).
-However, switching to UTF-8 is not always immediately feasible, so we can
-adapt.
-
-HTML Purifier uses iconv to support other character encodings, as such,
-any encoding that iconv supports <http://www.gnu.org/software/libiconv/>
-HTML Purifier supports with this code:
+If, for some reason, you are unable to switch to UTF-8 immediately, you can
+switch HTML Purifier's encoding.  Note that the availability of encodings is
+dependent on iconv, and you'll be missing characters if the charset you
+choose doesn't have them.

    $config->set('Core', 'Encoding', /* put your encoding here */);

-An example usage for Latin-1 websites (the most common encoding for English
-websites):
+An example usage for Latin-1 websites:

    $config->set('Core', 'Encoding', 'ISO-8859-1');

-Note that HTML Purifier's support for non-Unicode encodings is crippled by the
-fact that any character not supported by that encoding will be silently
-dropped, EVEN if it is ampersand escaped.  This is a current limitation of
-HTML Purifier that we are NOT actively working to fix.  Patches are welcome,
-but there are so many other gotchas and problems in I18N for non-Unicode
-encodings that this functionality is low priority.  See
-<http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html> for a more
-detailed lowdown on the topic.
-
-
-
-4.2. Setting a different doctype
-
 For those of you stuck using HTML 4.01 Transitional, you can disable
 XHTML output like this:

    $config->set('Core', 'XHTML', false);

-I recommend that you use XHTML, although not as much as I recommend UTF-8.  If
-your HTML 4.01 page validates, good for you!
-
-Currently, we can only guarantee transitional-complaint output, future
-versions will also allow strict-compliant output.
+However, I strongly recommend that you use XHTML. Currently, we can only
+guarantee transitional-complaint output, future versions will also allow strict
+output. There are more configuration directives which can be read about
+here: http://hp.jpsband.org/live/configdoc/plain.html



-4.3. Other settings
-
-There are more configuration directives which can be read about
-here: <http://hp.jpsband.org/live/configdoc/plain.html>  They're a bit boring,
-but they can help out for those of you who like to exert maximum control over
-your code.
-
-
-
-5.   Using the code
+3.   Using the code

 The interface is mind-numbingly simple:

    $purifier = new HTMLPurifier();
-    $clean_html = $purifier->purify( $dirty_html );
+    $clean_html = $purifier->purify($dirty_html);

-...or, if you're using the configuration object:
+Or, if you're using the configuration object:

    $purifier = new HTMLPurifier($config);
-    $clean_html = $purifier->purify( $dirty_html );
+    $clean_html = $purifier->purify($dirty_html);

-That's it!  For more examples, check out docs/examples/ (they aren't very
-different though).  Also, SLOW gives advice on what to do if HTML Purifier
-is slowing down your application.
+That's it.  For more examples, check out docs/examples/.  Also, SLOW gives
+advice on what to do if HTML Purifier is slowing down your application.



-6.   Quick install
+4.   Quick install

 If your website is in UTF-8 and XHTML Transitional, use this code:

 <?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
-    
+    set_include_path('/path/to/htmlpurifier/library'
+         . PATH_SEPARATOR . get_include_path() );
+    require_once 'HTMLPurifier.php';
    $purifier = new HTMLPurifier();
+    
    $clean_html = $purifier->purify($dirty_html);
 ?>

 If your website is in a different encoding or doctype, use this code:

 <?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
+    set_include_path('/path/to/htmlpurifier/library'
+         . PATH_SEPARATOR . get_include_path() );
+    require_once 'HTMLPurifier.php';
    
    $config = HTMLPurifier_Config::createDefault();
    $config->set('Core', 'Encoding', 'ISO-8859-1'); //replace with your encoding
--- a/121
+++ b/121
@@ -1,113 +1,24 @@
 NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

-= KEY ====================
-    # Breaks back-compat
-    ! Feature
-    - Bugfix
-      + Sub-comment
-    . Internal change
-==========================
-
-1.3.0, released 2006-11-26
-# Invalid images are now removed, rather than replaced with a dud
-  <img src="" alt="Invalid image" />. Previous behavior can be restored
-  with new directive %Core.RemoveInvalidImg set to false.
-! (X)HTML Strict now supported
-  + Transparently handles inline elements in block context (blockquote)
-! Added GET method to demo for easier validation, added 50kb max input size
-! New directive %HTML.BlockWrapper, for block-ifying inline elements
-! New directive %HTML.Parent, allows you to only allow inline content
-! New directives %HTML.AllowedElements and %HTML.AllowedAttributes to let
-  users narrow the set of allowed tags
-! <li value="4"> and <ul start="2"> now allowed in loose mode
-! New directives %URI.DisableExternalResources and %URI.DisableResources
-! New directive %Attr.DisableURI, which eliminates all hyperlinking
-! New directive %URI.Munge, munges URI so you can use some sort of redirector
-  service to avoid PageRank leaks or warn users that they are exiting your site.
-! Added spiffy new smoketest printDefinition.php, which lets you twiddle with
-  the configuration settings and see how the internal rules are affected.
-! New directive %URI.HostBlacklist for blocking links to bad hosts.
-  xssAttacks.php smoketest updated accordingly.
- Added missing type to ChildDef_Chameleon
- Remove Tidy option from demo if there is not Tidy available
-. ChildDef_Required guards against empty tags
-. Lookup table HTMLDefinition->info_flow_elements added
-. Added peace-of-mind variable initialization to Strategy_FixNesting
-. Added HTMLPurifier->info_parent_def, parent child processing made special
-. Added internal documents briefly summarizing future progression of HTML
-. HTMLPurifier_Config->getBatch($namespace) added
-. More lenient casting to bool from string in HTMLPurifier_ConfigSchema
-. Refactored ChildDef classes into their own files
-
-1.2.0, released 2006-11-19
-# ID attributes now disabled by default. New directives:
-  + %HTML.EnableAttrID - restores old behavior by allowing IDs
-  + %Attr.IDPrefix - %Attr.IDBlacklist alternative that munges all user IDs
-    so that they don't collide with your IDs
-  + %Attr.IDPrefixLocal - Same as above, but for when there are multiple
-    instances of user content on the page
-  + Profuse documentation on how to use these available in docs/enduser-id.txt
-! Added MODx plugin <http://modxcms.com/forums/index.php/topic,6604.0.html>
-! Added percent encoding normalization
-! XSS attacks smoketest given facelift
-! Configuration documentation now has table of contents
-! Added %URI.DisableExternal, which prevents links to external websites.  You
-  can also use %URI.Host to permit absolute linking to subdomains
-! Non-accessible resources (ex. mailto) blocked from embedded URIs (img src)
- Type variable in HTMLDefinition was not being set properly, fixed
- Documentation updated
-  + TODO added request Phalanger
-  + TODO added request Native compression
-  + TODO added request Remove redundant tags
-  + TODO added possible plaintext formatter for HTML Purifier documentation
-  + Updated ConfigDoc TODO
-  + Improved inline comments in AttrDef/Class.php, AttrDef/CSS.php
-    and AttrDef/Host.php
-  + Revamped documentation into HTML, along with misc updates
- HTMLPurifier_Context doesn't throw a variable reference error if you attempt
-  to retrieve a non-existent variable
-. Switched to purify()-wide Context object registry
-. Refactored unit tests to minimize duplication
-. XSS attack sheet updated
-. configdoc.xml now has xml:space attached to default value nodes
-. Allow configuration directives to permit null values
-. Cleaned up test-cases to remove unnecessary swallowErrors()
-
-1.1.2, released 2006-09-30
-! Add HTMLPurifier.auto.php stub file that configures include_path
- Documentation updated
-  + INSTALL document rewritten
-  + TODO added semi-lossy conversion
-  + API Doxygen docs' file exclusions updated
-  + Added notes on HTML versus XML attribute whitespace handling
-  + Noted that HTMLPurifier_ChildDef_Custom isn't being used
-  + Noted that config object's definitions are cached versions
- Fixed lack of attribute parsing in HTMLPurifier_Lexer_PEARSax3
- ftp:// URIs now have their typecodes checked
- Hooked up HTMLPurifier_ChildDef_Custom's unit tests (they weren't being run)
-. Line endings standardized throughout project (svn:eol-style standardized)
-. Refactored parseData() to general Lexer class
-. Tester named "HTML Purifier" not "HTMLPurifier"
-
 1.1.1, released 2006-09-24
-! Configuration option to optionally Tidy up output for indentation to make up
-  for dropped whitespace by DOMLex (pretty-printing for the entire application
-  should be done by a page-wide Tidy)
 - Various documentation updates
 - Fixed parse error in configuration documentation script
 - Fixed fatal error in benchmark scripts, slightly augmented
 - As far as possible, whitespace is preserved in-between table children
+- Configuration option to optionally Tidy up output for indentation to make up
+  for dropped whitespace by DOMLex (pretty-printing for the entire application
+  should be done by a page-wide Tidy)
 - Sample test-settings.php file included

 1.1.0, released 2006-09-16
-! Directive documentation generation using XSLT
-! XHTML can now be turned off, output becomes <br>
 - Made URI validator more forgiving: will ignore leading and trailing
  quotes, apostrophes and less than or greater than signs.
 - Enforce alphanumeric namespace and directive names for configuration.
+- Directive documentation generation using XSLT
 - Table child definition made more flexible, will fix up poorly ordered elements
-. Renamed ConfigDef to ConfigSchema
+- XHTML generation can now be turned off, allowing things like <br>
+- Renamed ConfigDef to ConfigSchema

 1.0.1, released 2006-09-04
 - Fixed slight bug in DOMLex attribute parsing
@@ -117,17 +28,17 @@ NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
  space in them.  This manifested in TinyMCE.

 1.0.0, released 2006-09-01
-! Shorthand CSS properties implemented: font, border, background, list-style
-! Basic color keywords translated into hexadecimal values
-! Table CSS properties implemented
-! Support for charsets other than UTF-8 (defined by iconv)
-! Malformed UTF-8 and non-SGML character detection and cleaning implemented
 - Fixed broken numeric entity conversion
+- Malformed UTF-8 and non-SGML character detection and cleaning implemented
 - API documentation completed
-. (HTML|CSS)Definition de-singleton-ized
+- Shorthand CSS properties implemented: font, border, background, list-style
+- Basic color keywords translated into hexadecimal values
+- Table CSS properties implemented
+- (HTML|CSS)Definition de-singleton-ized
+- Support for charsets other than UTF-8 (defined by iconv)

 1.0.0beta, released 2006-08-16
-! First public release, most functionality implemented. Notable omissions are:
-  + Shorthand CSS properties
-  + Table CSS properties
-  + Deprecated attribute transformations
+- First public release, most functionality implemented. Notable omissions are:
+  . Shorthand CSS properties
+  . Table CSS properties
+  . Deprecated attribute transformations
--- a/9
+++ b/9
@@ -2,13 +2,13 @@
 SLOW
  also known as the HELP ME LIBRARY IS TOO SLOW MY PAGE TAKE TOO LONG LOAD page

-HTML Purifier is a very powerful library.  But with power comes great
+HTMLPurifier is a very powerful library.  But with power comes great
 responsibility, or, at least, longer execution times.  Remember, this
 library isn't lightly grazing over submitted HTML: it's deconstructing
 the whole thing, rigorously checking the parts, and then putting it
 back together.

-So, if it so turns out that HTML Purifier is kinda too slow for outbound
+So, if it so turns out that HTMLPurifier is kinda too slow for outbound
 filtering, you've got a few options:

 1. Inbound filtering - perform filtering of HTML when it's submitted by the
@@ -19,7 +19,7 @@ it directly from your database/filesystem.  The trouble with this method is
 that your user loses the original text, and when doing edits, will be
 handling the filtered text.  While this may be a good thing, especially if
 you're using a WYSIWYG editor, it can also result in data-loss if a user
-makes a typo.
+expects a certain to be available but it doesn't.

 2. Caching the filtered output - accept the submitted text and put it
 unaltered into the database, but then also generate a filtered version and
@@ -36,5 +36,4 @@ it has some drawbacks which cannot be fixed unless you save both the original
 and the filtered versions.

 There is a third option: profile and optimize HTMLPurifier yourself.  Be sure
-to report back your results if you decide to do that!  Especially if you
-port HTML Purifier to C++.  ;-)
+to tell me if you decide to do that!  ;-)
--- a/90
+++ b/90
@@ -1,91 +1,51 @@

 TODO List

-= KEY ====================
-    # Flagship
-    - Regular
-    ? At-risk
-==========================
+Ongoing
+ - Lots of profiling, make it faster!
+ - Plugins for major CMSes (very tricky issue)
+
+1.2 release
+ - Make URI validation routines tighter (especially mailto)
+ - More extensive URI filtering schemes
+ - Allow for background-image and list-style-image (see above)
+ - Distinguish between different types of URIs, for instance, a mailto URI
+   in IMG SRC is nonsensical
+ - Error logging for filtering/cleanup procedures
+
+1.3 release
+ - Add various "levels" of cleaning
+    - Related: Allow strict (X)HTML

 1.4 release
- # More extensive URI filtering schemes (see docs/proposal-new-directives.txt)
- # Allow for background-image and list-style-image (intrinsically tied to above)
- - Aggressive caching
- ? Rich set* methods and config file loaders for HTMLPurifier_Config
- ? Configuration profiles: sets of directives that get set with one func call
- ? ConfigSchema directive aliases (so we can rename some of them)
- ? URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX)
-
-1.5 release
- # Error logging for filtering/cleanup procedures
-    - Requires I18N facilities to be created first (COMPLEX)
-
-1.6 release
- # Add pre-packaged "levels" of cleaning (custom behavior already done)
- - More fine-grained control over escaping behavior
-    - Silently drop content inbetween SCRIPT tags (can be generalized to allow
-      specification of elements that, when detected as foreign, trigger removal
-      of children, although unbalanced tags could wreck havoc (or at least
-      delete the rest of the document)).
-
-1.7 release
- # Additional support for poorly written HTML
-    - Implement all non-essential attribute transforms (BIG!)
-    - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
-    - Friendly strict handling of <address> (block -> <br>)
+ - Additional support for poorly written HTML
+    - Implement all non-essential attribute transforms
+    - Microsoft Word HTML cleaning (i.e. MsoNormal)

 2.0 release
- # Formatters for plaintext (COMPLEX)
+ - Formatters for plaintext
    - Auto-paragraphing (be sure to leverage fact that we know when things
      shouldn't be paragraphed, such as lists and tables).
    - Linkify URLs
    - Smileys
-    - Linkification for HTML Purifier docs: notably configuration and classes

 3.0 release
- - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
+ - Extended HTML capabilities based on namespacing and tag transforms
    - Hooks for adding custom processors to custom namespaced tags and
      attributes, offer default implementation
    - Lots of documentation and samples
- - XHTML 1.1 support
-
-Ongoing
- - Lots of profiling, make it faster!
- - Plugins for major CMSes (COMPLEX)
-    - Drupal
-    - WordPress
-    - eFiction
-    - more! (look for ones that use WYSIWYGs)

 Unknown release (on a scratch-an-itch basis)
+ - Silently drop content inbetween SCRIPT tags (can be generalized to allow
+   specification of elements that, when detected as foreign, trigger removal
+   of children, although unbalanced tags could wreck havoc (or at least delete
+   the rest of the document)).
 - Fixes for Firefox's inability to handle COL alignment props (Bug 915)
 - Automatically add non-breaking spaces to empty table cells when
   empty-cells:show is applied to have compatibility with Internet Explorer
- - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
-   Also, enable disabling of directionality
- - Append something to duplicate IDs so they're still usable (impl. note: the
-   dupe detector would also need to detect the suffix as well)
- - Have 'lang' attribute be checked against official lists
- - Docs on how to embed YouTube videos (and friends) without patches
-
-Encoding workarounds
 - Non-lossy dumb alternate character encoding transformations, achieved by
   numerically encoding all non-ASCII characters
- - Semi-lossy dumb alternate character encoding transformations, achieved by
-   encoding all characters that have string entity equivalents
-
-Requested
- - Native content compression, whitespace stripping (don't rely on Tidy, make
-   sure we don't remove from <pre> or related tags)
- - Win32 Phalanger C# binaries (?)
- - Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
-    1. Analyzing which tags to remove duplicants
-    2. Ensure attributes are merged into the parent tag
-    3. Extend the tag exclusion system to specify whether or not the
-    contents should be dropped or not (currently, there's code that could do
-    something like this if it didn't drop the inner text too.)

 Wontfix
- - Non-lossy smart alternate character encoding transformations (unless
-   patch provided)
+ - Non-lossy smart alternate character encoding transformations
 - Pretty-printing HTML, users can use Tidy on the output on entire page
--- a/configdoc/generate.php
+++ b/configdoc/generate.php
@@ -12,8 +12,10 @@ TODO:
 - multipage documentation
 - determine how to multilingualize
 - factor out code into classes
+- generate a table of contents
 */

+
 // ---------------------------------------------------------------------------
 // Check and configure environment

@@ -80,6 +82,9 @@ $dom_root->appendChild($dom_document->createElement('title', 'HTML Purifier'));

 /*
 TODO for XML format:
+- namespace descriptions
+- enumerated values
+- default values
 - create a definition (DTD or other) once interface stabilizes
 */

@@ -110,12 +115,9 @@ foreach($schema->info as $namespace_name => $namespace_info) {
        $dom_constraints = $dom_document->createElement('constraints');
        $dom_directive->appendChild($dom_constraints);
        
-        $dom_type = $dom_document->createElement('type', $info->type);
-        if ($info->allow_null) {
-            $dom_type->setAttribute('allow-null', 'yes');
-        }
-        $dom_constraints->appendChild($dom_type);
-        
+        $dom_constraints->appendChild(
+            $dom_document->createElement('type', $info->type)
+        );
        if ($info->allowed !== true) {
            $dom_allowed = $dom_document->createElement('allowed');
            $dom_constraints->appendChild($dom_allowed);
@@ -131,20 +133,14 @@ foreach($schema->info as $namespace_name => $namespace_info) {
            $default = $raw_default ? 'true' : 'false';
        } elseif (is_string($raw_default)) {
            $default = "\"$raw_default\"";
-        } elseif (is_null($raw_default)) {
-            $default = 'null';
        } else {
            $default = print_r(
                    $schema->defaults[$namespace_name][$name], true
                );
        }
-        
-        $dom_default = $dom_document->createElement('default', $default);
-        
-        // remove this once we get a DTD
-        $dom_default->setAttribute('xml:space', 'preserve');
-        
-        $dom_constraints->appendChild($dom_default);
+        $dom_constraints->appendChild(
+            $dom_document->createElement('default', $default)
+        );
        
        $dom_descriptions = $dom_document->createElement('descriptions');
        $dom_directive->appendChild($dom_descriptions);
--- a/configdoc/styles/plain.css
+++ b/configdoc/styles/plain.css
@@ -5,6 +5,3 @@ table.constraints {margin:0 0 1em;}
 table.constraints th {text-align:left;padding-left:0.4em;}
 table.constraints td {padding-right:0.4em;}
 table.constraints td pre {margin:0;}
-
-#toc {list-style-type:none; font-weight:bold;}
-#toc ul {list-style-type:disc; font-weight:normal;}
--- a/configdoc/styles/plain.xsl
+++ b/configdoc/styles/plain.xsl
@@ -23,41 +23,23 @@
                <link rel="stylesheet" type="text/css" href="styles/plain.css" />
            </head>
            <body>
-                <h1><xsl:value-of select="/configdoc/title" /> Configuration Documentation</h1>
-                <h2>Table of Contents</h2>
-                <ul id="toc">
-                    <xsl:apply-templates mode="toc" />
-                </ul>
                <xsl:apply-templates />
            </body>
        </html>
    </xsl:template>
    
-    <xsl:template match="title" mode="toc" />
-    <xsl:template match="namespace" mode="toc">
-        <xsl:if test="count(directive)&gt;0">
-            <li>
-                <a href="#{@id}"><xsl:value-of select="name" /></a>
-                <ul>
-                    <xsl:apply-templates select="directive" mode="toc" />
-                </ul>
-            </li>
-        </xsl:if>
+    <xsl:template match="title">
+        <h1><xsl:value-of select="/configdoc/title" /> Configuration Documentation</h1>
    </xsl:template>
-    <xsl:template match="directive" mode="toc">
-        <li><a href="#{@id}"><xsl:value-of select="name" /></a></li>
-    </xsl:template>
-    
-    <xsl:template match="title" />
    
    <xsl:template match="namespace">
        <xsl:apply-templates />
-        <xsl:if test="count(directive)=0">
+        <xsl:if test="count(child::directive)=0">
            <p>No configuration directives defined for this namespace.</p>
        </xsl:if>
    </xsl:template>
    <xsl:template match="namespace/name">
-        <h2 id="{../@id}"><xsl:value-of select="." /></h2>
+        <h2 id="{../@id}"><xsl:value-of select="text()" /></h2>
    </xsl:template>
    <xsl:template match="namespace/description">
        <div class="description">
@@ -69,7 +51,7 @@
        <xsl:apply-templates />
    </xsl:template>
    <xsl:template match="directive/name">
-        <h3 id="{../@id}"><xsl:value-of select="../@id" /></h3>
+        <h3 id="{../@id}"><xsl:value-of select="text()" /></h3>
    </xsl:template>
    <xsl:template match="directive/constraints">
        <table class="constraints">
@@ -99,9 +81,6 @@
                <xsl:variable name="type" select="text()" />
                <xsl:attribute name="class">type type-<xsl:value-of select="$type" /></xsl:attribute>
                <xsl:value-of select="$typeLookup/types/type[@id=$type]/text()" />
-                <xsl:if test="@allow-null='yes'">
-                    (or null)
-                </xsl:if>
            </td>
        </tr>
    </xsl:template>
--- a/docs/code-quality.txt
+++ b/docs/code-quality.txt
@@ -0,0 +1,39 @@
+
+Code Quality Issues
+
+Okay, face it.  Programmers can get lazy, cut corners, or make mistakes. They
+also can do quick prototypes, and then forget to rewrite them later.  Well,
+while I can't list mistakes in here, I can list prototype-like segments
+of code that should be aggressively refactored after the beta is released.
+This does not list optimization issues, that needs to be done after intense
+profiling.
+
+Here we go:
+
+AttrDef
+    Class - doesn't support Unicode characters (fringe); uses regular
+        expressions
+    Lang - code duplication; premature optimization; doesn't consult official
+        lists (fringe)
+    Length - easily mistaken for CSSLength
+    URI - multiple regular expressions; needs host validation routines factored
+        out for mailto scheme; missing validation for query; fragment and path,
+        no percent-encode fixing
+    CSS - parser doesn't accept advanced CSS (fringe)
+    Number - constructor interface is inconsistent with Integer
+AttrTransform - doesn't accept AttrContext
+Config - "load configuration" hooks missing, rich set* accessors missing
+ConfigSchema - redefinition is a mess
+Strategy
+    FixNesting - cannot bubble nodes out of structures
+    MakeWellFormed - insufficient automatic closing definitions (check HTML
+        spec for optional end tags, also, closing based on type (block/inline)
+        might be efficient).
+    RemoveForeignElements - should be run in parallel with MakeWellFormed
+URIScheme - needs to have callable generic checks
+    ftp - missing typecode check
+    mailto - doesn't validate emails
+    news - doesn't validate opaque path
+    nntp - doesn't constrain path
+EOL
+
--- a/docs/config-ideas.txt
+++ b/docs/config-ideas.txt
@@ -0,0 +1,46 @@
+
+Configuration Ideas
+
+Here are some theoretical configuration ideas that we could implement some
+time.  Note the naming convention: %Namespace.Directive
+
+%Attr.IDPrefix - prefix all ids with this
+
+%Attr.RewriteFragments - if there's %Attr.IDPrefix we may want to transparently
+    rewrite the URLs we parse too.  However, we can only do it when it's a pure
+    anchor link, so it's not foolproof
+
+%Attr.ClassBlacklist,
+%Attr.ClassWhitelist,
+%Attr.ClassListMode - determines what classes are allowed. When
+    %Attr.ClassListMode is set to Blacklist, only allow those not in
+    %Attr.ClassBlacklist. When it's Whitelist, only allow those in
+    %Attr.ClassWhitelist.
+
+%Attr.LangAlphaOnly - designate whether or not to allow numerals in language
+    code subtags
+    * RFC 1766, the current standard referenced by XML, does not permit
+        numbers, but,
+    * RFC 3066, the superseding best practice standard since January 2001,
+        permits them.
+    We allow numbers by default, but you generally never see them
+    at all, which makes this a little more sane.
+
+%Attr.MaxWidth, 
+%Attr.MaxHeight - caps for width and height related checks.
+    (a hack in Pixels for an image crashing attack could be replaced by this)
+
+%URI.Munge - will munge all URIs to a different URI, which should redirect
+    the user to the applicable page. A urlencoded version of the URI
+    will replace any instances of %s in the string. One possible
+    string is 'http://www.google.com/url?q=%s'. Useful for preventing
+    pagerank from being sent to other sites
+
+%URI.AddRelNofollow - will add rel="nofollow" to all links, preventing the
+    spread of ill-gotten pagerank
+
+%URI.Host - host of website, for external link checks
+
+%URI.RelativeToAbsolute - transforms all relative URIs to absolute form
+
+%URI.DisableExternal - disable external links
--- a/docs/proposal-config.txt
+++ b/docs/proposal-config.txt
@@ -10,9 +10,12 @@ Directives are divided into namespaces, indicating the major portion of
 functionality they cover (although there may be overlaps.  Please consult
 the documentation in ConfigDef for more information on these namespaces.

-Since configuration is dependant on context, internal classes require a
-configuration object to be passed as a parameter.  (They also require a
-Context object).
+Since configuration is dependent on context, most of the internal classes
+require a configuration object to be passed as a parameter.  However, a few
+make this optional: they will supply a default configuration object if none
+are passed.  These classes are: HTMLPurifier::*, Generator::generateFromTokens
+and Lexer::tokenizeHTML.  However, whenever a valid configuration object
+is defined, that object should be used.

 In relation to HTMLDefinition and CSSDefinition, there is a special class
 of directives that influence the *construction* of the Definition object.
--- a/docs/dev-code-quality.html
+++ b/docs/dev-code-quality.html
@@ -1,51 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Discusses code quality issues and places that need to be refactored in HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>Code Quality Issues - HTML Purifier</title>
-
-</head><body>
-
-<h1>Code Quality Issues</h1>
-
-<div id="filing">Filed under Development</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
-<p>Okay, face it.  Programmers can get lazy, cut corners, or make mistakes. They
-also can do quick prototypes, and then forget to rewrite them later.  Well,
-while I can't list mistakes in here, I can list prototype-like segments
-of code that should be aggressively refactored.  This does not list
-optimization issues, that needs to be done after intense profiling.</p>
-
-<pre>
-docs/examples/demo.php - ad hoc HTML/PHP soup to the extreme
-
-AttrDef
-    Class - doesn't support Unicode characters (fringe); uses regular
-        expressions
-    Lang - code duplication; premature optimization
-    Length - easily mistaken for CSSLength
-    URI - multiple regular expressions; missing validation for parts (?)
-    CSS - parser doesn't accept advanced CSS (fringe)
-    Number - constructor interface inconsistent with Integer
-ConfigSchema - redefinition is a mess
-Strategy
-    FixNesting - cannot bubble nodes out of structures, duplicated checks
-        for special-case parent node
-    MakeWellFormed - insufficient automatic closing definitions (check HTML
-        spec for optional end tags, also, closing based on type (block/inline)
-        might be efficient).
-    RemoveForeignElements - should be run in parallel with MakeWellFormed
-URIScheme - needs to have callable generic checks
-    mailto - doesn't validate emails, doesn't validate querystring
-    news - doesn't validate opaque path
-    nntp - doesn't constrain path
-</pre>
-
-<div id="version">$Id$</div>
-
-</body></html>
--- a/docs/dev-naming.html
+++ b/docs/dev-naming.html
@@ -1,80 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Defines class naming conventions in HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>Naming Conventions - HTML Purifier</title>
-
-</head><body>
-
-<h1>Naming Conventions</h1>
-
-<div id="filing">Filed under Development</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
-<p>The classes in this library follow a few naming conventions, which may
-help you find the correct functionality more quickly.  Here they are:</p>
-
-<dl>
-
-<dt>All classes occupy the HTMLPurifier pseudo-namespace.</dt>
-    <dd>This means that all classes are prefixed with HTMLPurifier_.  As such, all
-    names under HTMLPurifier_ are reserved.  I recommend that you use the name
-    HTMLPurifierX_YourName_ClassName, especially if you want to take advantage
-    of HTMLPurifier_ConfigDef.</dd>
-
-<dt>All classes correspond to their path if library/ was in the include path</dt>
-    <dd>HTMLPurifier_AttrDef is located at HTMLPurifier/AttrDef.php; replace
-    underscores with slashes and append .php and you'll have the location of
-    the class.</dd>
-
-<dt>Harness and Test are reserved class names for unit tests</dt>
-    <dd>The suffix <code>Test</code> indicates that the class is a subclass of UnitTestCase
-    (of the Simpletest library) and is testable. "Harness" indicates a subclass
-    of UnitTestCase that is not meant to be run but to be extended into 
-    concrete test cases and contains custom test methods (i.e. assert*())</dd>
-
-<dt>Class names do not necessarily represent inheritance hierarchies</dt>
-    <dd>While we try to reflect inheritance in naming to some extent, it is not
-    guaranteed (for instance, none of the classes inherit from HTMLPurifier,
-    the base class).  However, all class files have the require_once
-    declarations to whichever classes they are tightly coupled to.</dd>
-
-<dt>Strategy has a meaning different from the Gang of Four pattern</dt>
-    <dd>In Design Patterns, the Gang of Four describes a Strategy object as
-    encapsulating an algorithm so that they can be switched at run-time.  While
-    our strategies are indeed algorithms, they are not meant to be substituted:
-    all must be present in order for proper functioning.</dd>
-
-<dt>Abbreviations are avoided</dt>
-    <dd>We try to avoid abbreviations as much as possible, but in some cases, 
-    abbreviated version is more readable than the full version. Here, we
-    list common abbreviations:
-    <ul>
-        <li>Attr(s) to Attribute(s)</li>
-        <li>Def to Definition</li>
-    </ul>
-    </dd>
-
-<dt>Ambiguity concerning the definition of Def/Definition</dt>
-    <dd>While a definition normally defines the structure/acceptable values of
-    an entity, most of the definitions in this application also attempt
-    to validate and fix the value.  I am unsure of a better name, as
-    "Validator" would exclude fixing the value, "Fixer" doesn't invoke
-    the proper image of "fixing" something, and "ValidatorFixer" is too long!
-    Some other suggestions were "Handler", "Reference", "Check", "Fix",
-    "Repair" and "Heal".</dd>
-
-<dt>Transform not Transformer</dt>
-    <dd>Transform is both a noun and a verb, and thus we define a "Transform" as
-    something that "transforms," leaving "Transformer" (which sounds like an
-    electrical device/robot toy).</dd>
-
-</dl>
-
-<div id="version">$Id$</div>
-
-</body></html>
--- a/docs/dev-optimization.html
+++ b/docs/dev-optimization.html
@@ -1,32 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Discusses possible methods of optimizing HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>Optimization - HTML Purifier</title>
-
-</head><body>
-
-<h1>Optimization</h1>
-
-<div id="filing">Filed under Development</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
-<p>Here are some possible optimization techniques we can apply to code sections if
-they turn out to be slow.  Be sure not to prematurely optimize: if you get
-that itch, put it here!</p>
-
-<ul>
-    <li>Make Tokens Flyweights (may prove problematic, probably not worth it)</li>
-    <li>Rewrite regexps into PHP code</li>
-    <li>Serialize the Definition object</li>
-    <li>Batch regexp validation (do as many per function call as possible)</li>
-    <li>Parallelize strategies</li>
-</ul>
-
-<div id="version">$Id$</div>
-
-</body></html>
--- a/docs/ref-devnetwork.html
+++ b/docs/ref-devnetwork.html
@@ -2,20 +2,12 @@
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Credits and links to DevNetwork forum topics on HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />

-<title>DevNetwork Credits - HTML Purifier</title>
+<title>DevNetwork Forums</title>

 </head>
 <body>

-<h1>DevNetwork Credits</h1>
-
-<div id="filing">Filed under Reference</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
 <p>Many thanks to the DevNetwork community for answering questions,
 theorizing about design, and offering encouragement during
 the development of this library in these forum threads:</p>
@@ -29,16 +21,11 @@ the development of this library in these forum threads:</p>
    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53479">IPv6</a></li>
    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53539">http and ftp versus news and mailto</a></li>
    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53579">HTMLPurifier - Take your best shot</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53664">Need help optimizing a block of code</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53861">Non-SGML characters</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54283">Wordpress makes me cry</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54478">Parameter Object vs. Parameter Array vs. Parameter Functions</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54521">Convert encoding where output cannot represent characters</a></li>
-    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=56411">Reporting errors in a document without line numbers</a></li>
+    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53664">Need help optimizing a block of code</a>
+    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=53861">Non-SGML characters</a>
+    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54283">Wordpress makes me cry</a>
+    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54478">Parameter Object vs. Parameter Array vs. Parameter Functions</a>
+    <li><a href="http://forums.devnetwork.net/viewtopic.php?t=54521">Convert encoding where output cannot represent characters</a>
 </ul>
-
-<p>...as well as any I may have forgotten.</p>
-
-<div id="version">$Id$</div>
 </body>
 </html>
--- a/docs/enduser-id.html
+++ b/docs/enduser-id.html
@@ -1,146 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Explains various methods for allowing IDs in documents safely in HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>IDs - HTML Purifier</title>
-
-</head><body>
-
-<h1 class="subtitled">IDs</h1>
-<div class="subtitle">What they are, why you should(n't) wear them, and how to deal with it</div>
-
-<div id="filing">Filed under End-User</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
-<p>Prior to HTML Purifier 1.2.0, this library blithely accepted user input that
-looked like this:</p>
-
-<pre>&lt;a id=&quot;fragment&quot;&gt;Anchor&lt;/a&gt;</pre>
-
-<p>...presenting an attractive vector for those that would destroy standards
-compliance: simply set the ID to one that is already used elsewhere in the
-document and voila: validation breaks.  There was a half-hearted attempt to
-prevent this by allowing users to blacklist IDs, but I suspect that no one
-really bothered, and thus, with the release of 1.2.0, IDs are now <em>removed</em>
-by default.</p>
-
-<p>IDs, however, are quite useful functionality to have, so if users start
-complaining about broken anchors you'll probably want to turn them back on
-with %HTML.EnableAttrID. But before you go mucking around with the config
-object, it's probably worth to take some precautions to keep your page
-validating. Why?</p>
-
-<ol>
-   <li>Standards-compliant pages are good</li>
-   <li>Duplicated IDs interfere with anchors.  If there are two id="foobar"s in a
-   document, which spot does a browser presented with the fragment #foobar go
-   to? Most browsers opt for the first appearing ID, making it impossible
-   to references the second section. Similarly, duplicated IDs can hijack
-   client-side scripting that relies on the IDs of elements.</li>
-</ol>
-
-<p>You have (currently) four ways of dealing with the problem.</p>
-
-
-
-<h2 class="subtitled">Blacklisting IDs</h2>
-<div class="subsubtitle">Good for pages with single content source and stable templates</div>
-
-<p>Keeping in terms with the
-<acronym title="Keep It Simple, Stupid">KISS</acronym> principle, let us
-deal with the most obvious solution: preventing users from using any IDs that
-appear elsewhere on the document.  The method is simple:</p>
-
-<pre>$config->set('HTML', 'EnableAttrID', true);
-$config->set('Attr', 'IDBlacklist' array(
-    'list', 'of', 'attributes', 'that', 'are', 'forbidden'
-));</pre>
-
-<p>That being said, there are some notable drawbacks.  First of all, you have to
-know precisely which IDs are being used by the HTML surrounding the user code.
-This is easier said than done: quite often the page designer and the system
-coder work separately, so the designer has to constantly be talking with the
-coder whenever he decides to add a new anchor.  Miss one and you open yourself
-to possible standards-compliance issues.</p>
-
-<p>Furthermore, this position becomes untenable when a single web page must hold
-multiple portions of user-submitted content.  Since there's obviously no way
-to find out before-hand what IDs users will use, the blacklist is helpless.
-And even since HTML Purifier validates each segment seperately, perhaps doing
-so at different times, it would be extremely difficult to dynamically update
-the blacklist inbetween runs.</p>
-
-<p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after
-all, they might have simply specified a duplicate ID by accident.</p>
-
-<p>Thus, we get to our second method.</p>
-
-
-
-<h2 class="subtitled">Namespacing IDs</h2>
-<div class="subsubtitle">Lazy developer's way, but needs user education</div>
-
-<p>This method, too, is quite simple: add a prefix to all user IDs. With this
-code:</p>
-
-<pre>$config->set('HTML', 'EnableAttrID', true);
-$config->set('Attr', 'IDPrefix', 'user_');</pre>
-
-<p>...this:</p>
-
-<pre>&lt;a id=&quot;foobar&quot;&gt;Anchor!&lt;/a&gt;</pre>
-
-<p>...turns into:</p>
-
-<pre>&lt;a id=&quot;user_foobar&quot;&gt;Anchor!&lt;/a&gt;</pre>
-
-<p>As long as you don't have any IDs that start with user_, collisions are
-guaranteed not to happen.  The drawback is obvious: if a user submits
-id=&quot;foobar&quot;, they probably expect to be able to reference their page with
-#foobar. You'll have to tell them, &quot;No, that doesn't work, you have to add
-user_ to the beginning.&quot;</p>
-
-<p>And yes, things get hairier.  Even with a nice prefix, we still have done
-nothing about multiple HTML Purifier outputs on one page.  Thus, we have
-a second configuration value to piggy-back off of: %Attr.IDPrefixLocal:</p>
-
-<pre>$config->set('Attr', 'IDPrefixLocal', 'comment' . $id . '_');</pre>
-
-<p>This new attributes does nothing but append on to regular IDPrefix, but is
-special in that it is volatile: it's value is determined at run-time and
-cannot possibly be cordoned into, say, a .ini config file.  As for what to
-put into the directive, is up to you, but I would recommend the ID number
-the text has been assigned in the database.  Whatever you pick, however, it
-has to be unique and stable for the text you are validating.  Note, however,
-that we require that %Attr.IDPrefix be set before you use this directive.</p>
-
-<p>And also remember: the user has to know what this prefix is too!</p>
-
-
-
-<h2>Abstinence</h2>
-
-<p>You may not want to bother. That's okay too, just don't enable IDs.</p>
-
-<p>Personally, I would take this road whenever user-submitted content would be
-possibly be shown together on one page.  Why a blog comment would need to use
-anchors is beyond me.</p>
-
-
-
-<h2>Denial</h2>
-
-<p>To revert back to pre-1.2.0 behavior, simply:</p>
-
-<pre>$config->set('HTML', 'EnableAttrID', true);</pre>
-
-<p>Don't come crying to me when your page mysteriously stops validating, though.</p>
-
-<div id="version">$Id$</div>
-
-</body>
-</html>
--- a/docs/examples/demo.php
+++ b/docs/examples/demo.php
@@ -1,66 +1,34 @@
 <?php

-// using _REQUEST because we accept GET and POST requests
+header('Content-type:text/html;charset=UTF-8');

-$content = empty($_REQUEST['xml']) ? 'text/html' : 'application/xhtml+xml';
-header("Content-type:$content;charset=UTF-8");
-
-// prevent PHP versions with shorttags from barfing
-echo '<?xml version="1.0" encoding="UTF-8" ?>
-';
-
-function getFormMethod() {
-    return (isset($_REQUEST['post'])) ? 'post' : 'get';
-}
-
-if (empty($_REQUEST['strict'])) {
-?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+?><!DOCTYPE html 
+     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<?php
-} else {
-?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<?php
-}
-?>
-<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
+<html>
 <head>
-<title>HTML Purifier Live Demo</title>
+<title>HTMLPurifier Live Demo</title>
 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTML Purifier Live Demo</h1>
+<h1>HTMLPurifier Live Demo</h1>
 <?php

-require_once '../../library/HTMLPurifier.auto.php';
+set_include_path('../../library' . PATH_SEPARATOR . get_include_path());
+require_once 'HTMLPurifier.php';

-if (!empty($_REQUEST['html'])) { // start result
+if (!empty($_POST['html'])) {
    
-    if (strlen($_REQUEST['html']) > 50000) {
-        ?>
-        <p>Request exceeds maximum allowed text size of 50kb.</p>
-        <?php
-    } else { // start main processing
-    
-    $html = get_magic_quotes_gpc() ? stripslashes($_REQUEST['html']) : $_REQUEST['html'];
+    $html = get_magic_quotes_gpc() ? stripslashes($_POST['html']) : $_POST['html'];
    
    $config = HTMLPurifier_Config::createDefault();
-    $config->set('Core', 'TidyFormat', !empty($_REQUEST['tidy']));
-    $config->set('HTML', 'Strict',     !empty($_REQUEST['strict']));
+    $config->set('Core', 'TidyFormat', !empty($_POST['tidy']));
    $purifier = new HTMLPurifier($config);
    $pure_html = $purifier->purify($html);
    
 ?>
 <p>Here is your purified HTML:</p>
 <div style="border:5px solid #CCC;margin:0 10%;padding:1em;">
-<?php if(getFormMethod() == 'get') { ?>
-<div style="float:right;">
-    <a href="http://validator.w3.org/check?uri=referer"><img
-        src="http://www.w3.org/Icons/valid-xhtml10"
-        alt="Valid XHTML 1.0 Transitional" height="31" width="88" style="border:0;" /></a>
-</div>
-<?php } ?>
 <?php

 echo $pure_html;
@@ -75,34 +43,23 @@ echo htmlspecialchars($pure_html, ENT_COMPAT, 'UTF-8');

 ?></pre>
 <?php
-if (getFormMethod() == 'post') { // start POST validation notice
-?>
-<p>If you would like to validate the code with
-<a href="http://validator.w3.org/#validate-by-input">W3C's
-validator</a>, copy and paste the <em>entire</em> demo page's source.</p>
-<?php
-} // end POST validation notice
    
-} // end main processing
-
-// end result
 } else {

 ?>
-<p>Welcome to the live demo.  Enter some HTML and see how HTML Purifier
+<p>Welcome to the live demo.  Enter some HTML and see how HTMLPurifier
 will filter it.</p>
 <?php

 }

 ?>
-<form id="filter" action="demo.php<?php
-echo '?' . getFormMethod();
-if (isset($_REQUEST['profile']) || isset($_REQUEST['XDEBUG_PROFILE'])) {
-    echo '&amp;XDEBUG_PROFILE=1';
-} ?>" method="<?php echo getFormMethod();  ?>">
+<form name="filter" action="demo.php<?php
+if (isset($_GET['profile']) || isset($_GET['XDEBUG_PROFILE'])) {
+    echo '?XDEBUG_PROFILE=1';
+} ?>" method="post">
    <fieldset>
-        <legend>HTML Purifier Input (<?php echo getFormMethod(); ?>)</legend>
+        <legend>HTML</legend>
        <textarea name="html" cols="60" rows="15"><?php

 if (isset($html)) {
@@ -110,27 +67,13 @@ if (isset($html)) {
            HTMLPurifier_Encoder::cleanUTF8($html), ENT_COMPAT, 'UTF-8');
 }
        ?></textarea>
-        <?php if (getFormMethod() == 'get') { ?>
-            <p><strong>Warning:</strong> GET request method can only hold
-                8129 characters (probably less depending on your browser).
-                If you need to test anything
-                larger than that, try the <a href="demo.php?post">POST form</a>.</p>
-        <?php } ?>
-        <?php if (extension_loaded('tidy')) { ?>
        <div>Nicely format output with Tidy? <input type="checkbox" value="1"
-            name="tidy"<?php if (!empty($_REQUEST['tidy'])) echo ' checked="checked"'; ?> /></div>
-        <?php } ?>
-        <div>XHTML 1.0 Strict output? <input type="checkbox" value="1"
-        name="strict"<?php if (!empty($_REQUEST['strict'])) echo ' checked="checked"'; ?> /></div>
-        <div>Serve as application/xhtml+xml? (not for IE) <input type="checkbox" value="1"
-        name="xml"<?php if (!empty($_REQUEST['xml'])) echo ' checked="checked"'; ?> /></div>
+        name="tidy"<?php if (!empty($_POST['tidy'])) echo ' checked="checked"'; ?> /></div>
        <div>
            <input type="submit" value="Submit" name="submit" class="button" />
        </div>
    </fieldset>
 </form>
-<p>Return to <a href="http://hp.jpsband.org/">HTML Purifier's home page</a>.
-Try the form in <a href="demo.php?get">GET</a> and <a href="demo.php?post">POST</a> request
-flavors (GET is easy to validate with W3C, but POST allows larger inputs).</p>
+<p>Return to <a href="http://hp.jpsband.org/">HTMLPurifier's home page</a>.</p>
 </body>
 </html>
--- a/docs/filter-levels.txt
+++ b/docs/filter-levels.txt
@@ -0,0 +1,67 @@
+
+Filter Levels
+    When one size *does not* fit all
+
+The more I think about it, the less sense it makes for maintaining one huge
+monolithic HTMLDefinition class.  There's simply so much variation that
+could go into this definition: the set of HTML good for blog entries is
+definitely too large for HTML that would be allowed in blog comments. Going
+from Transitional to Strict requires changes to the definition.
+
+However, allowing users to specify their own whitelists was an idea I
+rejected from the start.  Simply put, the typical programmer is too lazy
+to actually go through the trouble of investigating which tags, attributes
+and properties to allow.  HTMLDefinition makes a big part of what HTMLPurifier
+is.
+
+The idea, then, is to setup fundamentally different set of definitions, which
+can further be customized using simpler configuration options.
+
+Here are some fuzzy levels you could set:
+
+1. Comments - Wordpress recommends a, abbr, acronym, b, blockquote, cite,
+    code, em, i, strike, strong; however, you could get away with only a, b and
+    i; also having p and pre tags would be helpful.
+2. Pages - As permissive as possible without allowing XSS.  No protection
+    against bad design sense, unfortunantely.  Suitable for wiki and page
+    environments.
+3. Lint - Accept everything in the spec, a Tidy wannabe.
+
+I've also decomposed tags into risk levels.  An asterisk indicates that no one
+really uses that tag, tilde indicates it's deprecated.
+
+1 - blockquote, code, em, i, p, tt / strong, sub, sup
+1* - abbr, acronym, bdo, cite, dfn, kbd, q, samp
+2 - b, br, del, div, pre, span / ins, s, strike ~ u
+3 - h2, h3, h4, h5, h6 ~ center
+4 - h1, big ~ font
+5 - a
+7 - area, map
+
+Lists - dd, dl, dt, li, ol, ul ~ menu, dir
+Tables - caption, table, td, th, tr / col, colgroup, tbody, tfoot, thead
+Forms - fieldset, form, input, lable, legend, optgroup, option, select, textarea
+XSS - noscript, object, script ~ applet
+
+Meta - base, basefont, body, head, html, link, meta, style, title
+Frames - frame, frameset, iframe
+
+And tag specific notes:
+
+a - general problems involving linkspam
+b - too much bold is bad, typographically speaking bold is discouraged
+br - often misused
+center - CSS, usually no legit use
+del - only useful in editing context
+div - little meaning in certain contexts i.e. blog comment
+h1 - usually no legit use, as header is already set by application
+h* - not needed in blog comments
+hr - usually not necessary in blog comments
+img - could be extremely undesirable if linking to external pics
+pre - could use formatting, only useful in code contexts
+q - very little support
+s - transform into span with styling or del?
+small - technically presentational
+span - depends on attribute allowances
+sub, sup - specialized
+u - little legit use, prefer class with text-decoration
--- a/docs/index.html
+++ b/docs/index.html
@@ -1,149 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Index to all HTML Purifier documentation." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>Documentation - HTML Purifier</title>
-
-</head>
-<body>
-
-<h1>Documentation</h1>
-
-<p><strong>HTML Purifier</strong> has documentation for all types of people.
-Here is an index of all of them.</p>
-
-<h2>End-user</h2>
-<p>End-user documentation that contains articles, tutorials and useful
-information for casual developers using HTML Purifier.</p>
-
-<dl>
-
-<dt><a href="enduser-id.html">IDs</a></dt>
-<dd>Explains various methods for allowing IDs in documents safely in HTML Purifier.</dd>
-
-</dl>
-
-<h2>Development</h2>
-<p>Developer documentation detailing code issues, roadmaps and project
-conventions.</p>
-
-<dl>
-
-<dt><a href="dev-code-quality.html">Code Quality Issues</a></dt>
-<dd>Discusses code quality issues and places that need to be refactored.</dd>
-
-<dt><a href="dev-progress.html">Implementation Progress</a></dt>
-<dd>Tables detailing HTML element and CSS property implementation coverage.</dd>
-
-<dt><a href="dev-naming.html">Naming Conventions</a></dt>
-<dd>Defines class naming conventions.</dd>
-
-<dt><a href="dev-optimization.html">Optimization</a></dt>
-<dd>Discusses possible methods of optimizing HTML Purifier.</dd>
-
-</dl>
-
-<h2>Proposals</h2>
-<p>Proposed features, as well as the associated rambling to get a clear
-objective in place before attempted implementation.</p>
-
-<dl>
-<dt><a href="proposal-colors.html">Colors</a></dt>
-<dd>Proposal to allow for color constraints.</dd>
-</dl>
-
-<h2>Reference</h2>
-<p>Miscellaneous essays, research pieces and other reference type material
-that may not directly discuss HTML Purifier.</p>
-
-<dl>
-<dt><a href="ref-devnetwork.html">DevNetwork Credits</a></dt>
-<dd>Credits and links to DevNetwork forum topics.</dd>
-</dl>
-
-<h2>Internal memos</h2>
-
-<p>Plaintext documents that are more for use by active developers of
-the code. They may be upgraded to HTML files or stay as TXT scratchpads.</p>
-
-<table class="table">
-
-<thead><tr>
-    <th width="10%">Type</th>
-    <th width="20%">Name</th>
-    <th>Description</th>
-</tr></thead>
-
-<tbody>
-
-<tr>
-    <td>End-user</td>
-    <td><a href="enduser-overview.txt">Overview</a></td>
-    <td>High level overview of the general control flow (mostly obsolete).</td>
-</tr>
-
-<tr>
-    <td>End-user</td>
-    <td><a href="enduser-security.txt">Security</a></td>
-    <td>Common security issues that may still arise (half-baked).</td>
-</tr>
-
-<tr>
-    <td>Proposal</td>
-    <td><a href="proposal-filter-levels.txt">Filter levels</a></td>
-    <td>Outlines details of projected configurable level of filtering.</td>
-</tr>
-
-<tr>
-    <td>Proposal</td>
-    <td><a href="proposal-language.txt">Language</a></td>
-    <td>Specification of I18N for error messages derived from MediaWiki (half-baked).</td>
-</tr>
-
-<tr>
-    <td>Proposal</td>
-    <td><a href="proposal-new-directives.txt">New directives</a></td>
-    <td>Assorted configuration options that could be implemented.</td>
-</tr>
-
-<tr>
-    <td>Reference</td>
-    <td><a href="ref-loose-vs-strict.txt">Loose vs.Strict</a></td>
-    <td>Differences between HTML Strict and Transitional versions.</td>
-</tr>
-
-<tr>
-    <td>Reference</td>
-    <td><a href="ref-proprietary-tags.txt">Proprietary tags</a></td>
-    <td>List of vendor-specific tags we may want to transform to W3C compliant markup.</td>
-</tr>
-
-<tr>
-    <td>Reference</td>
-    <td><a href="ref-strictness.txt">Strictness</a></td>
-    <td>Short essay on how loose definition isn't really loose.</td>
-</tr>
-
-<tr>
-    <td>Reference</td>
-    <td><a href="ref-xhtml-1.1.txt">XHTML 1.1</a></td>
-    <td>What we'd have to do to support XHTML 1.1.</td>
-</tr>
-
-<tr>
-    <td>Reference</td>
-    <td><a href="ref-whatwg.txt">WHATWG</a></td>
-    <td>How WHATWG plays into what we need to do.</td>
-</tr>
-
-</tbody>
-
-</table>
-
-<div id="version">$Id$</div>
-</body>
-</html>
--- a/docs/naming.txt
+++ b/docs/naming.txt
@@ -0,0 +1,56 @@
+
+Naming
+
+The classes in this library follow a few naming conventions, which may
+help you find the correct functionality more quickly.  Here they are:
+
+All classes occupy the HTMLPurifier pseudo-namespace.
+    This means that all classes are prefixed with HTMLPurifier_.  As such, all
+    names under HTMLPurifier_ are reserved.  I recommend that you use the name
+    HTMLPurifierX_YourName_ClassName, especially if you want to take advantage
+    of HTMLPurifier_ConfigDef.
+
+All classes correspond to their path if library/ was in the include path
+    HTMLPurifier_AttrDef is located at HTMLPurifier/AttrDef.php; replace
+    underscores with slashes and append .php and you'll have the location of
+    the class.
+
+Harness and Test are reserved class names for unit tests
+    The suffix "Test" indicates that the class is a subclass of UnitTestCase
+    (of the Simpletest library) and is testable. "Harness" indicates a subclass
+    of UnitTestCase that is not meant to be run but to be extended into 
+    concrete test cases and contains custom test methods (i.e. assert*())
+
+Class names do not necessarily represent inheritance hierarchies
+    While we try to reflect inheritance in naming to some extent, it is not
+    guaranteed (for instance, none of the classes inherit from HTMLPurifier,
+    the base class).  However, all class files have the require_once
+    declarations to whichever classes they are tightly coupled to.
+
+Strategy has a meaning different from the Gang of Four pattern
+    In Design Patterns, the Gang of Four describes a Strategy object as
+    encapsulating an algorithm so that they can be switched at run-time.  While
+    our strategies are indeed algorithms, they are not meant to be substituted:
+    all must be present in order for proper functioning.
+
+Abbreviations are avoided
+    We try to avoid abbreviations as much as possible, but in some cases, 
+    abbreviated version is more readable than the full version. Here, we
+    list common abbreviations:
+        Attr(s) -> Attribute(s)
+        Def -> Definition
+
+Ambiguity concerning the definition of Def/Definition
+    While a definition normally defines the structure/acceptable values of
+    an entity, most of the definitions in this application also attempt
+    to validate and fix the value.  I am unsure of a better name, as
+    "Validator" would exclude fixing the value, "Fixer" doesn't invoke
+    the proper image of "fixing" something, and "ValidatorFixer" is too long!
+    Some other suggestions were "Handler", "Reference", "Check", "Fix",
+    "Repair" and "Heal".
+
+Transform not Transformer
+    Transform is both a noun and a verb, and thus we define a "Transform" as
+    something that "transforms," leaving "Transformer" (which sounds like an
+    electrical device/robot toy).
+
--- a/docs/optimization.txt
+++ b/docs/optimization.txt
@@ -0,0 +1,12 @@
+
+Optimization
+
+Here are some possible optimization techniques we can apply to code sections if
+they turn out to be slow.  Be sure not to prematurely optimize: if you get
+that itch, put it here!
+
+ - Make Tokens Flyweights (may prove problematic, probably not worth it)
+ - Rewrite regexps into PHP code
+ - Serialize the Definition object
+ - Batch regexp validation (do as many per function call as possible)
+ - Parallelize strategies
--- a/docs/dev-progress.html
+++ b/docs/dev-progress.html
@@ -2,11 +2,8 @@
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Tables detailing HTML element and CSS property implementation coverage in HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />

-<title>Implementation Progress - HTML Purifier</title>
+<title>HTMLPurifier Progress</title>

 <style type="text/css">

@@ -28,10 +25,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}

 </head><body>

-<h1>Implementation Progress</h1>
-
-<div id="filing">Filed under Development</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
+<h1>HTMLPurifier Progress</h1>

 <h2>Key</h2>

@@ -44,7 +38,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
 <tr><td class="feature">Feature, requires extra work</td></tr>
 </tbody></table>

-<h2>CSS</h2>
+<h3>CSS</h3>

 <table cellspacing="0">

@@ -128,20 +122,19 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}

 <tbody>
 <tr><th colspan="2">Absolute positioning, unknown release milestone</th></tr>
-<tr class="danger impl-no"><td>bottom</td><td rowspan="4">Dangerous, must be non-negative to even be considered,
-    but it's still possible to arbitrarily position by running over.</td></tr>
-<tr class="danger impl-no"><td>left</td></tr>
-<tr class="danger impl-no"><td>right</td></tr>
-<tr class="danger impl-no"><td>top</td></tr>
-<tr class="impl-no"><td>clip</td><td>-</td></tr>
-<tr class="danger impl-no"><td>position</td><td>ENUM(static, relative, absolute, fixed)
+<tr class="danger"><td>bottom</td><td rowspan="4">Dangerous, must be non-negative</td></tr>
+<tr class="danger"><td>left</td></tr>
+<tr class="danger"><td>right</td></tr>
+<tr class="danger"><td>top</td></tr>
+<tr><td>clip</td><td>-</td></tr>
+<tr class="danger"><td>position</td><td>ENUM(static, relative, absolute, fixed), permit
    relative not absolute?</td></tr>
-<tr class="danger impl-no"><td>z-index</td><td>Dangerous</td></tr>
+<tr class="danger"><td>z-index</td><td>Dangerous</td></tr>
 </tbody>

 <tbody>
 <tr><th colspan="2">Unknown</th></tr>
-<tr class="danger css1"><td>background-image</td><td>Dangerous, target milestone 1.3</td></tr>
+<tr class="danger css1"><td>background-image</td><td>Dangerous, target milestone 1.2</td></tr>
 <tr class="css1"><td>background-attachment</td><td>ENUM(scroll, fixed),
    Depends on background-image</td></tr>
 <tr class="css1"><td>background-position</td><td>Depends on background-image</td></tr>
@@ -151,7 +144,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
    inline-block has incomplete IE6 support and requires -moz-inline-box
    for Mozilla. Unknown target milestone.</td></tr>
 <tr><td class="css1">height</td><td>Interesting, why use it? Unknown target milestone.</td></tr>
-<tr class="danger css1"><td>list-style-image</td><td>Dangerous? Target milestone 1.3</td></tr>
+<tr class="danger css1"><td>list-style-image</td><td>Dangerous? Target milestone 1.2</td></tr>
 <tr class="impl-no"><td>max-height</td><td rowspan="4">No IE 5/6</td></tr>
 <tr class="impl-no"><td>min-height</td></tr>
 <tr class="impl-no"><td>max-width</td></tr>
@@ -237,7 +230,7 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>
 <tr><th colspan="3">Questionable</th></tr>
 <tr class="impl-no"><td>accesskey</td><td>A</td><td>May interfere with main interface</td></tr>
 <tr class="impl-no"><td>tabindex</td><td>A</td><td>May interfere with main interface</td></tr>
-<tr><td>target</td><td>A</td><td>Config enabled, only useful for frame layouts, disallowed in strict</td></tr>
+<tr><td>target</td><td>A</td><td>Config enabled, only useful for frame layouts</td></tr>
 </tbody>

 <tbody>
@@ -284,11 +277,11 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>
 <tr><td>nowrap</td><td>TD, TH</td><td>Boolean, style 'white-space:nowrap;' (not compat with IE5)</td></tr>
 <tr><td>size</td><td>HR</td><td>Near-equiv 'width', needs px suffix if original was pixels</td></tr>
 <tr class="required impl-yes"><td>src</td><td>IMG</td><td>Required, insert blank or default img if not set</td></tr>
-<tr class="impl-yes"><td>start</td><td>OL</td><td>Poorly supported 'counter-reset', allowed in loose, dropped in strict</td></tr>
+<tr><td>start</td><td>OL</td><td>Poorly supported 'counter-reset', transform may not be desirable</td></tr>
 <tr><td rowspan="3">type</td><td>LI</td><td rowspan="3">Equivalent style 'list-style-type', different allowed values though. (needs testing)</td></tr>
    <tr><td>OL</td></tr>
    <tr><td>UL</td></tr>
-<tr class="impl-yes"><td>value</td><td>LI</td><td>Poorly supported 'counter-reset', allowed in loose, dropped in strict</td></tr>
+<tr><td>value</td><td>LI</td><td>Poorly supported 'counter-reset', transform may not be desirable, see ol.start. Configurable.</td></tr>
 <tr><td>vspace</td><td>IMG</td><td>Near-equiv styles 'margin-left' and 'margin-right', needs px suffix, see hspace</td></tr>
 <tr><td rowspan="2">width</td><td>HR</td><td rowspan="2">Near-equiv style 'width', needs px suffix if original was pixels</td></tr>
    <tr><td>TD, TH</td></tr>
@@ -296,6 +289,4 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>

 </table>

-<div id="version">$Id$</div>
-
 </body></html>
--- a/docs/proposal-colors.html
+++ b/docs/proposal-colors.html
@@ -1,47 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
-    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-<meta name="description" content="Proposal to allow for color constraints in HTML Purifier." />
-<link rel="stylesheet" type="text/css" href="./style.css" />
-
-<title>Proposal: Colors - HTML Purifier</title>
-
-</head><body>
-
-<h1 class="subtitled">Colors</h1>
-<div class="subtitle">Hammering some sense into those color-blind newbies</div>
-
-<div id="filing">Filed under Proposals</div>
-<div id="index">Return to the <a href="index.html">index</a>.</div>
-
-<p>Your website probably has a color-scheme.
-<span style="color:#090; background:#FFF;">Green on white</span>,
-<span style="color:#A0F; background:#FF0;">purple on yellow</span>,
-whatever. When you give users the ability to style their content, you may
-want them to keep in line with your styling. If you're website is all
-about light colors, you don't want a user to come in and vandalize your
-page with a deep maroon.</p>
-
-<p>This is an extremely silly feature proposal, but I'm writing it down anyway.</p>
-
-<p>What if the user could constrain the colors specified in inline styles? You
-are only allowed to use these shades of dark green for text and these shades
-of light yellow for the background. At the very least, you could ensure
-that we did not have pale yellow on white text.</p>
-
-<h2>Implementation issues</h2>
-
-<ol>
-<li>Requires the color attribute definition to know, currently, what the text
-and background colors are. This becomes difficult when classes are thrown
-into the mix.</li>
-<li>The user still has to define the permissible colors, how does one do
-something like that?</li>
-</ol>
-
-<div id="version">$Id$</div>
-
-</body>
-</html>
--- a/docs/proposal-filter-levels.txt
+++ b/docs/proposal-filter-levels.txt
@@ -1,130 +0,0 @@
-
-Filter Levels
-    When one size *does not* fit all
-
-The more I think about it, the less sense it makes for maintaining one huge
-monolithic HTMLDefinition class.  There's simply so much variation that
-could go into this definition: the set of HTML good for blog entries is
-definitely too large for HTML that would be allowed in blog comments. Going
-from Transitional to Strict requires changes to the definition.
-
-Allowing users to specify their own whitelists is one step (implemented, btw), 
-but I have doubts on only doing this. Simply put, the typical programmer is too 
-lazy to actually go through the trouble of investigating which tags, attributes 
-and properties to allow. HTMLDefinition makes a big part of what HTMLPurifier 
-is. 
-
-The idea, then, is to setup fundamentally different set of definitions, which
-can further be customized using simpler configuration options.
-
-Here are some fuzzy levels you could set:
-
-1. Comments - Wordpress recommends a, abbr, acronym, b, blockquote, cite,
-    code, em, i, strike, strong; however, you could get away with only a, em and
-    p; also having blockquote and pre tags would be helpful.
-2. BBCode - Emulate the usual tagset for forums: b, i, img, a, blockquote,
-    pre, div, span and h[2-6] (the last three are for specially formatted
-    posts, div and span require associated classes or inline styling enabled
-    to be useful)
-3. Pages - As permissive as possible without allowing XSS.  No protection
-    against bad design sense, unfortunantely.  Suitable for wiki and page
-    environments. (probably what we have now)
-4. Lint - Accept everything in the spec, a Tidy wannabe. (This probably won't
-    get implemented as it would require routines for things like <object>
-    and friends to be implemented, which is a lot of work for not a lot of
-    benefit)
-
-One final note: when you start axing tags that are more commonly used, you
-run the risk of accidentally destroying user data, especially if the data
-is incoming from a WYSIWYG eidtor that hasn't been synced accordingly. This may
-make forbidden element to text transformations desirable (for example, images).
-
-
-
-== Element Risk Analysis ==
-
-Legend:
-    [danger level] - regular tags / uncommon tags ~ deprecated tags
-    [danger level]* - rare tags
-
-1 - blockquote, code, em, i, p, tt / strong, sub, sup
-1* - abbr, acronym, bdo, cite, dfn, kbd, q, samp
-2 - b, br, del, div, pre, span / ins, s, strike ~ u
-3 - h2, h3, h4, h5, h6 ~ center
-4 - h1, big ~ font
-5 - a
-7 - area, map
-
-These are special use tags, they should be enabled on a blanket basis.
-
-Lists - dd, dl, dt, li, ol, ul ~ menu, dir
-Tables - caption, table, td, th, tr / col, colgroup, tbody, tfoot, thead
-
-Forms - fieldset, form, input, lable, legend, optgroup, option, select, textarea
-XSS - noscript, object, script ~ applet
-Meta - base, basefont, body, head, html, link, meta, style, title
-Frames - frame, frameset, iframe
-
-And tag specific notes:
-
-a   - general problems involving linkspam
-b   - too much bold is bad, typographically speaking bold is discouraged
-br  - often misused
-center - CSS, usually no legit use
-del - only useful in editing context
-div - little meaning in certain contexts i.e. blog comment
-h1  - usually no legit use, as header is already set by application
-h*  - not needed in blog comments
-hr  - usually not necessary in blog comments
-img - could be extremely undesirable if linking to external pics (CSRF, goatse)
-pre - could use formatting, only useful in code contexts
-q   - very little support
-s   - transform into span with styling or del?
-small - technically presentational
-span - depends on attribute allowances
-sub, sup - specialized
-u   - little legit use, prefer class with text-decoration
-
-Based on the riskiness of the items, we may want to offer %HTML.DisableImages
-attribute and put URI filtering higher up on the priority list.
-
-
-== Attribute Risk Analysis ==
-
-We actually have a suprisingly small assortment of allowed attributes (the
-rest are deprecated in strict, and thus we opted not to allow them, even
-though our output is XHTML Transitional by default.)
-
-Required URI - img.alt, img.src, a.href
-Medium risk - *.class, *.dir
-High risk - img.height, img.width, *.id, *.style
-
-Table - colgroup/col.span, td/th.rowspan, td/th.colspan
-Uncommon - *.title, *.lang, *.xml:lang
-Rare - td/th.abbr, table.summary, {table}.charoff
-Rare URI - del.cite, ins.cite, blockquote.cite, q.cite, img.longdesc
-Presentational - {table}.align, {table}.valign, table.frame, table.rules,
-    table.border
-Partially presentational - table.cellpadding, table.cellspacing,
-    table.width, col.width, colgroup.width
-
-
-== CSS Risk Analysis ==
-
-There are certain CSS elements that are extremely useful inline, but then
-as you get to more presentation oriented styling it may not always be
-appropriate to inline them.
-
-Useful - clear, float, border-collapse, caption-side
-
-These CSS properties can break layouts if used improperly. We have excluded
-any CSS properties that are not currently implemented (such as position).
-
-Dangerous, can go outside container - float
-Easy to abuse - font-size, font-family (font), width
-Colored - background-color (background), border-color (border), color
-Dramatic - border, list-style-position (list-style), margin, padding,
-    text-align, text-indent, text-transform, vertical-align, line-height
-
-Dramatic elements substantially change the look of text in ways that should
-probably have been reserved to other areas.
--- a/docs/proposal-language.txt
+++ b/docs/proposal-language.txt
@@ -1,98 +0,0 @@
-We are going to model our I18N/L10N off of MediaWiki's system.  Their's is
-obviously quite complicated, so we're going to simplify it a bit for our needs.
-
-== Structure ==
-
-First, you have a Language object.  This object contains all the localisable
-message strings, as well as other important language-specific settings and
-custom behavior (uppercasing, lowercasing, printing dates, formatting
-numbers, etc.)
-
-The object is constructed from two sources: subclassed versions of itself
-(classes) and Message files (messages).
-
-== General use ==
-
-You load a language object by calling the Language::factory() function. 
-This function the class file for the object (taking in account fallback 
-languages by using the fallback langauge's object but overloading the 
-language key) and returns that object. Nothing else happens.
-
-When a message/etc is requested, a lazy load initializor is called.  Now the
-real work starts.  We're first going to take the scenario that the language
-is not cached.  The system loads the Messages file by:
-
-    require( $filename );
-    $cache = compact( self::$mLocalisationKeys );	
-
-...where self::$mLocalisationKeys is the name of variables that could be used
-in the localization file. This lets you use things like:
-
-    $fallback = false;
-    $rtl = false;
-
-...and easily siphon them into arrays.
-
-Then, we load the $fallback language (if not set, English) to fill in the gaps in
-the messages.  There is specialized behavior for certain keys, as they can be
-mergeable maps, lists or alias lists (not sure what the last one is).
-
-== Caching ==
-
-MediaWiki has lots of caching mechanisms built in, which make the code somewhat
-more difficult to understand.  Before doing any loading, MediaWiki will check
-the following places to see if we can be lazy:
-
-1. $mLocalisationCache[$code] -  just a variable where it may have been stashed
-2. serialized/$code.ser -  compiled serialized language file
-3. Memcached version of file (with expiration checking)
-
-Expiration checking consists of by ensuring all dependencies have filemtime
-that match the ones bundled with the cached copy. Similar checking could be
-implemented for serialized versions, as it seems that they are not updated
-until manually recompiled.
-
-== Behavior ==
-
-Things that are localizable:
-
-  Weekdays (and abbrev)
-  Months (and abbrev)
-  Bookstores
-  Skin names
-  Date preferences / Custom date format
-  Default date format
-  Default user option overrides
-+ Language names
-  Timezones
-+ Character encoding conversion via iconv
-  UpperLowerCase first (needs casemaps for some)
-  UpperLowerCase
-  Uppercase words
-  Uppercase word breaks
-  Case folding
-  Strip punctuation for MySQL search
-  Get first character
-+ Alternate encoding
-+ Recoding for edit (and then recode input)
-+ RTL
-+ Direction mark character depending on RTL
-? Arrow depending on RTL
-  Languages where italics cannot be used
-+ Number formatting (commafy, transform digits, transform separators)
-  Truncate (multibyte)
-  Grammar conversions for inflected languages
-  Plural transformations
-  Formatting expiry times
-  Segmenting for diffs (Chinese)
-  Convert to variants of language
-  Language specific user preference options
-  Link trails [[foo]]bar
-+ Language code (RFC 3066)
-
-Neat functionality:
-
-  I18N sprintfDate
-  Roman numeral formatting
-
-Items marked with a + likely need to be addressed by HTML Purifier
--- a/docs/proposal-new-directives.txt
+++ b/docs/proposal-new-directives.txt
@@ -1,46 +0,0 @@
-
-Configuration Ideas
-
-Here are some theoretical configuration ideas that we could implement some
-time.  Note the naming convention: %Namespace.Directive
-
-%Attr.IDPrefix - prefix all ids with this
-
-%Attr.RewriteFragments - if there's %Attr.IDPrefix we may want to transparently
-    rewrite the URLs we parse too.  However, we can only do it when it's a pure
-    anchor link, so it's not foolproof
-
-%Attr.ClassBlacklist,
-%Attr.ClassWhitelist,
-%Attr.ClassPolicy - determines what classes are allowed. When
-    %Attr.ClassPolicy is set to Blacklist, only allow those not in
-    %Attr.ClassBlacklist. When it's Whitelist, only allow those in
-    %Attr.ClassWhitelist.
-
-%Attr.MaxWidth, 
-%Attr.MaxHeight - caps for width and height related checks.
-    (the hack in Pixels for an image crashing attack could be replaced by this)
-
-%URI.AddRelNofollow - will add rel="nofollow" to all links, preventing the
-    spread of ill-gotten pagerank
-
-%URI.RelativeToAbsolute - transforms all relative URIs to absolute form
-
-%URI.HostBlacklistRegex - regexes that if matching the host are disallowed
-%URI.HostWhitelist - domain names that are excluded from the host blacklist
-%URI.HostPolicy - determines whether or not its reject all and then whitelist
-    or allow all in then do specific blacklists with whitelist intervening.
-    'DenyAll' or 'AllowAll' (default)
-
-%URI.DisableIPHosts - URIs that have IP addresses for hosts are disallowed.
-    Be sure to also grab unusual encodings (dword, hex and octal), which may
-    be currently be caught by regular DNS
-%URI.DisableIDN - Disallow raw internationalized domain names. Punycode
-    will still be permitted.
-
-%URI.ConvertUnusualIPHosts - transform dword/hex/octal IP addresses to the
-    regular form
-%URI.ConvertAbsoluteDNS - Remove extra dots after host names that trigger
-    absolute DNS.  While this is actually the preferred method according to
-    the RFC, most people opt to use a relative domain name relative to . (root).
-
--- a/docs/ref-loose-vs-strict.txt
+++ b/docs/ref-loose-vs-strict.txt
@@ -1,37 +0,0 @@
-
-Loose versus Strict
-    Changes from one doctype to another
-
-There are changes.  Wow, how insightful.  Not everything changed is relevant
-to HTML Purifier, though, so let's take a look:
-
-== Major incompatibilities ==
-
-[done] BLOCKQUOTE changes from 'flow' to 'block'
-    current behavior: inline inner contents should not be nuked, block-ify as necessary
-[partially-done] U, S, STRIKE cut
-    current behavior: removed completely
-    projected behavior: replace with appropriate inline span + CSS
-[done] ADDRESS from potpourri to Inline (removes p tags)
-    current behavior: block tags silently dropped
-    ideal behavior: replace tags with something like <br>. (not high priority)
-
-== Things we can loosen up ==
-
-Tags DIR, MENU, CENTER, ISINDEX, FONT, BASEFONT? allowed in loose
-    current behavior: transform to strict-valid forms
-Attributes allowed in loose (see attribute transforms in 'dev-progress.html')
-    current behavior: projected to transform into strict-valid forms
-
-== Periphery issues ==
-
-A tag's attribute 'target' (for selecting frames) cut
-    current behavior: not allowed at all
-    projected behavior: use loose doctype if needed, needs valid values
-[done] OL/LI tag's attribute 'start'/'value' (for renumbering lists) cut
-    current behavior: no substitute, just delete when in strict, allow in loose
-Attribute 'name' deprecated in favor of 'id'
-    current behavior: dropped silently
-    projected behavior: create proper AttrTransform (currently not allowed at all)
-[done] PRE tag allows SUB/SUP? (strict dtd comment vs syntax, loose disallows)
-    current behavior: disallow as usual
--- a/docs/ref-proprietary-tags.txt
+++ b/docs/ref-proprietary-tags.txt
@@ -1,22 +0,0 @@
-
-Proprietary Tags
-    <nobr> and friends
-
-Here are some proprietary tags that W3C does not define but occasionally show
-up in the wild.  We have only included tags that would make sense in an
-HTML Purifier context.
-
-<align>, block element that aligns (extremely rare)
-<blackface>, inline that double-bolds text (extremely rare)
-<comment>, hidden comment for IE and WebTV
-<multicol cols=number gutter=pixels width=pixels>, multiple columns
-<nobr>, no linebreaks
-<spacer align=* type="vertical|horizontal|block">, whitespace in doc,
-    use width/height for block and size for vertical/horizontal (attributes)
-    (extremely rare)
-<wbr>, potential word break point: allows linebreaks. Only works in <nobr>
-
-<listing>, monospace pre-variant (extremely rare)
-<plaintext>, escapes all tags to the end of document
-<ruby> and friends, (more research needed, appears to be XHTML 1.1 markup)
-<xmp>, monospace, replace with pre
--- a/docs/ref-strictness.txt
+++ b/docs/ref-strictness.txt
@@ -1,36 +0,0 @@
-
-Is HTML Purifier Strict or Transitional?
-    A little bit of helpful guidance
-
-Despite the fact that HTML Purifier professes only to support transitional
-HTML, it rejects a lot of attributes and elements that are actually, indeed,
-valid. You can investigate progress.html to find out precisely what we
-are doing to these *deprecated* attributes.
-
-However, users have found that Strict HTML imposes some quite unreasonable
-restrictions on certain things. The start and value attributes in ol and
-li (respectively) perhaps are the most contested. There's is currently no
-widely supported browser method short of JavaScript that can replace these
-two deprecated elements. HTML Purifier does not currently support them, but
-it might behoove us to do so while our output is still transitional.
-
-Fortunantely, that's the only real bugger case. The others have near-perfect
-CSS equivalents, and were presentational anyway. However, the other question
-pops up: should we always convert these to the CSS forms when 1. the spec
-allows them anyway and 2. older browsers support them better? After all, the
-whole point about CSS is to seperate styling from content, so inline styling
-doesn't solve that problem.
-
-It's an icky question, and we'll have to deal with it as more and more 
-transforms get implemented.  As of right now, however, we currently support
-these loose-only constructs in loose mode:
-
- <ul start="1">, <li value="1"> attributes
- <u>, <strike>, <s> tags
- flow children in <blockquote>
- mixed children in <address>
-
-The changed child definitions as well as the ul.start li.value are the most
-compelling reasons why loose should be used.  We may want offer disabling <u>,
-<strike> and <s> by themselves.
-
--- a/docs/ref-whatwg.txt
+++ b/docs/ref-whatwg.txt
@@ -1,9 +0,0 @@
-
-Web Hypertext Application Technology Working Group
-    WHATWG
-
-I don't think we need to worry about them.  Untrusted users shouldn't be
-submitting applications, eh?  But if some interesting attribute pops up in
-their spec, and might be worth supporting, stick it here.
-
-(none so far, as you can see)
--- a/docs/ref-xhtml-1.1.txt
+++ b/docs/ref-xhtml-1.1.txt
@@ -1,20 +0,0 @@
-
-Getting XHTML 1.1 Working
-
-It's quite simple, according to <http://www.w3.org/TR/xhtml11/changes.html>
-
-1. Scratch lang entirely in favor of xml:lang
-2. Scratch name entirely in favor of id (partially-done)
-3. Support Ruby <http://www.w3.org/TR/2001/REC-ruby-20010531/>
-
-...but that's only an informative section. More things to do:
-
-1. Scratch style attribute (it's deprecated)
-2. Be module-aware
-3. Cross-reference minimal content models with existing DTDs and determine
-   changes (todo)
-4. Watch out for the Legacy Module
-<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/abstract_modules.html#s_legacymodule>
-5. Let users specify their own custom modules
-6. Study Modularization document
-<http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/>
--- a/docs/enduser-security.txt
+++ b/docs/enduser-security.txt
--- a/docs/enduser-overview.txt
+++ b/docs/enduser-overview.txt
--- a/docs/style.css
+++ b/docs/style.css
@@ -1,40 +0,0 @@
-html {font-size:1em; font-family:serif; }
-body {margin-left:4em; margin-right:4em; }
-
-dt {font-weight:bold; }
-pre {margin-left:2em; }
-pre, code, tt {font-family:monospace; font-size:1em; }
-
-h1 {text-align:center; font-family:Garamond, serif;
-  font-variant:small-caps;}
-h2 {border-bottom:1px solid #CCC; font-family:sans-serif; font-weight:normal;
-    font-size:1.3em;}
-h3 {font-family:sans-serif; font-size:1.1em; font-weight:bold; }
-h4 {font-family:sans-serif; font-size:0.9em; font-weight:bold; }
-
-/* For witty quips */
-.subtitled {margin-bottom:0em;}
-.subtitle , .subsubtitle {font-size:.8em; margin-bottom:1em;
-    font-style:italic; margin-top:-.2em;text-align:center;}
-.subsubtitle {text-align:left;margin-left:2em;}
-
-/* Used for special "See also" links. */
-.reference {font-style:italic;margin-left:2em;}
-
-/* Marks off asides, discussions on why something is the way it is */
-.aside {margin-left:2em; font-family:sans-serif; font-size:0.9em; }
-
-/* A regular table */
-.table {border-collapse:collapse; border-bottom:2px solid #888; margin-left:2em; }
-.table thead th {margin:0; background:#888; color:#FFF; }
-.table thead th:first-child {-moz-border-radius-topleft:1em;}
-.table tbody td {border-bottom:1px solid #CCC; padding-right:0.6em;padding-left:0.6em;}
-
-/* Category of the file */
-#filing {font-weight:bold; font-size:smaller; }
-
-/* Contains, without exception, Return to index. */
-#index {font-size:smaller; }
-
-/* Contains, without exception, $Id$, for SVN version info. */
-#version {text-align:right; font-style:italic; margin:2em 0;}
--- a/library/HTMLPurifier.auto.php
+++ b/library/HTMLPurifier.auto.php
@@ -1,10 +0,0 @@
-<?php
-
-/**
- * This is a stub include that automatically configures the include path.
- */
-
-set_include_path(dirname(__FILE__) . PATH_SEPARATOR . get_include_path() );
-require_once 'HTMLPurifier.php';
-
-?>
--- a/library/HTMLPurifier.php
+++ b/library/HTMLPurifier.php
@@ -3,7 +3,7 @@
 /*!
 * @mainpage
 * 
- * HTML Purifier is an HTML filter that will take an arbitrary snippet of
+ * HTMLPurifier is an HTML filter that will take an arbitrary snippet of
 * HTML and rigorously test, validate and filter it into a version that
 * is safe for output onto webpages. It achieves this by:
 * 
@@ -22,7 +22,7 @@
 */

 /*
-    HTML Purifier 1.3.0 - Standards Compliant HTML Filtering
+    HTMLPurifier - Standards Compliant HTML Filtering
    Copyright (C) 2006 Edward Z. Yang

    This library is free software; you can redistribute it and/or
@@ -44,7 +44,6 @@
 // they get included
 require_once 'HTMLPurifier/ConfigSchema.php';
 require_once 'HTMLPurifier/Config.php';
-require_once 'HTMLPurifier/Context.php';

 require_once 'HTMLPurifier/Lexer.php';
 require_once 'HTMLPurifier/Generator.php';
@@ -96,17 +95,16 @@ class HTMLPurifier
     */
    function purify($html, $config = null) {
        $config = $config ? $config : $this->config;
-        $context =& new HTMLPurifier_Context();
-        $html = $this->encoder->convertToUTF8($html, $config, $context);
+        $html = $this->encoder->convertToUTF8($html, $config);
        $html = 
            $this->generator->generateFromTokens(
                $this->strategy->execute(
-                    $this->lexer->tokenizeHTML($html, $config, $context),
-                    $config, $context
+                    $this->lexer->tokenizeHTML($html, $config),
+                    $config
                ),
-                $config, $context
+                $config
            );
-        $html = $this->encoder->convertFromUTF8($html, $config, $context);
+        $html = $this->encoder->convertFromUTF8($html, $config);
        return $html;
    }
    
--- a/library/HTMLPurifier/AttrContext.php
+++ b/library/HTMLPurifier/AttrContext.php
@@ -0,0 +1,26 @@
+<?php
+
+/**
+ * Internal data-structure used in attribute validation to accumulate state.
+ * 
+ * This is a data-structure that holds objects that accumulate state, like
+ * HTMLPurifier_IDAccumulator. It's better than using globals!
+ * 
+ * @note Many functions that accept this object have it as a mandatory
+ *       parameter, even when there is no use for it.  Though this is
+ *       for the same reasons as why HTMLPurifier_Config is a mandatory
+ *       parameter, it is also because you cannot assign a default value
+ *       to a parameter passed by reference (passing by reference is essential
+ *       for context to work in PHP 4).
+ */
+
+class HTMLPurifier_AttrContext
+{
+    /**
+     * Contains an HTMLPurifier_IDAccumulator, which keeps track of used IDs.
+     * @public
+     */
+    var $id_accumulator;
+}
+
+?>
--- a/library/HTMLPurifier/AttrDef.php
+++ b/library/HTMLPurifier/AttrDef.php
@@ -1,5 +1,7 @@
 <?php

+require_once 'HTMLPurifier/AttrContext.php';
+
 /**
 * Base class for all validating attribute definitions.
 * 
@@ -20,7 +22,10 @@ class HTMLPurifier_AttrDef
    var $minimized = false;
    
    /**
-     * Validates and cleans passed string according to a definition.
+     * Abstract function defined for functions that validate and clean strings.
+     * 
+     * This function forms the basis for all the subclasses: they must
+     * define this method.
     * 
     * @public
     * @param $string String to be validated and cleaned.
@@ -43,16 +48,7 @@ class HTMLPurifier_AttrDef
     * 
     * @note This method is not entirely standards compliant, as trim() removes
     *       more types of whitespace than specified in the spec. In practice,
-     *       this is rarely a problem, as those extra characters usually have
-     *       already been removed by HTMLPurifier_Encoder.
-     * 
-     * @warning This processing is inconsistent with XML's whitespace handling
-     *          as specified by section 3.3.3 and referenced XHTML 1.0 section
-     *          4.7.  Compliant processing requires all line breaks normalized
-     *          to "\n", so the fix is not as simple as fixing it in this
-     *          function.  Trim and whitespace collapsing are supposed to only
-     *          occur in NMTOKENs.  However, note that we are NOT necessarily
-     *          parsing XML, thus, this behavior may still be correct.
+     *       this is rarely a problem.
     * 
     * @public
     */
--- a/library/HTMLPurifier/AttrDef/CSS.php
+++ b/library/HTMLPurifier/AttrDef/CSS.php
@@ -43,7 +43,6 @@ class HTMLPurifier_AttrDef_CSS extends HTMLPurifier_AttrDef
            $propvalues[$property] = $result;
        }
        
-        // procedure does not write the new CSS simultaneously, so it's
        // slightly inefficient, but it's the only way of getting rid of
        // duplicates. Perhaps config to optimize it, but not now.
        
--- a/library/HTMLPurifier/AttrDef/Class.php
+++ b/library/HTMLPurifier/AttrDef/Class.php
@@ -24,14 +24,13 @@ class HTMLPurifier_AttrDef_Class extends HTMLPurifier_AttrDef
        // and plus it would complicate optimization efforts (you never
        // see that anyway).
        $matches = array();
-        $pattern = '/(?:(?<=\s)|\A)'. // look behind for space or string start
+        $pattern = '/(?:(?<=\s)|\A)'.
                   '((?:--|-?[A-Za-z_])[A-Za-z_\-0-9]*)'.
-                   '(?:(?=\s)|\z)/'; // look ahead for space or string end
+                   '(?:(?=\s)|\z)/';
        preg_match_all($pattern, $string, $matches);
        
        if (empty($matches[1])) return false;
        
-        // reconstruct class string
        $new_string = '';
        foreach ($matches[1] as $class_names) {
            $new_string .= $class_names . ' ';
--- a/library/HTMLPurifier/AttrDef/Email.php
+++ b/library/HTMLPurifier/AttrDef/Email.php
@@ -1,17 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/AttrDef.php';
-
-class HTMLPurifier_AttrDef_Email extends HTMLPurifier_AttrDef
-{
-    
-    /**
-     * Unpacks a mailbox into its display-name and address
-     */
-    function unpack($string) {
-        // needs to be implemented
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/AttrDef/Email/SimpleCheck.php
+++ b/library/HTMLPurifier/AttrDef/Email/SimpleCheck.php
@@ -1,23 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/AttrDef/Email.php';
-
-/**
- * Primitive email validation class based on the regexp found at 
- * http://www.regular-expressions.info/email.html
- */
-class HTMLPurifier_AttrDef_Email_SimpleCheck extends HTMLPurifier_AttrDef_Email
-{
-    
-    function validate($string, $config, &$context) {
-        // no support for named mailboxes i.e. "Bob <bob@example.com>"
-        // that needs more percent encoding to be done
-        if ($string == '') return false;
-        $string = trim($string);
-        $result = preg_match('/^[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i', $string);
-        return $result ? $string : false;
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/AttrDef/Host.php
+++ b/library/HTMLPurifier/AttrDef/Host.php
@@ -5,7 +5,7 @@ require_once 'HTMLPurifier/AttrDef/IPv4.php';
 require_once 'HTMLPurifier/AttrDef/IPv6.php';

 /**
- * Validates a host according to the IPv4, IPv6 and DNS (future) specifications.
+ * Validates a host according to the IPv4, IPv6 and DNS specifications.
 */
 class HTMLPurifier_AttrDef_Host extends HTMLPurifier_AttrDef
 {
@@ -35,8 +35,6 @@ class HTMLPurifier_AttrDef_Host extends HTMLPurifier_AttrDef
            if ($valid === false) return false;
            return '['. $valid . ']';
        }
-        
-        // need to do checks on unusual encodings too
        $ipv4 = $this->ipv4->validate($string, $config, $context);
        if ($ipv4 !== false) return $ipv4;
        
--- a/library/HTMLPurifier/AttrDef/ID.php
+++ b/library/HTMLPurifier/AttrDef/ID.php
@@ -3,30 +3,6 @@
 require_once 'HTMLPurifier/AttrDef.php';
 require_once 'HTMLPurifier/IDAccumulator.php';

-HTMLPurifier_ConfigSchema::define(
-    'Attr', 'IDPrefix', '', 'string',
-    'String to prefix to IDs.  If you have no idea what IDs your pages '.
-    'may use, you may opt to simply add a prefix to all user-submitted ID '.
-    'attributes so that they are still usable, but will not conflict with '.
-    'core page IDs. Example: setting the directive to \'user_\' will result in '.
-    'a user submitted \'foo\' to become \'user_foo\'  Be sure to set '.
-    '%HTML.EnableAttrID to true before using '.
-    'this.  This directive was available since 1.2.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'Attr', 'IDPrefixLocal', '', 'string',
-    'Temporary prefix for IDs used in conjunction with %Attr.IDPrefix.  If '.
-    'you need to allow multiple sets of '.
-    'user content on web page, you may need to have a seperate prefix that '.
-    'changes with each iteration.  This way, seperately submitted user content '.
-    'displayed on the same page doesn\'t clobber each other. Ideal values '.
-    'are unique identifiers for the content it represents (i.e. the id of '.
-    'the row in the database). Be sure to add a seperator (like an underscore) '.
-    'at the end.  Warning: this directive will not work unless %Attr.IDPrefix '.
-    'is set to a non-empty value! This directive was available since 1.2.0.'
-);
-
 /**
 * Validates the HTML attribute ID.
 * @warning Even though this is the id processor, it
@@ -44,19 +20,7 @@ class HTMLPurifier_AttrDef_ID extends HTMLPurifier_AttrDef
        $id = trim($id); // trim it first
        
        if ($id === '') return false;
-        
-        $prefix = $config->get('Attr', 'IDPrefix');
-        if ($prefix !== '') {
-            $prefix .= $config->get('Attr', 'IDPrefixLocal');
-            // prevent re-appending the prefix
-            if (strpos($id, $prefix) !== 0) $id = $prefix . $id;
-        } elseif ($config->get('Attr', 'IDPrefixLocal') !== '') {
-            trigger_error('%Attr.IDPrefixLocal cannot be used unless '.
-                '%Attr.IDPrefix is set', E_USER_WARNING);
-        }
-        
-        $id_accumulator =& $context->get('IDAccumulator');
-        if (isset($id_accumulator->ids[$id])) return false;
+        if (isset($context->id_accumulator->ids[$id])) return false;
        
        // we purposely avoid using regex, hopefully this is faster
        
@@ -71,7 +35,7 @@ class HTMLPurifier_AttrDef_ID extends HTMLPurifier_AttrDef
            $result = ($trim === '');
        }
        
-        if ($result) $id_accumulator->add($id);
+        if ($result) $context->id_accumulator->add($id);
        
        // if no change was made to the ID, return the result
        // else, return the new id if stripping whitespace made it
--- a/library/HTMLPurifier/AttrDef/URI.php
+++ b/library/HTMLPurifier/AttrDef/URI.php
@@ -4,7 +4,6 @@ require_once 'HTMLPurifier/AttrDef.php';
 require_once 'HTMLPurifier/URIScheme.php';
 require_once 'HTMLPurifier/URISchemeRegistry.php';
 require_once 'HTMLPurifier/AttrDef/Host.php';
-require_once 'HTMLPurifier/PercentEncoder.php';

 HTMLPurifier_ConfigSchema::define(
    'URI', 'DefaultScheme', 'http', 'string',
@@ -12,71 +11,6 @@ HTMLPurifier_ConfigSchema::define(
    'select the proper object validator when no scheme information is present.'
 );

-HTMLPurifier_ConfigSchema::define(
-    'URI', 'Host', null, 'string/null',
-    'Defines the domain name of the server, so we can determine whether or '.
-    'an absolute URI is from your website or not.  Not strictly necessary, '.
-    'as users should be using relative URIs to reference resources on your '.
-    'website.  It will, however, let you use absolute URIs to link to '.
-    'subdomains of the domain you post here: i.e. example.com will allow '.
-    'sub.example.com.  However, higher up domains will still be excluded: '.
-    'if you set %URI.Host to sub.example.com, example.com will be blocked. '.
-    'This directive has been available since 1.2.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'URI', 'DisableExternal', false, 'bool',
-    'Disables links to external websites.  This is a highly effective '.
-    'anti-spam and anti-pagerank-leech measure, but comes at a hefty price: no'.
-    'links or images outside of your domain will be allowed.  Non-linkified '.
-    'URIs will still be preserved.  If you want to be able to link to '.
-    'subdomains or use absolute URIs, specify %URI.Host for your website. '.
-    'This directive has been available since 1.2.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'URI', 'DisableExternalResources', false, 'bool',
-    'Disables the embedding of external resources, preventing users from '.
-    'embedding things like images from other hosts. This prevents '.
-    'access tracking (good for email viewers), bandwidth leeching, '.
-    'cross-site request forging, goatse.cx posting, and '.
-    'other nasties, but also results in '.
-    'a loss of end-user functionality (they can\'t directly post a pic '.
-    'they posted from Flickr anymore). Use it if you don\'t have a '.
-    'robust user-content moderation team. This directive has been '.
-    'available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'URI', 'DisableResources', false, 'bool',
-    'Disables embedding resources, essentially meaning no pictures. You can '.
-    'still link to them though. See %URI.DisableExternalResources for why '.
-    'this might be a good idea. This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'URI', 'Munge', null, 'string/null',
-    'Munges all browsable (usually http, https and ftp) URI\'s into some URL '.
-    'redirection service. Pass this directive a URI, with %s inserted where '.
-    'the url-encoded original URI should be inserted (sample: '.
-    '<code>http://www.google.com/url?q=%s</code>). '.
-    'This prevents PageRank leaks, while being as transparent as possible '.
-    'to users (you may also want to add some client side JavaScript to '.
-    'override the text in the statusbar). Warning: many security experts '.
-    'believe that this form of protection does not deter spam-bots. '.
-    'You can also use this directive to redirect users to a splash page '.
-    'telling them they are leaving your website. '.
-    'This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'URI', 'HostBlacklist', array(), 'list',
-    'List of strings that are forbidden in the host of any URI. Use it to '.
-    'kill domain names of spam, etc. Note that it will catch anything in '.
-    'the domain, so <tt>moo.com</tt> will catch <tt>moo.com.example.com</tt>. '.
-    'This directive has been available since 1.3.0.'
-);
-
 /**
 * Validates a URI as defined by RFC 3986.
 * @note Scheme-specific mechanics deferred to HTMLPurifier_URIScheme
@@ -85,16 +19,9 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
 {
    
    var $host;
-    var $PercentEncoder;
-    var $embeds_resource;
    
-    /**
-     * @param $embeds_resource_resource Does the URI here result in an extra HTTP request?
-     */
-    function HTMLPurifier_AttrDef_URI($embeds_resource = false) {
+    function HTMLPurifier_AttrDef_URI() {
        $this->host = new HTMLPurifier_AttrDef_Host();
-        $this->PercentEncoder = new HTMLPurifier_PercentEncoder();
-        $this->embeds_resource = (bool) $embeds_resource;
    }
    
    function validate($uri, $config, &$context) {
@@ -105,9 +32,6 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        // parse as CDATA
        $uri = $this->parseCDATA($uri);
        
-        // fix up percent-encoding
-        $uri = $this->PercentEncoder->normalize($uri);
-        
        // while it would be nice to use parse_url(), that's specifically
        // for HTTP and thus won't work for our generic URI parsing
        
@@ -139,38 +63,18 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
            // no need to validate the scheme's fmt since we do that when we
            // retrieve the specific scheme object from the registry
            $scheme = ctype_lower($scheme) ? $scheme : strtolower($scheme);
-            $scheme_obj =& $registry->getScheme($scheme, $config, $context);
+            $scheme_obj =& $registry->getScheme($scheme, $config);
            if (!$scheme_obj) return false; // invalid scheme, clean it out
        } else {
            $scheme_obj =& $registry->getScheme(
-                $config->get('URI', 'DefaultScheme'), $config, $context
+                $config->get('URI', 'DefaultScheme'), $config
            );
        }
        
        
-        // the URI we're processing embeds_resource a resource in the page, but the URI
-        // it references cannot be located
-        if ($this->embeds_resource && !$scheme_obj->browsable) {
-            return false;
-        }
-        
        
        if ($authority !== null) {
            
-            // remove URI if it's absolute and we disabled externals or
-            // if it's absolute and embedded and we disabled external resources
-            unset($our_host);
-            if (
-                $config->get('URI', 'DisableExternal') ||
-                (
-                    $config->get('URI', 'DisableExternalResources') &&
-                    $this->embeds_resource
-                )
-            ) {
-                $our_host = $config->get('URI', 'Host');
-                if ($our_host === null) return false;
-            }
-            
            $HEXDIG = '[A-Fa-f0-9]';
            $unreserved = 'A-Za-z0-9-._~'; // make sure you wrap with []
            $sub_delims = '!$&\'()'; // needs []
@@ -193,19 +97,6 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
            $host = $this->host->validate($host, $config, $context);
            if ($host === false) $host = null;
            
-            if ($this->checkBlacklist($host, $config, $context)) return false;
-            
-            // more lenient absolute checking
-            if (isset($our_host)) {
-                $host_parts = array_reverse(explode('.', $host));
-                // could be cached
-                $our_host_parts = array_reverse(explode('.', $our_host));
-                foreach ($our_host_parts as $i => $discard) {
-                    if (!isset($host_parts[$i])) return false;
-                    if ($host_parts[$i] != $our_host_parts[$i]) return false;
-                }
-            }
-            
            // userinfo and host are validated within the regexp
            
        } else {
@@ -229,7 +120,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        // note that $fragment is omitted
        list($userinfo, $host, $port, $path, $query) = 
            $scheme_obj->validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context
+                $userinfo, $host, $port, $path, $query, $config
            );
        
        
@@ -250,37 +141,10 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        if ($query !== null) $result .= "?$query";
        if ($fragment !== null) $result .= "#$fragment";
        
-        // munge if necessary
-        $munge = $config->get('URI', 'Munge');
-        if (!empty($scheme_obj->browsable) && $munge !== null) {
-            if ($authority !== null) {
-                $result = str_replace('%s', rawurlencode($result), $munge);
-            }
-        }
-        
        return $result;
        
    }
    
-    /**
-     * Checks a host against an array blacklist
-     * @param $host Host to check
-     * @param $config HTMLPurifier_Config instance
-     * @param $context HTMLPurifier_Context instance
-     * @return bool Is spam?
-     */
-    function checkBlacklist($host, &$config, &$context) {
-        $blacklist = $config->get('URI', 'HostBlacklist');
-        if (!empty($blacklist)) {
-            foreach($blacklist as $blacklisted_host_fragment) {
-                if (strpos($host, $blacklisted_host_fragment) !== false) {
-                    return true;
-                }
-            }
-        }
-        return false;
-    }
-    
 }

 ?>
--- a/library/HTMLPurifier/AttrTransform.php
+++ b/library/HTMLPurifier/AttrTransform.php
@@ -23,10 +23,9 @@ class HTMLPurifier_AttrTransform
     * @param $attr Assoc array of attributes, usually from
     *              HTMLPurifier_Token_Tag::$attributes
     * @param $config Mandatory HTMLPurifier_Config object.
-     * @param $context Mandatory HTMLPurifier_Context object
     * @returns Processed attribute array.
     */
-    function transform($attr, $config, &$context) {
+    function transform($attr, $config) {
        trigger_error('Cannot call abstract function', E_USER_ERROR);
    }
 }
--- a/library/HTMLPurifier/AttrTransform/BdoDir.php
+++ b/library/HTMLPurifier/AttrTransform/BdoDir.php
@@ -20,7 +20,7 @@ HTMLPurifier_ConfigSchema::defineAllowedValues(
 class HTMLPurifier_AttrTransform_BdoDir extends HTMLPurifier_AttrTransform
 {
    
-    function transform($attr, $config, $context) {
+    function transform($attr, $config) {
        if (isset($attr['dir'])) return $attr;
        $attr['dir'] = $config->get('Attr', 'DefaultTextDir');
        return $attr;
--- a/library/HTMLPurifier/AttrTransform/ImgRequired.php
+++ b/library/HTMLPurifier/AttrTransform/ImgRequired.php
@@ -25,7 +25,7 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_AttrTransform_ImgRequired extends HTMLPurifier_AttrTransform
 {
    
-    function transform($attr, $config, $context) {
+    function transform($attr, $config) {
        
        $src = true;
        if (!isset($attr['src'])) {
--- a/library/HTMLPurifier/AttrTransform/Lang.php
+++ b/library/HTMLPurifier/AttrTransform/Lang.php
@@ -10,7 +10,7 @@ require_once 'HTMLPurifier/AttrTransform.php';
 class HTMLPurifier_AttrTransform_Lang extends HTMLPurifier_AttrTransform
 {
    
-    function transform($attr, $config, $context) {
+    function transform($attr, $config) {
        
        $lang     = isset($attr['lang']) ? $attr['lang'] : false;
        $xml_lang = isset($attr['xml:lang']) ? $attr['xml:lang'] : false;
--- a/library/HTMLPurifier/AttrTransform/TextAlign.php
+++ b/library/HTMLPurifier/AttrTransform/TextAlign.php
@@ -8,7 +8,7 @@ require_once 'HTMLPurifier/AttrTransform.php';
 class HTMLPurifier_AttrTransform_TextAlign
    extends HTMLPurifier_AttrTransform {

-    function transform($attr, $config, $context) {
+    function transform($attr, $config) {
        
        if (!isset($attr['align'])) return $attr;
        
--- a/library/HTMLPurifier/ChildDef.php
+++ b/library/HTMLPurifier/ChildDef.php
@@ -20,9 +20,10 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_ChildDef
 {
    /**
-     * Type of child definition, usually right-most part of class name lowercase.
-     * Used occasionally in terms of context.
-     * @public
+     * Type of child definition, usually right-most part of class name lowercase
+     * 
+     * Used occasionally in terms of context.  Possible values include
+     * custom, required, optional and empty.
     */
    var $type;
    
@@ -31,25 +32,407 @@ class HTMLPurifier_ChildDef
     * 
     * This is necessary for redundant checking when changes affecting
     * a child node may cause a parent node to now be disallowed.
-     * 
-     * @public
     */
    var $allow_empty;
    
    /**
     * Validates nodes according to definition and returns modification.
     * 
-     * @public
+     * @warning $context is NOT HTMLPurifier_AttrContext
     * @param $tokens_of_children Array of HTMLPurifier_Token
     * @param $config HTMLPurifier_Config object
-     * @param $context HTMLPurifier_Context object
+     * @param $context String context indicating inline, block or unknown
     * @return bool true to leave nodes as is
     * @return bool false to remove parent node
     * @return array of replacement child tokens
     */
-    function validateChildren($tokens_of_children, $config, &$context) {
+    function validateChildren($tokens_of_children, $config, $context) {
        trigger_error('Call to abstract function', E_USER_ERROR);
    }
 }

+/**
+ * Custom validation class, accepts DTD child definitions
+ * 
+ * @warning Currently this class is an all or nothing proposition, that is,
+ *          it will only give a bool return value.
+ */
+class HTMLPurifier_ChildDef_Custom extends HTMLPurifier_ChildDef
+{
+    var $type = 'custom';
+    var $allow_empty = false;
+    /**
+     * Allowed child pattern as defined by the DTD
+     */
+    var $dtd_regex;
+    /**
+     * PCRE regex derived from $dtd_regex
+     * @private
+     */
+    var $_pcre_regex;
+    /**
+     * @param $dtd_regex Allowed child pattern from the DTD
+     */
+    function HTMLPurifier_ChildDef_Custom($dtd_regex) {
+        $this->dtd_regex = $dtd_regex;
+        $this->_compileRegex();
+    }
+    /**
+     * Compiles the PCRE regex from a DTD regex ($dtd_regex to $_pcre_regex)
+     */
+    function _compileRegex() {
+        $raw = str_replace(' ', '', $this->dtd_regex);
+        if ($raw{0} != '(') {
+            $raw = "($raw)";
+        }
+        $reg = str_replace(',', ',?', $raw);
+        $reg = preg_replace('/([#a-zA-Z0-9_.-]+)/', '(,?\\0)', $reg);
+        $this->_pcre_regex = $reg;
+    }
+    function validateChildren($tokens_of_children, $config, $context) {
+        $list_of_children = '';
+        $nesting = 0; // depth into the nest
+        foreach ($tokens_of_children as $token) {
+            if (!empty($token->is_whitespace)) continue;
+            
+            $is_child = ($nesting == 0); // direct
+            
+            if ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            if ($is_child) {
+                $list_of_children .= $token->name . ',';
+            }
+        }
+        $list_of_children = rtrim($list_of_children, ',');
+        
+        $okay =
+            preg_match(
+                '/^'.$this->_pcre_regex.'$/',
+                $list_of_children
+            );
+        
+        return (bool) $okay;
+    }
+}
+
+/**
+ * Definition that allows a set of elements, but disallows empty children.
+ */
+class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
+{
+    /**
+     * Lookup table of allowed elements.
+     */
+    var $elements = array();
+    /**
+     * @param $elements List of allowed element names (lowercase).
+     */
+    function HTMLPurifier_ChildDef_Required($elements) {
+        if (is_string($elements)) {
+            $elements = str_replace(' ', '', $elements);
+            $elements = explode('|', $elements);
+        }
+        $elements = array_flip($elements);
+        foreach ($elements as $i => $x) $elements[$i] = true;
+        $this->elements = $elements;
+        $this->gen = new HTMLPurifier_Generator();
+    }
+    var $allow_empty = false;
+    var $type = 'required';
+    function validateChildren($tokens_of_children, $config, $context) {
+        // if there are no tokens, delete parent node
+        if (empty($tokens_of_children)) return false;
+        
+        // the new set of children
+        $result = array();
+        
+        // current depth into the nest
+        $nesting = 0;
+        
+        // whether or not we're deleting a node
+        $is_deleting = false;
+        
+        // whether or not parsed character data is allowed
+        // this controls whether or not we silently drop a tag
+        // or generate escaped HTML from it
+        $pcdata_allowed = isset($this->elements['#PCDATA']);
+        
+        // a little sanity check to make sure it's not ALL whitespace
+        $all_whitespace = true;
+        
+        // some configuration
+        $escape_invalid_children = $config->get('Core', 'EscapeInvalidChildren');
+        
+        foreach ($tokens_of_children as $token) {
+            if (!empty($token->is_whitespace)) {
+                $result[] = $token;
+                continue;
+            }
+            $all_whitespace = false; // phew, we're not talking about whitespace
+            
+            $is_child = ($nesting == 0);
+            
+            if ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            if ($is_child) {
+                $is_deleting = false;
+                if (!isset($this->elements[$token->name])) {
+                    $is_deleting = true;
+                    if ($pcdata_allowed && $token->type == 'text') {
+                        $result[] = $token;
+                    } elseif ($pcdata_allowed && $escape_invalid_children) {
+                        $result[] = new HTMLPurifier_Token_Text(
+                            $this->gen->generateFromToken($token, $config)
+                        );
+                    }
+                    continue;
+                }
+            }
+            if (!$is_deleting || ($pcdata_allowed && $token->type == 'text')) {
+                $result[] = $token;
+            } elseif ($pcdata_allowed && $escape_invalid_children) {
+                $result[] =
+                    new HTMLPurifier_Token_Text(
+                        $this->gen->generateFromToken( $token, $config )
+                    );
+            } else {
+                // drop silently
+            }
+        }
+        if (empty($result)) return false;
+        if ($all_whitespace) return false;
+        if ($tokens_of_children == $result) return true;
+        return $result;
+    }
+}
+
+/**
+ * Definition that allows a set of elements, and allows no children.
+ * @note This is a hack to reuse code from HTMLPurifier_ChildDef_Required,
+ *       really, one shouldn't inherit from the other.  Only altered behavior
+ *       is to overload a returned false with an array.  Thus, it will never
+ *       return false.
+ */
+class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
+{
+    var $allow_empty = true;
+    var $type = 'optional';
+    function validateChildren($tokens_of_children, $config, $context) {
+        $result = parent::validateChildren($tokens_of_children, $config, $context);
+        if ($result === false) return array();
+        return $result;
+    }
+}
+
+/**
+ * Definition that disallows all elements.
+ * @warning validateChildren() in this class is actually never called, because
+ *          empty elements are corrected in HTMLPurifier_Strategy_MakeWellFormed
+ *          before child definitions are parsed in earnest by
+ *          HTMLPurifier_Strategy_FixNesting.
+ */
+class HTMLPurifier_ChildDef_Empty extends HTMLPurifier_ChildDef
+{
+    var $allow_empty = true;
+    var $type = 'empty';
+    function HTMLPurifier_ChildDef_Empty() {}
+    function validateChildren($tokens_of_children, $config, $context) {
+        return array();
+    }
+}
+
+/**
+ * Definition that uses different definitions depending on context.
+ * 
+ * The del and ins tags are notable because they allow different types of
+ * elements depending on whether or not they're in a block or inline context.
+ * Chameleon allows this behavior to happen by using two different
+ * definitions depending on context.  While this somewhat generalized,
+ * it is specifically intended for those two tags.
+ */
+class HTMLPurifier_ChildDef_Chameleon extends HTMLPurifier_ChildDef
+{
+    
+    /**
+     * Instance of the definition object to use when inline. Usually stricter.
+     */
+    var $inline;
+    /**
+     * Instance of the definition object to use when block.
+     */
+    var $block;
+    
+    /**
+     * @param $inline List of elements to allow when inline.
+     * @param $block List of elements to allow when block.
+     */
+    function HTMLPurifier_ChildDef_Chameleon($inline, $block) {
+        $this->inline = new HTMLPurifier_ChildDef_Optional($inline);
+        $this->block  = new HTMLPurifier_ChildDef_Optional($block);
+    }
+    
+    function validateChildren($tokens_of_children, $config, $context) {
+        switch ($context) {
+            case 'unknown':
+            case 'inline':
+                $result = $this->inline->validateChildren(
+                    $tokens_of_children, $config, $context);
+                break;
+            case 'block':
+                $result = $this->block->validateChildren(
+                    $tokens_of_children, $config, $context);
+                break;
+            default:
+                trigger_error('Invalid context', E_USER_ERROR);
+                return false;
+        }
+        return $result;
+    }
+}
+
+/**
+ * Definition for tables
+ */
+class HTMLPurifier_ChildDef_Table extends HTMLPurifier_ChildDef
+{
+    var $allow_empty = false;
+    var $type = 'table';
+    function HTMLPurifier_ChildDef_Table() {}
+    function validateChildren($tokens_of_children, $config, $context) {
+        if (empty($tokens_of_children)) return false;
+        
+        // this ensures that the loop gets run one last time before closing
+        // up. It's a little bit of a hack, but it works! Just make sure you
+        // get rid of the token later.
+        $tokens_of_children[] = false;
+        
+        // only one of these elements is allowed in a table
+        $caption = false;
+        $thead   = false;
+        $tfoot   = false;
+        
+        // as many of these as you want
+        $cols    = array();
+        $content = array();
+        
+        $nesting = 0; // current depth so we can determine nodes
+        $is_collecting = false; // are we globbing together tokens to package
+                                // into one of the collectors?
+        $collection = array(); // collected nodes
+        $tag_index = 0; // the first node might be whitespace,
+                            // so this tells us where the start tag is
+        
+        foreach ($tokens_of_children as $token) {
+            $is_child = ($nesting == 0);
+            
+            if ($token === false) {
+                // terminating sequence started
+            } elseif ($token->type == 'start') {
+                $nesting++;
+            } elseif ($token->type == 'end') {
+                $nesting--;
+            }
+            
+            // handle node collection
+            if ($is_collecting) {
+                if ($is_child) {
+                    // okay, let's stash the tokens away
+                    // first token tells us the type of the collection
+                    switch ($collection[$tag_index]->name) {
+                        case 'tr':
+                        case 'tbody':
+                            $content[] = $collection;
+                            break;
+                        case 'caption':
+                            if ($caption !== false) break;
+                            $caption = $collection;
+                            break;
+                        case 'thead':
+                        case 'tfoot':
+                            // access the appropriate variable, $thead or $tfoot
+                            $var = $collection[$tag_index]->name;
+                            if ($$var === false) {
+                                $$var = $collection;
+                            } else {
+                                // transmutate the first and less entries into
+                                // tbody tags, and then put into content
+                                $collection[$tag_index]->name = 'tbody';
+                                $collection[count($collection)-1]->name = 'tbody';
+                                $content[] = $collection;
+                            }
+                            break;
+                         case 'colgroup':
+                            $cols[] = $collection;
+                            break;
+                    }
+                    $collection = array();
+                    $is_collecting = false;
+                    $tag_index = 0;
+                } else {
+                    // add the node to the collection
+                    $collection[] = $token;
+                }
+            }
+            
+            // terminate
+            if ($token === false) break;
+            
+            if ($is_child) {
+                // determine what we're dealing with
+                if ($token->name == 'col') {
+                    // the only empty tag in the possie, we can handle it
+                    // immediately
+                    $cols[] = array_merge($collection, array($token));
+                    $collection = array();
+                    $tag_index = 0;
+                    continue;
+                }
+                switch($token->name) {
+                    case 'caption':
+                    case 'colgroup':
+                    case 'thead':
+                    case 'tfoot':
+                    case 'tbody':
+                    case 'tr':
+                        $is_collecting = true;
+                        $collection[] = $token;
+                        continue;
+                    default:
+                        if ($token->type == 'text' && $token->is_whitespace) {
+                            $collection[] = $token;
+                            $tag_index++;
+                        }
+                        continue;
+                }
+            }
+        }
+        
+        if (empty($content)) return false;
+        
+        $ret = array();
+        if ($caption !== false) $ret = array_merge($ret, $caption);
+        if ($cols !== false)    foreach ($cols as $token_array) $ret = array_merge($ret, $token_array);
+        if ($thead !== false)   $ret = array_merge($ret, $thead);
+        if ($tfoot !== false)   $ret = array_merge($ret, $tfoot);
+        foreach ($content as $token_array) $ret = array_merge($ret, $token_array);
+        if (!empty($collection) && $is_collecting == false){
+            // grab the trailing space
+            $ret = array_merge($ret, $collection);
+        }
+        
+        array_pop($tokens_of_children); // remove phantom token
+        
+        return ($ret === $tokens_of_children) ? true : $ret;
+        
+    }
+}
+
 ?>
--- a/library/HTMLPurifier/ChildDef/Chameleon.php
+++ b/library/HTMLPurifier/ChildDef/Chameleon.php
@@ -1,60 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef.php';
-
-/**
- * Definition that uses different definitions depending on context.
- * 
- * The del and ins tags are notable because they allow different types of
- * elements depending on whether or not they're in a block or inline context.
- * Chameleon allows this behavior to happen by using two different
- * definitions depending on context.  While this somewhat generalized,
- * it is specifically intended for those two tags.
- */
-class HTMLPurifier_ChildDef_Chameleon extends HTMLPurifier_ChildDef
-{
-    
-    /**
-     * Instance of the definition object to use when inline. Usually stricter.
-     * @public
-     */
-    var $inline;
-    
-    /**
-     * Instance of the definition object to use when block.
-     * @public
-     */
-    var $block;
-    
-    var $type = 'chameleon';
-    
-    /**
-     * @param $inline List of elements to allow when inline.
-     * @param $block List of elements to allow when block.
-     */
-    function HTMLPurifier_ChildDef_Chameleon($inline, $block) {
-        $this->inline = new HTMLPurifier_ChildDef_Optional($inline);
-        $this->block  = new HTMLPurifier_ChildDef_Optional($block);
-    }
-    
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $parent_type = $context->get('ParentType');
-        switch ($parent_type) {
-            case 'unknown':
-            case 'inline':
-                $result = $this->inline->validateChildren(
-                    $tokens_of_children, $config, $context);
-                break;
-            case 'block':
-                $result = $this->block->validateChildren(
-                    $tokens_of_children, $config, $context);
-                break;
-            default:
-                trigger_error('Invalid context', E_USER_ERROR);
-                return false;
-        }
-        return $result;
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/Custom.php
+++ b/library/HTMLPurifier/ChildDef/Custom.php
@@ -1,75 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef.php';
-
-/**
- * Custom validation class, accepts DTD child definitions
- * 
- * @warning Currently this class is an all or nothing proposition, that is,
- *          it will only give a bool return value.
- * @note This class is currently not used by any code, although it is unit
- *       tested.
- */
-class HTMLPurifier_ChildDef_Custom extends HTMLPurifier_ChildDef
-{
-    var $type = 'custom';
-    var $allow_empty = false;
-    /**
-     * Allowed child pattern as defined by the DTD
-     */
-    var $dtd_regex;
-    /**
-     * PCRE regex derived from $dtd_regex
-     * @private
-     */
-    var $_pcre_regex;
-    /**
-     * @param $dtd_regex Allowed child pattern from the DTD
-     */
-    function HTMLPurifier_ChildDef_Custom($dtd_regex) {
-        $this->dtd_regex = $dtd_regex;
-        $this->_compileRegex();
-    }
-    /**
-     * Compiles the PCRE regex from a DTD regex ($dtd_regex to $_pcre_regex)
-     */
-    function _compileRegex() {
-        $raw = str_replace(' ', '', $this->dtd_regex);
-        if ($raw{0} != '(') {
-            $raw = "($raw)";
-        }
-        $reg = str_replace(',', ',?', $raw);
-        $reg = preg_replace('/([#a-zA-Z0-9_.-]+)/', '(,?\\0)', $reg);
-        $this->_pcre_regex = $reg;
-    }
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $list_of_children = '';
-        $nesting = 0; // depth into the nest
-        foreach ($tokens_of_children as $token) {
-            if (!empty($token->is_whitespace)) continue;
-            
-            $is_child = ($nesting == 0); // direct
-            
-            if ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            if ($is_child) {
-                $list_of_children .= $token->name . ',';
-            }
-        }
-        $list_of_children = rtrim($list_of_children, ',');
-        
-        $okay =
-            preg_match(
-                '/^'.$this->_pcre_regex.'$/',
-                $list_of_children
-            );
-        
-        return (bool) $okay;
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/Empty.php
+++ b/library/HTMLPurifier/ChildDef/Empty.php
@@ -1,22 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef.php';
-
-/**
- * Definition that disallows all elements.
- * @warning validateChildren() in this class is actually never called, because
- *          empty elements are corrected in HTMLPurifier_Strategy_MakeWellFormed
- *          before child definitions are parsed in earnest by
- *          HTMLPurifier_Strategy_FixNesting.
- */
-class HTMLPurifier_ChildDef_Empty extends HTMLPurifier_ChildDef
-{
-    var $allow_empty = true;
-    var $type = 'empty';
-    function HTMLPurifier_ChildDef_Empty() {}
-    function validateChildren($tokens_of_children, $config, &$context) {
-        return array();
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/Optional.php
+++ b/library/HTMLPurifier/ChildDef/Optional.php
@@ -1,23 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef/Required.php';
-
-/**
- * Definition that allows a set of elements, and allows no children.
- * @note This is a hack to reuse code from HTMLPurifier_ChildDef_Required,
- *       really, one shouldn't inherit from the other.  Only altered behavior
- *       is to overload a returned false with an array.  Thus, it will never
- *       return false.
- */
-class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
-{
-    var $allow_empty = true;
-    var $type = 'optional';
-    function validateChildren($tokens_of_children, $config, &$context) {
-        $result = parent::validateChildren($tokens_of_children, $config, $context);
-        if ($result === false) return array();
-        return $result;
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/Required.php
+++ b/library/HTMLPurifier/ChildDef/Required.php
@@ -1,104 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef.php';
-
-/**
- * Definition that allows a set of elements, but disallows empty children.
- */
-class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
-{
-    /**
-     * Lookup table of allowed elements.
-     * @public
-     */
-    var $elements = array();
-    /**
-     * @param $elements List of allowed element names (lowercase).
-     */
-    function HTMLPurifier_ChildDef_Required($elements) {
-        if (is_string($elements)) {
-            $elements = str_replace(' ', '', $elements);
-            $elements = explode('|', $elements);
-        }
-        $elements = array_flip($elements);
-        foreach ($elements as $i => $x) {
-            $elements[$i] = true;
-            if (empty($i)) unset($elements[$i]);
-        }
-        $this->elements = $elements;
-        $this->gen = new HTMLPurifier_Generator();
-    }
-    var $allow_empty = false;
-    var $type = 'required';
-    function validateChildren($tokens_of_children, $config, &$context) {
-        // if there are no tokens, delete parent node
-        if (empty($tokens_of_children)) return false;
-        
-        // the new set of children
-        $result = array();
-        
-        // current depth into the nest
-        $nesting = 0;
-        
-        // whether or not we're deleting a node
-        $is_deleting = false;
-        
-        // whether or not parsed character data is allowed
-        // this controls whether or not we silently drop a tag
-        // or generate escaped HTML from it
-        $pcdata_allowed = isset($this->elements['#PCDATA']);
-        
-        // a little sanity check to make sure it's not ALL whitespace
-        $all_whitespace = true;
-        
-        // some configuration
-        $escape_invalid_children = $config->get('Core', 'EscapeInvalidChildren');
-        
-        foreach ($tokens_of_children as $token) {
-            if (!empty($token->is_whitespace)) {
-                $result[] = $token;
-                continue;
-            }
-            $all_whitespace = false; // phew, we're not talking about whitespace
-            
-            $is_child = ($nesting == 0);
-            
-            if ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            if ($is_child) {
-                $is_deleting = false;
-                if (!isset($this->elements[$token->name])) {
-                    $is_deleting = true;
-                    if ($pcdata_allowed && $token->type == 'text') {
-                        $result[] = $token;
-                    } elseif ($pcdata_allowed && $escape_invalid_children) {
-                        $result[] = new HTMLPurifier_Token_Text(
-                            $this->gen->generateFromToken($token, $config)
-                        );
-                    }
-                    continue;
-                }
-            }
-            if (!$is_deleting || ($pcdata_allowed && $token->type == 'text')) {
-                $result[] = $token;
-            } elseif ($pcdata_allowed && $escape_invalid_children) {
-                $result[] =
-                    new HTMLPurifier_Token_Text(
-                        $this->gen->generateFromToken( $token, $config )
-                    );
-            } else {
-                // drop silently
-            }
-        }
-        if (empty($result)) return false;
-        if ($all_whitespace) return false;
-        if ($tokens_of_children == $result) return true;
-        return $result;
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/StrictBlockquote.php
+++ b/library/HTMLPurifier/ChildDef/StrictBlockquote.php
@@ -1,70 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef/Required.php';
-
-/**
- * Takes the contents of blockquote when in strict and reformats for validation.
- * 
- * From XHTML 1.0 Transitional to Strict, there is a notable change where 
- */
-class   HTMLPurifier_ChildDef_StrictBlockquote
-extends HTMLPurifier_ChildDef_Required
-{
-    var $allow_empty = true;
-    var $type = 'strictblockquote';
-    var $init = false;
-    function HTMLPurifier_ChildDef_StrictBlockquote() {}
-    function validateChildren($tokens_of_children, $config, &$context) {
-        
-        $def = $config->getHTMLDefinition();
-        if (!$this->init) {
-            // allow all inline elements
-            $this->elements = $def->info_flow_elements;
-            $this->elements['#PCDATA'] = true;
-            $this->init = true;
-        }
-        
-        $result = parent::validateChildren($tokens_of_children, $config, $context);
-        if ($result === false) return array();
-        if ($result === true) $result = $tokens_of_children;
-        
-        $block_wrap_start = new HTMLPurifier_Token_Start($def->info_block_wrapper);
-        $block_wrap_end   = new HTMLPurifier_Token_End(  $def->info_block_wrapper);
-        $is_inline = false;
-        $depth = 0;
-        $ret = array();
-        
-        // assuming that there are no comment tokens
-        foreach ($result as $i => $token) {
-            $token = $result[$i];
-            // ifs are nested for readability
-            if (!$is_inline) {
-                if (!$depth) {
-                     if (($token->type == 'text') ||
-                         ($def->info[$token->name]->type == 'inline')) {
-                        $is_inline = true;
-                        $ret[] = $block_wrap_start;
-                     }
-                }
-            } else {
-                if (!$depth) {
-                    // starting tokens have been inline text / empty
-                    if ($token->type == 'start' || $token->type == 'empty') {
-                        if ($def->info[$token->name]->type == 'block') {
-                            // ended
-                            $ret[] = $block_wrap_end;
-                            $is_inline = false;
-                        }
-                    }
-                }
-            }
-            $ret[] = $token;
-            if ($token->type == 'start') $depth++;
-            if ($token->type == 'end')   $depth--;
-        }
-        if ($is_inline) $ret[] = $block_wrap_end;
-        return $ret;
-    }
-}
-
-?>
--- a/library/HTMLPurifier/ChildDef/Table.php
+++ b/library/HTMLPurifier/ChildDef/Table.php
@@ -1,142 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/ChildDef.php';
-
-/**
- * Definition for tables
- */
-class HTMLPurifier_ChildDef_Table extends HTMLPurifier_ChildDef
-{
-    var $allow_empty = false;
-    var $type = 'table';
-    function HTMLPurifier_ChildDef_Table() {}
-    function validateChildren($tokens_of_children, $config, &$context) {
-        if (empty($tokens_of_children)) return false;
-        
-        // this ensures that the loop gets run one last time before closing
-        // up. It's a little bit of a hack, but it works! Just make sure you
-        // get rid of the token later.
-        $tokens_of_children[] = false;
-        
-        // only one of these elements is allowed in a table
-        $caption = false;
-        $thead   = false;
-        $tfoot   = false;
-        
-        // as many of these as you want
-        $cols    = array();
-        $content = array();
-        
-        $nesting = 0; // current depth so we can determine nodes
-        $is_collecting = false; // are we globbing together tokens to package
-                                // into one of the collectors?
-        $collection = array(); // collected nodes
-        $tag_index = 0; // the first node might be whitespace,
-                            // so this tells us where the start tag is
-        
-        foreach ($tokens_of_children as $token) {
-            $is_child = ($nesting == 0);
-            
-            if ($token === false) {
-                // terminating sequence started
-            } elseif ($token->type == 'start') {
-                $nesting++;
-            } elseif ($token->type == 'end') {
-                $nesting--;
-            }
-            
-            // handle node collection
-            if ($is_collecting) {
-                if ($is_child) {
-                    // okay, let's stash the tokens away
-                    // first token tells us the type of the collection
-                    switch ($collection[$tag_index]->name) {
-                        case 'tr':
-                        case 'tbody':
-                            $content[] = $collection;
-                            break;
-                        case 'caption':
-                            if ($caption !== false) break;
-                            $caption = $collection;
-                            break;
-                        case 'thead':
-                        case 'tfoot':
-                            // access the appropriate variable, $thead or $tfoot
-                            $var = $collection[$tag_index]->name;
-                            if ($$var === false) {
-                                $$var = $collection;
-                            } else {
-                                // transmutate the first and less entries into
-                                // tbody tags, and then put into content
-                                $collection[$tag_index]->name = 'tbody';
-                                $collection[count($collection)-1]->name = 'tbody';
-                                $content[] = $collection;
-                            }
-                            break;
-                         case 'colgroup':
-                            $cols[] = $collection;
-                            break;
-                    }
-                    $collection = array();
-                    $is_collecting = false;
-                    $tag_index = 0;
-                } else {
-                    // add the node to the collection
-                    $collection[] = $token;
-                }
-            }
-            
-            // terminate
-            if ($token === false) break;
-            
-            if ($is_child) {
-                // determine what we're dealing with
-                if ($token->name == 'col') {
-                    // the only empty tag in the possie, we can handle it
-                    // immediately
-                    $cols[] = array_merge($collection, array($token));
-                    $collection = array();
-                    $tag_index = 0;
-                    continue;
-                }
-                switch($token->name) {
-                    case 'caption':
-                    case 'colgroup':
-                    case 'thead':
-                    case 'tfoot':
-                    case 'tbody':
-                    case 'tr':
-                        $is_collecting = true;
-                        $collection[] = $token;
-                        continue;
-                    default:
-                        if ($token->type == 'text' && $token->is_whitespace) {
-                            $collection[] = $token;
-                            $tag_index++;
-                        }
-                        continue;
-                }
-            }
-        }
-        
-        if (empty($content)) return false;
-        
-        $ret = array();
-        if ($caption !== false) $ret = array_merge($ret, $caption);
-        if ($cols !== false)    foreach ($cols as $token_array) $ret = array_merge($ret, $token_array);
-        if ($thead !== false)   $ret = array_merge($ret, $thead);
-        if ($tfoot !== false)   $ret = array_merge($ret, $tfoot);
-        foreach ($content as $token_array) $ret = array_merge($ret, $token_array);
-        if (!empty($collection) && $is_collecting == false){
-            // grab the trailing space
-            $ret = array_merge($ret, $collection);
-        }
-        
-        array_pop($tokens_of_children); // remove phantom token
-        
-        return ($ret === $tokens_of_children) ? true : $ret;
-        
-    }
-}
-
-?>
--- a/library/HTMLPurifier/Config.php
+++ b/library/HTMLPurifier/Config.php
@@ -26,12 +26,12 @@ class HTMLPurifier_Config
    var $def;
    
    /**
-     * Cached instance of HTMLPurifier_HTMLDefinition
+     * Instance of HTMLPurifier_HTMLDefinition
     */
    var $html_definition;
    
    /**
-     * Cached instance of HTMLPurifier_CSSDefinition
+     * Instance of HTMLPurifier_CSSDefinition
     */
    var $css_definition;
    
@@ -60,7 +60,7 @@ class HTMLPurifier_Config
     * @param $key String key
     */
    function get($namespace, $key) {
-        if (!isset($this->def->info[$namespace][$key])) {
+        if (!isset($this->conf[$namespace][$key])) {
            trigger_error('Cannot retrieve value of undefined directive',
                E_USER_WARNING);
            return;
@@ -68,19 +68,6 @@ class HTMLPurifier_Config
        return $this->conf[$namespace][$key];
    }
    
-    /**
-     * Retreives an array of directives to values from a given namespace
-     * @param $namespace String namespace
-     */
-    function getBatch($namespace) {
-        if (!isset($this->def->info[$namespace])) {
-            trigger_error('Cannot retrieve undefined namespace',
-                E_USER_WARNING);
-            return;
-        }
-        return $this->conf[$namespace];
-    }
-    
    /**
     * Sets a value to configuration.
     * @param $namespace String namespace
@@ -88,16 +75,13 @@ class HTMLPurifier_Config
     * @param $value Mixed value
     */
    function set($namespace, $key, $value) {
-        if (!isset($this->def->info[$namespace][$key])) {
+        if (!isset($this->conf[$namespace][$key])) {
            trigger_error('Cannot set undefined directive to value',
                E_USER_WARNING);
            return;
        }
-        $value = $this->def->validate(
-                    $value,
-                    $this->def->info[$namespace][$key]->type,
-                    $this->def->info[$namespace][$key]->allow_null
-                 );
+        $value = $this->def->validate($value,
+                                      $this->def->info[$namespace][$key]->type);
        if (is_string($value)) {
            // resolve value alias if defined
            if (isset($this->def->info[$namespace][$key]->aliases[$value])) {
@@ -111,7 +95,7 @@ class HTMLPurifier_Config
                }
            }
        }
-        if ($this->def->isError($value)) {
+        if ($value === null) {
            trigger_error('Value is of invalid type', E_USER_WARNING);
            return;
        }
@@ -140,28 +124,6 @@ class HTMLPurifier_Config
        return $this->css_definition;
    }
    
-    /**
-     * Loads configuration values from an array with the following structure:
-     * Namespace.Directive => Value
-     * @param $config_array Configuration associative array
-     */
-    function loadArray($config_array) {
-        foreach ($config_array as $key => $value) {
-            $key = str_replace('_', '.', $key);
-            if (strpos($key, '.') !== false) {
-                // condensed form
-                list($namespace, $directive) = explode('.', $key);
-                $this->set($namespace, $directive, $value);
-            } else {
-                $namespace = $key;
-                $namespace_values = $value;
-                foreach ($namespace_values as $directive => $value) {
-                    $this->set($namespace, $directive, $value);
-                }
-            }
-        }
-    }
-    
 }

 ?>
--- a/library/HTMLPurifier/ConfigSchema.php
+++ b/library/HTMLPurifier/ConfigSchema.php
@@ -1,7 +1,5 @@
 <?php

-require_once 'HTMLPurifier/Error.php';
-
 /**
 * Configuration definition, defines directives and their defaults.
 * @todo The ability to define things multiple times is confusing and should
@@ -113,19 +111,12 @@ class HTMLPurifier_ConfigSchema {
                return;
            }
        } else {
-            // process modifiers
-            $type_values = explode('/', $type, 2);
-            $type = $type_values[0];
-            $modifier = isset($type_values[1]) ? $type_values[1] : false;
-            $allow_null = ($modifier === 'null');
-            
            if (!isset($def->types[$type])) {
                trigger_error('Invalid type for configuration directive',
                    E_USER_ERROR);
                return;
            }
-            $default = $def->validate($default, $type, $allow_null);
-            if ($def->isError($default)) {
+            if ($def->validate($default, $type) === null) {
                trigger_error('Default value does not match directive type',
                    E_USER_ERROR);
                return;
@@ -133,7 +124,6 @@ class HTMLPurifier_ConfigSchema {
            $def->info[$namespace][$name] =
                new HTMLPurifier_ConfigEntity_Directive();
            $def->info[$namespace][$name]->type = $type;
-            $def->info[$namespace][$name]->allow_null = $allow_null;
            $def->defaults[$namespace][$name]   = $default;
        }
        $backtrace = debug_backtrace();
@@ -222,52 +212,36 @@ class HTMLPurifier_ConfigSchema {
    /**
     * Validate a variable according to type. Return null if invalid.
     */
-    function validate($var, $type, $allow_null = false) {
+    function validate($var, $type) {
        if (!isset($this->types[$type])) {
            trigger_error('Invalid type', E_USER_ERROR);
            return;
        }
-        if ($allow_null && $var === null) return null;
        switch ($type) {
            case 'mixed':
                return $var;
            case 'istring':
            case 'string':
-                if (!is_string($var)) break;
+                if (!is_string($var)) return;
                if ($type === 'istring') $var = strtolower($var);
                return $var;
            case 'int':
                if (is_string($var) && ctype_digit($var)) $var = (int) $var;
-                elseif (!is_int($var)) break;
+                elseif (!is_int($var)) return;
                return $var;
            case 'float':
                if (is_string($var) && is_numeric($var)) $var = (float) $var;
-                elseif (!is_float($var)) break;
+                elseif (!is_float($var)) return;
                return $var;
            case 'bool':
                if (is_int($var) && ($var === 0 || $var === 1)) {
                    $var = (bool) $var;
-                } elseif (is_string($var)) {
-                    if ($var == 'on' || $var == 'true' || $var == '1') {
-                        $var = true;
-                    } elseif ($var == 'off' || $var == 'false' || $var == '0') {
-                        $var = false;
-                    } else {
-                        break;
-                    }
-                } elseif (!is_bool($var)) break;
+                } elseif (!is_bool($var)) return;
                return $var;
            case 'list':
            case 'hash':
            case 'lookup':
-                if (is_string($var)) {
-                    // simplistic string to array method that only works
-                    // for simple lists of tag names or alphanumeric characters
-                    $var = explode(',',$var);
-                    // remove spaces
-                    foreach ($var as $i => $j) $var[$i] = trim($j);
-                }
-                if (!is_array($var)) break;
+                if (!is_array($var)) return;
                $keys = array_keys($var);
                if ($keys === array_keys($keys)) {
                    if ($type == 'list') return $var;
@@ -277,7 +251,7 @@ class HTMLPurifier_ConfigSchema {
                            $new[$key] = true;
                        }
                        return $new;
-                    } else break;
+                    } else return;
                }
                if ($type === 'lookup') {
                    foreach ($var as $key => $value) {
@@ -286,13 +260,8 @@ class HTMLPurifier_ConfigSchema {
                }
                return $var;
        }
-        $error = new HTMLPurifier_Error();
-        return $error;
    }
    
-    /**
-     * Takes an absolute path and munges it into a more manageable relative path
-     */
    function mungeFilename($filename) {
        $offset = strrpos($filename, 'HTMLPurifier');
        $filename = substr($filename, $offset);
@@ -300,14 +269,6 @@ class HTMLPurifier_ConfigSchema {
        return $filename;
    }
    
-    /**
-     * Checks if var is an HTMLPurifier_Error object
-     */
-    function isError($var) {
-        if (!is_object($var)) return false;
-        if (!is_a($var, 'HTMLPurifier_Error')) return false;
-        return true;
-    }
 }

 /**
@@ -357,13 +318,6 @@ class HTMLPurifier_ConfigEntity_Directive extends HTMLPurifier_ConfigEntity
     *      - mixed (anything goes)
     */
    var $type = 'mixed';
-    
-    /**
-     * Is null allowed? Has no affect for mixed type.
-     * @bool
-     */
-    var $allow_null = false;
-    
    /**
     * Plaintext descriptions of the configuration entity is. Organized by
     * file and line number, so multiple descriptions are allowed.
--- a/library/HTMLPurifier/Context.php
+++ b/library/HTMLPurifier/Context.php
@@ -1,76 +0,0 @@
-<?php
-
-/**
- * Registry object that contains information about the current context.
- */
-class HTMLPurifier_Context
-{
-    
-    /**
-     * Private array that stores the references.
-     * @private
-     */
-    var $_storage = array();
-    
-    /**
-     * Registers a variable into the context.
-     * @param $name String name
-     * @param $ref Variable to be registered
-     */
-    function register($name, &$ref) {
-        if (isset($this->_storage[$name])) {
-            trigger_error('Name collision, cannot re-register',
-                          E_USER_ERROR);
-            return;
-        }
-        $this->_storage[$name] =& $ref;
-    }
-    
-    /**
-     * Retrieves a variable reference from the context.
-     * @param $name String name
-     */
-    function &get($name) {
-        if (!isset($this->_storage[$name])) {
-            trigger_error('Attempted to retrieve non-existent variable',
-                          E_USER_ERROR);
-            $var = null; // so we can return by reference
-            return $var;
-        }
-        return $this->_storage[$name];
-    }
-    
-    /**
-     * Destorys a variable in the context.
-     * @param $name String name
-     */
-    function destroy($name) {
-        if (!isset($this->_storage[$name])) {
-            trigger_error('Attempted to destroy non-existent variable',
-                          E_USER_ERROR);
-            return;
-        }
-        unset($this->_storage[$name]);
-    }
-    
-    /**
-     * Checks whether or not the variable exists.
-     * @param $name String name
-     */
-    function exists($name) {
-        return isset($this->_storage[$name]);
-    }
-    
-    /**
-     * Loads a series of variables from an associative array
-     * @param $context_array Assoc array of variables to load
-     */
-    function loadArray(&$context_array) {
-        foreach ($context_array as $key => $discard) {
-            $this->register($key, $context_array[$key]);
-        }
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/Encoder.php
+++ b/library/HTMLPurifier/Encoder.php
@@ -225,30 +225,7 @@ class HTMLPurifier_Encoder
    
    /**
     * Translates a Unicode codepoint into its corresponding UTF-8 character.
-     * @note Based on Feyd's function at
-     *       <http://forums.devnetwork.net/viewtopic.php?p=191404#191404>,
-     *       which is in public domain.
-     * @note While we're going to do code point parsing anyway, a good
-     *       optimization would be to refuse to translate code points that
-     *       are non-SGML characters.  However, this could lead to duplication.
-     * @note This is very similar to the unichr function in
-     *       maintenance/generate-entity-file.php (although this is superior,
-     *       due to its sanity checks).
     */
-    
-    // +----------+----------+----------+----------+
-    // | 33222222 | 22221111 | 111111   |          |
-    // | 10987654 | 32109876 | 54321098 | 76543210 | bit
-    // +----------+----------+----------+----------+
-    // |          |          |          | 0xxxxxxx | 1 byte 0x00000000..0x0000007F
-    // |          |          | 110yyyyy | 10xxxxxx | 2 byte 0x00000080..0x000007FF
-    // |          | 1110zzzz | 10yyyyyy | 10xxxxxx | 3 byte 0x00000800..0x0000FFFF
-    // | 11110www | 10wwzzzz | 10yyyyyy | 10xxxxxx | 4 byte 0x00010000..0x0010FFFF
-    // +----------+----------+----------+----------+
-    // | 00000000 | 00011111 | 11111111 | 11111111 | Theoretical upper limit of legal scalars: 2097151 (0x001FFFFF)
-    // | 00000000 | 00010000 | 11111111 | 11111111 | Defined upper limit of legal scalar codes
-    // +----------+----------+----------+----------+ 
-    
    function unichr($code) {
        if($code > 1114111 or $code < 0 or
          ($code >= 55296 and $code <= 57343) ) {
@@ -289,7 +266,7 @@ class HTMLPurifier_Encoder
    /**
     * Converts a string to UTF-8 based on configuration.
     */
-    function convertToUTF8($str, $config, &$context) {
+    function convertToUTF8($str, $config) {
        static $iconv = null;
        if ($iconv === null) $iconv = function_exists('iconv');
        $encoding = $config->get('Core', 'Encoding');
@@ -306,7 +283,7 @@ class HTMLPurifier_Encoder
     * @note Currently, this is a lossy conversion, with unexpressable
     *       characters being omitted.
     */
-    function convertFromUTF8($str, $config, &$context) {
+    function convertFromUTF8($str, $config) {
        static $iconv = null;
        if ($iconv === null) $iconv = function_exists('iconv');
        $encoding = $config->get('Core', 'Encoding');
--- a/library/HTMLPurifier/EntityLookup.php
+++ b/library/HTMLPurifier/EntityLookup.php
@@ -19,7 +19,7 @@ class HTMLPurifier_EntityLookup {
     */
    function setup($file = false) {
        if (!$file) {
-            $file = dirname(__FILE__) . '/EntityLookup/entities.ser';
+            $file = dirname(__FILE__) . '/EntityLookup/data.txt';
        }
        $this->table = unserialize(file_get_contents($file));
    }
--- a/library/HTMLPurifier/EntityLookup/entities.ser
+++ b/library/HTMLPurifier/EntityLookup/entities.ser
--- a/library/HTMLPurifier/EntityParser.php
+++ b/library/HTMLPurifier/EntityParser.php
@@ -3,10 +3,6 @@
 require_once 'HTMLPurifier/EntityLookup.php';
 require_once 'HTMLPurifier/Encoder.php';

-// if want to implement error collecting here, we'll need to use some sort
-// of global data (probably trigger_error) because it's impossible to pass
-// $config or $context to the callback functions.
-
 /**
 * Handles referencing and derefencing character entities
 */
@@ -76,12 +72,37 @@ class HTMLPurifier_EntityParser
     * 
     * @warning Though this is public in order to let the callback happen,
     *          calling it directly is not recommended.
+     * @note Based on Feyd's function at
+     *       <http://forums.devnetwork.net/viewtopic.php?p=191404#191404>,
+     *       which is in public domain.
+     * @note While we're going to do code point parsing anyway, a good
+     *       optimization would be to refuse to translate code points that
+     *       are non-SGML characters.  However, this could lead to duplication.
+     * @note This function is heavily intimate with the inner workings of
+     *       UTF-8 and would also be well suited in the Encoder class (or at
+     *       least deferring some processing to it).  This is also very
+     *       similar to the unichr function in
+     *       maintenance/generate-entity-file.php (although this is superior,
+     *       due to its sanity checks).
     * @param $matches  PCRE matches array, with 0 the entire match, and
     *                  either index 1, 2 or 3 set with a hex value, dec value,
     *                  or string (respectively).
     * @returns Replacement string.
     */
    
+    // +----------+----------+----------+----------+
+    // | 33222222 | 22221111 | 111111   |          |
+    // | 10987654 | 32109876 | 54321098 | 76543210 | bit
+    // +----------+----------+----------+----------+
+    // |          |          |          | 0xxxxxxx | 1 byte 0x00000000..0x0000007F
+    // |          |          | 110yyyyy | 10xxxxxx | 2 byte 0x00000080..0x000007FF
+    // |          | 1110zzzz | 10yyyyyy | 10xxxxxx | 3 byte 0x00000800..0x0000FFFF
+    // | 11110www | 10wwzzzz | 10yyyyyy | 10xxxxxx | 4 byte 0x00010000..0x0010FFFF
+    // +----------+----------+----------+----------+
+    // | 00000000 | 00011111 | 11111111 | 11111111 | Theoretical upper limit of legal scalars: 2097151 (0x001FFFFF)
+    // | 00000000 | 00010000 | 11111111 | 11111111 | Defined upper limit of legal scalar codes
+    // +----------+----------+----------+----------+ 
+    
    function nonSpecialEntityCallback($matches) {
        // replaces all but big five
        $entity = $matches[0];
--- a/library/HTMLPurifier/Error.php
+++ b/library/HTMLPurifier/Error.php
@@ -1,8 +0,0 @@
-<?php
-
-/**
- * Return object from functions that signifies error when null doesn't cut it
- */
-class HTMLPurifier_Error {}
-
-?>
--- a/library/HTMLPurifier/Generator.php
+++ b/library/HTMLPurifier/Generator.php
@@ -1,5 +1,7 @@
 <?php

+// pretty-printing with indentation would be pretty cool
+
 require_once 'HTMLPurifier/Lexer.php';

 HTMLPurifier_ConfigSchema::define(
@@ -50,7 +52,6 @@ class HTMLPurifier_Generator
    
    /**
     * Bool cache of %Core.XHTML
-     * @private
     */
    var $_xhtml = true;
    
@@ -59,8 +60,9 @@ class HTMLPurifier_Generator
     * @param $tokens Array of HTMLPurifier_Token
     * @param $config HTMLPurifier_Config object
     * @return Generated HTML
+     * @note Only unit tests may omit configuration: internals MUST pass config
     */
-    function generateFromTokens($tokens, $config, &$context) {
+    function generateFromTokens($tokens, $config = null) {
        $html = '';
        if (!$config) $config = HTMLPurifier_Config::createDefault();
        $this->_clean_utf8 = $config->get('Core', 'CleanUTF8DuringGeneration');
--- a/library/HTMLPurifier/HTMLDefinition.php
+++ b/library/HTMLPurifier/HTMLDefinition.php
@@ -18,86 +18,10 @@ require_once 'HTMLPurifier/AttrTransform.php';
    require_once 'HTMLPurifier/AttrTransform/BdoDir.php';
    require_once 'HTMLPurifier/AttrTransform/ImgRequired.php';
 require_once 'HTMLPurifier/ChildDef.php';
-    require_once 'HTMLPurifier/ChildDef/Chameleon.php';
-    require_once 'HTMLPurifier/ChildDef/Empty.php';
-    require_once 'HTMLPurifier/ChildDef/Required.php';
-    require_once 'HTMLPurifier/ChildDef/Optional.php';
-    require_once 'HTMLPurifier/ChildDef/Table.php';
-    require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
 require_once 'HTMLPurifier/Generator.php';
 require_once 'HTMLPurifier/Token.php';
 require_once 'HTMLPurifier/TagTransform.php';

-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'EnableAttrID', false, 'bool',
-    'Allows the ID attribute in HTML.  This is disabled by default '.
-    'due to the fact that without proper configuration user input can '.
-    'easily break the validation of a webpage by specifying an ID that is '.
-    'already on the surrounding HTML.  If you don\'t mind throwing caution to '.
-    'the wind, enable this directive, but I strongly recommend you also '.
-    'consider blacklisting IDs you use (%Attr.IDBlacklist) or prefixing all '.
-    'user supplied IDs (%Attr.IDPrefix).  This directive has been available '.
-    'since 1.2.0, and when set to true reverts to the behavior of pre-1.2.0 '.
-    'versions.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'Strict', false, 'bool',
-    'Determines whether or not to use Transitional (loose) or Strict rulesets. '.
-    'This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'BlockWrapper', 'p', 'string',
-    'String name of element to wrap inline elements that are inside a block '.
-    'context.  This only occurs in the children of blockquote in strict mode. '.
-    'Example: by default value, <code>&lt;blockquote&gt;Foo&lt;/blockquote&gt;</code> '.
-    'would become <code>&lt;blockquote&gt;&lt;p&gt;Foo&lt;/p&gt;&lt;/blockquote&gt;</code>. The '.
-    '<code>&lt;p&gt;</code> tags can be replaced '.
-    'with whatever you desire, as long as it is a block level element. '.
-    'This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'Parent', 'div', 'string',
-    'String name of element that HTML fragment passed to library will be '.
-    'inserted in.  An interesting variation would be using span as the '.
-    'parent element, meaning that only inline tags would be allowed. '.
-    'This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'AllowedElements', null, 'lookup/null',
-    'If HTML Purifier\'s tag set is unsatisfactory for your needs, you '.
-    'can overload it with your own list of tags to allow.  Note that this '.
-    'method is subtractive: it does its job by taking away from HTML Purifier '.
-    'usual feature set, so you cannot add a tag that HTML Purifier never '.
-    'supported in the first place (like embed).  If you change this, you '.
-    'probably also want to change %HTML.AllowedAttributes. '.
-    '<strong>Warning:</strong> If another directive conflicts with the '.
-    'elements here, <em>that</em> directive will win and override. '.
-    'This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'HTML', 'AllowedAttributes', null, 'lookup/null',
-    'IF HTML Purifier\'s attribute set is unsatisfactory, overload it! '.
-    'The syntax is \'tag.attr\' or \'*.attr\' for the global attributes '.
-    '(style, id, class, dir, lang, xml:lang).'.
-    '<strong>Warning:</strong> If another directive conflicts with the '.
-    'elements here, <em>that</em> directive will win and override. For '.
-    'example, %HTML.EnableAttrID will take precedence over *.id in this '.
-    'directive.  You must set that directive to true before you can use '.
-    'IDs at all. This directive has been available since 1.3.0.'
-);
-
-HTMLPurifier_ConfigSchema::define(
-    'Attr', 'DisableURI', false, 'bool',
-    'Disables all URIs in all forms. Not sure why you\'d want to do that '.
-    '(after all, the Internet\'s founded on the notion of a hyperlink). '.
-    'This directive has been available since 1.3.0.'
-);
-
 /**
 * Defines the purified HTML type with large amounts of objects.
 * 
@@ -136,20 +60,6 @@ class HTMLPurifier_HTMLDefinition
     */
    var $info_parent = 'div';
    
-    /**
-     * Definition for parent element, allows parent element to be a
-     * tag that's not allowed inside the HTML fragment.
-     * @public
-     */
-    var $info_parent_def;
-    
-    /**
-     * String name of element used to wrap inline elements in block context
-     * @note This is rarely used except for BLOCKQUOTEs in strict mode
-     * @public
-     */
-    var $info_block_wrapper = 'p';
-    
    /**
     * Associative array of deprecated tag name to HTMLPurifier_TagTransform
     * @public
@@ -168,25 +78,14 @@ class HTMLPurifier_HTMLDefinition
     */
    var $info_attr_transform_post = array();
    
-    /**
-     * Lookup table of flow elements
-     * @public
-     */
-    var $info_flow_elements = array();
-    
-    /**
-     * Boolean is a strict definition?
-     * @public
-     */
-    var $strict;
-    
    /**
     * Initializes the definition, the meat of the class.
     */
    function setup($config) {
        
-        // some cached config values
-        $this->strict = $config->get('HTML', 'Strict');
+        // emulates the structure of the DTD
+        // these are condensed, however, with bad stuff taken out
+        // screening process was done by hand
        
        //////////////////////////////////////////////////////////////////////
        // info[] : initializes the definition objects
@@ -198,19 +97,13 @@ class HTMLPurifier_HTMLDefinition
            array(
                'ins', 'del', 'blockquote', 'dd', 'li', 'div', 'em', 'strong',
                'dfn', 'code', 'samp', 'kbd', 'var', 'cite', 'abbr', 'acronym',
-                'q', 'sub', 'tt', 'sup', 'i', 'b', 'big', 'small',
-                'bdo', 'span', 'dt', 'p', 'h1', 'h2', 'h3', 'h4',
+                'q', 'sub', 'tt', 'sup', 'i', 'b', 'big', 'small', 'u', 's',
+                'strike', 'bdo', 'span', 'dt', 'p', 'h1', 'h2', 'h3', 'h4',
                'h5', 'h6', 'ol', 'ul', 'dl', 'address', 'img', 'br', 'hr',
                'pre', 'a', 'table', 'caption', 'thead', 'tfoot', 'tbody',
                'colgroup', 'col', 'td', 'th', 'tr'
            );
        
-        if (!$this->strict) {
-            $allowed_tags[] = 'u';
-            $allowed_tags[] = 's';
-            $allowed_tags[] = 'strike';
-        }
-        
        foreach ($allowed_tags as $tag) {
            $this->info[$tag] = new HTMLPurifier_ElementDef();
        }
@@ -218,23 +111,12 @@ class HTMLPurifier_HTMLDefinition
        //////////////////////////////////////////////////////////////////////
        // info[]->child : defines allowed children for elements
        
-        // emulates the structure of the DTD
-        // however, these are condensed, with bad stuff taken out
-        // screening process was done by hand
-        
-        // entities: prefixed with e_ and _ replaces . from DTD
-        // double underlines are entities we made up
+        // entities: prefixed with e_ and _ replaces .
        
        // we don't use an array because that complicates interpolation
        // strings are used instead of arrays because if you use arrays,
        // you have to do some hideous manipulation with array_merge()
        
-        // todo: determine whether or not having allowed children
-        //       that aren't allowed globally affects security (it shouldn't)
-        // if above works out, extend children definitions to include all
-        //       possible elements (allowed elements will dictate which ones
-        //       get dropped
-        
        $e_special_extra = 'img';
        $e_special_basic = 'br | span | bdo';
        $e_special = "$e_special_basic | $e_special_extra";
@@ -245,9 +127,11 @@ class HTMLPurifier_HTMLDefinition
        $e_phrase_basic = 'em | strong | dfn | code | q | samp | kbd | var'.
          ' | cite | abbr | acronym';
        $e_phrase = "$e_phrase_basic | $e_phrase_extra";
+        $e_inline_forms = ''; // humor the dtd
        $e_misc_inline = 'ins | del';
        $e_misc = "$e_misc_inline";
-        $e_inline = "a | $e_special | $e_fontstyle | $e_phrase";
+        $e_inline = "a | $e_special | $e_fontstyle | $e_phrase".
+          " | $e_inline_forms";
        // pseudo-property we created for convenience, see later on
        $e__inline = "#PCDATA | $e_inline | $e_misc_inline";
        // note the casing
@@ -256,31 +140,24 @@ class HTMLPurifier_HTMLDefinition
        $e_lists = 'ul | ol | dl';
        $e_blocktext = 'pre | hr | blockquote | address';
        $e_block = "p | $e_heading | div | $e_lists | $e_blocktext | table";
-        $e_Block = new HTMLPurifier_ChildDef_Optional($e_block);
        $e__flow = "#PCDATA | $e_block | $e_inline | $e_misc";
        $e_Flow = new HTMLPurifier_ChildDef_Optional($e__flow);
-        $e_a_content = new HTMLPurifier_ChildDef_Optional("#PCDATA".
-          " | $e_special | $e_fontstyle | $e_phrase | $e_misc_inline");
+        $e_a_content = new HTMLPurifier_ChildDef_Optional("#PCDATA | $e_special".
+          " | $e_fontstyle | $e_phrase | $e_inline_forms | $e_misc_inline");
        $e_pre_content = new HTMLPurifier_ChildDef_Optional("#PCDATA | a".
          " | $e_special_basic | $e_fontstyle_basic | $e_phrase_basic".
-          " | $e_misc_inline");
-        $e_form_content = new HTMLPurifier_ChildDef_Optional('');//unused
-        $e_form_button_content = new HTMLPurifier_ChildDef_Optional('');//unused
+          " | $e_inline_forms | $e_misc_inline");
+        $e_form_content = new HTMLPurifier_ChildDef_Optional(''); //unused
+        $e_form_button_content = new HTMLPurifier_ChildDef_Optional(''); // unused
        
        $this->info['ins']->child =
-        $this->info['del']->child =
-            new HTMLPurifier_ChildDef_Chameleon($e__inline, $e__flow);
+        $this->info['del']->child = new HTMLPurifier_ChildDef_Chameleon($e__inline, $e__flow);
        
+        $this->info['blockquote']->child=
        $this->info['dd']->child  =
        $this->info['li']->child  =
        $this->info['div']->child = $e_Flow;
        
-        if ($this->strict) {
-            $this->info['blockquote']->child = new HTMLPurifier_ChildDef_StrictBlockquote();
-        } else {
-            $this->info['blockquote']->child = $e_Flow;
-        }
-        
        $this->info['caption']->child   = 
        $this->info['em']->child   =
        $this->info['strong']->child    =
@@ -320,13 +197,9 @@ class HTMLPurifier_HTMLDefinition
        
        $this->info['dl']->child   = new HTMLPurifier_ChildDef_Required('dt|dd');
        
-        if ($this->strict) {
-            $this->info['address']->child = $e_Inline;
-        } else {
        $this->info['address']->child =
          new HTMLPurifier_ChildDef_Optional("#PCDATA | p | $e_inline".
              " | $e_misc_inline");
-        }
        
        $this->info['img']->child  =
        $this->info['br']->child   =
@@ -352,20 +225,17 @@ class HTMLPurifier_HTMLDefinition
        //////////////////////////////////////////////////////////////////////
        // info[]->type : defines the type of the element (block or inline)
        
-        // reuses $e_Inline and $e_Block
-        foreach ($e_Inline->elements as $name => $bool) {
-            if ($name == '#PCDATA') continue;
+        // reuses $e_Inline and $e_block
+        
+        foreach ($e_Inline->elements as $name) {
            $this->info[$name]->type = 'inline';
        }
        
-        foreach ($e_Block->elements as $name => $bool) {
+        $e_Block = new HTMLPurifier_ChildDef_Optional($e_block);
+        foreach ($e_Block->elements as $name) {
            $this->info[$name]->type = 'block';
        }
        
-        foreach ($e_Flow->elements as $name => $bool) {
-            $this->info_flow_elements[$name] = true;
-        }
-        
        //////////////////////////////////////////////////////////////////////
        // info[]->excludes : defines elements that aren't allowed in here
        
@@ -373,7 +243,7 @@ class HTMLPurifier_HTMLDefinition
        
        $this->info['a']->excludes = array('a' => true);
        $this->info['pre']->excludes = array_flip(array('img', 'big', 'small',
-            // technically useless, but good to be indepth
+            // technically in spec, but we don't allow em anyway
            'object', 'applet', 'font', 'basefont'));
        
        //////////////////////////////////////////////////////////////////////
@@ -383,14 +253,13 @@ class HTMLPurifier_HTMLDefinition
        // by the transform classes. It will, however, do simple and slightly
        // complex attribute value substitution
        
-        // the question of varying allowed attributes is more entangling.
-        
        $e_Text = new HTMLPurifier_AttrDef_Text();
        
        // attrs, included in almost every single one except for a few,
        // which manually override these in their local definitions
        $this->info_global_attr = array(
            // core attrs
+            'id'    => new HTMLPurifier_AttrDef_ID(),
            'class' => new HTMLPurifier_AttrDef_Class(),
            'title' => $e_Text,
            'style' => new HTMLPurifier_AttrDef_CSS(),
@@ -400,10 +269,6 @@ class HTMLPurifier_HTMLDefinition
            'xml:lang' => new HTMLPurifier_AttrDef_Lang(),
            );
        
-        if ($config->get('HTML', 'EnableAttrID')) {
-            $this->info_global_attr['id'] = new HTMLPurifier_AttrDef_ID();
-        }
-        
        // required attribute stipulation handled in attribute transformation
        $this->info['bdo']->attr = array(); // nothing else
        
@@ -432,8 +297,7 @@ class HTMLPurifier_HTMLDefinition
        
        $this->info['table']->attr['summary'] = $e_Text;
        
-        $this->info['table']->attr['border'] =
-            new HTMLPurifier_AttrDef_Pixels();
+        $this->info['table']->attr['border'] = new HTMLPurifier_AttrDef_Pixels();
        
        $e_Length = new HTMLPurifier_AttrDef_Length();
        $this->info['table']->attr['cellpadding'] =
@@ -455,26 +319,17 @@ class HTMLPurifier_HTMLDefinition
        $this->info['td']->attr['colspan'] =
        $this->info['th']->attr['colspan'] = $e__NumberSpan;
        
-        if (!$config->get('Attr', 'DisableURI')) {
        $e_URI = new HTMLPurifier_AttrDef_URI();
        $this->info['a']->attr['href'] =
        $this->info['img']->attr['longdesc'] =
+        $this->info['img']->attr['src'] =
        $this->info['del']->attr['cite'] =
        $this->info['ins']->attr['cite'] =
        $this->info['blockquote']->attr['cite'] =
        $this->info['q']->attr['cite'] = $e_URI;
        
-            // URI that causes HTTP request
-            $this->info['img']->attr['src'] = new HTMLPurifier_AttrDef_URI(true);
-        }
-        
-        if (!$this->strict) {
-            $this->info['li']->attr['value'] = new HTMLPurifier_AttrDef_Integer();
-            $this->info['ol']->attr['start'] = new HTMLPurifier_AttrDef_Integer();
-        }
-        
        //////////////////////////////////////////////////////////////////////
-        // info_tag_transform : transformations of tags
+        // UNIMP : info_tag_transform : transformations of tags
        
        $this->info_tag_transform['font']   = new HTMLPurifier_TagTransform_Font();
        $this->info_tag_transform['menu']   = new HTMLPurifier_TagTransform_Simple('ul');
@@ -484,9 +339,6 @@ class HTMLPurifier_HTMLDefinition
        //////////////////////////////////////////////////////////////////////
        // info[]->auto_close : tags that automatically close another
        
-        // todo: determine whether or not SGML-like modeling based on
-        // mandatory/optional end tags would be a better policy
-        
        // make sure you test using isset() not !empty()
        
        // these are all block elements: blocks aren't allowed in P
@@ -529,60 +381,6 @@ class HTMLPurifier_HTMLDefinition
        
        $this->info_attr_transform_post[] = new HTMLPurifier_AttrTransform_Lang();
        
-        // protect against stdclasses floating around
-        foreach ($this->info as $key => $obj) {
-            if (is_a($obj, 'stdclass')) {
-                unset($this->info[$key]);
-            }
-        }
-        
-        //////////////////////////////////////////////////////////////////////
-        // info_block_wrapper : wraps inline elements in block context
-        
-        $block_wrapper = $config->get('HTML', 'BlockWrapper');
-        if (isset($e_Block->elements[$block_wrapper])) {
-            $this->info_block_wrapper = $block_wrapper;
-        } else {
-            trigger_error('Cannot use non-block element as block wrapper.',
-                E_USER_ERROR);
-        }
-        
-        //////////////////////////////////////////////////////////////////////
-        // info_parent : parent element of the HTML fragment
-        
-        $parent = $config->get('HTML', 'Parent');
-        if (isset($this->info[$parent])) {
-            $this->info_parent = $parent;
-        } else {
-            trigger_error('Cannot use unrecognized element as parent.',
-                E_USER_ERROR);
-        }
-        $this->info_parent_def = $this->info[$this->info_parent];
-        
-        //////////////////////////////////////////////////////////////////////
-        // %HTML.Allowed(Elements|Attributes) : cut non-allowed elements
-        $allowed_elements = $config->get('HTML', 'AllowedElements');
-        if (is_array($allowed_elements)) {
-            // $allowed_elements[$this->info_parent] = true; // allow parent element
-            foreach ($this->info as $name => $d) {
-                if(!isset($allowed_elements[$name])) unset($this->info[$name]);
-            }
-        }
-        $allowed_attributes = $config->get('HTML', 'AllowedAttributes');
-        if (is_array($allowed_attributes)) {
-            foreach ($this->info_global_attr as $attr => $info) {
-                if (!isset($allowed_attributes["*.$attr"])) {
-                    unset($this->info_global_attr[$attr]);
-                }
-            }
-            foreach ($this->info as $tag => $info) {
-                foreach ($info->attr as $attr => $attr_info) {
-                    if (!isset($allowed_attributes["$tag.$attr"])) {
-                        unset($this->info[$tag]->attr[$attr]);
-                    }
-                }
-            }
-        }
    }
    
    function setAttrForTableElements($attr, $def) {
--- a/library/HTMLPurifier/IDAccumulator.php
+++ b/library/HTMLPurifier/IDAccumulator.php
@@ -3,9 +3,6 @@
 /**
 * Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
 * @note In Slashdot-speak, dupe means duplicate.
- * @note This class does not accept $config or $context, thus, it is the
- *       burden of the callee to register the appropriate errors or
- *       configuration.
 */
 class HTMLPurifier_IDAccumulator
 {
--- a/library/HTMLPurifier/Lexer.php
+++ b/library/HTMLPurifier/Lexer.php
@@ -60,60 +60,6 @@ class HTMLPurifier_Lexer
        $this->_entity_parser = new HTMLPurifier_EntityParser();
    }
    
-    
-    /**
-     * Most common entity to raw value conversion table for special entities.
-     * @protected
-     */
-    var $_special_entity2str =
-            array(
-                    '&quot;' => '"',
-                    '&amp;'  => '&',
-                    '&lt;'   => '<',
-                    '&gt;'   => '>',
-                    '&#39;'  => "'",
-                    '&#039;' => "'",
-                    '&#x27;' => "'"
-            );
-    
-    /**
-     * Parses special entities into the proper characters.
-     * 
-     * This string will translate escaped versions of the special characters
-     * into the correct ones.
-     * 
-     * @warning
-     * You should be able to treat the output of this function as
-     * completely parsed, but that's only because all other entities should
-     * have been handled previously in substituteNonSpecialEntities()
-     * 
-     * @param $string String character data to be parsed.
-     * @returns Parsed character data.
-     */
-    function parseData($string) {
-        
-        // following functions require at least one character
-        if ($string === '') return '';
-        
-        // subtracts amps that cannot possibly be escaped
-        $num_amp = substr_count($string, '&') - substr_count($string, '& ') -
-            ($string[strlen($string)-1] === '&' ? 1 : 0);
-        
-        if (!$num_amp) return $string; // abort if no entities
-        $num_esc_amp = substr_count($string, '&amp;');
-        $string = strtr($string, $this->_special_entity2str);
-        
-        // code duplication for sake of optimization, see above
-        $num_amp_2 = substr_count($string, '&') - substr_count($string, '& ') -
-            ($string[strlen($string)-1] === '&' ? 1 : 0);
-        
-        if ($num_amp_2 <= $num_esc_amp) return $string;
-        
-        // hmm... now we have some uncommon entities. Use the callback.
-        $string = $this->_entity_parser->substituteSpecialEntities($string);
-        return $string;
-    }
-    
    var $_encoder;
    
    /**
@@ -122,7 +68,7 @@ class HTMLPurifier_Lexer
     * @param $string String HTML.
     * @return HTMLPurifier_Token array representation of HTML.
     */
-    function tokenizeHTML($string, $config, &$context) {
+    function tokenizeHTML($string, $config = null) {
        trigger_error('Call to abstract class', E_USER_ERROR);
    }
    
@@ -196,7 +142,7 @@ class HTMLPurifier_Lexer
     * Takes a piece of HTML and normalizes it by converting entities, fixing
     * encoding, extracting bits, and other good stuff.
     */
-    function normalize($html, $config, &$context) {
+    function normalize($html, $config) {
        
        // extract body from document if applicable
        if ($config->get('Core', 'AcceptFullDocuments')) {
--- a/library/HTMLPurifier/Lexer/DOMLex.php
+++ b/library/HTMLPurifier/Lexer/DOMLex.php
@@ -38,9 +38,10 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
        $this->factory = new HTMLPurifier_TokenFactory();
    }
    
-    public function tokenizeHTML($string, $config, &$context) {
+    public function tokenizeHTML($string, $config = null) {
+        if (!$config) $config = HTMLPurifier_Config::createDefault();
        
-        $string = $this->normalize($string, $config, $context);
+        $string = $this->normalize($string, $config);
        
        // preprocess string, essential for UTF-8
        $string =
--- a/library/HTMLPurifier/Lexer/DirectLex.php
+++ b/library/HTMLPurifier/Lexer/DirectLex.php
@@ -12,21 +12,75 @@ require_once 'HTMLPurifier/Lexer.php';
 * completely eventually.
 * 
 * @todo Reread XML spec and document differences.
- * 
- * @todo Determine correct behavior in transforming comment data. (preserve dashes?)
+ * @todo Add support for CDATA sections.
+ * @todo Determine correct behavior in outputting comment data. (preserve dashes?)
+ * @todo Optimize main function tokenizeHTML().
+ * @todo Less than sign (<) being prohibited (even as entity) in attr-values?
 */
 class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
 {
    
+    /**
+     * Most common entity to raw value conversion table for special entities.
+     * @protected
+     */
+    var $_special_entity2str =
+            array(
+                    '&quot;' => '"',
+                    '&amp;'  => '&',
+                    '&lt;'   => '<',
+                    '&gt;'   => '>',
+                    '&#39;'  => "'",
+                    '&#039;' => "'",
+                    '&#x27;' => "'"
+            );
+    
+    /**
+     * Parses special entities into the proper characters.
+     * 
+     * This string will translate escaped versions of the special characters
+     * into the correct ones.
+     * 
+     * @warning
+     * You should be able to treat the output of this function as
+     * completely parsed, but that's only because all other entities should
+     * have been handled previously in substituteNonSpecialEntities()
+     * 
+     * @param $string String character data to be parsed.
+     * @returns Parsed character data.
+     */
+    function parseData($string) {
+        
+        // subtracts amps that cannot possibly be escaped
+        $num_amp = substr_count($string, '&') - substr_count($string, '& ') -
+            ($string[strlen($string)-1] === '&' ? 1 : 0);
+        
+        if (!$num_amp) return $string; // abort if no entities
+        $num_esc_amp = substr_count($string, '&amp;');
+        $string = strtr($string, $this->_special_entity2str);
+        
+        // code duplication for sake of optimization, see above
+        $num_amp_2 = substr_count($string, '&') - substr_count($string, '& ') -
+            ($string[strlen($string)-1] === '&' ? 1 : 0);
+        
+        if ($num_amp_2 <= $num_esc_amp) return $string;
+        
+        // hmm... now we have some uncommon entities. Use the callback.
+        $string = $this->_entity_parser->substituteSpecialEntities($string);
+        return $string;
+    }
+    
    /**
     * Whitespace characters for str(c)spn.
     * @protected
     */
    var $_whitespace = "\x20\x09\x0D\x0A";
    
-    function tokenizeHTML($html, $config, &$context) {
+    function tokenizeHTML($html, $config = null) {
        
-        $html = $this->normalize($html, $config, $context);
+        if (!$config) $config = HTMLPurifier_Config::createDefault();
+        
+        $html = $this->normalize($html, $config);
        
        $cursor = 0; // our location in the text
        $inside_tag = false; // whether or not we're parsing the inside of a tag
@@ -145,7 +199,6 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                if ($attribute_string) {
                    $attributes = $this->parseAttributeString(
                                        $attribute_string
-                                      , $config, $context
                                  );
                } else {
                    $attributes = array();
@@ -180,7 +233,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
     * @param $string Inside of tag excluding name.
     * @returns Assoc array of attributes.
     */
-    function parseAttributeString($string, $config, &$context) {
+    function parseAttributeString($string) {
        $string = (string) $string; // quick typecast
        
        if ($string == '') return array(); // no attributes
--- a/library/HTMLPurifier/Lexer/PEARSax3.php
+++ b/library/HTMLPurifier/Lexer/PEARSax3.php
@@ -18,8 +18,6 @@ require_once 'HTMLPurifier/Lexer.php';
 * whatever it does for poorly formed HTML is up to it.
 * 
 * @todo Generalize so that XML_HTMLSax is also supported.
- * 
- * @warning Entity-resolution inside attributes is broken.
 */

 class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
@@ -31,19 +29,18 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
     */
    var $tokens = array();
    
-    function tokenizeHTML($string, $config, &$context) {
+    function tokenizeHTML($string, $config = null) {
        
        $this->tokens = array();
        
-        $string = $this->normalize($string, $config, $context);
+        if (!$config) $config = HTMLPurifier_Config::createDefault();
+        $string = $this->normalize($string, $config);
        
        $parser=& new XML_HTMLSax3();
        $parser->set_object($this);
        $parser->set_element_handler('openHandler','closeHandler');
        $parser->set_data_handler('dataHandler');
        $parser->set_escape_handler('escapeHandler');
-        
-        // doesn't seem to work correctly for attributes
        $parser->set_option('XML_OPTION_ENTITIES_PARSED', 1);
        
        $parser->parse($string);
@@ -56,10 +53,6 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
     * Open tag event handler, interface is defined by PEAR package.
     */
    function openHandler(&$parser, $name, $attrs, $closed) {
-        // entities are not resolved in attrs
-        foreach ($attrs as $key => $attr) {
-            $attrs[$key] = $this->parseData($attr);
-        }
        if ($closed) {
            $this->tokens[] = new HTMLPurifier_Token_Empty($name, $attrs);
        } else {
--- a/library/HTMLPurifier/PercentEncoder.php
+++ b/library/HTMLPurifier/PercentEncoder.php
@@ -1,47 +0,0 @@
-<?php
-
-/**
- * Class that handles operations involving percent-encoding in URIs.
- */
-class HTMLPurifier_PercentEncoder
-{
-    
-    /**
-     * Fix up percent-encoding by decoding unreserved characters and normalizing
-     * @param $string String to normalize
-     */
-    function normalize($string) {
-        if ($string == '') return '';
-        $parts = explode('%', $string);
-        $ret = array_shift($parts);
-        foreach ($parts as $part) {
-            $length = strlen($part);
-            if ($length < 2) {
-                $ret .= '%25' . $part;
-                continue;
-            }
-            $encoding = substr($part, 0, 2);
-            $text     = substr($part, 2);
-            if (!ctype_xdigit($encoding)) {
-                $ret .= '%25' . $part;
-                continue;
-            }
-            $int = hexdec($encoding);
-            if (
-                ($int >= 48 && $int <= 57) || // digits
-                ($int >= 65 && $int <= 90) || // uppercase letters
-                ($int >= 97 && $int <= 122) || // lowercase letters
-                $int == 126 || $int == 45 || $int == 46 || $int == 95 // ~-._
-            ) {
-                $ret .= chr($int) . $text;
-                continue;
-            }
-            $encoding = strtoupper($encoding);
-            $ret .= '%' . $encoding . $text;
-        }
-        return $ret;
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/Printer.php
+++ b/library/HTMLPurifier/Printer.php
@@ -1,149 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/Generator.php';
-require_once 'HTMLPurifier/Token.php';
-require_once 'HTMLPurifier/Encoder.php';
-
-class HTMLPurifier_Printer
-{
-    
-    /**
-     * Instance of HTMLPurifier_Generator for HTML generation convenience funcs
-     */
-    var $generator;
-    
-    /**
-     * Instance of HTMLPurifier_Config, for easy access
-     */
-    var $config;
-    
-    /**
-     * Initialize $generator.
-     */
-    function HTMLPurifier_Printer() {
-        $this->generator = new HTMLPurifier_Generator();
-    }
-    
-    /**
-     * Main function that renders object or aspect of that object
-     * @param $config Configuration object
-     */
-    function render($config) {}
-    
-    /**
-     * Returns a start tag
-     * @param $tag Tag name
-     * @param $attr Attribute array
-     */
-    function start($tag, $attr = array()) {
-        return $this->generator->generateFromToken(
-                    new HTMLPurifier_Token_Start($tag, $attr ? $attr : array())
-               );
-    }
-    
-    /**
-     * Returns an end teg
-     * @param $tag Tag name
-     */
-    function end($tag) {
-        return $this->generator->generateFromToken(
-                    new HTMLPurifier_Token_End($tag)
-               );
-    }
-    
-    /**
-     * Prints a complete element with content inside
-     * @param $tag Tag name
-     * @param $contents Element contents
-     * @param $attr Tag attributes
-     * @param $escape Bool whether or not to escape contents
-     */
-    function element($tag, $contents, $attr = array(), $escape = true) {
-        return $this->start($tag, $attr) .
-               ($escape ? $this->escape($contents) : $contents) .
-               $this->end($tag);
-    }
-    
-    /**
-     * Prints a simple key/value row in a table.
-     * @param $name Key
-     * @param $value Value
-     */
-    function row($name, $value) {
-        if (is_bool($value)) $value = $value ? 'On' : 'Off';
-        return
-            $this->start('tr') . "\n" .
-                $this->element('th', $name) . "\n" .
-                $this->element('td', $value) . "\n" .
-            $this->end('tr')
-        ;
-    }
-    
-    /**
-     * Escapes a string for HTML output.
-     * @param $string String to escape
-     */
-    function escape($string) {
-        $string = HTMLPurifier_Encoder::cleanUTF8($string);
-        $string = htmlspecialchars($string, ENT_COMPAT, 'UTF-8');
-        return $string;
-    }
-    
-    /**
-     * Takes a list of strings and turns them into a single list
-     * @param $array List of strings
-     * @param $polite Bool whether or not to add an end before the last
-     */
-    function listify($array, $polite = false) {
-        if (empty($array)) return 'None';
-        $ret = '';
-        $i = count($array);
-        foreach ($array as $value) {
-            $i--;
-            $ret .= $value;
-            if ($i > 0 && !($polite && $i == 1)) $ret .= ', ';
-            if ($polite && $i == 1) $ret .= 'and ';
-        }
-        return $ret;
-    }
-    
-    /**
-     * Retrieves the class of an object without prefixes, as well as metadata
-     * @param $obj Object to determine class of
-     * @param $prefix Further prefix to remove
-     */
-    function getClass($obj, $sec_prefix = '') {
-        static $five = null;
-        if ($five === null) $five = version_compare(PHP_VERSION, '5', '>=');
-        $prefix = 'HTMLPurifier_' . $sec_prefix;
-        if (!$five) $prefix = strtolower($prefix);
-        $class = str_replace($prefix, '', get_class($obj));
-        $lclass = strtolower($class);
-        $class .= '(';
-        switch ($lclass) {
-            case 'enum':
-                $values = array();
-                foreach ($obj->valid_values as $value => $bool) {
-                    $values[] = $value;
-                }
-                $class .= implode(', ', $values);
-                break;
-            case 'composite':
-                $values = array();
-                foreach ($obj->defs as $def) {
-                    $values[] = $this->getClass($def, $sec_prefix);
-                }
-                $class .= implode(', ', $values);
-                break;
-            case 'multiple':
-                $class .= $this->getClass($obj->single, $sec_prefix) . ', ';
-                $class .= $obj->max;
-                break;
-        }
-        $class .= ')';
-        return $class;
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/Printer/CSSDefinition.php
+++ b/library/HTMLPurifier/Printer/CSSDefinition.php
@@ -1,40 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/Printer.php';
-
-class HTMLPurifier_Printer_CSSDefinition extends HTMLPurifier_Printer
-{
-    
-    var $def;
-    
-    function render($config) {
-        $this->def = $config->getCSSDefinition();
-        $ret = '';
-        
-        $ret .= $this->start('div', array('class' => 'HTMLPurifier_Printer'));
-        $ret .= $this->start('table');
-        
-        $ret .= $this->element('caption', 'Properties ($info)');
-        
-        $ret .= $this->start('thead');
-        $ret .= $this->start('tr');
-        $ret .= $this->element('th', 'Property', array('class' => 'heavy'));
-        $ret .= $this->element('th', 'Definition', array('class' => 'heavy', 'style' => 'width:auto;'));
-        $ret .= $this->end('tr');
-        $ret .= $this->end('thead');
-        
-        ksort($this->def->info);
-        foreach ($this->def->info as $property => $obj) {
-            $name = $this->getClass($obj, 'AttrDef_');
-            $ret .= $this->row($property, $name);
-        }
-        
-        $ret .= $this->end('table');
-        $ret .= $this->end('div');
-        
-        return $ret;
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/Printer/HTMLDefinition.php
+++ b/library/HTMLPurifier/Printer/HTMLDefinition.php
@@ -1,206 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/Printer.php';
-
-class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
-{
-    
-    /**
-     * Instance of HTMLPurifier_HTMLDefinition, for easy access
-     */
-    var $def;
-    
-    function render(&$config) {
-        $ret = '';
-        $this->config =& $config;
-        $this->def =& $config->getHTMLDefinition();
-        $def =& $this->def;
-        
-        $ret .= $this->start('div', array('class' => 'HTMLPurifier_Printer'));
-        $ret .= $this->start('table');
-        $ret .= $this->element('caption', 'Environment');
-        
-        $ret .= $this->row('Parent of fragment', $def->info_parent);
-        $ret .= $this->row('Strict mode', $def->strict);
-        if ($def->strict) $ret .= $this->row('Block wrap name', $def->info_block_wrapper);
-        
-        $ret .= $this->start('tr');
-            $ret .= $this->element('th', 'Global attributes');
-            $ret .= $this->element('td', $this->listifyAttr($def->info_global_attr),0,0);
-        $ret .= $this->end('tr');
-        
-        $ret .= $this->renderChildren($def->info_parent_def->child);
-        
-        $ret .= $this->start('tr');
-            $ret .= $this->element('th', 'Tag transforms');
-            $list = array();
-            foreach ($def->info_tag_transform as $old => $new) {
-                $new = $this->getClass($new, 'TagTransform_');
-                $list[] = "<$old> with $new";
-            }
-            $ret .= $this->element('td', $this->listify($list));
-        $ret .= $this->end('tr');
-        
-        $ret .= $this->start('tr');
-            $ret .= $this->element('th', 'Pre-AttrTransform');
-            $ret .= $this->element('td', $this->listifyObjectList($def->info_attr_transform_pre));
-        $ret .= $this->end('tr');
-        
-        $ret .= $this->start('tr');
-            $ret .= $this->element('th', 'Post-AttrTransform');
-            $ret .= $this->element('td', $this->listifyObjectList($def->info_attr_transform_post));
-        $ret .= $this->end('tr');
-        
-        $ret .= $this->end('table');
-        
-        
-        $ret .= $this->renderInfo();
-        
-        
-        $ret .= $this->end('div');
-        
-        return $ret;
-    }
-    
-    /**
-     * Renders the Elements ($info) table
-     */
-    function renderInfo() {
-        $ret = '';
-        $ret .= $this->start('table');
-        $ret .= $this->element('caption', 'Elements ($info)');
-        ksort($this->def->info);
-        $ret .= $this->start('tr');
-        $ret .= $this->element('th', 'Allowed tags', array('colspan' => 2, 'class' => 'heavy'));
-        $ret .= $this->end('tr');
-        $ret .= $this->start('tr');
-        $ret .= $this->element('td', $this->listifyTagLookup($this->def->info), array('colspan' => 2));
-        $ret .= $this->end('tr');
-        foreach ($this->def->info as $name => $def) {
-            $ret .= $this->start('tr');
-                $ret .= $this->element('th', "<$name>", array('class'=>'heavy', 'colspan' => 2));
-            $ret .= $this->end('tr');
-            $ret .= $this->start('tr');
-                $ret .= $this->element('th', 'Type');
-                $ret .= $this->element('td', ucfirst($def->type));
-            $ret .= $this->end('tr');
-            if (!empty($def->excludes)) {
-                $ret .= $this->start('tr');
-                    $ret .= $this->element('th', 'Excludes');
-                    $ret .= $this->element('td', $this->listifyTagLookup($def->excludes));
-                $ret .= $this->end('tr');
-            }
-            if (!empty($def->attr_transform_pre)) {
-                $ret .= $this->start('tr');
-                    $ret .= $this->element('th', 'Pre-AttrTransform');
-                    $ret .= $this->element('td', $this->listifyObjectList($def->attr_transform_pre));
-                $ret .= $this->end('tr');
-            }
-            if (!empty($def->attr_transform_post)) {
-                $ret .= $this->start('tr');
-                    $ret .= $this->element('th', 'Post-AttrTransform');
-                    $ret .= $this->element('td', $this->listifyObjectList($def->attr_transform_post));
-                $ret .= $this->end('tr');
-            }
-            if (!empty($def->auto_close)) {
-                $ret .= $this->start('tr');
-                    $ret .= $this->element('th', 'Auto closed by');
-                    $ret .= $this->element('td', $this->listifyTagLookup($def->auto_close));
-                $ret .= $this->end('tr');
-            }
-            $ret .= $this->start('tr');
-                $ret .= $this->element('th', 'Allowed attributes');
-                $ret .= $this->element('td',$this->listifyAttr($def->attr),0,0);
-            $ret .= $this->end('tr');
-            
-            $ret .= $this->renderChildren($def->child);
-        }
-        $ret .= $this->end('table');
-        return $ret;
-    }
-    
-    /** 
-     * Renders a row describing the allowed children of an element
-     * @param $def HTMLPurifier_ChildDef of pertinent element
-     */
-    function renderChildren($def) {
-        $context = new HTMLPurifier_Context();
-        $ret = '';
-        $ret .= $this->start('tr');
-            $elements = array();
-            $attr = array();
-            if (isset($def->elements)) {
-                if ($def->type == 'strictblockquote') $def->validateChildren(array(), $this->config, $context);
-                $elements = $def->elements;
-            } elseif ($def->type == 'chameleon') {
-                $attr['rowspan'] = 2;
-            } elseif ($def->type == 'empty') {
-                $elements = array();
-            } elseif ($def->type == 'table') {
-                $elements = array('col', 'caption', 'colgroup', 'thead',
-                    'tfoot', 'tbody', 'tr');
-            }
-            $ret .= $this->element('th', 'Allowed children', $attr);
-            
-            if ($def->type == 'chameleon') {
-                
-                $ret .= $this->element('td',
-                    '<em>Block</em>: ' .
-                    $this->escape($this->listifyTagLookup($def->block->elements)),0,0);
-                $ret .= $this->end('tr');
-                $ret .= $this->start('tr');
-                $ret .= $this->element('td',
-                    '<em>Inline</em>: ' .
-                    $this->escape($this->listifyTagLookup($def->inline->elements)),0,0);
-                
-            } else {
-                $ret .= $this->element('td',
-                    '<em>'.ucfirst($def->type).'</em>: ' .
-                    $this->escape($this->listifyTagLookup($elements)),0,0);
-            }
-        $ret .= $this->end('tr');
-        return $ret;
-    }
-    
-    /** 
-     * Listifies a tag lookup table.
-     * @param $array Tag lookup array in form of array('tagname' => true)
-     */
-    function listifyTagLookup($array) {
-        $list = array();
-        foreach ($array as $name => $discard) {
-            if ($name !== '#PCDATA' && !isset($this->def->info[$name])) continue;
-            $list[] = $name;
-        }
-        return $this->listify($list);
-    }
-    
-    /**
-     * Listifies a list of objects by retrieving class names and internal state
-     * @param $array List of objects
-     * @todo Also add information about internal state
-     */
-    function listifyObjectList($array) {
-        $list = array();
-        foreach ($array as $discard => $obj) {
-            $list[] = $this->getClass($obj, 'AttrTransform_');
-        }
-        return $this->listify($list);
-    }
-    
-    /**
-     * Listifies a hash of attributes to AttrDef classes
-     * @param $array Array hash in form of array('attrname' => HTMLPurifier_AttrDef)
-     */
-    function listifyAttr($array) {
-        $list = array();
-        foreach ($array as $name => $obj) {
-            if ($obj === false) continue;
-            $list[] = "$name&nbsp;=&nbsp;<i>" . $this->getClass($obj, 'AttrDef_') . '</i>';
-        }
-        return $this->listify($list);
-    }
-    
-}
-
-?>
--- a/library/HTMLPurifier/Strategy.php
+++ b/library/HTMLPurifier/Strategy.php
@@ -24,7 +24,7 @@ class HTMLPurifier_Strategy
     * @param $config Configuration options
     * @returns Processed array of token objects.
     */
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config = null) {
        trigger_error('Cannot call abstract function', E_USER_ERROR);
    }
    
--- a/library/HTMLPurifier/Strategy/Composite.php
+++ b/library/HTMLPurifier/Strategy/Composite.php
@@ -18,9 +18,9 @@ class HTMLPurifier_Strategy_Composite extends HTMLPurifier_Strategy
        trigger_error('Attempt to instantiate abstract object', E_USER_ERROR);
    }
    
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config) {
        foreach ($this->strategies as $strategy) {
-            $tokens = $strategy->execute($tokens, $config, $context);
+            $tokens = $strategy->execute($tokens, $config);
        }
        return $tokens;
    }
--- a/library/HTMLPurifier/Strategy/FixNesting.php
+++ b/library/HTMLPurifier/Strategy/FixNesting.php
@@ -34,7 +34,8 @@ require_once 'HTMLPurifier/HTMLDefinition.php';
 class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
 {
    
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config) {
+        
        //####################################################################//
        // Pre-processing
        
@@ -48,10 +49,6 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
        array_unshift($tokens, new HTMLPurifier_Token_Start($parent_name));
        $tokens[] = new HTMLPurifier_Token_End($parent_name);
        
-        // setup the context variables
-        $parent_type = 'unknown'; // reference var that we alter
-        $context->register('ParentType', $parent_type);
-        
        //####################################################################//
        // Loop initialization
        
@@ -104,11 +101,7 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            if ($count = count($stack)) {
                $parent_index = $stack[$count-1];
                $parent_name  = $tokens[$parent_index]->name;
-                if ($parent_index == 0) {
-                    $parent_def   = $definition->info_parent_def;
-                } else {
                $parent_def   = $definition->info[$parent_name];
-                }
            } else {
                // unknown info, it won't be used anyway
                $parent_index = $parent_name = $parent_def = null;
@@ -116,10 +109,10 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            
            // calculate context
            if (isset($parent_def)) {
-                $parent_type = $parent_def->type;
+                $context = $parent_def->type;
            } else {
                // generally found in specialized elements like UL
-                $parent_type = 'unknown';
+                $context = 'unknown';
            }
            
            //################################################################//
@@ -145,22 +138,14 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            if ($excluded) {
                // there is an exclusion, remove the entire node
                $result = false;
-                $excludes = array(); // not used, but good to initialize anyway
            } else {
                // DEFINITION CALL
-                if ($i === 0) {
-                    // special processing for the first node
-                    $def = $definition->info_parent_def;
-                } else {
                $def = $definition->info[$tokens[$i]->name];
-                    
-                }
-                
                $child_def = $def->child;
                
                // have DTD child def validate children
                $result = $child_def->validateChildren(
-                    $child_tokens, $config, $context);
+                    $child_tokens, $config,$context);
                
                // determine whether or not this element has any exclusions
                $excludes = $def->excludes;
@@ -240,20 +225,13 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            
            // Test if the token indeed is a start tag, if not, move forward
            // and test again.
-            $size = count($tokens);
            while ($i < $size and $tokens[$i]->type != 'start') {
                if ($tokens[$i]->type == 'end') {
                    // pop a token index off the stack if we ended a node
                    array_pop($stack);
                    // pop an exclusion lookup off exclusion stack if
                    // we ended node and that node had exclusions
-                    if ($i == 0 || $i == $size - 1) {
-                        // use specialized var if it's the super-parent
-                        $s_excludes = $definition->info_parent_def->excludes;
-                    } else {
-                        $s_excludes = $definition->info[$tokens[$i]->name]->excludes;
-                    }
-                    if ($s_excludes) {
+                    if ($definition->info[$tokens[$i]->name]->excludes) {
                        array_pop($exclude_stack);
                    }
                }
@@ -269,9 +247,6 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
        array_shift($tokens);
        array_pop($tokens);
        
-        // remove context variables
-        $context->destroy('ParentType');
-        
        //####################################################################//
        // Return
        
--- a/library/HTMLPurifier/Strategy/MakeWellFormed.php
+++ b/library/HTMLPurifier/Strategy/MakeWellFormed.php
@@ -10,7 +10,7 @@ require_once 'HTMLPurifier/Generator.php';
 class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
 {
    
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config) {
        $definition = $config->getHTMLDefinition();
        $generator = new HTMLPurifier_Generator();
        $result = array();
@@ -86,7 +86,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            if (empty($current_nesting)) {
                if ($escape_invalid_tags) {
                    $result[] = new HTMLPurifier_Token_Text(
-                        $generator->generateFromToken($token, $config, $context)
+                        $generator->generateFromToken($token, $config)
                    );
                }
                continue;
@@ -123,7 +123,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            if ($skipped_tags === false) {
                if ($escape_invalid_tags) {
                    $result[] = new HTMLPurifier_Token_Text(
-                        $generator->generateFromToken($token, $config, $context)
+                        $generator->generateFromToken($token, $config)
                    );
                }
                continue;
--- a/library/HTMLPurifier/Strategy/RemoveForeignElements.php
+++ b/library/HTMLPurifier/Strategy/RemoveForeignElements.php
@@ -5,14 +5,6 @@ require_once 'HTMLPurifier/HTMLDefinition.php';
 require_once 'HTMLPurifier/Generator.php';
 require_once 'HTMLPurifier/TagTransform.php';

-HTMLPurifier_ConfigSchema::define(
-    'Core', 'RemoveInvalidImg', true, 'bool',
-    'This directive enables pre-emptive URI checking in <code>img</code> '.
-    'tags, as the attribute validation strategy is not authorized to '.
-    'remove elements from the document.  This directive has been available '.
-    'since 1.3.0, revert to pre-1.3.0 behavior by setting to false.'
-);
-
 /**
 * Removes all unrecognized tags from the list of tokens.
 * 
@@ -24,7 +16,7 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
 {
    
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config) {
        $definition = $config->getHTMLDefinition();
        $generator = new HTMLPurifier_Generator();
        $result = array();
@@ -33,23 +25,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
            if (!empty( $token->is_tag )) {
                // DEFINITION CALL
                if (isset($definition->info[$token->name])) {
-                    // leave untouched, except for a few special cases:
-                    
-                    // hard-coded image special case, pre-emptively drop
-                    // if not available. Probably not abstract-able
-                    if ( $token->name == 'img' ) {
-                        if (!isset($token->attr['src'])) continue;
-                        if (!isset($definition->info['img']->attr['src'])) {
-                            continue;
-                        }
-                        $token->attr['src'] =
-                            $definition->
-                                info['img']->
-                                    attr['src']->
-                                        validate($token->attr['src']);
-                        if ($token->attr['src'] === false) continue;
-                    }
-                    
+                    // leave untouched
                } elseif (
                    isset($definition->info_tag_transform[$token->name])
                ) {
@@ -57,11 +33,11 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
                    // DEFINITION CALL
                    $token = $definition->
                                info_tag_transform[$token->name]->
-                                    transform($token, $config, $context);
+                                    transform($token);
                } elseif ($escape_invalid_tags) {
                    // invalid tag, generate HTML and insert in
                    $token = new HTMLPurifier_Token_Text(
-                        $generator->generateFromToken($token, $config, $context)
+                        $generator->generateFromToken($token, $config)
                    );
                } else {
                    continue;
--- a/library/HTMLPurifier/Strategy/ValidateAttributes.php
+++ b/library/HTMLPurifier/Strategy/ValidateAttributes.php
@@ -3,6 +3,8 @@
 require_once 'HTMLPurifier/Strategy.php';
 require_once 'HTMLPurifier/HTMLDefinition.php';
 require_once 'HTMLPurifier/IDAccumulator.php';
+require_once 'HTMLPurifier/ConfigSchema.php';
+require_once 'HTMLPurifier/AttrContext.php';

 HTMLPurifier_ConfigSchema::define(
    'Attr', 'IDBlacklist', array(), 'list',
@@ -15,14 +17,18 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
 {
    
-    function execute($tokens, $config, &$context) {
+    function execute($tokens, $config) {
        
        $definition = $config->getHTMLDefinition();
        
-        // setup id_accumulator context
-        $id_accumulator = new HTMLPurifier_IDAccumulator();
-        $id_accumulator->load($config->get('Attr', 'IDBlacklist'));
-        $context->register('IDAccumulator', $id_accumulator);
+        // setup StrategyContext
+        $context = new HTMLPurifier_AttrContext();
+        
+        // setup ID accumulator and load it with blacklisted IDs
+        //     eventually, we'll have a dedicated context object to hold
+        //     all these accumulators and caches. For now, just an IDAccumulator
+        $context->id_accumulator = new HTMLPurifier_IDAccumulator();
+        $context->id_accumulator->load($config->get('Attr', 'IDBlacklist'));
        
        // create alias to global definition array, see also $defs
        // DEFINITION CALL
@@ -38,17 +44,19 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
            $attr = $token->attributes;
            
            // do global transformations (pre)
-            // nothing currently utilizes this
+            // ex. <ELEMENT lang="fr"> to <ELEMENT lang="fr" xml:lang="fr">
+            // DEFINITION CALL
            foreach ($definition->info_attr_transform_pre as $transform) {
-                $attr = $transform->transform($attr, $config, $context);
+                $attr = $transform->transform($attr, $config);
            }
            
            // do local transformations only applicable to this element (pre)
            // ex. <p align="right"> to <p style="text-align:right;">
+            // DEFINITION CALL
            foreach ($definition->info[$token->name]->attr_transform_pre
                as $transform
            ) {
-                $attr = $transform->transform($attr, $config, $context);
+                $attr = $transform->transform($attr, $config);
            }
            
            // create alias to this element's attribute definition array, see
@@ -104,23 +112,17 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
            }
            
            // post transforms
-            
-            // ex. <x lang="fr"> to <x lang="fr" xml:lang="fr">
            foreach ($definition->info_attr_transform_post as $transform) {
-                $attr = $transform->transform($attr, $config, $context);
+                $attr = $transform->transform($attr, $config);
            }
-            
-            // ex. <bdo> to <bdo dir="ltr">
            foreach ($definition->info[$token->name]->attr_transform_post as $transform) {
-                $attr = $transform->transform($attr, $config, $context);
+                $attr = $transform->transform($attr, $config);
            }
            
            // commit changes
            // could interfere with flyweight implementation
            $tokens[$key]->attributes = $attr;
        }
-        $context->destroy('IDAccumulator');
-        
        return $tokens;
    }
    
--- a/library/HTMLPurifier/TagTransform.php
+++ b/library/HTMLPurifier/TagTransform.php
@@ -17,10 +17,8 @@ class HTMLPurifier_TagTransform
    /**
     * Transforms the obsolete tag into the valid tag.
     * @param $tag Tag to be transformed.
-     * @param $config Mandatory HTMLPurifier_Config object
-     * @param $context Mandatory HTMLPurifier_Context object
     */
-    function transform($tag, $config, &$context) {
+    function transform($tag) {
        trigger_error('Call to abstract function', E_USER_ERROR);
    }
    
@@ -39,7 +37,7 @@ class HTMLPurifier_TagTransform_Simple extends HTMLPurifier_TagTransform
        $this->transform_to = $transform_to;
    }
    
-    function transform($tag, $config, &$context) {
+    function transform($tag) {
        $new_tag = $tag->copy();
        $new_tag->name = $this->transform_to;
        return $new_tag;
@@ -57,7 +55,7 @@ class HTMLPurifier_TagTransform_Center extends HTMLPurifier_TagTransform
 {
    var $transform_to = 'div';
    
-    function transform($tag, $config, &$context) {
+    function transform($tag) {
        if ($tag->type == 'end') {
            $new_tag = new HTMLPurifier_Token_End($this->transform_to);
            return $new_tag;
@@ -108,7 +106,7 @@ class HTMLPurifier_TagTransform_Font extends HTMLPurifier_TagTransform
        '+4' => '300%'
    );
    
-    function transform($tag, $config, &$context) {
+    function transform($tag) {
        
        if ($tag->type == 'end') {
            $new_tag = new HTMLPurifier_Token_End($this->transform_to);
--- a/library/HTMLPurifier/URIScheme.php
+++ b/library/HTMLPurifier/URIScheme.php
@@ -12,13 +12,6 @@ class HTMLPurifier_URIScheme
     */
    var $default_port = null;
    
-    /**
-     * Whether or not URIs of this schem are locatable by a browser
-     * http and ftp are accessible, while mailto and news are not.
-     * @public
-     */
-    var $browsable = false;
-    
    /**
     * Validates the components of a URI
     * @note This implementation should be called by children if they define
@@ -30,10 +23,9 @@ class HTMLPurifier_URIScheme
     * @param $path Path of URI
     * @param $query Query of URI, found after question mark
     * @param $config HTMLPurifier_Config object
-     * @param $context HTMLPurifier_Context object
     */
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        if ($this->default_port == $port) $port = null;
        return array($userinfo, $host, $port, $path, $query);
--- a/library/HTMLPurifier/URIScheme/ftp.php
+++ b/library/HTMLPurifier/URIScheme/ftp.php
@@ -4,39 +4,19 @@ require_once 'HTMLPurifier/URIScheme.php';

 /**
 * Validates ftp (File Transfer Protocol) URIs as defined by generic RFC 1738.
+ * @todo Typecode check on path
 */
 class HTMLPurifier_URIScheme_ftp extends HTMLPurifier_URIScheme {
    
    var $default_port = 21;
-    var $browsable = true; // usually
    
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        list($userinfo, $host, $port, $path, $query) = 
            parent::validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context );
-        $semicolon_pos = strrpos($path, ';'); // reverse
-        if ($semicolon_pos !== false) {
-            // typecode check
-            $type = substr($path, $semicolon_pos + 1); // no semicolon
-            $path = substr($path, 0, $semicolon_pos);
-            $type_ret = '';
-            if (strpos($type, '=') !== false) {
-                // figure out whether or not the declaration is correct
-                list($key, $typecode) = explode('=', $type, 2);
-                if ($key !== 'type') {
-                    // invalid key, tack it back on encoded
-                    $path .= '%3B' . $type;
-                } elseif ($typecode === 'a' || $typecode === 'i' || $typecode === 'd') {
-                    $type_ret = ";type=$typecode";
-                }
-            } else {
-                $path .= '%3B' . $type;
-            }
-            $path = str_replace(';', '%3B', $path);
-            $path .= $type_ret;
-        }
+                $userinfo, $host, $port, $path, $query, $config );
+        // typecode check needed on path
        return array($userinfo, $host, $port, $path, null);
    }
    
--- a/library/HTMLPurifier/URIScheme/http.php
+++ b/library/HTMLPurifier/URIScheme/http.php
@@ -8,14 +8,13 @@ require_once 'HTMLPurifier/URIScheme.php';
 class HTMLPurifier_URIScheme_http extends HTMLPurifier_URIScheme {
    
    var $default_port = 80;
-    var $browsable = true;
    
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        list($userinfo, $host, $port, $path, $query) = 
            parent::validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context );
+                $userinfo, $host, $port, $path, $query, $config );
        return array(null, $host, $port, $path, $query);
    }
    
--- a/library/HTMLPurifier/URIScheme/mailto.php
+++ b/library/HTMLPurifier/URIScheme/mailto.php
@@ -13,14 +13,12 @@ require_once 'HTMLPurifier/URIScheme.php';

 class HTMLPurifier_URIScheme_mailto extends HTMLPurifier_URIScheme {
    
-    var $browsable = false;
-    
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        list($userinfo, $host, $port, $path, $query) = 
            parent::validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context );
+                $userinfo, $host, $port, $path, $query, $config );
        // we need to validate path against RFC 2368's addr-spec
        return array(null, null, null, $path, $query);
    }
--- a/library/HTMLPurifier/URIScheme/news.php
+++ b/library/HTMLPurifier/URIScheme/news.php
@@ -7,14 +7,12 @@ require_once 'HTMLPurifier/URIScheme.php';
 */
 class HTMLPurifier_URIScheme_news extends HTMLPurifier_URIScheme {
    
-    var $browsable = false;
-    
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        list($userinfo, $host, $port, $path, $query) = 
            parent::validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context );
+                $userinfo, $host, $port, $path, $query, $config );
        // typecode check needed on path
        return array(null, null, null, $path, null);
    }
--- a/library/HTMLPurifier/URIScheme/nntp.php
+++ b/library/HTMLPurifier/URIScheme/nntp.php
@@ -8,14 +8,13 @@ require_once 'HTMLPurifier/URIScheme.php';
 class HTMLPurifier_URIScheme_nntp extends HTMLPurifier_URIScheme {
    
    var $default_port = 119;
-    var $browsable = false;
    
    function validateComponents(
-        $userinfo, $host, $port, $path, $query, $config, &$context
+        $userinfo, $host, $port, $path, $query, $config
    ) {
        list($userinfo, $host, $port, $path, $query) = 
            parent::validateComponents(
-                $userinfo, $host, $port, $path, $query, $config, $context );
+                $userinfo, $host, $port, $path, $query, $config );
        return array(null, $host, $port, $path, null);
    }
    
--- a/library/HTMLPurifier/URISchemeRegistry.php
+++ b/library/HTMLPurifier/URISchemeRegistry.php
@@ -63,9 +63,8 @@ class HTMLPurifier_URISchemeRegistry
     * Retrieves a scheme validator object
     * @param $scheme String scheme name like http or mailto
     * @param $config HTMLPurifier_Config object
-     * @param $config HTMLPurifier_Context object
     */
-    function &getScheme($scheme, $config, &$context) {
+    function &getScheme($scheme, $config = null) {
        if (!$config) $config = HTMLPurifier_Config::createDefault();
        $null = null; // for the sake of passing by reference
        
--- a/maintenance/generate-entity-file.php
+++ b/maintenance/generate-entity-file.php
@@ -13,7 +13,7 @@ chdir( dirname(__FILE__) );
 $entity_dir = '../docs/entities/';

 // defines the output file for the serialized content.
-$output_file = '../library/HTMLPurifier/EntityLookup/entities.ser';
+$output_file = '../library/HTMLPurifier/EntityLookup/data.txt';

 // courtesy of a PHP manual comment
 function unichr($dec) {
--- a/plugins/modx.txt
+++ b/plugins/modx.txt
@@ -1,91 +0,0 @@
-
-MODx Plugin
-
-MODx <http://www.modxcms.com/> is an open source PHP application framework.  
-I first came across them in my referrer logs when tillda asked if anyone
-could implement an HTML Purifier plugin.  This forum thread
-<http://modxcms.com/forums/index.php/topic,6604.0.html> eventually resulted
-in the fruition of this plugin that davidm says, "is on top of my favorite
-list."  HTML Purifier goes great with WYSIWYG editors!
-
-
-
-1. Credits
-
-PaulGregory wrote the overall structure of the code.  I added the
-slashes hack.
-
-
-
-2. Install
-
-First, you need to place HTML Purifier library somewhere.  The code here
-assumes that you've placed in MODx's assets/plugins/htmlpurifier (no version
-number).
-
-Log into the manager, and navigate:
-
-Resources > Manage Resources > Plugins tab > New Plugin
-
-Type in a name (probably HTML Purifier), and copy paste this code into the
-textarea:
-
--------------------------------------------------------------------------------
-$e = &$modx->Event;
-if ($e->name == 'OnBeforeDocFormSave') {
-    global $content;
-    
-    set_include_path('../assets/plugins/htmlpurifier/library/'
-       . PATH_SEPARATOR . get_include_path());
-    include_once 'HTMLPurifier.php';
-    $purifier = new HTMLPurifier();
-    
-    static $magic_quotes = null;
-    if ($magic_quotes === null) {
-        // this is an ugly hack because this hook hasn't
-        // had the backslashes removed yet when magic_quotes_gpc is on,
-        // but HTMLPurifier must not have the quotes slashed.
-        $magic_quotes = get_magic_quotes_gpc();
-    }
-    
-    if ($magic_quotes) $content = stripslashes($content);
-    $content = $purifier->purify($content);
-    if ($magic_quotes) $content = addslashes($content);
-}
--------------------------------------------------------------------------------
-
-Then navigate to the System Events tab and check "OnBeforeDocFormSave".
-Save the plugin.  HTML Purifier now is integrated!
-
-
-
-3. Making sure it works
-
-You can test HTML Purifier by deliberately putting in crappy HTML and seeing
-whether or not it gets fixed.  A better way is to put in something like this:
-
-<p lang="fr">Il est bon</p>
-
-...and seeing whether or not the content comes out as:
-
-<p lang="fr" xml:lang="fr">Il est bon</p>
-
-(lang to xml:lang synchronization is one of the many features HTML Purifier
-has).
-
-
-
-4. Caveat Emptor
-
-This code does not intercept save requests from the QuickEdit plugin, this may
-be added in a later version.  It also modifies things on save, so there's a
-slight chance that HTML Purifier may make a boo-boo and accidently mess things
-up (the original version is not saved).
-
-Finally, make sure that MODx is using UTF-8.  If you are using, say, a French
-localisation, you may be using Latin-1, if that's the case, configure
-HTML Purifier properly like this:
-
-$config = HTMLPurifier_Config::createDefault();
-$config->set('Core', 'Encoding', 'ISO-8859-1'); // or whatever encoding
-$purifier = new HTMLPurifier($config);
--- a/smoketests/common.php
+++ b/smoketests/common.php
@@ -2,7 +2,8 @@

 header('Content-type: text/html; charset=UTF-8');

-require_once '../library/HTMLPurifier.auto.php';
+set_include_path('../library' . PATH_SEPARATOR . get_include_path());
+require_once 'HTMLPurifier.php';

 function escapeHTML($string) {
    $string = HTMLPurifier_Encoder::cleanUTF8($string);
--- a/smoketests/printDefinition.php
+++ b/smoketests/printDefinition.php
@@ -1,137 +0,0 @@
-<?php
-
-require_once 'common.php'; // load library
-
-require_once 'HTMLPurifier/Printer/HTMLDefinition.php';
-require_once 'HTMLPurifier/Printer/CSSDefinition.php';
-
-$config = HTMLPurifier_Config::createDefault();
-
-// you can do custom configuration!
-if (file_exists('printDefinition.settings.php')) {
-    include 'printDefinition.settings.php';
-}
-
-$get = $_GET;
-foreach ($_GET as $key => $value) {
-    if (!strncmp($key, 'Null_', 5) && !empty($value)) {
-        unset($get[substr($key, 5)]);
-        unset($get[$key]);
-    }
-}
-
-@$config->loadArray($get);
-
-$printer_html_definition = new HTMLPurifier_Printer_HTMLDefinition();
-$printer_css_definition  = new HTMLPurifier_Printer_CSSDefinition();
-
-echo '<?xml version="1.0" encoding="UTF-8" ?>';
-?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
-<head>
-    <title>HTML Purifier Printer Smoketest</title>
-    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-    <style type="text/css">
-        form table {margin:1em auto;}
-        form th {text-align:right;padding-right:1em;}
-        .HTMLPurifier_Printer table {border-collapse:collapse;
-            border:1px solid #000; width:600px;
-            margin:1em auto;font-family:sans-serif;font-size:75%;}
-        .HTMLPurifier_Printer td, .HTMLPurifier_Printer th {padding:3px;
-            border:1px solid #000;background:#CCC; vertical-align: baseline;}
-        .HTMLPurifier_Printer th {text-align:left;background:#CCF;width:20%;}
-        .HTMLPurifier_Printer caption {font-size:1.5em; font-weight:bold;
-            width:100%;}
-        .HTMLPurifier_Printer .heavy {background:#99C;text-align:center;}
-    </style>
-    <script type="text/javascript">
-        function toggleWriteability(id_of_patient, checked) {
-            document.getElementById(id_of_patient).disabled = checked;
-        }
-    </script>
-</head>
-<body>
-<h1>HTML Purifier Printer Smoketest</h1>
-<p>This page will allow you to see precisely what HTML Purifier's internal
-whitelist is. You can
-also twiddle with the configuration settings to see how a directive
-influences the internal workings of the definition objects.</p>
-<h2>Modify configuration</h2>
-
-<p>You can specify an array by typing in a comma-separated
-list of items, HTML Purifier will take care of the rest (including
-transformation into a real array list or a lookup table). If a
-directive can be set to null, that usually means that the feature
-is disabled when it is null (not that, say, no tags are allowed).</p>
-
-<form id="edit-config" method="get" action="printDefinition.php">
-<table>
-<?php
-    $directives = $config->getBatch('HTML');
-    // can't handle hashes
-    foreach ($directives as $key => $value) {
-        $directive = "HTML.$key";
-        if (is_array($value)) {
-            $keys = array_keys($value);
-            if ($keys === array_keys($keys)) {
-                $value = implode(',', $keys);
-            } else {
-                $new_value = '';
-                foreach ($value as $name => $bool) {
-                    if ($bool !== true) continue;
-                    $new_value .= "$name,";
-                }
-                $value = rtrim($new_value, ',');
-            }
-        }
-        $allow_null = $config->def->info['HTML'][$key]->allow_null;
-?>
-<tr>
-<th>
-    <a href="http://hp.jpsband.org/live/configdoc/plain.html#<?php echo $directive ?>">
-        %<?php echo $directive; ?>
-    </a>
-</th>
-<td>
-<?php if (is_bool($value)) { ?>
-    Yes <input type="radio" name="<?php echo $directive; ?>" value="1"<?php if ($value) { ?> checked="checked"<?php } ?> /> &nbsp;
-    No <input type="radio" name="<?php echo $directive; ?>" value="0"<?php if (!$value) { ?> checked="checked"<?php } ?> />
-<?php } else { ?>
-    <?php if($allow_null) { ?>
-        Null/Disabled <input
-                type="checkbox"
-                value="1"
-                onclick="toggleWriteability('<?php echo $directive ?>',checked)"
-                name="Null_<?php echo $directive; ?>"
-                <?php if ($value === null) { ?> checked="checked"<?php } ?>
-              /> or <br />
-    <?php } ?>
-    <input
-        type="text"
-        id="<?php echo $directive; ?>"
-        name="<?php echo $directive; ?>"
-        value="<?php echo escapeHTML($value); ?>"
-        <?php if($value === null) {echo 'disabled="disabled"';} ?>
-    />
-<?php } ?>
-</td>
-</tr>
-<?php
-    }
-?>
-<tr>
-    <td colspan="2" style="text-align:right;">
-        [<a href="printDefinition.php">Reset</a>]
-        <input type="submit" value="Submit" />
-    </td>
-</tr>
-</table>
-</form>
-<h2>HTMLDefinition</h2>
-<?php echo $printer_html_definition->render($config) ?>
-<h2>CSSDefinition</h2>
-<?php echo $printer_css_definition->render($config) ?>
-</body>
-</html>
--- a/smoketests/utf8.php
+++ b/smoketests/utf8.php
@@ -2,17 +2,16 @@

 require_once 'common.php';

-echo '<?xml version="1.0" encoding="UTF-8" ?>';
 ?><!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-    <title>HTML Purifier UTF-8 Smoketest</title>
-    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+<title>HTMLPurifier UTF-8 Smoketest</title>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTML Purifier UTF-8 Smoketest</h1>
+<h1>HTMLPurifier UTF-8 Smoketest</h1>
 <?php

 $purifier = new HTMLPurifier();
--- a/smoketests/variableWidthAttack.php
+++ b/smoketests/variableWidthAttack.php
@@ -2,17 +2,16 @@

 require_once 'common.php';

-echo '<?xml version="1.0" encoding="UTF-8" ?>';
 ?><!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-    <title>HTML Purifier Variable Width Attack Smoketest</title>
-    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+<title>HTMLPurifier Variable Width Attack Smoketest</title>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
 <body>
-<h1>HTML Purifier Variable Width Attack Smoketest</h1>
+<h1>HTMLPurifier Variable Width Attack Smoketest</h1>
 <p>For more information, see
 <a href="http://applesoup.googlepages.com/bypass_filter.txt">Cheng Peng Su's
 original advisory.</a>  This particular exploit code appears only to work
--- a/smoketests/xssAttacks.php
+++ b/smoketests/xssAttacks.php
@@ -2,89 +2,49 @@

 require_once('common.php');

-function formatCode($string) {
-    return 
-        str_replace(
-            array("\t", '»', '\0(null)'),
-            array('<strong>\t</strong>', '<span class="linebreak">»</span>', '<strong>\0</strong>'),
-            escapeHTML(
-                str_replace("\0", '\0(null)',
-                    wordwrap($string, 28, " »\n", true)
-                )
-            )
-        );
-}
-
 ?><!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 <html>
 <head>
-    <title>HTML Purifier XSS Attacks Smoketest</title>
+    <title>HTMLPurifier XSS Attacks Smoketest</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
-    <style type="text/css">
-        .scroll {overflow:auto; width:100%;}
-        .even {background:#EAEAEA;}
-        thead th {border-bottom:1px solid #000;}
-        pre strong {color:#00C;}
-        pre .linebreak {color:#AAA;font-weight:100;}
-    </style>
 </head>
 <body>
-<h1>HTML Purifier XSS Attacks Smoketest</h1>
+<h1>HTMLPurifier XSS Attacks Smoketest</h1>
 <p>XSS attacks are from
 <a href="http://ha.ckers.org/xss.html">http://ha.ckers.org/xss.html</a>.</p>
-<p><strong>Caveats:</strong>
-<tt>Google.com</tt> has been programatically disallowed, but as you can
-see, there are ways of getting around that, so coverage in this area
-is not complete. Most XSS broadcasts its presence by spawning an alert dialogue.
-The displayed code is not strictly correct, as linebreaks have been forced for
-readability. Linewraps have been marked with <tt>»</tt>.  Some tests are
-omitted for your convenience. Not all control characters are displayed.</p>
-
+<p>The last segment of tests regarding blacklisted websites is not
+applicable at the moment, but when we add that functionality they'll be
+relevant.</p>
+<p>Most of the XSS broadcasts its presence by spawning an alert dialogue.</p>
 <h2>Test</h2>
 <?php

 if (version_compare(PHP_VERSION, '5', '<')) exit('<p>Requires PHP 5.</p>');

 $xml = simplexml_load_file('xssAttacks.xml');
-
-// programatically disallow google.com for URI evasion tests
-// not complete
-$config = HTMLPurifier_Config::createDefault();
-$config->set('URI', 'HostBlacklist', array('google.com'));
-$purifier = new HTMLPurifier($config);
+$purifier = new HTMLPurifier();

 ?>
-<table cellspacing="0" cellpadding="2">
+<!-- form is used so that we can use textareas and stay valid -->
+<form method="post" action="xssAttacks.php">
+<table>
 <thead><tr><th>Name</th><th width="30%">Raw</th><th>Output</th><th>Render</th></tr></thead>
 <tbody>
 <?php

-$i = 0;
 foreach ($xml->attack as $attack) {
    $code = $attack->code;
-    
-    // custom code for null byte injection tests
-    if (substr($code, 0, 7) == 'perl -e') {
-        $code = substr($code, $i=strpos($code, '"')+1, strrpos($code, '"') - $i);
-        $code = str_replace('\0', "\0", $code);
-    }
-    
-    // disable vectors we cannot test in any meaningful way
-    if ($code == 'See Below') continue; // event handlers, whitelist defeats
-    if ($attack->name == 'OBJECT w/Flash 2') continue; // requires ActionScript
-    if ($attack->name == 'IMG Embedded commands 2') continue; // is an HTTP response
-    
    // custom code for US-ASCII, which couldn't be expressed in XML without encoding
    if ($attack->name == 'US-ASCII encoding') $code = urldecode($code);
 ?>
-    <tr<?php if ($i++ % 2) {echo ' class="even"';} ?>>
+    <tr>
        <td><?php echo escapeHTML($attack->name); ?></td>
-        <td><pre><?php echo formatCode($code); ?></pre></td>
+        <td><textarea readonly="readonly" cols="20" rows="2"><?php echo escapeHTML($code); ?></textarea></td>
        <?php $pure_html = $purifier->purify($code); ?>
-        <td><pre><?php echo formatCode($pure_html); ?></pre></td>
-        <td><div class="scroll"><?php echo $pure_html ?></div></td>
+        <td><textarea readonly="readonly" cols="20" rows="2"><?php echo escapeHTML($pure_html); ?></textarea></td>
+        <td><?php echo $pure_html ?></td>
    </tr>
 <?php
 }
@@ -92,5 +52,6 @@ foreach ($xml->attack as $attack) {
 ?>
 </tbody>
 </table>
+</form>
 </body>
 </html>
--- a/smoketests/xssAttacks.xml
+++ b/smoketests/xssAttacks.xml
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Edward Z. Yang	6ef8abd04f	Released 1.1.1. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@459 48356398-32a2-884e-a903-53898d9a118a	2006-09-24 22:32:26 +00:00
Edward Z. Yang	bc5871f389	Merged 438:439, 440:441, and 442:457 from trunk/ to branches/1.1/, mostly major work done for 1.1.1 release. - Various documentation updates - Fixed fatal error in benchmark scripts, slightly augmented - As far as possible, whitespace is preserved in-between table children - Configuration option to optionally Tidy up output for indentation to make up for dropped whitespace by DOMLex (pretty-printing for the entire application should be done by a page-wide Tidy) - Sample test-settings.php file included Unrelated unmerged edit: removed irrelevant 1.2.0 release notes, those only exist in the trunk. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@458 48356398-32a2-884e-a903-53898d9a118a	2006-09-24 22:22:06 +00:00
Edward Z. Yang	30d75c999d	Merged r434:436 from trunk/ to branches/1.1 - Update documentation. - Update TODO. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@437 48356398-32a2-884e-a903-53898d9a118a	2006-09-17 22:08:48 +00:00
Edward Z. Yang	64d8ca9831	Branch out 1.1 release set. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/1.1@426 48356398-32a2-884e-a903-53898d9a118a	2006-09-17 00:19:47 +00:00