Release 2.1.5

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1814 48356398-32a2-884e-a903-53898d9a118a
Fix PHP 4.3.9/10 bug with float handling
2025-08-05 13:47:24 +02:00 · 2008-06-19 22:57:15 +00:00 · 2008-06-19 21:13:56 +00:00 · 2008-06-19 19:58:53 +00:00 · 2008-06-17 03:18:23 +00:00 · 2008-06-11 19:01:22 +00:00
126 changed files with 7326 additions and 674 deletions
--- a/1101
+++ b/1101
--- a/227
+++ b/227
@@ -2,32 +2,82 @@
 Install
    How to install HTML Purifier

-HTML Purifier is designed to run out of the box,  so actually using the library
-is extremely easy. (Although, if you were looking for a step-by-step
-installation GUI, you've come to the wrong place!)  The impatient can scroll
-down to the bottom of this INSTALL document to see the code, but you really
-should make sure a few things are properly done.
+HTML Purifier is designed to run out of the box, so actually using the 
+library is extremely easy.  (Although... if you were looking for a 
+step-by-step installation GUI, you've downloaded the wrong software!)
+
+While the impatient can get going immediately with some of the sample
+code at the bottom of this library, it's well worth performing some
+basic sanity checks to get the most out of this library.


+---------------------------------------------------------------------------
 1.  Compatibility

-HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has no
-core dependencies with other libraries.
+THIS IS A DEPRECATED PHP4 VERSION OF HTML PURIFIER.

-Optional extensions are iconv (usually installed) and tidy (also common).
-If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
-not having either of these extensions.
+If you are running PHP5, please go to http://htmlpurifier.org to download
+the latest version. This version of HTML Purifier is only actively tested
+from PHP 4.3.7 to PHP 5.0.5. Essential security will be released for this branch
+fixes will be issued for the PHP 4 version until August 8, 2008. 
+
+These optional extensions can enhance the capabilities of HTML Purifier:
+
+    * iconv  : Converts text to and from non-UTF-8 encodings
+    * bcmath : Used for unit conversion and imagecrash protection
+    * tidy   : Used for pretty-printing HTML


+---------------------------------------------------------------------------
+2.  Reconnaissance

-2.  Including the library
+A big plus of HTML Purifier is its inerrant support of standards, so
+your web-pages should be standards-compliant.  (They should also use
+semantic markup, but that's another issue altogether, one HTML Purifier
+cannot fix without reading your mind.)

-Simply use:
+HTML Purifier can process these doctypes:
+
+* XHTML 1.0 Transitional (default)
+* XHTML 1.0 Strict
+* HTML 4.01 Transitional
+* HTML 4.01 Strict
+* XHTML 1.1
+
+...and these character encodings:
+
+* UTF-8 (default)
+* Any encoding iconv supports (with crippled internationalization support)
+
+These defaults reflect what my choices where be if I were authoring an
+HTML document, however, what you choose depends on the nature of your
+codebase.  If you don't know what doctype you are using, you can determine
+the doctype from this identifier at the top of your source code:
+
+    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+
+...and the character encoding from this code:
+
+    <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
+
+If the character encoding declaration is missing, STOP NOW, and
+read 'docs/enduser-utf8.html' (web accessible at
+http://htmlpurifier.org/docs/enduser-utf8.html).  In fact, even if it is
+present, read this document anyway, as most websites specify character
+encoding incorrectly.
+
+
+---------------------------------------------------------------------------
+3.  Including the library
+
+The procedure is quite simple:

    require_once '/path/to/library/HTMLPurifier.auto.php';

-...and you're good to go.  Since HTML Purifier's codebase is fairly
-large, I recommend only including HTML Purifier when you need it.
+I recommend only including HTML Purifier when you need it, because that
+call represents the inclusion of a lot of PHP files which constitute
+the bulk of HTML Purifier's memory usage.

 If you don't like your include_path to be fiddled around with, simply set
 HTML Purifier's library/ directory to the include path yourself and then:
@@ -38,42 +88,7 @@ Only the contents in the library/ folder are necessary, so you can remove
 everything else when using HTML Purifier in a production environment. 


-
-3.  Preparing the proper output environment
-
-HTML Purifier is all about web-standards, so accordingly your webpages should
-be standards compliant.  HTML Purifier can deal with these doctypes:
-
-* XHTML 1.0 Transitional (default)
-* XHTML 1.0 Strict
-* HTML 4.01 Transitional
-* HTML 4.01 Strict
-* XHTML 1.1 (sans Ruby)
-
-...and these character encodings:
-
-* UTF-8 (default)
-* Any encoding iconv supports (support is crippled for i18n though)
-
-The defaults are there for a reason: they are best-practice choices that
-should not be changed lightly.  For those of you in the dark, you can determine
-the doctype from this code in your HTML documents:
-
-    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
-        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
-
-...and the character encoding from this code:
-
-    <meta http-equiv="Content-type" content="text/html;charset=ENCODING">
-
-For legacy codebases these declarations may be missing.  If that is the case,
-STOP, and read docs/enduser-utf8.html
-
-You may currently be vulnerable to XSS and other security threats, and HTML
-Purifier won't be able to fix that.
-
-
-
+---------------------------------------------------------------------------
 4. Configuration

 HTML Purifier is designed to run out-of-the-box, but occasionally HTML
@@ -90,7 +105,6 @@ object and read on:
    $config = HTMLPurifier_Config::createDefault();


-
 4.1. Setting a different character encoding

 You really shouldn't use any other encoding except UTF-8, especially if you
@@ -117,7 +131,6 @@ but please be cognizant of the issues the "solution" creates (for this
 reason, I do not include the solution in this document).


-
 4.2. Setting a different doctype

 For those of you using HTML 4.01 Transitional, you can disable
@@ -134,7 +147,6 @@ Other supported doctypes include:
    * XHTML 1.1


-
 4.3. Other settings

 There are more configuration directives which can be read about
@@ -144,55 +156,24 @@ your code.  Some of the more interesting ones are configurable at the
 demo <http://htmlpurifier.org/demo.php> and are well worth looking into
 for your own system.

+For example, you can fine tune allowed elements and attributes, convert
+relative URLs to absolute ones, and even autoparagraph input text! These
+are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
+%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
+translates to:
+
+    $config->set('Namespace', 'Directive', $value);
+
+E.g.
+
+    $config->set('HTML', 'Allowed', 'p,b,a[href],i');
+    $config->set('URI', 'Base', 'http://www.example.com');
+    $config->set('URI', 'MakeAbsolute', true);
+    $config->set('AutoFormat', 'AutoParagraph', true);


-5.   Using the code
-
-The interface is mind-numbingly simple:
-
-    $purifier = new HTMLPurifier();
-    $clean_html = $purifier->purify( $dirty_html );
-
-...or, if you're using the configuration object:
-
-    $purifier = new HTMLPurifier($config);
-    $clean_html = $purifier->purify( $dirty_html );
-
-That's it!  For more examples, check out docs/examples/ (they aren't very
-different though).  Also, docs/enduser-slow.html gives advice on what to
-do if HTML Purifier is slowing down your application.
-
-
-
-6.   Quick install
-
-First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
-writable by the webserver (see Section 7: Caching below for details).
-If your website is in UTF-8 and XHTML Transitional, use this code:
-
-<?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
-    
-    $purifier = new HTMLPurifier();
-    $clean_html = $purifier->purify($dirty_html);
-?>
-
-If your website is in a different encoding or doctype, use this code:
-
-<?php
-    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
-    
-    $config = HTMLPurifier_Config::createDefault();
-    $config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
-    $config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
-    $purifier = new HTMLPurifier($config);
-    
-    $clean_html = $purifier->purify($dirty_html);
-?>
-
-
-
-7. Caching
+---------------------------------------------------------------------------
+5. Caching

 HTML Purifier generates some cache files (generally one or two) to speed up
 its execution. For maximum performance, make sure that
@@ -228,3 +209,49 @@ Or move the cache directory somewhere else (no trailing slash):

    $config->set('Cache', 'SerializerPath', '/home/user/absolute/path');

+
+---------------------------------------------------------------------------
+6.   Using the code
+
+The interface is mind-numbingly simple:
+
+    $purifier = new HTMLPurifier();
+    $clean_html = $purifier->purify( $dirty_html );
+
+...or, if you're using the configuration object:
+
+    $purifier = new HTMLPurifier($config);
+    $clean_html = $purifier->purify( $dirty_html );
+
+That's it!  For more examples, check out docs/examples/ (they aren't very
+different though).  Also, docs/enduser-slow.html gives advice on what to
+do if HTML Purifier is slowing down your application.
+
+
+---------------------------------------------------------------------------
+7.   Quick install
+
+First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
+writable by the webserver (see Section 5: Caching above for details).
+If your website is in UTF-8 and XHTML Transitional, use this code:
+
+<?php
+    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
+    
+    $purifier = new HTMLPurifier();
+    $clean_html = $purifier->purify($dirty_html);
+?>
+
+If your website is in a different encoding or doctype, use this code:
+
+<?php
+    require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
+    
+    $config = HTMLPurifier_Config::createDefault();
+    $config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
+    $config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
+    $purifier = new HTMLPurifier($config);
+    
+    $clean_html = $purifier->purify($dirty_html);
+?>
+
--- a/107
+++ b/107
@@ -9,6 +9,113 @@ NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
    . Internal change
 ==========================

+ERRATA
+- PH5P is seriously broken here; it can result in fatal errors and exceptions. 
+  If you desire to use it, please use it with the latest, PHP5-only version of 
+  HTML Purifier.
+
+2.1.5, released 2008-06-19
+! More robust imagecrash protection with height/width CSS with %CSS.MaxImgLength,
+  and height/width HTML with %HTML.MaxImgLength.
+- AttrValidator operations are now atomic; updates to attributes are not
+  manifest in token until end of operations. This prevents naughty internal
+  code from directly modifying CurrentToken when they're not supposed to.
+- Percent encoding checks enabled for URI query and fragment
+- Disable percent height/width attributes for img
+- Fix stray backslashes in font-family; CSS Unicode character escapes are
+  now properly resolved (although *only* in font-family).
+- Improve parseCDATA algorithm to take into account newline normalization
+- Account for browser confusion between Yen character and backslash in
+  Shift_JIS encoding. This fix generalizes to any other encoding which is not
+  a strict superset of printable ASCII.
+- Improved adherence to Unicode by checking for non-character codepoints.
+  Thanks Geoffrey Sneddon for reporting. This may result in degraded
+  performance for extremely large inputs.
+- Allow CSS property-value pair ''text-decoration: none''
+. Added HTMLPurifier_UnitConverter and HTMLPurifier_Length for convenient
+  handling of CSS-style lengths. HTMLPurifier_AttrDef_CSS_Length now uses
+  this class.
+. API of HTMLPurifier_AttrDef_CSS_Length changed from __construct($disable_negative)
+  to __construct($min, $max). __construct(true) is equivalent to
+  __construct('0'). (replace __construct with HTMLPurifier_AttrDef_CSS_Length)
+. Added HTMLPurifier_AttrDef_Switch class
+. Rename HTMLPurifier_HTMLModule_Tidy->construct() to setup() and bubble method
+  up inheritance hierarchy to HTMLPurifier_HTMLModule. All HTMLModules
+  get this called with the configuration object.  All modules now
+  use this rather than __construct(), although legacy code using constructors
+  will still work--the new format, however, lets modules access the
+  configuration object for HTML namespace dependant tweaks.
+. AttrDef_HTML_Pixels now takes a single construction parameter, pixels.
+
+2.1.4, released 2008-05-18
+! DefinitionCacheFactory now can register new implementations
+! CSS properties are now case-insensitive
+! Encoder optimized with valid UTF-8 input
+! HTML Purifier's URI handling is a lot more robust, with much stricter
+  validation checks and better percent encoding handling.
+- Colors missing # but in hex form will be corrected
+- CSS Number algorithm improved
+- Autoclose now operates iteratively, i.e. <span><span><div> now has
+  both span tags closed.
+- Fix bug with trusted script handling in libxml versions later than 2.6.28.
+- Fix bug in comment parsing with DirectLex
+- Fix bug with rgb(0, 1, 2) color syntax with spaces inside shorthand syntax
+- HTMLPurifier_HTMLDefinition->addAttribute can now be called multiple times
+  on the same element without emitting errors.
+- Iconv uses set_error_handler instead of shut-up operator
+- Add protection against imagecrash attack with CSS height/width
+- HTMLPurifier::getInstance() renamed to HTMLPurifier::instance() for consistency
+- Fixed bug with fallback languages in LanguageFactory
+
+2.1.3, released 2007-11-05
+! tests/multitest.php allows you to test multiple versions by running
+  tests/index.php through multiple interpreters using `phpv` shell
+  script (you must provide this script!)
+- Fixed poor include ordering for Email URI AttrDefs, causes fatal errors
+  on some systems.
+- Injector algorithm further refined: off-by-one error regarding skip 
+  counts for dormant injectors fixed
+- Corrective blockquote definition now enabled for HTML 4.01 Strict
+- Fatal error when <img> tag (or any other element with required attributes)
+  has 'id' attribute fixed, thanks NykO18 for reporting
+- Fix warning emitted when a non-supported URI scheme is passed to the
+  MakeAbsolute URIFilter, thanks NykO18 (again)
+- Further refine AutoParagraph injector. Behavior inside of elements
+  allowing paragraph tags clarified: only inline content delimeted by
+  double newlines (not block elements) are paragraphed.
+- Buggy treatment of end tags of elements that have required attributes
+  fixed (does not manifest on default tag-set)
+- Spurious internal content reorganization error suppressed
+- HTMLDefinition->addElement now returns a reference to the created
+  element object, as implied by the documentation
+- Phorum mod's HTML Purifier help message expanded (unreleased elsewhere)
+- Fix a theoretical class of infinite loops from DirectLex reported
+  by Nate Abele
+- Work around unnecessary DOMElement type-cast in PH5P that caused errors
+  in PHP 5.1
+- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only
+  HTMLDefinition errors, this may indicate problems with error-collecting
+  facilities in PHP 5
+- Make ErrorCollectorEMock work in both PHP 4 and PHP 5
+- Make PH5P work with PHP 5.0 by removing unnecessary array parameter typedef
+. %Core.AcceptFullDocuments renamed to %Core.ConvertDocumentToFragment 
+  to better communicate its purpose
+. Error unit tests can now specify the expectation of no errors. Future
+  iterations of the harness will be extremely strict about what errors
+  are allowed
+. Extend Injector hooks to allow for more powerful injector routines
+. HTMLDefinition->addBlankElement created, as according to the HTMLModule
+  method
+. Doxygen configuration file updated, with minor improvements
+. Test runner now checks for similarly named files in conf/ directory too.
+. Minor cosmetic change to flush-definition-cache.php: trailing newline is
+  outputted
+. Maintenance script for generating PH5P patch added, original PH5P source
+  file also added under version control
+. Full unit test runner script title made more descriptive with PHP version
+. Updated INSTALL file to state that 4.3.7 is the earliest version we
+  are actively testing
+
 2.1.2, released 2007-09-03
 ! Implemented Object module for trusted users
 ! Implemented experimental HTML5 parsing mode using PH5P. To use, add
--- a/2
+++ b/2
@@ -1 +1 @@
-2.1.2
+2.1.5
--- a/15
+++ b/15
@@ -1,8 +1,7 @@
-Version 2.1.2 is a mix of experimental features and stability updates.
-Among new features: an Object module for trusted users, support for the
-CSS property 'border-spacing', and HTML 5 style parsing using PH5P.
-Bug fixes ihave resolved a few obscure issues including border-collapse:seperate,
-a DirectLex parsing error, broken HTML in printDefinition.php, and problems
-with the experimental standalone distribution. Also, there were large
-amounts of behind-the-scenes refactoring and the removal of URIScheme
-inclusion reflection.
+Security and bugfix release 2.1.5 is a backport that fixes two vulnerabilities
+related to CSS, one of which only occurs under Shift_JIS. It also improves
+imagecrash protection (percent CSS width and height is now disabled for
+images, and you can control the bounds with %CSS.MaxImgLength and
+%HTML.MaxImgLength). Finally, there are number of bug fixes, most notably 
+support for text-decoration: none, improved adherence to Unicode and increased
+percent encoding checks.
--- a/library/HTMLPurifier.php
+++ b/library/HTMLPurifier.php
@@ -22,8 +22,8 @@
 */

 /*
-    HTML Purifier 2.1.2 - Standards Compliant HTML Filtering
-    Copyright (C) 2006 Edward Z. Yang
+    HTML Purifier 2.1.5 - Standards Compliant HTML Filtering
+    Copyright (C) 2006-2007 Edward Z. Yang

    This library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -43,9 +43,8 @@
 // constants are slow, but we'll make one exception
 define('HTMLPURIFIER_PREFIX', dirname(__FILE__));

-// almost every class has an undocumented dependency to these, so make sure
-// they get included
-require_once 'HTMLPurifier/ConfigSchema.php'; // important
+// every class has an undocumented dependency to these, must be included!
+require_once 'HTMLPurifier/ConfigSchema.php'; // fatal errors if not included
 require_once 'HTMLPurifier/Config.php';
 require_once 'HTMLPurifier/Context.php';

@@ -60,16 +59,23 @@ require_once 'HTMLPurifier/LanguageFactory.php';
 HTMLPurifier_ConfigSchema::define(
    'Core', 'CollectErrors', false, 'bool', '
 Whether or not to collect errors found while filtering the document. This
-is a useful way to give feedback to your users. CURRENTLY NOT IMPLEMENTED.
-This directive has been available since 2.0.0.
+is a useful way to give feedback to your users. <strong>Warning:</strong>
+Currently this feature is very patchy and experimental, with lots of
+possible error messages not yet implemented. It will not cause any problems,
+but it may not help your users either. This directive has been available
+since 2.0.0.
 ');

 /**
- * Main library execution class.
+ * Facade that coordinates HTML Purifier's subsystems in order to purify HTML.
 * 
- * Facade that performs calls to the HTMLPurifier_Lexer,
- * HTMLPurifier_Strategy and HTMLPurifier_Generator subsystems in order to
- * purify HTML.
+ * @note There are several points in which configuration can be specified 
+ *       for HTML Purifier.  The precedence of these (from lowest to
+ *       highest) is as follows:
+ *          -# Instance: new HTMLPurifier($config)
+ *          -# Invocation: purify($html, $config)
+ *       These configurations are entirely independent of each other and
+ *       are *not* merged.
 * 
 * @todo We need an easier way to inject strategies, it'll probably end
 *       up getting done through config though.
@@ -77,15 +83,16 @@ This directive has been available since 2.0.0.
 class HTMLPurifier
 {
    
-    var $version = '2.1.2';
+    var $version = '2.1.5';
    
    var $config;
-    var $filters;
+    var $filters = array();
    
    var $strategy, $generator;
    
    /**
-     * Final HTMLPurifier_Context of last run purification. Might be an array.
+     * Resultant HTMLPurifier_Context of last run purification. Is an array
+     * of contexts if the last called method was purifyArray().
     * @public
     */
    var $context;
@@ -150,6 +157,11 @@ class HTMLPurifier
            $context->register('ErrorCollector', $error_collector);
        }
        
+        // setup id_accumulator context, necessary due to the fact that
+        // AttrValidator can be called from many places
+        $id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
+        $context->register('IDAccumulator', $id_accumulator);
+        
        $html = HTMLPurifier_Encoder::convertToUTF8($html, $config, $context);
        
        for ($i = 0, $size = count($this->filters); $i < $size; $i++) {
@@ -198,8 +210,10 @@ class HTMLPurifier
    
    /**
     * Singleton for enforcing just one HTML Purifier in your system
+     * @param $prototype Optional prototype HTMLPurifier instance to
+     *                   overload singleton with.
     */
-    function &getInstance($prototype = null) {
+    function &instance($prototype = null) {
        static $htmlpurifier;
        if (!$htmlpurifier || $prototype) {
            if (is_a($prototype, 'HTMLPurifier')) {
@@ -213,6 +227,9 @@ class HTMLPurifier
        return $htmlpurifier;
    }
    
+    function &getInstance($prototype = null) {
+        return HTMLPurifier::instance($prototype);
+    }
    
 }

--- a/library/HTMLPurifier/AttrDef.php
+++ b/library/HTMLPurifier/AttrDef.php
@@ -54,18 +54,15 @@ class HTMLPurifier_AttrDef
     * 
     * @warning This processing is inconsistent with XML's whitespace handling
     *          as specified by section 3.3.3 and referenced XHTML 1.0 section
-     *          4.7.  Compliant processing requires all line breaks normalized
-     *          to "\n", so the fix is not as simple as fixing it in this
-     *          function.  Trim and whitespace collapsing are supposed to only
-     *          occur in NMTOKENs.  However, note that we are NOT necessarily
-     *          parsing XML, thus, this behavior may still be correct.
+     *          4.7.  However, note that we are NOT necessarily
+     *          parsing XML, thus, this behavior may still be correct. We
+     *          assume that newlines have been normalized.
     * 
     * @public
     */
    function parseCDATA($string) {
        $string = trim($string);
-        $string = str_replace("\n", '', $string);
-        $string = str_replace(array("\r", "\t"), ' ', $string);
+        $string = str_replace(array("\n", "\t", "\r"), ' ', $string);
        return $string;
    }
    
@@ -82,5 +79,13 @@ class HTMLPurifier_AttrDef
        return $this;
    }
    
+    /**
+     * Removes spaces from rgb(0, 0, 0) so that shorthand CSS properties work
+     * properly. THIS IS A HACK!
+     */
+    function mungeRgb($string) {
+        return preg_replace('/rgb\((\d+)\s*,\s*(\d+)\s*,\s*(\d+)\)/', 'rgb(\1,\2,\3)', $string);
+    }
+    
 }

--- a/library/HTMLPurifier/AttrDef/CSS.php
+++ b/library/HTMLPurifier/AttrDef/CSS.php
@@ -38,7 +38,20 @@ class HTMLPurifier_AttrDef_CSS extends HTMLPurifier_AttrDef
            list($property, $value) = explode(':', $declaration, 2);
            $property = trim($property);
            $value    = trim($value);
-            if (!isset($definition->info[$property])) continue;
+            $ok = false;
+            do {
+                if (isset($definition->info[$property])) {
+                    $ok = true;
+                    break;
+                }
+                if (ctype_lower($property)) break;
+                $property = strtolower($property);
+                if (isset($definition->info[$property])) {
+                    $ok = true;
+                    break;
+                }
+            } while(0);
+            if (!$ok) continue;
            // inefficient call, since the validator will do this again
            if (strtolower(trim($value)) !== 'inherit') {
                // inherit works for everything (but only on the base property)
--- a/library/HTMLPurifier/AttrDef/CSS/Background.php
+++ b/library/HTMLPurifier/AttrDef/CSS/Background.php
@@ -31,6 +31,9 @@ class HTMLPurifier_AttrDef_CSS_Background extends HTMLPurifier_AttrDef
        $string = $this->parseCDATA($string);
        if ($string === '') return false;
        
+        // munge rgb() decl if necessary
+        $string = $this->mungeRgb($string);
+        
        // assumes URI doesn't have spaces in it
        $bits = explode(' ', strtolower($string)); // bits to process
        
--- a/library/HTMLPurifier/AttrDef/CSS/Border.php
+++ b/library/HTMLPurifier/AttrDef/CSS/Border.php
@@ -22,7 +22,7 @@ class HTMLPurifier_AttrDef_CSS_Border extends HTMLPurifier_AttrDef
    
    function validate($string, $config, &$context) {
        $string = $this->parseCDATA($string);
-        // we specifically will not support rgb() syntax with spaces
+        $string = $this->mungeRgb($string);
        $bits = explode(' ', $string);
        $done = array(); // segments we've finished
        $ret = ''; // return value
--- a/library/HTMLPurifier/AttrDef/CSS/Color.php
+++ b/library/HTMLPurifier/AttrDef/CSS/Color.php
@@ -39,20 +39,13 @@ class HTMLPurifier_AttrDef_CSS_Color extends HTMLPurifier_AttrDef
        if ($colors === null) $colors = $config->get('Core', 'ColorKeywords');
        
        $color = trim($color);
-        if (!$color) return false;
+        if ($color === '') return false;
        
        $lower = strtolower($color);
        if (isset($colors[$lower])) return $colors[$lower];
        
-        if ($color[0] === '#') {
-            // hexadecimal handling
-            $hex = substr($color, 1);
-            $length = strlen($hex);
-            if ($length !== 3 && $length !== 6) return false;
-            if (!ctype_xdigit($hex)) return false;
-        } else {
+        if (strpos($color, 'rgb(') !== false) {
            // rgb literal handling
-            if (strpos($color, 'rgb(')) return false;
            $length = strlen($color);
            if (strpos($color, ')') !== $length - 1) return false;
            $triad = substr($color, 4, $length - 4 - 1);
@@ -90,6 +83,17 @@ class HTMLPurifier_AttrDef_CSS_Color extends HTMLPurifier_AttrDef
            }
            $new_triad = implode(',', $new_parts);
            $color = "rgb($new_triad)";
+        } else {
+            // hexadecimal handling
+            if ($color[0] === '#') {
+                $hex = substr($color, 1);
+            } else {
+                $hex = $color;
+                $color = '#' . $color;
+            }
+            $length = strlen($hex);
+            if ($length !== 3 && $length !== 6) return false;
+            if (!ctype_xdigit($hex)) return false;
        }
        
        return $color;
--- a/library/HTMLPurifier/AttrDef/CSS/DenyElementDecorator.php
+++ b/library/HTMLPurifier/AttrDef/CSS/DenyElementDecorator.php
@@ -0,0 +1,26 @@
+<?php
+
+/**
+ * Decorator which enables CSS properties to be disabled for specific elements.
+ */
+class HTMLPurifier_AttrDef_CSS_DenyElementDecorator extends HTMLPurifier_AttrDef
+{
+    var $def, $element;
+    
+    /**
+     * @param $def Definition to wrap
+     * @param $element Element to deny
+     */
+    function HTMLPurifier_AttrDef_CSS_DenyElementDecorator(&$def, $element) {
+        $this->def =& $def;
+        $this->element = $element;
+    }
+    /**
+     * Checks if CurrentToken is set and equal to $this->element
+     */
+    function validate($string, $config, $context) {
+        $token = $context->get('CurrentToken', true);
+        if ($token && $token->name == $this->element) return false;
+        return $this->def->validate($string, $config, $context);
+    }
+}
--- a/library/HTMLPurifier/AttrDef/CSS/FontFamily.php
+++ b/library/HTMLPurifier/AttrDef/CSS/FontFamily.php
@@ -19,7 +19,6 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
            'cursive' => true
        );
        
-        $string = $this->parseCDATA($string);
        // assume that no font names contain commas in them
        $fonts = explode(',', $string);
        $final = '';
@@ -38,13 +37,40 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
                $quote = $font[0];
                if ($font[$length - 1] !== $quote) continue;
                $font = substr($font, 1, $length - 2);
-                // double-backslash processing is buggy
-                $font = str_replace("\\$quote", $quote, $font); // de-escape quote
-                $font = str_replace("\\\n", "\n", $font);       // de-escape newlines
+                
+                $new_font = '';
+                for ($i = 0, $c = strlen($font); $i < $c; $i++) {
+                    if ($font[$i] === '\\') {
+                        $i++;
+                        if ($i >= $c) {
+                            $new_font .= '\\';
+                            break;
+                        }
+                        if (ctype_xdigit($font[$i])) {
+                            $code = $font[$i];
+                            for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
+                                if (!ctype_xdigit($font[$i])) break;
+                                $code .= $font[$i];
+                            }
+                            // We have to be extremely careful when adding
+                            // new characters, to make sure we're not breaking
+                            // the encoding.
+                            $char = HTMLPurifier_Encoder::unichr(hexdec($code));
+                            if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
+                            $new_font .= $char;
+                            if ($i < $c && trim($font[$i]) !== '') $i--;
+                            continue;
+                        }
+                        if ($font[$i] === "\n") continue;
+                    }
+                    $new_font .= $font[$i];
+                }
+                
+                $font = $new_font;
            }
            // $font is a pure representation of the font name
            
-            if (ctype_alnum($font)) {
+            if (ctype_alnum($font) && $font !== '') {
                // very simple font, allow it in unharmed
                $final .= $font . ', ';
                continue;
@@ -53,8 +79,8 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
            // complicated font, requires quoting
            
            // armor single quotes and new lines
+            $font = str_replace("\\", "\\\\", $font);
            $font = str_replace("'", "\\'", $font);
-            $font = str_replace("\n", "\\\n", $font);
            $final .= "'$font', ";
        }
        $final = rtrim($final, ', ');
--- a/library/HTMLPurifier/AttrDef/CSS/Length.php
+++ b/library/HTMLPurifier/AttrDef/CSS/Length.php
@@ -1,7 +1,7 @@
 <?php

-require_once 'HTMLPurifier/AttrDef.php';
-require_once 'HTMLPurifier/AttrDef/CSS/Number.php';
+require_once 'HTMLPurifier/Length.php';
+require_once 'HTMLPurifier/UnitConverter.php';

 /**
 * Represents a Length as defined by CSS.
@@ -9,46 +9,40 @@ require_once 'HTMLPurifier/AttrDef/CSS/Number.php';
 class HTMLPurifier_AttrDef_CSS_Length extends HTMLPurifier_AttrDef
 {
    
-    /**
-     * Valid unit lookup table.
-     * @warning The code assumes all units are two characters long.  Be careful
-     *          if we have to change this behavior!
-     */
-    var $units = array('em' => true, 'ex' => true, 'px' => true, 'in' => true,
-         'cm' => true, 'mm' => true, 'pt' => true, 'pc' => true);
-    /**
-     * Instance of HTMLPurifier_AttrDef_Number to defer number validation to
-     */
-    var $number_def;
+    var $min, $max;
    
    /**
-     * @param $non_negative Bool indication whether or not negative values are
-     *                      allowed.
+     * @param HTMLPurifier_Length $max Minimum length, or null for no bound. String is also acceptable.
+     * @param HTMLPurifier_Length $max Maximum length, or null for no bound. String is also acceptable.
     */
-    function HTMLPurifier_AttrDef_CSS_Length($non_negative = false) {
-        $this->number_def = new HTMLPurifier_AttrDef_CSS_Number($non_negative);
+    function HTMLPurifier_AttrDef_CSS_Length($min = null, $max = null) {
+        $this->min = $min !== null ? HTMLPurifier_Length::make($min) : null;
+        $this->max = $max !== null ? HTMLPurifier_Length::make($max) : null;
    }
    
-    function validate($length, $config, &$context) {
+    function validate($string, $config, $context) {
+        $string = $this->parseCDATA($string);
        
-        $length = $this->parseCDATA($length);
-        if ($length === '') return false;
-        if ($length === '0') return '0';
-        $strlen = strlen($length);
-        if ($strlen === 1) return false; // impossible!
+        // Optimizations
+        if ($string === '') return false;
+        if ($string === '0') return '0';
+        if (strlen($string) === 1) return false;
        
-        // we assume all units are two characters
-        $unit = substr($length, $strlen - 2);
-        if (!ctype_lower($unit)) $unit = strtolower($unit);
-        $number = substr($length, 0, $strlen - 2);
+        $length = HTMLPurifier_Length::make($string);
+        if (!$length->isValid()) return false;
        
-        if (!isset($this->units[$unit])) return false;
-        
-        $number = $this->number_def->validate($number, $config, $context);
-        if ($number === false) return false;
-        
-        return $number . $unit;
+        if ($this->min) {
+            $c = $length->compareTo($this->min);
+            if ($c === false) return false;
+            if ($c < 0) return false;
+        }
+        if ($this->max) {
+            $c = $length->compareTo($this->max);
+            if ($c === false) return false;
+            if ($c > 0) return false;
+        }
        
+        return $length->toString();
    }
    
 }
--- a/library/HTMLPurifier/AttrDef/CSS/Number.php
+++ b/library/HTMLPurifier/AttrDef/CSS/Number.php
@@ -18,6 +18,11 @@ class HTMLPurifier_AttrDef_CSS_Number extends HTMLPurifier_AttrDef
        $this->non_negative = $non_negative;
    }
    
+    /**
+     * @warning Some contexts do not pass $config, $context. These
+     *          variables should not be used without checking HTMLPurifier_Length.
+     *          This might not work properly in PHP4.
+     */
    function validate($number, $config, &$context) {
        
        $number = $this->parseCDATA($number);
--- a/library/HTMLPurifier/AttrDef/CSS/TextDecoration.php
+++ b/library/HTMLPurifier/AttrDef/CSS/TextDecoration.php
@@ -15,10 +15,13 @@ class HTMLPurifier_AttrDef_CSS_TextDecoration extends HTMLPurifier_AttrDef
        static $allowed_values = array(
            'line-through' => true,
            'overline' => true,
-            'underline' => true
+            'underline' => true,
        );
        
        $string = strtolower($this->parseCDATA($string));
+        
+        if ($string === 'none') return $string;
+        
        $parts = explode(' ', $string);
        $final = '';
        foreach ($parts as $part) {
--- a/library/HTMLPurifier/AttrDef/HTML/Pixels.php
+++ b/library/HTMLPurifier/AttrDef/HTML/Pixels.php
@@ -8,6 +8,12 @@ require_once 'HTMLPurifier/AttrDef.php';
 class HTMLPurifier_AttrDef_HTML_Pixels extends HTMLPurifier_AttrDef
 {
    
+    var $max;
+    
+    function HTMLPurifier_AttrDef_HTML_Pixels($max = null) {
+        $this->max = $max;
+    }
+    
    function validate($string, $config, &$context) {
        
        $string = trim($string);
@@ -26,11 +32,18 @@ class HTMLPurifier_AttrDef_HTML_Pixels extends HTMLPurifier_AttrDef
        // crash operating systems, see <http://ha.ckers.org/imagecrash.html>
        // WARNING, above link WILL crash you if you're using Windows
        
-        if ($int > 1200) return '1200';
+        if ($this->max !== null && $int > $this->max) return (string) $this->max;
        
        return (string) $int;
        
    }
    
+    function make($string) {
+        if ($string === '') $max = null;
+        else $max = (int) $string;
+        $class = get_class($this);
+        return new $class($max);
+    }
+    
 }

--- a/library/HTMLPurifier/AttrDef/Switch.php
+++ b/library/HTMLPurifier/AttrDef/Switch.php
@@ -0,0 +1,32 @@
+<?php
+
+/**
+ * Decorator that, depending on a token, switches between two definitions.
+ */
+class HTMLPurifier_AttrDef_Switch
+{
+    
+    var $tag;
+    var $withTag, $withoutTag;
+    
+    /**
+     * @param string $tag Tag name to switch upon
+     * @param HTMLPurifier_AttrDef $with_tag Call if token matches tag
+     * @param HTMLPurifier_AttrDef $without_tag Call if token doesn't match, or there is no token
+     */
+    function HTMLPurifier_AttrDef_Switch($tag, $with_tag, $without_tag) {
+        $this->tag = $tag;
+        $this->withTag = $with_tag;
+        $this->withoutTag = $without_tag;
+    }
+    
+    function validate($string, $config, $context) {
+        $token = $context->get('CurrentToken', true);
+        if (!$token || $token->name !== $this->tag) {
+            return $this->withoutTag->validate($string, $config, $context);
+        } else {
+            return $this->withTag->validate($string, $config, $context);
+        }
+    }
+    
+}
--- a/library/HTMLPurifier/AttrDef/URI.php
+++ b/library/HTMLPurifier/AttrDef/URI.php
@@ -68,7 +68,7 @@ HTMLPurifier_ConfigSchema::define(
 class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
 {
    
-    var $parser, $percentEncoder;
+    var $parser;
    var $embedsResource;
    
    /**
@@ -76,7 +76,6 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
     */
    function HTMLPurifier_AttrDef_URI($embeds_resource = false) {
        $this->parser = new HTMLPurifier_URIParser();
-        $this->percentEncoder = new HTMLPurifier_PercentEncoder();
        $this->embedsResource = (bool) $embeds_resource;
    }
    
@@ -84,9 +83,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        
        if ($config->get('URI', 'Disable')) return false;
        
-        // initial operations
        $uri = $this->parseCDATA($uri);
-        $uri = $this->percentEncoder->normalize($uri);
        
        // parse the URI
        $uri = $this->parser->parse($uri);
@@ -102,7 +99,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
            $result = $uri->validate($config, $context);
            if (!$result) break;
            
-            // chained validation
+            // chained filtering
            $uri_def =& $config->getDefinition('URI');
            $result = $uri_def->filter($uri, $config, $context);
            if (!$result) break;
@@ -122,13 +119,6 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
        $context->destroy('EmbeddedURI');
        if (!$ok) return false;
        
-        // munge scheme off if necessary (this must be last)
-        if (!is_null($uri->scheme) && is_null($uri->host)) {
-            if ($uri_def->defaultScheme == $uri->scheme) {
-                $uri->scheme = null;
-            }
-        }
-        
        // back to string
        $result = $uri->toString();
        
--- a/library/HTMLPurifier/AttrDef/URI/Email.php
+++ b/library/HTMLPurifier/AttrDef/URI/Email.php
@@ -1,7 +1,6 @@
 <?php

 require_once 'HTMLPurifier/AttrDef.php';
-require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';

 class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
 {
@@ -15,3 +14,5 @@ class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
    
 }

+// sub-implementations
+require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';
--- a/library/HTMLPurifier/AttrDef/URI/Host.php
+++ b/library/HTMLPurifier/AttrDef/URI/Host.php
@@ -40,11 +40,23 @@ class HTMLPurifier_AttrDef_URI_Host extends HTMLPurifier_AttrDef
        $ipv4 = $this->ipv4->validate($string, $config, $context);
        if ($ipv4 !== false) return $ipv4;
        
-        // validate a domain name here, do filtering, etc etc etc
+        // A regular domain name.
        
-        // We could use this, but it would break I18N domain names
-        //$match = preg_match('/^[a-z0-9][\w\-\.]*[a-z0-9]$/i', $string);
-        //if (!$match) return false;
+        // This breaks I18N domain names, but we don't have proper IRI support,
+        // so force users to insert Punycode. If there's complaining we'll 
+        // try to fix things into an international friendly form.
+        
+        // The productions describing this are:
+        $a   = '[a-z]';     // alpha
+        $an  = '[a-z0-9]';  // alphanum
+        $and = '[a-z0-9-]'; // alphanum | "-"
+        // domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
+        $domainlabel   = "$an($and*$an)?";
+        // toplabel    = alpha | alpha *( alphanum | "-" ) alphanum
+        $toplabel      = "$a($and*$an)?";
+        // hostname    = *( domainlabel "." ) toplabel [ "." ]
+        $match = preg_match("/^($domainlabel\.)*$toplabel\.?$/i", $string);
+        if (!$match) return false;
        
        return $string;
    }
--- a/library/HTMLPurifier/AttrValidator.php
+++ b/library/HTMLPurifier/AttrValidator.php
@@ -23,6 +23,13 @@ class HTMLPurifier_AttrValidator
        $definition = $config->getHTMLDefinition();
        $e =& $context->get('ErrorCollector', true);
        
+        // initialize IDAccumulator if necessary
+        $ok =& $context->get('IDAccumulator', true);
+        if (!$ok) {
+            $id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
+            $context->register('IDAccumulator', $id_accumulator);
+        }
+        
        // initialize CurrentToken if necessary
        $current_token =& $context->get('CurrentToken', true);
        if (!$current_token) $context->register('CurrentToken', $token);
@@ -33,8 +40,8 @@ class HTMLPurifier_AttrValidator
        // DEFINITION CALL
        $d_defs = $definition->info_global_attr;
        
-        // reference attributes for easy manipulation
-        $attr =& $token->attr;
+        // don't update token until the very end, to ensure an atomic update
+        $attr = $token->attr;
        
        // do global transformations (pre)
        // nothing currently utilizes this
@@ -129,6 +136,8 @@ class HTMLPurifier_AttrValidator
            if ($e && ($attr != $o)) $e->send(E_NOTICE, 'AttrValidator: Attributes transformed', $o, $attr);
        }
        
+        $token->attr = $attr;
+        
        // destroy CurrentToken if we made it ourselves
        if (!$current_token) $context->destroy('CurrentToken');
        
--- a/library/HTMLPurifier/CSSDefinition.php
+++ b/library/HTMLPurifier/CSSDefinition.php
@@ -7,6 +7,7 @@ require_once 'HTMLPurifier/AttrDef/CSS/BackgroundPosition.php';
 require_once 'HTMLPurifier/AttrDef/CSS/Border.php';
 require_once 'HTMLPurifier/AttrDef/CSS/Color.php';
 require_once 'HTMLPurifier/AttrDef/CSS/Composite.php';
+require_once 'HTMLPurifier/AttrDef/CSS/DenyElementDecorator.php';
 require_once 'HTMLPurifier/AttrDef/CSS/Font.php';
 require_once 'HTMLPurifier/AttrDef/CSS/FontFamily.php';
 require_once 'HTMLPurifier/AttrDef/CSS/Length.php';
@@ -16,6 +17,7 @@ require_once 'HTMLPurifier/AttrDef/CSS/Percentage.php';
 require_once 'HTMLPurifier/AttrDef/CSS/TextDecoration.php';
 require_once 'HTMLPurifier/AttrDef/CSS/URI.php';
 require_once 'HTMLPurifier/AttrDef/Enum.php';
+require_once 'HTMLPurifier/AttrDef/Switch.php';

 HTMLPurifier_ConfigSchema::define(
    'CSS', 'DefinitionRev', 1, 'int', '
@@ -26,6 +28,20 @@ HTMLPurifier_ConfigSchema::define(
 </p>
 ');

+HTMLPurifier_ConfigSchema::define(
+    'CSS', 'MaxImgLength', '1200px', 'string/null', '
+<p>
+ This parameter sets the maximum allowed length on <code>img</code> tags,
+ effectively the <code>width</code> and <code>height</code> properties.
+ Only absolute units of measurement (in, pt, pc, mm, cm) and pixels (px) are allowed. This is
+ in place to prevent imagecrash attacks, disable with null at your own risk.
+ This directive is similar to %HTML.MaxImgLength, and both should be
+ concurrently edited, although there are
+ subtle differences in the input format (the CSS max is a number with
+ a unit).
+</p>
+');
+
 /**
 * Defines allowed CSS attributes and what their values are.
 * @see HTMLPurifier_HTMLDefinition
@@ -116,7 +132,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
        $this->info['border-left-width'] = 
        $this->info['border-right-width'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
            new HTMLPurifier_AttrDef_Enum(array('thin', 'medium', 'thick')),
-            new HTMLPurifier_AttrDef_CSS_Length(true) //disallow negative
+            new HTMLPurifier_AttrDef_CSS_Length('0') //disallow negative
        ));
        
        $this->info['border-width'] = new HTMLPurifier_AttrDef_CSS_Multiple($border_width);
@@ -142,7 +158,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
        $this->info['line-height'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
            new HTMLPurifier_AttrDef_Enum(array('normal')),
            new HTMLPurifier_AttrDef_CSS_Number(true), // no negatives
-            new HTMLPurifier_AttrDef_CSS_Length(true),
+            new HTMLPurifier_AttrDef_CSS_Length('0'),
            new HTMLPurifier_AttrDef_CSS_Percentage(true)
        ));
        
@@ -164,7 +180,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
        $this->info['padding-bottom'] = 
        $this->info['padding-left'] = 
        $this->info['padding-right'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
-            new HTMLPurifier_AttrDef_CSS_Length(true),
+            new HTMLPurifier_AttrDef_CSS_Length('0'),
            new HTMLPurifier_AttrDef_CSS_Percentage(true)
        ));
        
@@ -175,13 +191,25 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
            new HTMLPurifier_AttrDef_CSS_Percentage()
        ));
        
-        $this->info['width'] =
-        $this->info['height'] = 
-        new HTMLPurifier_AttrDef_CSS_Composite(array(
-            new HTMLPurifier_AttrDef_CSS_Length(true),
+        $trusted_wh = new HTMLPurifier_AttrDef_CSS_Composite(array(
+            new HTMLPurifier_AttrDef_CSS_Length('0'),
            new HTMLPurifier_AttrDef_CSS_Percentage(true),
            new HTMLPurifier_AttrDef_Enum(array('auto'))
        ));
+        $max = $config->get('CSS', 'MaxImgLength');
+        $this->info['width'] =
+        $this->info['height'] =
+            $max === null ?
+            $trusted_wh : 
+            new HTMLPurifier_AttrDef_Switch('img',
+                // For img tags:
+                new HTMLPurifier_AttrDef_CSS_Composite(array(
+                    new HTMLPurifier_AttrDef_CSS_Length('0', $max),
+                    new HTMLPurifier_AttrDef_Enum(array('auto'))
+                )),
+                // For everyone else:
+                $trusted_wh
+            );
        
        $this->info['text-decoration'] = new HTMLPurifier_AttrDef_CSS_TextDecoration();
        
--- a/library/HTMLPurifier/ChildDef/Optional.php
+++ b/library/HTMLPurifier/ChildDef/Optional.php
@@ -15,7 +15,10 @@ class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
    var $type = 'optional';
    function validateChildren($tokens_of_children, $config, &$context) {
        $result = parent::validateChildren($tokens_of_children, $config, $context);
-        if ($result === false) return array();
+        if ($result === false) {
+            if (empty($tokens_of_children)) return true;
+            else return array();
+        }
        return $result;
    }
 }
--- a/library/HTMLPurifier/Config.php
+++ b/library/HTMLPurifier/Config.php
@@ -42,7 +42,7 @@ class HTMLPurifier_Config
    /**
     * HTML Purifier's version
     */
-    var $version = '2.1.2';
+    var $version = '2.1.5';
    
    /**
     * Two-level associative array of configuration directives
--- a/library/HTMLPurifier/DefinitionCache.php
+++ b/library/HTMLPurifier/DefinitionCache.php
@@ -120,6 +120,9 @@ class HTMLPurifier_DefinitionCache
    
    /**
     * Clears all expired (older version or revision) objects from cache
+     * @note Be carefuly implementing this method as flush. Flush must
+     *       not interfere with other Definition types, and cleanup()
+     *       should not be repeatedly called by userland code.
     */
    function cleanup($config) {
        trigger_error('Cannot call abstract method', E_USER_ERROR);
--- a/library/HTMLPurifier/DefinitionCacheFactory.php
+++ b/library/HTMLPurifier/DefinitionCacheFactory.php
@@ -1,6 +1,7 @@
 <?php

 require_once 'HTMLPurifier/DefinitionCache.php';
+require_once 'HTMLPurifier/DefinitionCache/Serializer.php';

 HTMLPurifier_ConfigSchema::define(
    'Cache', 'DefinitionImpl', 'Serializer', 'string/null', '
@@ -10,10 +11,6 @@ to disable caching (not recommended, as you will see a definite
 performance degradation). This directive has been available since 2.0.0.
 ');

-HTMLPurifier_ConfigSchema::defineAllowedValues(
-    'Cache', 'DefinitionImpl', array('Serializer')
-);
-
 HTMLPurifier_ConfigSchema::defineAlias(
    'Core', 'DefinitionCache',
    'Cache', 'DefinitionImpl'
@@ -27,6 +24,7 @@ class HTMLPurifier_DefinitionCacheFactory
 {
    
    var $caches = array('Serializer' => array());
+    var $implementations = array();
    var $decorators = array();
    
    /**
@@ -51,14 +49,21 @@ class HTMLPurifier_DefinitionCacheFactory
        return $instance;
    }
    
+    /**
+     * Registers a new definition cache object
+     * @param $short Short name of cache object, for reference
+     * @param $long Full class name of cache object, for construction 
+     */
+    function register($short, $long) {
+        $this->implementations[$short] = $long;
+    }
+    
    /**
     * Factory method that creates a cache object based on configuration
     * @param $name Name of definitions handled by cache
     * @param $config Instance of HTMLPurifier_Config
     */
    function &create($type, $config) {
-        // only one implementation as for right now, $config will
-        // be used to determine implementation
        $method = $config->get('Cache', 'DefinitionImpl');
        if ($method === null) {
            $null = new HTMLPurifier_DefinitionCache_Null($type);
@@ -67,7 +72,17 @@ class HTMLPurifier_DefinitionCacheFactory
        if (!empty($this->caches[$method][$type])) {
            return $this->caches[$method][$type];
        }
-        $cache = new HTMLPurifier_DefinitionCache_Serializer($type);
+        if (
+          isset($this->implementations[$method]) &&
+          class_exists($class = $this->implementations[$method])
+        ) {
+            $cache = new $class($type);
+        } else {
+            if ($method != 'Serializer') {
+                trigger_error("Unrecognized DefinitionCache $method, using Serializer instead", E_USER_WARNING);
+            }
+            $cache = new HTMLPurifier_DefinitionCache_Serializer($type);
+        }
        foreach ($this->decorators as $decorator) {
            $new_cache = $decorator->decorate($cache);
            // prevent infinite recursion in PHP 4
--- a/library/HTMLPurifier/ElementDef.php
+++ b/library/HTMLPurifier/ElementDef.php
@@ -82,7 +82,7 @@ class HTMLPurifier_ElementDef
    
    /**
     * List of the names of required attributes this element has. Dynamically
-     * populated.
+     * populated by HTMLPurifier_HTMLDefinition::getElement
     * @public
     */
    var $required_attr = array();
--- a/library/HTMLPurifier/Encoder.php
+++ b/library/HTMLPurifier/Encoder.php
@@ -62,6 +62,12 @@ class HTMLPurifier_Encoder
        trigger_error('Cannot instantiate encoder, call methods statically', E_USER_ERROR);
    }
    
+    /**
+     * Error-handler that mutes errors, alternative to shut-up operator.
+     */
+    function muteErrorHandler() {}
+    
+    /**
    /**
     * Cleans a UTF-8 string for well-formedness and SGML validity
     * 
@@ -90,26 +96,13 @@ class HTMLPurifier_Encoder
     */
    function cleanUTF8($str, $force_php = false) {
        
-        static $non_sgml_chars = array();
-        if (empty($non_sgml_chars)) {
-            for ($i = 0; $i <= 31; $i++) {
-                // non-SGML ASCII chars
-                // save \r, \t and \n
-                if ($i == 9 || $i == 13 || $i == 10) continue;
-                $non_sgml_chars[chr($i)] = '';
-            }
-            for ($i = 127; $i <= 159; $i++) {
-                $non_sgml_chars[HTMLPurifier_Encoder::unichr($i)] = '';
-            }
-        }
-        
-        static $iconv = null;
-        if ($iconv === null) $iconv = function_exists('iconv');
-        
-        if ($iconv && !$force_php) {
-            // do the shortcut way
-            $str = @iconv('UTF-8', 'UTF-8//IGNORE', $str);
-            return strtr($str, $non_sgml_chars);
+        // UTF-8 validity is checked since PHP 4.3.5
+        // This is an optimization: if the string is already valid UTF-8, no
+        // need to do PHP stuff. 99% of the time, this will be the case.
+        // The regexp matches the XML char production, as well as well as excluding
+        // non-SGML codepoints U+007F to U+009F
+        if (preg_match('/^[\x{9}\x{A}\x{D}\x{20}-\x{7E}\x{A0}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF}]*$/Du', $str)) {
+            return $str;
        }
        
        $mState = 0; // cached expected number of octets after the current octet
@@ -220,7 +213,17 @@ class HTMLPurifier_Encoder
                        ) {
                            
                        } elseif (0xFEFF != $mUcs4 && // omit BOM
-                            !($mUcs4 >= 128 && $mUcs4 <= 159) // omit non-SGML
+                            // check for valid Char unicode codepoints
+                            (
+                                0x9 == $mUcs4 ||
+                                0xA == $mUcs4 ||
+                                0xD == $mUcs4 ||
+                                (0x20 <= $mUcs4 && 0x7E >= $mUcs4) ||
+                                // 7F-9F is not strictly prohibited by XML,
+                                // but it is non-SGML, and thus we don't allow it
+                                (0xA0 <= $mUcs4 && 0xD7FF >= $mUcs4) ||
+                                (0x10000 <= $mUcs4 && 0x10FFFF >= $mUcs4)
+                            )
                        ) {
                            $out .= $char;
                        }
@@ -313,14 +316,23 @@ class HTMLPurifier_Encoder
     * @static
     */
    function convertToUTF8($str, $config, &$context) {
-        static $iconv = null;
-        if ($iconv === null) $iconv = function_exists('iconv');
        $encoding = $config->get('Core', 'Encoding');
        if ($encoding === 'utf-8') return $str;
+        static $iconv = null;
+        if ($iconv === null) $iconv = function_exists('iconv');
+        set_error_handler(array('HTMLPurifier_Encoder', 'muteErrorHandler'));
        if ($iconv && !$config->get('Test', 'ForceNoIconv')) {
-            return @iconv($encoding, 'utf-8//IGNORE', $str);
+            $str = iconv($encoding, 'utf-8//IGNORE', $str);
+            // If the string is bjorked by Shift_JIS or a similar encoding
+            // that doesn't support all of ASCII, convert the naughty
+            // characters to their true byte-wise ASCII/UTF-8 equivalents.
+            $str = strtr($str, HTMLPurifier_Encoder::testEncodingSupportsASCII($encoding));
+            restore_error_handler();
+            return $str;
        } elseif ($encoding === 'iso-8859-1') {
-            return @utf8_encode($str);
+            $str = utf8_encode($str);
+            restore_error_handler();
+            return $str;
        }
        trigger_error('Encoding not supported', E_USER_ERROR);
    }
@@ -332,17 +344,31 @@ class HTMLPurifier_Encoder
     *       characters being omitted.
     */
    function convertFromUTF8($str, $config, &$context) {
-        static $iconv = null;
-        if ($iconv === null) $iconv = function_exists('iconv');
        $encoding = $config->get('Core', 'Encoding');
        if ($encoding === 'utf-8') return $str;
-        if ($config->get('Core', 'EscapeNonASCIICharacters')) {
+        static $iconv = null;
+        if ($iconv === null) $iconv = function_exists('iconv');
+        if ($escape = $config->get('Core', 'EscapeNonASCIICharacters')) {
            $str = HTMLPurifier_Encoder::convertToASCIIDumbLossless($str);
        }
+        set_error_handler(array('HTMLPurifier_Encoder', 'muteErrorHandler'));
        if ($iconv && !$config->get('Test', 'ForceNoIconv')) {
-            return @iconv('utf-8', $encoding . '//IGNORE', $str);
+            // Undo our previous fix in convertToUTF8, otherwise iconv will barf
+            $ascii_fix = HTMLPurifier_Encoder::testEncodingSupportsASCII($encoding);
+            if (!$escape && !empty($ascii_fix)) {
+                $clear_fix = array();
+                foreach ($ascii_fix as $utf8 => $native) $clear_fix[$utf8] = '';
+                $str = strtr($str, $clear_fix);
+            }
+            $str = strtr($str, array_flip($ascii_fix));
+            // Normal stuff
+            $str = iconv('utf-8', $encoding . '//IGNORE', $str);
+            restore_error_handler();
+            return $str;
        } elseif ($encoding === 'iso-8859-1') {
-            return @utf8_decode($str);
+            $str = utf8_decode($str);
+            restore_error_handler();
+            return $str;
        }
        trigger_error('Encoding not supported', E_USER_ERROR);
    }
@@ -395,6 +421,47 @@ class HTMLPurifier_Encoder
        return $result;
    }
    
+    /**
+     * This expensive function tests whether or not a given character
+     * encoding supports ASCII. 7/8-bit encodings like Shift_JIS will
+     * fail this test, and require special processing. Variable width
+     * encodings shouldn't ever fail.
+     * 
+     * @param string $encoding Encoding name to test, as per iconv format
+     * @param bool $bypass Whether or not to bypass the precompiled arrays.
+     * @return Array of UTF-8 characters to their corresponding ASCII,
+     *      which can be used to "undo" any overzealous iconv action.
+     */
+    function testEncodingSupportsASCII($encoding, $bypass = false) {
+        static $encodings = array();
+        if (!$bypass) {
+            if (isset($encodings[$encoding])) return $encodings[$encoding];
+            $lenc = strtolower($encoding);
+            switch ($lenc) {
+                case 'shift_jis':
+                    return array("\xC2\xA5" => '\\', "\xE2\x80\xBE" => '~');
+                case 'johab':
+                    return array("\xE2\x82\xA9" => '\\');
+            }
+            if (strpos($lenc, 'iso-8859-') === 0) return array();
+        }
+        $ret = array();
+        set_error_handler(array('HTMLPurifier_Encoder', 'muteErrorHandler'));
+        if (iconv('UTF-8', $encoding, 'a') === false) return false;
+        for ($i = 0x20; $i <= 0x7E; $i++) { // all printable ASCII chars
+            $c = chr($i);
+            if (iconv('UTF-8', "$encoding//IGNORE", $c) === '') {
+                // Reverse engineer: what's the UTF-8 equiv of this byte
+                // sequence? This assumes that there's no variable width
+                // encoding that doesn't support ASCII.
+                $ret[iconv($encoding, 'UTF-8//IGNORE', $c)] = $c;
+            }
+        }
+        restore_error_handler();
+        $encodings[$encoding] = $ret;
+        return $ret;
+    }
+    
    
 }

--- a/library/HTMLPurifier/HTMLDefinition.php
+++ b/library/HTMLPurifier/HTMLDefinition.php
@@ -222,6 +222,8 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
    
    /**
     * Adds a custom attribute to a pre-existing element
+     * @note This is strictly convenience, and does not have a corresponding
+     *       method in HTMLPurifier_HTMLModule
     * @param $element_name String element name to add attribute to
     * @param $attr_name String name of attribute
     * @param $def Attribute definition, can be string or object, see
@@ -229,20 +231,37 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
     */
    function addAttribute($element_name, $attr_name, $def) {
        $module =& $this->getAnonymousModule();
-        $element =& $module->addBlankElement($element_name);
+        if (!isset($module->info[$element_name])) {
+            $element =& $module->addBlankElement($element_name);
+        } else {
+            $element =& $module->info[$element_name];
+        }
        $element->attr[$attr_name] = $def;
    }
    
    /**
     * Adds a custom element to your HTML definition
     * @note See HTMLPurifier_HTMLModule::addElement for detailed 
-     *       parameter descriptions.
+     *       parameter and return value descriptions.
     */
-    function addElement($element_name, $type, $contents, $attr_collections, $attributes) {
+    function &addElement($element_name, $type, $contents, $attr_collections, $attributes) {
        $module =& $this->getAnonymousModule();
        // assume that if the user is calling this, the element
        // is safe. This may not be a good idea
-        $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
+        $element =& $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
+        return $element;
+    }
+    
+    /**
+     * Adds a blank element to your HTML definition, for overriding
+     * existing behavior
+     * @note See HTMLPurifier_HTMLModule::addBlankElement for detailed
+     *       parameter and return value descriptions.
+     */
+    function &addBlankElement($element_name) {
+        $module  =& $this->getAnonymousModule();
+        $element =& $module->addBlankElement($element_name);
+        return $element;
    }
    
    /**
--- a/library/HTMLPurifier/HTMLModule.php
+++ b/library/HTMLPurifier/HTMLModule.php
@@ -219,5 +219,14 @@ class HTMLPurifier_HTMLModule
        }
        return $ret;
    }
+    
+    /**
+     * Lazy load construction of the module after determining whether
+     * or not it's needed, and also when a finalized configuration object
+     * is available.
+     * @param $config Instance of HTMLPurifier_Config
+     */
+    function setup($config) {}
+    
 }

--- a/library/HTMLPurifier/HTMLModule/Bdo.php
+++ b/library/HTMLPurifier/HTMLModule/Bdo.php
@@ -15,7 +15,7 @@ class HTMLPurifier_HTMLModule_Bdo extends HTMLPurifier_HTMLModule
        'I18N' => array('dir' => false)
    );
    
-    function HTMLPurifier_HTMLModule_Bdo() {
+    function setup($config) {
        $bdo =& $this->addElement(
            'bdo', true, 'Inline', 'Inline', array('Core', 'Lang'),
            array(
--- a/library/HTMLPurifier/HTMLModule/Edit.php
+++ b/library/HTMLPurifier/HTMLModule/Edit.php
@@ -12,7 +12,7 @@ class HTMLPurifier_HTMLModule_Edit extends HTMLPurifier_HTMLModule
    
    var $name = 'Edit';
    
-    function HTMLPurifier_HTMLModule_Edit() {
+    function setup($config) {
        $contents = 'Chameleon: #PCDATA | Inline ! #PCDATA | Flow';
        $attr = array(
            'cite' => 'URI',
--- a/library/HTMLPurifier/HTMLModule/Hypertext.php
+++ b/library/HTMLPurifier/HTMLModule/Hypertext.php
@@ -11,7 +11,7 @@ class HTMLPurifier_HTMLModule_Hypertext extends HTMLPurifier_HTMLModule
    
    var $name = 'Hypertext';
    
-    function HTMLPurifier_HTMLModule_Hypertext() {
+    function setup($config) {
        $a =& $this->addElement(
            'a', true, 'Inline', 'Inline', 'Common',
            array(
--- a/library/HTMLPurifier/HTMLModule/Image.php
+++ b/library/HTMLPurifier/HTMLModule/Image.php
@@ -5,6 +5,18 @@ require_once 'HTMLPurifier/HTMLModule.php';
 require_once 'HTMLPurifier/AttrDef/URI.php';
 require_once 'HTMLPurifier/AttrTransform/ImgRequired.php';

+HTMLPurifier_ConfigSchema::define(
+    'HTML', 'MaxImgLength', 1200, 'int/null', '
+<p>
+ This directive controls the maximum number of pixels in the width and
+ height attributes in <code>img</code> tags. This is
+ in place to prevent imagecrash attacks, disable with null at your own risk.
+ This directive is similar to %CSS.MaxImgLength, and both should be
+ concurrently edited, although there are
+ subtle differences in the input format (the HTML max is an integer).
+</p>
+');
+
 /**
 * XHTML 1.1 Image Module provides basic image embedding.
 * @note There is specialized code for removing empty images in
@@ -15,17 +27,26 @@ class HTMLPurifier_HTMLModule_Image extends HTMLPurifier_HTMLModule
    
    var $name = 'Image';
    
-    function HTMLPurifier_HTMLModule_Image() {
+    function setup($config) {
+        $max = $config->get('HTML', 'MaxImgLength');
        $img =& $this->addElement(
            'img', true, 'Inline', 'Empty', 'Common',
            array(
                'alt*' => 'Text',
-                'height' => 'Length',
+                // According to the spec, it's Length, but percents can
+                // be abused, so we allow only Pixels. A trusted module
+                // could overload this with the real value.
+                'height' => 'Pixels#' . $max,
+                'width' => 'Pixels#' . $max,
                'longdesc' => 'URI', 
                'src*' => new HTMLPurifier_AttrDef_URI(true), // embedded
-                'width' => 'Length'
            )
        );
+        if ($max === null || $config->get('HTML', 'Trusted')) {
+            $img->attr['height'] =
+            $img->attr['width'] = 'Length';
+        }
+        
        // kind of strange, but splitting things up would be inefficient
        $img->attr_transform_pre[] =
        $img->attr_transform_post[] =
--- a/library/HTMLPurifier/HTMLModule/Legacy.php
+++ b/library/HTMLPurifier/HTMLModule/Legacy.php
@@ -25,7 +25,7 @@ class HTMLPurifier_HTMLModule_Legacy extends HTMLPurifier_HTMLModule
    
    var $name = 'Legacy';
    
-    function HTMLPurifier_HTMLModule_Legacy() {
+    function setup($config) {
        
        $this->addElement('basefont', true, 'Inline', 'Empty', false, array(
            'color' => 'Color',
--- a/library/HTMLPurifier/HTMLModule/List.php
+++ b/library/HTMLPurifier/HTMLModule/List.php
@@ -21,7 +21,7 @@ class HTMLPurifier_HTMLModule_List extends HTMLPurifier_HTMLModule
    
    var $content_sets = array('Flow' => 'List');
    
-    function HTMLPurifier_HTMLModule_List() {
+    function setup($config) {
        $this->addElement('ol', true, 'List', 'Required: li', 'Common');
        $this->addElement('ul', true, 'List', 'Required: li', 'Common');
        $this->addElement('dl', true, 'List', 'Required: dt | dd', 'Common');
--- a/library/HTMLPurifier/HTMLModule/Object.php
+++ b/library/HTMLPurifier/HTMLModule/Object.php
@@ -12,7 +12,7 @@ class HTMLPurifier_HTMLModule_Object extends HTMLPurifier_HTMLModule
    
    var $name = 'Object';
    
-    function HTMLPurifier_HTMLModule_Object() {
+    function setup($config) {
        
        $this->addElement('object', false, 'Inline', 'Optional: #PCDATA | Flow | param', 'Common', 
            array(
--- a/library/HTMLPurifier/HTMLModule/Presentation.php
+++ b/library/HTMLPurifier/HTMLModule/Presentation.php
@@ -17,7 +17,7 @@ class HTMLPurifier_HTMLModule_Presentation extends HTMLPurifier_HTMLModule
    
    var $name = 'Presentation';
    
-    function HTMLPurifier_HTMLModule_Presentation() {
+    function setup($config) {
        $this->addElement('b',      true, 'Inline', 'Inline', 'Common');
        $this->addElement('big',    true, 'Inline', 'Inline', 'Common');
        $this->addElement('hr',     true, 'Block',  'Empty',  'Common');
--- a/library/HTMLPurifier/HTMLModule/Ruby.php
+++ b/library/HTMLPurifier/HTMLModule/Ruby.php
@@ -11,7 +11,7 @@ class HTMLPurifier_HTMLModule_Ruby extends HTMLPurifier_HTMLModule
    
    var $name = 'Ruby';
    
-    function HTMLPurifier_HTMLModule_Ruby() {
+    function setup($config) {
        $this->addElement('ruby', true, 'Inline',
            'Custom: ((rb, (rt | (rp, rt, rp))) | (rbc, rtc, rtc?))',
            'Common');
--- a/library/HTMLPurifier/HTMLModule/Scripting.php
+++ b/library/HTMLPurifier/HTMLModule/Scripting.php
@@ -32,7 +32,7 @@ class HTMLPurifier_HTMLModule_Scripting extends HTMLPurifier_HTMLModule
    var $elements = array('script', 'noscript');
    var $content_sets = array('Block' => 'script | noscript', 'Inline' => 'script | noscript');
    
-    function HTMLPurifier_HTMLModule_Scripting() {
+    function setup($config) {
        // TODO: create custom child-definition for noscript that
        // auto-wraps stray #PCDATA in a similar manner to 
        // blockquote's custom definition (we would use it but
--- a/library/HTMLPurifier/HTMLModule/StyleAttribute.php
+++ b/library/HTMLPurifier/HTMLModule/StyleAttribute.php
@@ -18,7 +18,7 @@ class HTMLPurifier_HTMLModule_StyleAttribute extends HTMLPurifier_HTMLModule
        'Core' => array(0 => array('Style'))
    );
    
-    function HTMLPurifier_HTMLModule_StyleAttribute() {
+    function setup($config) {
        $this->attr_collections['Style']['style'] = new HTMLPurifier_AttrDef_CSS();
    }
    
--- a/library/HTMLPurifier/HTMLModule/Tables.php
+++ b/library/HTMLPurifier/HTMLModule/Tables.php
@@ -11,7 +11,7 @@ class HTMLPurifier_HTMLModule_Tables extends HTMLPurifier_HTMLModule
    
    var $name = 'Tables';
    
-    function HTMLPurifier_HTMLModule_Tables() {
+    function setup($config) {
        
        $this->addElement('caption', true, false, 'Inline', 'Common');
        
--- a/library/HTMLPurifier/HTMLModule/Target.php
+++ b/library/HTMLPurifier/HTMLModule/Target.php
@@ -10,7 +10,7 @@ class HTMLPurifier_HTMLModule_Target extends HTMLPurifier_HTMLModule
    
    var $name = 'Target';
    
-    function HTMLPurifier_HTMLModule_Target() {
+    function setup($config) {
        $elements = array('a');
        foreach ($elements as $name) {
            $e =& $this->addBlankElement($name);
--- a/library/HTMLPurifier/HTMLModule/Text.php
+++ b/library/HTMLPurifier/HTMLModule/Text.php
@@ -22,7 +22,7 @@ class HTMLPurifier_HTMLModule_Text extends HTMLPurifier_HTMLModule
        'Flow' => 'Heading | Block | Inline'
    );
    
-    function HTMLPurifier_HTMLModule_Text() {
+    function setup($config) {
        
        // Inline Phrasal -------------------------------------------------
        $this->addElement('abbr',    true, 'Inline', 'Inline', 'Common');
--- a/library/HTMLPurifier/HTMLModule/Tidy.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy.php
@@ -70,7 +70,7 @@ class HTMLPurifier_HTMLModule_Tidy extends HTMLPurifier_HTMLModule
     * @todo Wildcard matching and error reporting when an added or
     *       subtracted fix has no effect.
     */
-    function construct($config) {
+    function setup($config) {
        
        // create fixes, initialize fixesForLevel
        $fixes = $this->makeFixes();
--- a/library/HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php
@@ -13,6 +13,8 @@ require_once 'HTMLPurifier/AttrTransform/Length.php';
 require_once 'HTMLPurifier/AttrTransform/ImgSpace.php';
 require_once 'HTMLPurifier/AttrTransform/EnumToCSS.php';

+require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
+
 class HTMLPurifier_HTMLModule_Tidy_XHTMLAndHTML4 extends
      HTMLPurifier_HTMLModule_Tidy
 {
@@ -188,5 +190,17 @@ class HTMLPurifier_HTMLModule_Tidy_Strict extends
 {
    var $name = 'Tidy_Strict';
    var $defaultLevel = 'light';
+    
+    function makeFixes() {
+        $r = parent::makeFixes();
+        $r['blockquote#content_model_type'] = 'strictblockquote';
+        return $r;
+    }
+    
+    var $defines_child_def = true;
+    function getChildDef($def) {
+        if ($def->content_model_type != 'strictblockquote') return parent::getChildDef($def);
+        return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
+    }
 }

--- a/library/HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php
@@ -1,26 +0,0 @@
-<?php
-
-require_once 'HTMLPurifier/HTMLModule/Tidy.php';
-require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
-
-class HTMLPurifier_HTMLModule_Tidy_XHTMLStrict extends
-      HTMLPurifier_HTMLModule_Tidy
-{
-    
-    var $name = 'Tidy_XHTMLStrict';
-    var $defaultLevel = 'light';
-    
-    function makeFixes() {
-        $r = array();
-        $r['blockquote#content_model_type'] = 'strictblockquote';
-        return $r;
-    }
-    
-    var $defines_child_def = true;
-    function getChildDef($def) {
-        if ($def->content_model_type != 'strictblockquote') return false;
-        return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
-    }
-    
-}
-
--- a/library/HTMLPurifier/HTMLModuleManager.php
+++ b/library/HTMLPurifier/HTMLModuleManager.php
@@ -35,7 +35,6 @@ require_once 'HTMLPurifier/HTMLModule/Object.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/XHTML.php';
-require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php';
 require_once 'HTMLPurifier/HTMLModule/Tidy/Proprietary.php';

 HTMLPurifier_ConfigSchema::define(
@@ -209,7 +208,7 @@ class HTMLPurifier_HTMLModuleManager
        $this->doctypes->register(
            'XHTML 1.0 Strict', true,
            array_merge($common, $xml, $non_xml),
-            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_XHTMLStrict', 'Tidy_Proprietary'),
+            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Strict', 'Tidy_Proprietary'),
            array(),
            '-//W3C//DTD XHTML 1.0 Strict//EN',
            'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
@@ -218,7 +217,7 @@ class HTMLPurifier_HTMLModuleManager
        $this->doctypes->register(
            'XHTML 1.1', true,
            array_merge($common, $xml, array('Ruby')),
-            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_XHTMLStrict'), // Tidy_XHTML1_1
+            array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_Strict'), // Tidy_XHTML1_1
            array(),
            '-//W3C//DTD XHTML 1.1//EN',
            'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'
@@ -343,13 +342,12 @@ class HTMLPurifier_HTMLModuleManager
        
        foreach ($modules as $module) {
            $this->processModule($module);
+            $this->modules[$module]->setup($config);
        }
        
        foreach ($this->doctype->tidyModules as $module) {
            $this->processModule($module);
-            if (method_exists($this->modules[$module], 'construct')) {
-                $this->modules[$module]->construct($config);
-            }
+            $this->modules[$module]->setup($config);
        }
        
        // setup lookup table based on all valid modules
--- a/library/HTMLPurifier/IDAccumulator.php
+++ b/library/HTMLPurifier/IDAccumulator.php
@@ -1,11 +1,15 @@
 <?php

+HTMLPurifier_ConfigSchema::define(
+    'Attr', 'IDBlacklist', array(), 'list',
+    'Array of IDs not allowed in the document.'
+);
+
 /**
 * Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
 * @note In Slashdot-speak, dupe means duplicate.
- * @note This class does not accept $config or $context, thus, it is the
- *       burden of the callee to register the appropriate errors or
- *       configuration.
+ * @note The default constructor does not accept $config or $context objects:
+ *       use must use the static build() factory method to perform initialization.
 */
 class HTMLPurifier_IDAccumulator
 {
@@ -16,6 +20,19 @@ class HTMLPurifier_IDAccumulator
     */
    var $ids = array();
    
+    /**
+     * Builds an IDAccumulator, also initializing the default blacklist
+     * @param $config Instance of HTMLPurifier_Config
+     * @param $context Instance of HTMLPurifier_Context
+     * @return Fully initialized HTMLPurifier_IDAccumulator
+     * @static
+     */
+    function build($config, &$context) {
+        $acc = new HTMLPurifier_IDAccumulator();
+        $acc->load($config->get('Attr', 'IDBlacklist'));
+        return $acc;
+    }
+    
    /**
     * Add an ID to the lookup table.
     * @param $id ID to be added.
--- a/library/HTMLPurifier/Injector.php
+++ b/library/HTMLPurifier/Injector.php
@@ -4,6 +4,9 @@
 * Injects tokens into the document while parsing for well-formedness.
 * This enables "formatter-like" functionality such as auto-paragraphing,
 * smiley-ification and linkification to take place.
+ * 
+ * @todo Allow injectors to request a re-run on their output. This 
+ *       would help if an operation is recursive.
 */
 class HTMLPurifier_Injector
 {
@@ -107,5 +110,12 @@ class HTMLPurifier_Injector
     */
    function handleElement(&$token) {}
    
+    /**
+     * Notifier that is called when an end token is processed
+     * @note This differs from handlers in that the token is read-only
+     */
+    function notifyEnd($token) {}
+    
+    
 }

--- a/library/HTMLPurifier/Injector/AutoParagraph.php
+++ b/library/HTMLPurifier/Injector/AutoParagraph.php
@@ -6,20 +6,28 @@ HTMLPurifier_ConfigSchema::define(
    'AutoFormat', 'AutoParagraph', false, 'bool', '
 <p>
  This directive turns on auto-paragraphing, where double newlines are
-  converted in to paragraphs whenever possible. Auto-paragraphing
-  applies when:
+  converted in to paragraphs whenever possible. Auto-paragraphing:
 </p>
 <ul>
-  <li>There are inline elements or text in the root node</li>
-  <li>There are inline elements or text with double newlines or
-      block elements in nodes that allow paragraph tags</li>
-  <li>There are double newlines in paragraph tags</li>
+  <li>Always applies to inline elements or text in the root node,</li>
+  <li>Applies to inline elements or text with double newlines in nodes
+      that allow paragraph tags,</li>
+  <li>Applies to double newlines in paragraph tags</li>
 </ul>
 <p>
  <code>p</code> tags must be allowed for this directive to take effect.
  We do not use <code>br</code> tags for paragraphing, as that is
  semantically incorrect.
 </p>
+<p>
+  To prevent auto-paragraphing as a content-producer, refrain from using
+  double-newlines except to specify a new paragraph or in contexts where
+  it has special meaning (whitespace usually has no meaning except in
+  tags like <code>pre</code>, so this should not be difficult.) To prevent
+  the paragraphing of inline text adjacent to block elements, wrap them
+  in <code>div</code> tags (the behavior is slightly different outside of
+  the root node.)
+</p>
 <p>
  This directive has been available since 2.0.1.
 </p>
@@ -62,19 +70,27 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
                $ok = false;
                // test if up-coming tokens are either block or have
                // a double newline in them
+                $nesting = 0;
                for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
                    if ($this->inputTokens[$i]->type == 'start'){
                        if (!$this->_isInline($this->inputTokens[$i])) {
-                            $ok = true;
+                            // we haven't found a double-newline, and
+                            // we've hit a block element, so don't paragraph
+                            $ok = false;
+                            break;
                        }
-                        break;
+                        $nesting++;
+                    }
+                    if ($this->inputTokens[$i]->type == 'end') {
+                        if ($nesting <= 0) break;
+                        $nesting--;
                    }
-                    if ($this->inputTokens[$i]->type == 'end') break;
                    if ($this->inputTokens[$i]->type == 'text') {
+                        // found it!
                        if (strpos($this->inputTokens[$i]->data, "\n\n") !== false) {
                            $ok = true;
+                            break;
                        }
-                        if (!$this->inputTokens[$i]->is_whitespace) break;
                    }
                }
                if ($ok) {
--- a/library/HTMLPurifier/Language.php
+++ b/library/HTMLPurifier/Language.php
@@ -25,6 +25,13 @@ class HTMLPurifier_Language
     */
    var $errorNames = array();
    
+    /**
+     * True if no message file was found for this language, so English
+     * is being used instead. Check this if you'd like to notify the
+     * user that they've used a non-supported language.
+     */
+    var $error = false;
+    
    /**
     * Has the language object been loaded yet?
     * @private
--- a/library/HTMLPurifier/Language/messages/en-x-testmini.php
+++ b/library/HTMLPurifier/Language/messages/en-x-testmini.php
@@ -0,0 +1,11 @@
+<?php
+
+// private language message file for unit testing purposes
+// this language file has no class associated with it
+
+$fallback = 'en';
+
+$messages = array(
+    'HTMLPurifier' => 'HTML Purifier XNone'
+);
+
--- a/library/HTMLPurifier/LanguageFactory.php
+++ b/library/HTMLPurifier/LanguageFactory.php
@@ -16,6 +16,7 @@ This directive has been available since 2.0.0.
 * caching and fallbacks.
 * @note Thanks to MediaWiki for the general logic, although this version
 *       has been entirely rewritten
+ * @todo Serialized cache for languages
 */
 class HTMLPurifier_LanguageFactory
 {
@@ -89,40 +90,42 @@ class HTMLPurifier_LanguageFactory
     * Creates a language object, handles class fallbacks
     * @param $config Instance of HTMLPurifier_Config
     * @param $context Instance of HTMLPurifier_Context
+     * @param $code Code to override configuration with. Private parameter.
     */
-    function create($config, &$context) {
+    function create($config, &$context, $code = false) {
        
        // validate language code
-        $code = $this->validator->validate(
-          $config->get('Core', 'Language'), $config, $context
-        );
+        if ($code === false) {
+            $code = $this->validator->validate(
+              $config->get('Core', 'Language'), $config, $context
+            );
+        } else {
+            $code = $this->validator->validate($code, $config, $context);
+        }
        if ($code === false) $code = 'en'; // malformed code becomes English
        
        $pcode = str_replace('-', '_', $code); // make valid PHP classname
        static $depth = 0; // recursion protection
        
        if ($code == 'en') {
-            $class = 'HTMLPurifier_Language';
-            $file  = $this->dir . '/Language.php';
+            $lang = new HTMLPurifier_Language($config, $context);
        } else {
            $class = 'HTMLPurifier_Language_' . $pcode;
            $file  = $this->dir . '/Language/classes/' . $code . '.php';
-            // PHP5/APC deps bug workaround can go here
-            // you can bypass the conditional include by loading the
-            // file yourself
-            if (file_exists($file) && !class_exists($class)) {
-                include_once $file;
-         			}
-        }
-        
-        if (!class_exists($class)) {
-            // go fallback
-            $fallback = HTMLPurifier_LanguageFactory::getFallbackFor($code);
-            $depth++;
-            $lang = HTMLPurifier_LanguageFactory::factory( $fallback );
-            $depth--;
-        } else {
-            $lang = new $class($config, $context);
+            if (file_exists($file)) {
+                include $file;
+                $lang = new $class($config, $context);
+            } else {
+                // Go fallback
+                $raw_fallback = $this->getFallbackFor($code);
+                $fallback = $raw_fallback ? $raw_fallback : 'en';
+                $depth++;
+                $lang = $this->create($config, $context, $fallback);
+                if (!$raw_fallback) {
+                    $lang->error = true;
+                }
+                $depth--;
+            }
        }
        $lang->code = $code;
        
--- a/library/HTMLPurifier/Length.php
+++ b/library/HTMLPurifier/Length.php
@@ -0,0 +1,111 @@
+<?php
+
+/**
+ * Represents a measurable length, with a string numeric magnitude
+ * and a unit. This object is immutable.
+ */
+class HTMLPurifier_Length
+{
+    
+    /**
+     * String numeric magnitude.
+     */
+    var $n;
+    
+    /**
+     * String unit. False is permitted if $n = 0.
+     */
+    var $unit;
+    
+    /**
+     * Whether or not this length is valid. Null if not calculated yet.
+     */
+    var $isValid;
+    
+    /*
+     * @param number $n Magnitude
+     * @param string $u Unit
+     */
+    function HTMLPurifier_Length($n = '0', $u = false) {
+        $this->n = (string) $n;
+        $this->unit = $u !== false ? (string) $u : false;
+    }
+    
+    /**
+     * @param string $s Unit string, like '2em' or '3.4in'
+     * @warning Does not perform validation.
+     */
+    function make($s) {
+        if (is_a($s, 'HTMLPurifier_Length')) return $s;
+        $n_length = strspn($s, '1234567890.+-');
+        $n = substr($s, 0, $n_length);
+        $unit = substr($s, $n_length);
+        if ($unit === '') $unit = false;
+        return new HTMLPurifier_Length($n, $unit);
+    }
+    
+    /**
+     * Validates the number and unit.
+     */
+    function validate() {
+        // Special case:
+
+        static $allowedUnits = array(
+            'em' => true, 'ex' => true, 'px' => true, 'in' => true,
+            'cm' => true, 'mm' => true, 'pt' => true, 'pc' => true
+        );
+        if ($this->n === '+0' || $this->n === '-0') $this->n = '0';
+        if ($this->n === '0' && $this->unit === false) return true;
+        if (!ctype_lower($this->unit)) $this->unit = strtolower($this->unit);
+        if (!isset($allowedUnits[$this->unit])) return false;
+        // Hack:
+        $def = new HTMLPurifier_AttrDef_CSS_Number();
+        $a = false; // hack hack
+        $result = $def->validate($this->n, $a, $a);
+        if ($result === false) return false;
+        $this->n = $result;
+        return true;
+    }
+    
+    /**
+     * Returns string representation of number.
+     */
+    function toString() {
+        if (!$this->isValid()) return false;
+        return $this->n . $this->unit;
+    }
+    
+    /**
+     * Retrieves string numeric magnitude.
+     */
+    function getN() {return $this->n;}
+    
+    /**
+     * Retrieves string unit.
+     */
+    function getUnit() {return $this->unit;}
+    
+    /**
+     * Returns true if this length unit is valid.
+     */
+    function isValid() {
+        if ($this->isValid === null) $this->isValid = $this->validate();
+        return $this->isValid;
+    }
+    
+    /**
+     * Compares two lengths, and returns 1 if greater, -1 if less and 0 if equal.
+     * @warning If both values are too large or small, this calculation will
+     *          not work properly
+     */
+    function compareTo($l) {
+        if ($l === false) return false;
+        if ($l->unit !== $this->unit) {
+            $converter = new HTMLPurifier_UnitConverter();
+            $l = $converter->convert($l, $this->unit);
+            if ($l === false) return false;
+        }
+        return $this->n - $l->n;
+    }
+    
+}
--- a/library/HTMLPurifier/Lexer.php
+++ b/library/HTMLPurifier/Lexer.php
@@ -13,11 +13,14 @@ if (version_compare(PHP_VERSION, "5", ">=")) {
 }

 HTMLPurifier_ConfigSchema::define(
-    'Core', 'AcceptFullDocuments', true, 'bool',
-    'This parameter determines whether or not the filter should accept full '.
-    'HTML documents, not just HTML fragments.  When on, it will '.
-    'drop all sections except the content between body.'
-);
+    'Core', 'ConvertDocumentToFragment', true, 'bool', '
+This parameter determines whether or not the filter should convert
+input that is a full document with html and body tags to a fragment
+of just the contents of a body tag. This parameter is simply something
+HTML Purifier can do during an edge-case: for most inputs, this
+processing is not necessary.
+');
+HTMLPurifier_ConfigSchema::defineAlias('Core', 'AcceptFullDocuments', 'Core', 'ConvertDocumentToFragment');

 HTMLPurifier_ConfigSchema::define(
    'Core', 'LexerImpl', null, 'mixed/null', '
@@ -316,7 +319,7 @@ class HTMLPurifier_Lexer
    function normalize($html, $config, &$context) {
        
        // extract body from document if applicable
-        if ($config->get('Core', 'AcceptFullDocuments')) {
+        if ($config->get('Core', 'ConvertDocumentToFragment')) {
            $html = $this->extractBody($html);
        }
        
--- a/library/HTMLPurifier/Lexer/DOMLex.php
+++ b/library/HTMLPurifier/Lexer/DOMLex.php
@@ -90,10 +90,27 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
            $tokens[] = $this->factory->createText($node->data);
            return;
        } elseif ($node->nodeType === XML_CDATA_SECTION_NODE) {
-            // undo DOM's special treatment of <script> tags
-            $tokens[] = $this->factory->createText($this->parseData($node->data));
+            // undo libxml's special treatment of <script> and <style> tags
+            $last = end($tokens);
+            $data = $node->data;
+            // (note $node->tagname is already normalized)
+            if ($last instanceof HTMLPurifier_Token_Start && $last->name == 'script') {
+                $new_data = trim($data);
+                if (substr($new_data, 0, 4) === '<!--') {
+                    $data = substr($new_data, 4);
+                    if (substr($data, -3) === '-->') {
+                        $data = substr($data, 0, -3);
+                    } else {
+                        // Highly suspicious! Not sure what to do...
+                    }
+                }
+            }
+            $tokens[] = $this->factory->createText($this->parseData($data));
            return;
        } elseif ($node->nodeType === XML_COMMENT_NODE) {
+            // this is code is only invoked for comments in script/style in versions
+            // of libxml pre-2.6.28 (regular comments, of course, are still
+            // handled regularly)
            $tokens[] = $this->factory->createComment($node->data);
            return;
        } elseif (
--- a/library/HTMLPurifier/Lexer/DirectLex.php
+++ b/library/HTMLPurifier/Lexer/DirectLex.php
@@ -160,9 +160,15 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                
                $segment = substr($html, $cursor, $strlen_segment);
                
+                if ($segment === false) {
+                    // somehow, we attempted to access beyond the end of
+                    // the string, defense-in-depth, reported by Nate Abele
+                    break;
+                }
+                
                // Check if it's a comment
                if (
-                    substr($segment, 0, 3) == '!--'
+                    strncmp('!--', $segment, 3) === 0
                ) {
                    // re-determine segment length, looking for -->
                    $position_comment_end = strpos($html, '-->', $cursor);
@@ -178,12 +184,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                    }
                    $strlen_segment = $position_comment_end - $cursor;
                    $segment = substr($html, $cursor, $strlen_segment);
-                    $token = new
-                        HTMLPurifier_Token_Comment(
-                            substr(
-                                $segment, 3, $strlen_segment - 3
-                            )
-                        );
+                    $token = new HTMLPurifier_Token_Comment(substr($segment, 3));
                    if ($maintain_line_numbers) {
                        $token->line = $current_line;
                        $current_line += $this->substrCount($html, $nl, $cursor, $strlen_segment);
@@ -237,7 +238,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                // trailing slash. Remember, we could have a tag like <br>, so
                // any later token processing scripts must convert improperly
                // classified EmptyTags from StartTags.
-                $is_self_closing= (strrpos($segment,'/') === $strlen_segment-1);
+                $is_self_closing = (strrpos($segment,'/') === $strlen_segment-1);
                if ($is_self_closing) {
                    $strlen_segment--;
                    $segment = substr($segment, 0, $strlen_segment);
--- a/library/HTMLPurifier/Lexer/PH5P.php
+++ b/library/HTMLPurifier/Lexer/PH5P.php
@@ -26,8 +26,6 @@ class HTMLPurifier_Lexer_PH5P extends HTMLPurifier_Lexer_DOMLex {
    
 }

-// begin PHP5P source code here
-
 /*

 Copyright 2007 Jeroen van der Meer <http://jero.net/> 
@@ -3722,7 +3720,7 @@ class HTML5TreeConstructer {
        }
    }

-    private function generateImpliedEndTags(array $exclude = array()) {
+    private function generateImpliedEndTags($exclude = array()) {
        /* When the steps below require the UA to generate implied end tags,
        then, if the current node is a dd element, a dt element, an li element,
        a p element, a td element, a th  element, or a tr element, the UA must
@@ -3736,7 +3734,8 @@ class HTML5TreeConstructer {
        }
    }

-    private function getElementCategory($name) {
+    private function getElementCategory($node) {
+        $name = $node->tagName;
        if(in_array($name, $this->special))
            return self::SPECIAL;

@@ -3884,3 +3883,4 @@ class HTML5TreeConstructer {
        return $this->dom;
    }
 }
+?>
--- a/library/HTMLPurifier/PercentEncoder.php
+++ b/library/HTMLPurifier/PercentEncoder.php
@@ -2,12 +2,68 @@

 /**
 * Class that handles operations involving percent-encoding in URIs.
+ *
+ * @warning
+ *      Be careful when reusing instances of PercentEncoder. The object
+ *      you use for normalize() SHOULD NOT be used for encode(), or
+ *      vice-versa.
 */
 class HTMLPurifier_PercentEncoder
 {
    
    /**
-     * Fix up percent-encoding by decoding unreserved characters and normalizing
+     * Reserved characters to preserve when using encode().
+     */
+    var $preserve = array();
+    
+    /**
+     * String of characters that should be preserved while using encode().
+     */
+    function HTMLPurifier_PercentEncoder($preserve = false) {
+        // unreserved letters, ought to const-ify
+        for ($i = 48; $i <= 57;  $i++) $this->preserve[$i] = true; // digits
+        for ($i = 65; $i <= 90;  $i++) $this->preserve[$i] = true; // upper-case
+        for ($i = 97; $i <= 122; $i++) $this->preserve[$i] = true; // lower-case
+        $this->preserve[45] = true; // Dash         -
+        $this->preserve[46] = true; // Period       .
+        $this->preserve[95] = true; // Underscore   _
+        $this->preserve[126]= true; // Tilde        ~
+        
+        // extra letters not to escape
+        if ($preserve !== false) {
+            for ($i = 0, $c = strlen($preserve); $i < $c; $i++) {
+                $this->preserve[ord($preserve[$i])] = true;
+            }
+        }
+    }
+    
+    /**
+     * Our replacement for urlencode, it encodes all non-reserved characters,
+     * as well as any extra characters that were instructed to be preserved.
+     * @note
+     *      Assumes that the string has already been normalized, making any
+     *      and all percent escape sequences valid. Percents will not be
+     *      re-escaped, regardless of their status in $preserve
+     * @param $string String to be encoded
+     * @return Encoded string.
+     */
+    function encode($string) {
+        $ret = '';
+        for ($i = 0, $c = strlen($string); $i < $c; $i++) {
+            if ($string[$i] !== '%' && !isset($this->preserve[$int = ord($string[$i])]) ) {
+                $ret .= '%' . sprintf('%02X', $int);
+            } else {
+                $ret .= $string[$i];
+            }
+        }
+        return $ret;
+    }
+    
+    /**
+     * Fix up percent-encoding by decoding unreserved characters and normalizing.
+     * @warning This function is affected by $preserve, even though the
+     *          usual desired behavior is for this not to preserve those
+     *          characters. Be careful when reusing instances of PercentEncoder!
     * @param $string String to normalize
     */
    function normalize($string) {
@@ -27,12 +83,7 @@ class HTMLPurifier_PercentEncoder
                continue;
            }
            $int = hexdec($encoding);
-            if (
-                ($int >= 48 && $int <= 57) || // digits
-                ($int >= 65 && $int <= 90) || // uppercase letters
-                ($int >= 97 && $int <= 122) || // lowercase letters
-                $int == 126 || $int == 45 || $int == 46 || $int == 95 // ~-._
-            ) {
+            if (isset($this->preserve[$int])) {
                $ret .= chr($int) . $text;
                continue;
            }
--- a/library/HTMLPurifier/Strategy/FixNesting.php
+++ b/library/HTMLPurifier/Strategy/FixNesting.php
@@ -195,7 +195,7 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
            //################################################################//
            // Process result by interpreting $result
            
-            if ($result === true) {
+            if ($result === true || $child_tokens === $result) {
                // leave the node as is
                
                // register start token as a parental node start
--- a/library/HTMLPurifier/Strategy/MakeWellFormed.php
+++ b/library/HTMLPurifier/Strategy/MakeWellFormed.php
@@ -36,28 +36,23 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
        
        $definition = $config->getHTMLDefinition();
        
-        // CurrentNesting
-        $this->currentNesting = array();
-        $context->register('CurrentNesting', $this->currentNesting);
-        
-        // InputIndex
-        $this->inputIndex = false;
-        $context->register('InputIndex', $this->inputIndex);
-        
-        // InputTokens
-        $context->register('InputTokens', $tokens);
-        $this->inputTokens =& $tokens;
-        
-        // OutputTokens
+        // local variables
        $result = array();
-        $this->outputTokens =& $result;
-        
-        // %Core.EscapeInvalidTags
-        $escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
        $generator = new HTMLPurifier_Generator();
-        
+        $escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
        $e =& $context->get('ErrorCollector', true);
        
+        // member variables
+        $this->currentNesting = array();
+        $this->inputIndex     = false;
+        $this->inputTokens    =& $tokens;
+        $this->outputTokens   =& $result;
+        
+        // context variables
+        $context->register('CurrentNesting', $this->currentNesting);
+        $context->register('InputIndex', $this->inputIndex);
+        $context->register('InputTokens', $tokens);
+        
        // -- begin INJECTOR --
        
        $this->injectors = array();
@@ -95,6 +90,10 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            trigger_error("Cannot enable $name injector because $error is not allowed", E_USER_WARNING);
        }
        
+        // warning: most foreach loops follow the convention $i => $x.
+        // be sure, for PHP4 compatibility, to only perform write operations
+        // directly referencing the object using $i: $x is only safe for reads
+        
        // -- end INJECTOR --
        
        $token = false;
@@ -105,6 +104,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            // if all goes well, this token will be passed through unharmed
            $token = $tokens[$this->inputIndex];
            
+            //printTokens($tokens, $this->inputIndex);
+            
            foreach ($this->injectors as $i => $x) {
                if ($x->skip > 0) $this->injectors[$i]->skip--;
            }
@@ -114,7 +115,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                if ($token->type === 'text') {
                     // injector handler code; duplicated for performance reasons
                     foreach ($this->injectors as $i => $x) {
-                         if (!$x->skip) $x->handleText($token);
+                         if (!$x->skip) $this->injectors[$i]->handleText($token);
                         if (is_array($token)) {
                             $this->currentInjector = $i;
                             break;
@@ -157,10 +158,9 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                    // the parent
                    if (!isset($parent_info->child->elements[$token->name])) {
                        if ($e) $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag auto closed', $parent);
-                        // close the parent, then append the token
+                        // close the parent, then re-loop to reprocess token
                        $result[] = new HTMLPurifier_Token_End($parent->name);
-                        $result[] = $token;
-                        $this->currentNesting[] = $token;
+                        $this->inputIndex--;
                        continue;
                    }
                    
@@ -172,7 +172,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            // injector handler code; duplicated for performance reasons
            if ($ok) {
                foreach ($this->injectors as $i => $x) {
-                    if (!$x->skip) $x->handleElement($token);
+                    if (!$x->skip) $this->injectors[$i]->handleElement($token);
                    if (is_array($token)) {
                        $this->currentInjector = $i;
                        break;
@@ -202,6 +202,9 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            $current_parent = array_pop($this->currentNesting);
            if ($current_parent->name == $token->name) {
                $result[] = $token;
+                foreach ($this->injectors as $i => $x) {
+                    $this->injectors[$i]->notifyEnd($token);
+                }
                continue;
            }
            
@@ -238,16 +241,16 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            
            // okay, we found it, close all the skipped tags
            // note that skipped tags contains the element we need closed
-            $size = count($skipped_tags);
-            for ($i = $size - 1; $i > 0; $i--) {
-                if ($e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
+            for ($i = count($skipped_tags) - 1; $i >= 0; $i--) {
+                if ($i && $e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$i]);
                }
-                $result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
+                $result[] = $new_token = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
+                foreach ($this->injectors as $j => $x) { // $j, not $i!!!
+                    $this->injectors[$j]->notifyEnd($new_token);
+                }
            }
            
-            $result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
-            
        }
        
        $context->destroy('CurrentNesting');
@@ -255,17 +258,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
        $context->destroy('InputIndex');
        $context->destroy('CurrentToken');
        
-        // we're at the end now, fix all still unclosed tags
-        // not using processToken() because at this point we don't
-        // care about current nesting
+        // we're at the end now, fix all still unclosed tags (this is
+        // duplicated from the end of the loop with some slight modifications)
+        // not using $skipped_tags since it would invariably be all of them
        if (!empty($this->currentNesting)) {
-            $size = count($this->currentNesting);
-            for ($i = $size - 1; $i >= 0; $i--) {
+            for ($i = count($this->currentNesting) - 1; $i >= 0; $i--) {
                if ($e && !isset($this->currentNesting[$i]->armor['MakeWellFormed_TagClosedError'])) {
                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $this->currentNesting[$i]);
                }
-                $result[] =
-                    new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
+                $result[] = $new_token = new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
+                foreach ($this->injectors as $j => $x) { // $j, not $i!!!
+                    $this->injectors[$j]->notifyEnd($new_token);
+                }
            }
        }
        
@@ -286,8 +290,14 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            
            // adjust the injector skips based on the array substitution
            if ($this->injectors) {
-                $offset = count($token) + 1;
+                $offset = count($token);
                for ($i = 0; $i <= $this->currentInjector; $i++) {
+                    // because of the skip back, we need to add one more
+                    // for uninitialized injectors. I'm not exactly
+                    // sure why this is the case, but I think it has to
+                    // do with the fact that we're decrementing skips
+                    // before re-checking text
+                    if (!$this->injectors[$i]->skip) $this->injectors[$i]->skip++;
                    $this->injectors[$i]->skip += $offset;
                }
            }
--- a/library/HTMLPurifier/Strategy/RemoveForeignElements.php
+++ b/library/HTMLPurifier/Strategy/RemoveForeignElements.php
@@ -116,6 +116,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
                    // mostly everything's good, but
                    // we need to make sure required attributes are in order
                    if (
+                        ($token->type === 'start' || $token->type === 'empty') &&
                        $definition->info[$token->name]->required_attr &&
                        ($token->name != 'img' || $remove_invalid_img) // ensure config option still works
                    ) {
@@ -134,7 +135,6 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
                        $token->armor['ValidateAttributes'] = true;
                    }
                    
-                    // CAN BE GENERICIZED
                    if (isset($hidden_elements[$token->name]) && $token->type == 'start') {
                        $textify_comments = $token->name;
                    } elseif ($token->name === $textify_comments && $token->type == 'end') {
--- a/library/HTMLPurifier/Strategy/ValidateAttributes.php
+++ b/library/HTMLPurifier/Strategy/ValidateAttributes.php
@@ -6,10 +6,6 @@ require_once 'HTMLPurifier/IDAccumulator.php';

 require_once 'HTMLPurifier/AttrValidator.php';

-HTMLPurifier_ConfigSchema::define(
-    'Attr', 'IDBlacklist', array(), 'list',
-    'Array of IDs not allowed in the document.');
-
 /**
 * Validate all attributes in the tokens.
 */
@@ -19,11 +15,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
    
    function execute($tokens, $config, &$context) {
        
-        // setup id_accumulator context
-        $id_accumulator = new HTMLPurifier_IDAccumulator();
-        $id_accumulator->load($config->get('Attr', 'IDBlacklist'));
-        $context->register('IDAccumulator', $id_accumulator);
-        
        // setup validator
        $validator = new HTMLPurifier_AttrValidator();
        
@@ -44,8 +35,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
            
            $tokens[$key] = $token; // for PHP 4
        }
-        
-        $context->destroy('IDAccumulator');
        $context->destroy('CurrentToken');
        
        return $tokens;
--- a/library/HTMLPurifier/URI.php
+++ b/library/HTMLPurifier/URI.php
@@ -4,7 +4,12 @@ require_once 'HTMLPurifier/URIParser.php';
 require_once 'HTMLPurifier/URIFilter.php';

 /**
- * HTML Purifier's internal representation of a URI
+ * HTML Purifier's internal representation of a URI.
+ * @note
+ *      Internal data-structures are completely escaped. If the data needs
+ *      to be used in a non-URI context (which is very unlikely), be sure
+ *      to decode it first. The URI may not necessarily be well-formed until
+ *      validate() is called.
 */
 class HTMLPurifier_URI
 {
@@ -52,13 +57,27 @@ class HTMLPurifier_URI
    }
    
    /**
-     * Generic validation method applicable for all schemes
+     * Generic validation method applicable for all schemes. May modify
+     * this URI in order to get it into a compliant form.
     * @param $config Instance of HTMLPurifier_Config
     * @param $context Instance of HTMLPurifier_Context
     * @return True if validation/filtering succeeds, false if failure
     */
    function validate($config, &$context) {
        
+        // ABNF definitions from RFC 3986
+        $chars_sub_delims = '!$&\'()*+,;=';
+        $chars_gen_delims = ':/?#[]@';
+        $chars_pchar = $chars_sub_delims . ':@';
+        
+        // validate scheme (MUST BE FIRST!)
+        if (!is_null($this->scheme) && is_null($this->host)) {
+            $def = $config->getDefinition('URI');
+            if ($def->defaultScheme === $this->scheme) {
+                $this->scheme = null;
+            }
+        }
+        
        // validate host
        if (!is_null($this->host)) {
            $host_def = new HTMLPurifier_AttrDef_URI_Host();
@@ -66,18 +85,62 @@ class HTMLPurifier_URI
            if ($this->host === false) $this->host = null;
        }
        
+        // validate username
+        if (!is_null($this->userinfo)) {
+            $encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . ':');
+            $this->userinfo = $encoder->encode($this->userinfo);
+        }
+        
        // validate port
        if (!is_null($this->port)) {
            if ($this->port < 1 || $this->port > 65535) $this->port = null;
        }
        
-        // query and fragment are quite simple in terms of definition:
-        // *( pchar / "/" / "?" ), so define their validation routines
-        // when we start fixing percent encoding
+        // validate path
+        $path_parts = array();
+        $segments_encoder = new HTMLPurifier_PercentEncoder($chars_pchar . '/');
+        if (!is_null($this->host)) {
+            // path-abempty (hier and relative)
+            $this->path = $segments_encoder->encode($this->path);
+        } elseif ($this->path !== '' && $this->path[0] === '/') {
+            // path-absolute (hier and relative)
+            if (strlen($this->path) >= 2 && $this->path[1] === '/') {
+                // This shouldn't ever happen!
+                $this->path = '';
+            } else {
+                $this->path = $segments_encoder->encode($this->path);
+            }
+        } elseif (!is_null($this->scheme) && $this->path !== '') {
+            // path-rootless (hier)
+            // Short circuit evaluation means we don't need to check nz
+            $this->path = $segments_encoder->encode($this->path);
+        } elseif (is_null($this->scheme) && $this->path !== '') {
+            // path-noscheme (relative)
+            // (once again, not checking nz)
+            $segment_nc_encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . '@');
+            $c = strpos($this->path, '/');
+            if ($c !== false) {
+                $this->path = 
+                    $segment_nc_encoder->encode(substr($this->path, 0, $c)) .
+                    $segments_encoder->encode(substr($this->path, $c));
+            } else {
+                $this->path = $segment_nc_encoder->encode($this->path);
+            }
+        } else {
+            // path-empty (hier and relative)
+            $this->path = ''; // just to be safe
+        }
        
-        // path gets to be validated against a hodge-podge of rules depending
-        // on the status of authority and scheme, but it's not that important,
-        // esp. since it won't be applicable to everyone
+        // qf = query and fragment
+        $qf_encoder = new HTMLPurifier_PercentEncoder($chars_pchar . '/?');
+        
+        if (!is_null($this->query)) {
+            $this->query = $qf_encoder->encode($this->query);
+        }
+        
+        if (!is_null($this->fragment)) {
+            $this->fragment = $qf_encoder->encode($this->fragment);
+        }
        
        return true;
        
--- a/library/HTMLPurifier/URIFilter.php
+++ b/library/HTMLPurifier/URIFilter.php
@@ -1,10 +1,22 @@
 <?php

 /**
- * Chainable filters for custom URI processing 
+ * Chainable filters for custom URI processing.
+ * 
+ * These filters can perform custom actions on a URI filter object,
+ * including transformation or blacklisting.
+ * 
+ * @warning This filter is called before scheme object validation occurs.
+ *          Make sure, if you require a specific scheme object, you
+ *          you check that it exists. This allows filters to convert
+ *          proprietary URI schemes into regular ones.
 */
 class HTMLPurifier_URIFilter
 {
+    
+    /**
+     * Unique identifier of filter
+     */
    var $name;
    
    /**
@@ -17,8 +29,12 @@ class HTMLPurifier_URIFilter
     * @param &$uri Reference to URI object
     * @param $config Instance of HTMLPurifier_Config
     * @param &$context Instance of HTMLPurifier_Context
+     * @return bool Whether or not to continue processing: false indicates
+     *         URL is no good, true indicates continue processing. Note that
+     *         all changes are committed directly on the URI object
     */
    function filter(&$uri, $config, &$context) {
        trigger_error('Cannot call abstract function', E_USER_ERROR);
    }
+    
 }
--- a/library/HTMLPurifier/URIFilter/MakeAbsolute.php
+++ b/library/HTMLPurifier/URIFilter/MakeAbsolute.php
@@ -47,6 +47,10 @@ class HTMLPurifier_URIFilter_MakeAbsolute extends HTMLPurifier_URIFilter
            // absolute URI already: don't change
            if (!is_null($uri->host)) return true;
            $scheme_obj = $uri->getSchemeObj($config, $context);
+            if (!$scheme_obj) {
+                // scheme not recognized
+                return false;
+            }
            if (!$scheme_obj->hierarchical) {
                // non-hierarchal URI with explicit scheme, don't change
                return true;
--- a/library/HTMLPurifier/URIParser.php
+++ b/library/HTMLPurifier/URIParser.php
@@ -4,24 +4,39 @@ require_once 'HTMLPurifier/URI.php';

 /**
 * Parses a URI into the components and fragment identifier as specified
- * by RFC 2396.
- * @todo Replace regexps with a native PHP parser
+ * by RFC 3986.
 */
 class HTMLPurifier_URIParser
 {
    
    /**
-     * Parses a URI
+     * Instance of HTMLPurifier_PercentEncoder to do normalization with.
+     */
+    var $percentEncoder;
+    
+    function HTMLPurifier_URIParser() {
+        $this->percentEncoder = new HTMLPurifier_PercentEncoder();
+    }
+    
+    /**
+     * Parses a URI.
     * @param $uri string URI to parse
-     * @return HTMLPurifier_URI representation of URI
+     * @return HTMLPurifier_URI representation of URI. This representation has
+     *         not been validated yet and may not conform to RFC.
     */
    function parse($uri) {
+        
+        $uri = $this->percentEncoder->normalize($uri);
+        
+        // Regexp is as per Appendix B.
+        // Note that ["<>] are an addition to the RFC's recommended 
+        // characters, because they represent external delimeters.
        $r_URI = '!'.
-            '(([^:/?#<>\'"]+):)?'. // 2. Scheme
-            '(//([^/?#<>\'"]*))?'. // 4. Authority
-            '([^?#<>\'"]*)'.       // 5. Path
-            '(\?([^#<>\'"]*))?'.   // 7. Query
-            '(#([^<>\'"]*))?'.     // 8. Fragment
+            '(([^:/?#"<>]+):)?'. // 2. Scheme
+            '(//([^/?#"<>]*))?'. // 4. Authority
+            '([^?#"<>]*)'.       // 5. Path
+            '(\?([^#"<>]*))?'.   // 7. Query
+            '(#([^"<>]*))?'.     // 8. Fragment
            '!';
        
        $matches = array();
@@ -38,13 +53,7 @@ class HTMLPurifier_URIParser
        
        // further parse authority
        if ($authority !== null) {
-            // ridiculously inefficient: it's a stacked regex!
-            $HEXDIG = '[A-Fa-f0-9]';
-            $unreserved = 'A-Za-z0-9-._~'; // make sure you wrap with []
-            $sub_delims = '!$&\'()'; // needs []
-            $pct_encoded = "%$HEXDIG$HEXDIG";
-            $r_userinfo = "(?:[$unreserved$sub_delims:]|$pct_encoded)*";
-            $r_authority = "/^(($r_userinfo)@)?(\[[^\]]+\]|[^:]*)(:(\d*))?/";
+            $r_authority = "/^((.+?)@)?(\[[^\]]+\]|[^:]*)(:(\d*))?/";
            $matches = array();
            preg_match($r_authority, $authority, $matches);
            $userinfo   = !empty($matches[1]) ? $matches[2] : null;
--- a/library/HTMLPurifier/UnitConverter.php
+++ b/library/HTMLPurifier/UnitConverter.php
@@ -0,0 +1,241 @@
+<?php
+
+/**
+ * Class for converting between different unit-lengths as specified by
+ * CSS.
+ */
+class HTMLPurifier_UnitConverter
+{
+    
+    /**
+     * Minimum bcmath precision for output.
+     */
+    var $outputPrecision;
+    
+    /**
+     * Bcmath precision for internal calculations.
+     */
+    var $internalPrecision;
+    
+    /**
+     * Whether or not BCMath is available
+     */
+    var $bcmath;
+    
+    function HTMLPurifier_UnitConverter($output_precision = 4, $internal_precision = 10, $force_no_bcmath = false) {
+        $this->outputPrecision = $output_precision;
+        $this->internalPrecision = $internal_precision;
+        $this->bcmath = !$force_no_bcmath && function_exists('bcmul');
+    }
+    
+    /**
+     * Converts a length object of one unit into another unit.
+     * @param HTMLPurifier_Length $length
+     *      Instance of HTMLPurifier_Length to convert. You must validate()
+     *      it before passing it here!
+     * @param string $to_unit
+     *      Unit to convert to.
+     * @note
+     *      About precision: This conversion function pays very special
+     *      attention to the incoming precision of values and attempts
+     *      to maintain a number of significant figure. Results are
+     *      fairly accurate up to nine digits. Some caveats:
+     *          - If a number is zero-padded as a result of this significant
+     *            figure tracking, the zeroes will be eliminated.
+     *          - If a number contains less than four sigfigs ($outputPrecision)
+     *            and this causes some decimals to be excluded, those
+     *            decimals will be added on.
+     */
+    function convert($length, $to_unit) {
+        
+        /**
+         * Units information array. Units are grouped into measuring systems
+         * (English, Metric), and are assigned an integer representing
+         * the conversion factor between that unit and the smallest unit in
+         * the system. Numeric indexes are actually magical constants that
+         * encode conversion data from one system to the next, with a O(n^2)
+         * constraint on memory (this is generally not a problem, since
+         * the number of measuring systems is small.)
+         */
+        static $units = array(
+            1 => array(
+                'px' => 3, // This is as per CSS 2.1 and Firefox. Your mileage may vary
+                'pt' => 4,
+                'pc' => 48,
+                'in' => 288,
+                2 => array('pt', '0.352777778', 'mm'),
+            ),
+            2 => array(
+                'mm' => 1,
+                'cm' => 10,
+                1 => array('mm', '2.83464567', 'pt'),
+            ),
+        );
+        
+        if (!$length->isValid()) return false;
+        
+        $n    = $length->getN();
+        $unit = $length->getUnit();
+        
+        if ($n === '0' || $unit === false) {
+            return new HTMLPurifier_Length('0', false);
+        }
+        
+        $state = $dest_state = false;
+        foreach ($units as $k => $x) {
+            if (isset($x[$unit])) $state = $k;
+            if (isset($x[$to_unit])) $dest_state = $k;
+        }
+        if (!$state || !$dest_state) return false;
+        
+        // Some calculations about the initial precision of the number;
+        // this will be useful when we need to do final rounding.
+        $sigfigs = $this->getSigFigs($n);
+        if ($sigfigs < $this->outputPrecision) $sigfigs = $this->outputPrecision;
+        
+        // Cleanup $n for PHP 4.3.9 and 4.3.10. See http://bugs.php.net/bug.php?id=30726
+        if (strncmp($n, '-.', 2) === 0) {
+            $n = '-0.' . substr($n, 2);
+        }
+        
+        // BCMath's internal precision deals only with decimals. Use
+        // our default if the initial number has no decimals, or increase
+        // it by how ever many decimals, thus, the number of guard digits
+        // will always be greater than or equal to internalPrecision.
+        $log = (int) floor(log(abs($n), 10));
+        $cp = ($log < 0) ? $this->internalPrecision - $log : $this->internalPrecision; // internal precision
+        
+        for ($i = 0; $i < 2; $i++) {
+            
+            // Determine what unit IN THIS SYSTEM we need to convert to
+            if ($dest_state === $state) {
+                // Simple conversion
+                $dest_unit = $to_unit;
+            } else {
+                // Convert to the smallest unit, pending a system shift
+                $dest_unit = $units[$state][$dest_state][0];
+            }
+            
+            // Do the conversion if necessary
+            if ($dest_unit !== $unit) {
+                $factor = $this->div($units[$state][$unit], $units[$state][$dest_unit], $cp);
+                $n = $this->mul($n, $factor, $cp);
+                $unit = $dest_unit;
+            }
+            
+            // Output was zero, so bail out early. Shouldn't ever happen.
+            if ($n === '') {
+                $n = '0';
+                $unit = $to_unit;
+                break;
+            }
+            
+            // It was a simple conversion, so bail out
+            if ($dest_state === $state) {
+                break;
+            }
+            
+            if ($i !== 0) {
+                // Conversion failed! Apparently, the system we forwarded
+                // to didn't have this unit. This should never happen!
+                return false;
+            }
+            
+            // Pre-condition: $i == 0
+            
+            // Perform conversion to next system of units
+            $n = $this->mul($n, $units[$state][$dest_state][1], $cp);
+            $unit = $units[$state][$dest_state][2];
+            $state = $dest_state;
+            
+            // One more loop around to convert the unit in the new system.
+            
+        }
+        
+        // Post-condition: $unit == $to_unit
+        if ($unit !== $to_unit) return false;
+        
+        // Useful for debugging:
+        //echo "<pre>n";
+        //echo "$n\nsigfigs = $sigfigs\nnew_log = $new_log\nlog = $log\nrp = $rp\n</pre>\n";
+        
+        $n = $this->round($n, $sigfigs);
+        if (strpos($n, '.') !== false) $n = rtrim($n, '0');
+        $n = rtrim($n, '.');
+        
+        return new HTMLPurifier_Length($n, $unit);
+    }
+    
+    /**
+     * Returns the number of significant figures in a string number.
+     * @param string $n Decimal number
+     * @return int number of sigfigs
+     */
+    function getSigFigs($n) {
+        $n = ltrim($n, '0+-');
+        $dp = strpos($n, '.'); // decimal position
+        if ($dp === false) {
+            $sigfigs = strlen(rtrim($n, '0'));
+        } else {
+            $sigfigs = strlen(ltrim($n, '0.')); // eliminate extra decimal character
+            if ($dp !== 0) $sigfigs--;
+        }
+        return $sigfigs;
+    }
+    
+    /**
+     * Adds two numbers, using arbitrary precision when available.
+     */
+    function add($s1, $s2, $scale) {
+        if ($this->bcmath) return bcadd($s1, $s2, $scale);
+        else return $this->scale($s1 + $s2, $scale);
+    }
+    
+    /**
+     * Multiples two numbers, using arbitrary precision when available.
+     */
+    function mul($s1, $s2, $scale) {
+        if ($this->bcmath) return bcmul($s1, $s2, $scale);
+        else return $this->scale($s1 * $s2, $scale);
+    }
+    
+    /**
+     * Divides two numbers, using arbitrary precision when available.
+     */
+    function div($s1, $s2, $scale) {
+        if ($this->bcmath) return bcdiv($s1, $s2, $scale);
+        else return $this->scale($s1 / $s2, $scale);
+    }
+    
+    /**
+     * Rounds a number according to the number of sigfigs it should have,
+     * using arbitrary precision when available.
+     */
+    function round($n, $sigfigs) {
+        $new_log = (int) floor(log(abs($n), 10)); // Number of digits left of decimal - 1
+        $rp = $sigfigs - $new_log - 1; // Number of decimal places needed
+        $neg = $n < 0 ? '-' : ''; // Negative sign
+        if ($this->bcmath) {
+            if ($rp >= 0) {
+                $n = bcadd($n, $neg . '0.' .  str_repeat('0', $rp) . '5', $rp + 1);
+                $n = bcdiv($n, '1', $rp);
+            } else {
+                // This algorithm partially depends on the standardized
+                // form of numbers that comes out of bcmath.
+                $n = bcadd($n, $neg . '5' . str_repeat('0', $new_log - $sigfigs), 0);
+                $n = substr($n, 0, $sigfigs + strlen($neg)) . str_repeat('0', $new_log - $sigfigs + 1);
+            }
+            return $n;
+        } else {
+            return $this->scale(round($n, $sigfigs - $new_log - 1), $rp + 1);
+        }
+    }
+    
+    /**
+     * Scales a float to $scale digits right of decimal point, like BCMath.
+     */
+    function scale($r, $scale) {
+        return sprintf('%.' . $scale . 'f', (float) $r);
+    }
+    
+}
--- a/maintenance/PH5P.patch
+++ b/maintenance/PH5P.patch
@@ -1,5 +1,5 @@
--- old.php	2007-08-19 14:42:33.640625000 -0400
-+++ new.php	2007-08-19 14:41:51.609375000 -0400
+--- C:\Users\Edward\Webs\htmlpurifier\maintenance\PH5P.php	2007-11-04 23:41:49.074543700 -0500
+++ C:\Users\Edward\Webs\htmlpurifier\maintenance/PH5P.new.php	2007-11-05 00:23:52.839543700 -0500
@@ -211,7 +211,10 @@
         // If nothing is returned, emit a U+0026 AMPERSAND character token.
         // Otherwise, emit the character token that was returned.
@@ -43,3 +43,22 @@
                         $entity = $id;
                         break;
                     }
+@@ -3659,7 +3668,7 @@
+         }
+     }
+ 
+-    private function generateImpliedEndTags(array $exclude = array()) {
+    private function generateImpliedEndTags($exclude = array()) {
+         /* When the steps below require the UA to generate implied end tags,
+         then, if the current node is a dd element, a dt element, an li element,
+         a p element, a td element, a th  element, or a tr element, the UA must
+@@ -3673,7 +3682,8 @@
+         }
+     }
+ 
+-    private function getElementCategory($name) {
+    private function getElementCategory($node) {
+        $name = $node->tagName;
+         if(in_array($name, $this->special))
+             return self::SPECIAL;
+ 
--- a/maintenance/PH5P.php
+++ b/maintenance/PH5P.php
--- a/maintenance/flush-definition-cache.php
+++ b/maintenance/flush-definition-cache.php
@@ -32,5 +32,5 @@ foreach ($names as $name) {
    $cache->flush($config);
 }

-echo 'Cache flushed successfully.';
+echo "Cache flushed successfully.\n";

--- a/maintenance/generate-ph5p-patch.php
+++ b/maintenance/generate-ph5p-patch.php
@@ -0,0 +1,13 @@
+<?php
+
+$orig = realpath(dirname(__FILE__) . '/PH5P.php');
+$new  = realpath(dirname(__FILE__) . '/../library/HTMLPurifier/Lexer/PH5P.php');
+$newt = dirname(__FILE__) . '/PH5P.new.php'; // temporary file
+
+// minor text-processing of new file to get into same format as original
+$new_src = file_get_contents($new);
+$new_src = '<?php' . PHP_EOL . substr($new_src, strpos($new_src, 'class HTML5 {'));
+
+file_put_contents($newt, $new_src);
+shell_exec("diff -u \"$orig\" \"$newt\" > PH5P.patch");
+unlink($newt);
--- a/package.php
+++ b/package.php
@@ -10,11 +10,11 @@ $pkg->setOptions(
    array(
        'baseinstalldir' => '/',
        'packagefile' => 'package2.xml',
-        'packagedirectory' => dirname(__FILE__) . '/library',
+        'packagedirectory' => realpath(dirname(__FILE__) . '/library'),
        'filelistgenerator' => 'file',
        'include' => array('*'),
        'dir_roles' => array('/' => 'php'), // hack to put .ser in the right place
-        'ignore' => array('HTMLPurifier.auto.php'),
+        'ignore' => array('HTMLPurifier.auto.php', 'HTMLPurifier.standalone.php', 'standalone/'),
    )
 );

--- a/phpdoc.ini
+++ b/phpdoc.ini
@@ -71,7 +71,7 @@ readmeinstallchangelog = README, INSTALL, NEWS, WYSIWYG, SLOW, LICENSE, CREDITS
 ;; legal values: directory paths separated by commas
 ;directory = /path1,/path2,.,..,subdirectory
 ;directory = /home/jeichorn/cvs/pear
-directory = ./
+directory = .

 ;; template base directory (the equivalent directory of <installdir>/phpDocumentor)
 ;templatebase = /path/to/my/templates
@@ -82,7 +82,7 @@ directory = ./
 ;; comma-separated list of files, directories or wildcards ? and * (any wildcard) to ignore
 ;; legal values: any wildcard strings separated by commas
 ;ignore = /path/to/ignore*,*list.php,myfile.php,subdirectory/
-ignore = pear-*,templates/,Documentation/,test*.php,Lexer.inc
+ignore = *tests*,*benchmarks*,*docs*,*test-settings.php,*configdoc*,*maintenance*,*smoketests*,*standalone*,*.svn*,*conf*

 sourcecode = on

--- a/plugins/phorum/htmlpurifier.php
+++ b/plugins/phorum/htmlpurifier.php
@@ -261,12 +261,42 @@ function phorum_htmlpurifier_editor_after_subject() {
    // don't show this message if it's a WYSIWYG editor, since it will
    // then be handled automatically
    if (!empty($GLOBALS['PHORUM']['mod_htmlpurifier']['wysiwyg'])) return;
-    ?><tr><td colspan="2" style="padding:1em 0.3em;">
-  HTML input is <strong>on</strong>. Make sure you escape all HTML and
-  angled-brackets with &amp;lt; and &amp;gt; (you can also use CDATA
-  tags, simply wrap the suspect text with
-&lt;![CDATA[<em>text</em>]]&gt;. Paragraphs will only be applied to 
-double-spaces; single-spaces will not generate <tt>&lt;br&gt;</tt> tags.
+    ?><tr><td colspan="2" style="padding:1em 0.3em;" class="htmlpurifier-help">
+    <p>
+        <strong>HTML input</strong> is enabled. Make sure you escape all HTML and
+        angled brackets with <code>&amp;lt;</code> and <code>&amp;gt;</code>.
+    </p><?php
+            $purifier =& HTMLPurifier::getInstance();
+            $config = $purifier->config;
+            if ($config->get('AutoFormat', 'AutoParagraph')) {
+                ?><p>
+                    <strong>Auto-paragraphing</strong> is enabled. Double
+                    newlines will be converted to paragraphs; for single
+                    newlines, use the <code>pre</code> tag.
+                </p><?php
+            }
+            $html_definition = $config->getDefinition('HTML');
+            $allowed = array();
+            foreach ($html_definition->info as $name => $x) $allowed[] = "<code>$name</code>";
+            sort($allowed);
+            $allowed_text = implode(', ', $allowed);
+            ?><p><strong>Allowed tags:</strong> <?php
+            echo $allowed_text;
+            ?>.</p><?php
+        ?>
+    </p>
+    <p>
+        For inputting literal code such as HTML and PHP for display, use
+        CDATA tags to auto-escape your angled brackets, and <code>pre</code>
+        to preserve newlines:
+    </p>
+    <pre>&lt;pre&gt;&lt;![CDATA[
+<em>Place code here</em>
+]]&gt;&lt;/pre&gt;</pre>
+    <p>
+        Power users, you can hide this notice with:
+        <pre>.htmlpurifier-help {display:none;}</pre>
+    </p>
    </td></tr><?php
 }

--- a/plugins/phorum/settings/migrate-sigs.php
+++ b/plugins/phorum/settings/migrate-sigs.php
@@ -20,8 +20,10 @@ function phorum_htmlpurifier_migrate_sigs_check() {
 function phorum_htmlpurifier_migrate_sigs($offset) {
    global $PHORUM;
    
-    if(!$offset) return; // bail out quick of $offset == 0
+    if(!$offset) return; // bail out quick if $offset == 0
    
+    // theoretically, we could get rid of this multi-request
+    // doo-hickery if safe mode is off
    @set_time_limit(0); // attempt to let this run
    $increment = $PHORUM['mod_htmlpurifier']['migrate-sigs-increment'];
    
@@ -52,21 +54,19 @@ function phorum_htmlpurifier_migrate_sigs($offset) {
    
    // query for highest ID in database
    $type = $PHORUM['DBCONFIG']['type'];
+    $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
    if ($type == 'mysql') {
        $conn = phorum_db_mysql_connect();
-        $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
        $res = mysql_query($sql, $conn);
        $row = mysql_fetch_row($res);
-        $top_id = (int) $row[0];
    } elseif ($type == 'mysqli') {
        $conn = phorum_db_mysqli_connect();
-        $sql = "select MAX(user_id) from {$PHORUM['user_table']}";
        $res = mysqli_query($conn, $sql);
        $row = mysqli_fetch_row($res);
-        $top_id = (int) $row[0];
    } else {
        exit('Unrecognized database!');
    }
+    $top_id = (int) $row[0];
    
    $offset += $increment;
    if ($offset > $top_id) { // test for end condition
--- a/release2-strict.php
+++ b/release2-strict.php
@@ -1,30 +0,0 @@
-<?php
-
-// Merges in changes from trunk to strict branch
-// WORKING COPY MUST BE POINTED TO STRICT BRANCH
-
-if (php_sapi_name() != 'cli') {
-    echo 'Release script cannot be called from web-browser.';
-    exit;
-}
-
-require 'svn.php';
-
-$svn_info = svn_info('.');
-
-$last_rev = (int) $svn_info['Last Changed Rev'];
-$trunk_url = $svn_info['Repository Root'] . '/htmlpurifier/trunk';
-echo "Last revision was $last_rev, merging from $last_rev to head.\n";
-
-$merge_cmd = "svn merge -r $last_rev:HEAD $trunk_url .";
-$out = explode("\n", shell_exec($merge_cmd));
-
-echo "Conflicted files:\n";
-foreach ($out as $line) {
-    if (empty($line)) continue;
-    if ($line{0} === 'C' || $line{1} === 'C') echo $line . "\n";
-}
-
-$version = trim(file_get_contents('VERSION'));
-echo "Resolve conflicts and then commit as 'Release $version, merged in $last_rev to HEAD.'";
-
--- a/release2-tag.php
+++ b/release2-tag.php
@@ -0,0 +1,20 @@
+<?php
+
+// Tags releases
+
+if (php_sapi_name() != 'cli') {
+    echo 'Release script cannot be called from web-browser.';
+    exit;
+}
+
+require 'svn.php';
+
+$svn_info = my_svn_info('.');
+
+$version = trim(file_get_contents('VERSION'));
+
+$trunk_url  = $svn_info['Repository Root'] . '/htmlpurifier/branches/php4';
+$trunk_tag_url  = $svn_info['Repository Root'] . '/htmlpurifier/tags/' . $version;
+
+echo "Tagging php4 branch to tags/$version...";
+passthru("svn copy --message \"Tag $version release.\" $trunk_url $trunk_tag_url");
--- a/release3-tag.php
+++ b/release3-tag.php
@@ -1,25 +0,0 @@
-<?php
-
-// Tags releases
-
-if (php_sapi_name() != 'cli') {
-    echo 'Release script cannot be called from web-browser.';
-    exit;
-}
-
-require 'svn.php';
-
-$svn_info = svn_info('.');
-
-$version = trim(file_get_contents('VERSION'));
-
-$trunk_url  = $svn_info['Repository Root'] . '/htmlpurifier/trunk';
-$strict_url = $svn_info['Repository Root'] . '/htmlpurifier/branches/strict';
-$trunk_tag_url  = $svn_info['Repository Root'] . '/htmlpurifier/tags/' . $version;
-$strict_tag_url = $svn_info['Repository Root'] . '/htmlpurifier/tags/' . $version . '-strict';
-
-echo "Tagging trunk to tags/$version...";
-passthru("svn copy --message \"Tag $version release.\" $trunk_url $trunk_tag_url");
-echo "Tagging strict to tags/$version-strict...";
-passthru("svn copy --message \"Tag $version-strict release.\" $strict_url $strict_tag_url");
-
--- a/svn.php
+++ b/svn.php
@@ -1,6 +1,6 @@
 <?php

-function svn_info($dir) {
+function my_svn_info($dir) {
    $raw = explode("\n", shell_exec("svn info $dir"));
    $svn_info = array();
    foreach ($raw as $r) {
--- a/tests/HTMLPurifier/AttrDef/CSS/BackgroundTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/BackgroundTest.php
@@ -14,6 +14,10 @@ class HTMLPurifier_AttrDef_CSS_BackgroundTest extends HTMLPurifier_AttrDefHarnes
        $valid = '#333 url(chess.png) repeat fixed 50% top';
        $this->assertDef($valid);
        $this->assertDef('url("chess.png") #333 50% top repeat fixed', $valid);
+        $this->assertDef(
+            'rgb(34, 56, 33) url(chess.png) repeat fixed top',
+            'rgb(34,56,33) url(chess.png) repeat fixed top'
+        );
        
    }
    
--- a/tests/HTMLPurifier/AttrDef/CSS/BorderTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/BorderTest.php
@@ -14,6 +14,7 @@ class HTMLPurifier_AttrDef_CSS_BorderTest extends HTMLPurifier_AttrDefHarness
        $this->assertDef('thick solid');
        $this->assertDef('solid red', 'solid #FF0000');
        $this->assertDef('1px solid #000');
+        $this->assertDef('1px solid rgb(0, 0, 0)', '1px solid rgb(0,0,0)');
        
    }
    
--- a/tests/HTMLPurifier/AttrDef/CSS/ColorTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/ColorTest.php
@@ -11,6 +11,8 @@ class HTMLPurifier_AttrDef_CSS_ColorTest extends HTMLPurifier_AttrDefHarness
        $this->def = new HTMLPurifier_AttrDef_CSS_Color();
        
        $this->assertDef('#F00');
+        $this->assertDef('#fff');
+        $this->assertDef('#eeeeee');
        $this->assertDef('#808080');
        $this->assertDef('rgb(255, 0, 0)', 'rgb(255,0,0)'); // rm spaces
        $this->assertDef('rgb(100%,0%,0%)');
@@ -27,6 +29,11 @@ class HTMLPurifier_AttrDef_CSS_ColorTest extends HTMLPurifier_AttrDefHarness
        // color keywords, of course
        $this->assertDef('red', '#FF0000');
        
+        // malformed hex declaration
+        $this->assertDef('808080', '#808080');
+        $this->assertDef('000000', '#000000');
+        $this->assertDef('fed', '#fed');
+        
        // maybe hex transformations would be another nice feature
        // at the very least transform rgb percent to rgb integer
        
--- a/tests/HTMLPurifier/AttrDef/CSS/FontFamilyTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/FontFamilyTest.php
@@ -20,7 +20,21 @@ class HTMLPurifier_AttrDef_CSS_FontFamilyTest extends HTMLPurifier_AttrDefHarnes
        $this->assertDef("John's Font", $d);
        $this->assertDef($d = "'\xE5\xAE\x8B\xE4\xBD\x93'");
        $this->assertDef("\xE5\xAE\x8B\xE4\xBD\x93", $d);
-        
+        $this->assertDef("'\\','f'", "'\\\\', f");
+        $this->assertDef("'\\01'", "''");
+        $this->assertDef("'\\20'", "' '");
+        $this->assertDef("\\0020", "'\\\\0020'");
+        $this->assertDef("'\\000045'", "E");
+        $this->assertDef("','", false);
+        $this->assertDef("',' foobar','", "' foobar'");
+        $this->assertDef("'\\27'", "'\''");
+        $this->assertDef('"\\22"', "'\"'");
+        $this->assertDef('"\\""', "'\"'");
+        $this->assertDef('"\'"', "'\\''");
+        $this->assertDef("'\\000045a'", "Ea");
+        $this->assertDef("'\\00045 a'", "Ea");
+        $this->assertDef("'\\00045  a'", "'E a'");
+        $this->assertDef("'\\\nf'", "f");
    }
    
 }
--- a/tests/HTMLPurifier/AttrDef/CSS/LengthTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/LengthTest.php
@@ -31,12 +31,20 @@ class HTMLPurifier_AttrDef_CSS_LengthTest extends HTMLPurifier_AttrDefHarness
    
    function testNonNegative() {
        
-        $this->def = new HTMLPurifier_AttrDef_CSS_Length(true);
+        $this->def = new HTMLPurifier_AttrDef_CSS_Length('0');
        
        $this->assertDef('3cm');
        $this->assertDef('-3mm', false);
        
    }
    
+    function testBounding() {
+        $this->def = new HTMLPurifier_AttrDef_CSS_Length('-1in', '1in');
+        $this->assertDef('1cm');
+        $this->assertDef('-1cm');
+        $this->assertDef('0');
+        $this->assertDef('1em', false);
+    }
+    
 }

--- a/tests/HTMLPurifier/AttrDef/CSS/TextDecorationTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/TextDecorationTest.php
@@ -10,6 +10,9 @@ class HTMLPurifier_AttrDef_CSS_TextDecorationTest extends HTMLPurifier_AttrDefHa
        
        $this->def = new HTMLPurifier_AttrDef_CSS_TextDecoration();
        
+        $this->assertDef('none');
+        $this->assertDef('none underline', 'underline');
+        
        $this->assertDef('underline');
        $this->assertDef('overline');
        $this->assertDef('line-through overline underline');
--- a/tests/HTMLPurifier/AttrDef/CSS/URITest.php
+++ b/tests/HTMLPurifier/AttrDef/CSS/URITest.php
@@ -29,7 +29,6 @@ class HTMLPurifier_AttrDef_CSS_URITest extends HTMLPurifier_AttrDefHarness
        // escaping
        $this->assertDef("url(http://www.example.com/foo,bar\))", 
            "url(http://www.example.com/foo\,bar\))");
-        
    }
    
 }
--- a/tests/HTMLPurifier/AttrDef/CSSTest.php
+++ b/tests/HTMLPurifier/AttrDef/CSSTest.php
@@ -107,6 +107,9 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
        $this->assertDef(' font-weight : bold; color : #ff0000',
                         'font-weight:bold;color:#ff0000;');
        
+        // case-insensitivity
+        $this->assertDef('FLOAT:LEFT;', 'float:left;');
+        
    }
    
 }
--- a/tests/HTMLPurifier/AttrDef/HTML/PixelsTest.php
+++ b/tests/HTMLPurifier/AttrDef/HTML/PixelsTest.php
@@ -36,5 +36,12 @@ class HTMLPurifier_AttrDef_HTML_PixelsTest extends HTMLPurifier_AttrDefHarness
        
    }
    
+    function test_make() {
+        $factory = new HTMLPurifier_AttrDef_HTML_Pixels();
+        $this->def = $factory->make('30');
+        $this->assertDef('25');
+        $this->assertDef('35', '30');
+    }
+    
 }

--- a/tests/HTMLPurifier/AttrDef/SwitchTest.php
+++ b/tests/HTMLPurifier/AttrDef/SwitchTest.php
@@ -0,0 +1,34 @@
+<?php
+
+require_once 'HTMLPurifier/AttrDef/Switch.php';
+
+class HTMLPurifier_AttrDef_SwitchTest extends HTMLPurifier_AttrDefHarness
+{
+    
+    var $with, $without;
+    
+    function setUp() {
+        parent::setUp();
+        generate_mock_once('HTMLPurifier_AttrDef');
+        $this->with = new HTMLPurifier_AttrDefMock();
+        $this->without = new HTMLPurifier_AttrDefMock();
+        $this->def = new HTMLPurifier_AttrDef_Switch('tag', $this->with, $this->without);
+    }
+    
+    function testWith() {
+        $token = new HTMLPurifier_Token_Start('tag');
+        $this->context->register('CurrentToken', $token);
+        $this->with->expectOnce('validate');
+        $this->with->setReturnValue('validate', 'foo');
+        $this->assertDef('bar', 'foo');
+    }
+    
+    function testWithout() {
+        $token = new HTMLPurifier_Token_Start('other-tag');
+        $this->context->register('CurrentToken', $token);
+        $this->without->expectOnce('validate');
+        $this->without->setReturnValue('validate', 'foo');
+        $this->assertDef('bar', 'foo');
+    }
+    
+}
--- a/tests/HTMLPurifier/AttrDef/TextTest.php
+++ b/tests/HTMLPurifier/AttrDef/TextTest.php
@@ -11,7 +11,7 @@ class HTMLPurifier_AttrDef_TextTest extends HTMLPurifier_AttrDefHarness
        $this->def = new HTMLPurifier_AttrDef_Text();
        
        $this->assertDef('This is spiffy text!');
-        $this->assertDef(" Casual\tCDATA parse\ncheck. ", 'Casual CDATA parsecheck.');
+        $this->assertDef(" Casual\tCDATA parse\ncheck. ", 'Casual CDATA parse check.');
        
    }
    
--- a/tests/HTMLPurifier/AttrDef/URI/HostTest.php
+++ b/tests/HTMLPurifier/AttrDef/URI/HostTest.php
@@ -17,6 +17,27 @@ class HTMLPurifier_AttrDef_URI_HostTest extends HTMLPurifier_AttrDefHarness
        $this->assertDef('124.15.6.89'); // IPv4
        $this->assertDef('www.google.com'); // reg-name
        
+        // more domain name tests
+        $this->assertDef('test.');
+        $this->assertDef('sub.test.');
+        $this->assertDef('.test', false);
+        $this->assertDef('ff');
+        $this->assertDef('1f', false);
+        $this->assertDef('-f', false);
+        $this->assertDef('f1');
+        $this->assertDef('f-', false);
+        $this->assertDef('sub.ff');
+        $this->assertDef('sub.1f', false);
+        $this->assertDef('sub.-f', false);
+        $this->assertDef('sub.f1');
+        $this->assertDef('sub.f-', false);
+        $this->assertDef('ff.top');
+        $this->assertDef('1f.top');
+        $this->assertDef('-f.top', false);
+        $this->assertDef('ff.top');
+        $this->assertDef('f1.top');
+        $this->assertDef('f-.top', false);
+        
    }
    
 }
--- a/tests/HTMLPurifier/AttrDef/URITest.php
+++ b/tests/HTMLPurifier/AttrDef/URITest.php
@@ -33,6 +33,19 @@ class HTMLPurifier_AttrDef_URITest extends HTMLPurifier_AttrDefHarness
        );
    }
    
+    function testPercentEncoding() {
+        $this->assertDef(
+            'http:colon:mercenary',
+            'colon%3Amercenary'
+        );
+    }
+    
+    function testPercentEncodingPreserve() {
+        $this->assertDef(
+            'http://www.example.com/abcABC123-_.!~*()\''
+        );
+    }
+    
    function testEmbeds() {
        $this->def = new HTMLPurifier_AttrDef_URI(true);
        $this->assertDef('http://sub.example.com/alas?foo=asd');
--- a/tests/HTMLPurifier/AttrDefTest.php
+++ b/tests/HTMLPurifier/AttrDefTest.php
@@ -12,8 +12,7 @@ class HTMLPurifier_AttrDefTest extends HTMLPurifier_Harness
        $this->assertIdentical('', $def->parseCDATA(''));
        $this->assertIdentical('', $def->parseCDATA("\t\n\r \t\t"));
        $this->assertIdentical('foo', $def->parseCDATA("\t\n\r foo\t\t"));
-        $this->assertIdentical('ignorelinefeeds', $def->parseCDATA("ignore\nline\nfeeds"));
-        $this->assertIdentical('translate to space', $def->parseCDATA("translate\rto\tspace"));
+        $this->assertIdentical('translate to space', $def->parseCDATA("translate\nto\tspace"));
        
    }
    
--- a/tests/HTMLPurifier/ChildDef/OptionalTest.php
+++ b/tests/HTMLPurifier/ChildDef/OptionalTest.php
@@ -19,5 +19,9 @@ class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness
        $this->assertResult('Not allowed text', '');
    }
    
+    function testEmpty() {
+        $this->assertResult('');
+    }
+    
 }

--- a/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
+++ b/tests/HTMLPurifier/ChildDef/StrictBlockquoteTest.php
@@ -74,10 +74,11 @@ extends HTMLPurifier_ChildDefHarness
    }
    
    function testError() {
-        $this->expectError('Cannot use non-block element as block wrapper');
+        // $this->expectError('Cannot use non-block element as block wrapper');
        $this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
        $this->config->set('HTML', 'BlockWrapper', 'dav');
        $this->assertResult('Needs wrap', '<p>Needs wrap</p>');
+        $this->swallowErrors();
    }
    
 }
--- a/tests/HTMLPurifier/DefinitionCacheFactoryTest.php
+++ b/tests/HTMLPurifier/DefinitionCacheFactoryTest.php
@@ -5,13 +5,14 @@ require_once 'HTMLPurifier/DefinitionCacheFactory.php';
 class HTMLPurifier_DefinitionCacheFactoryTest extends HTMLPurifier_Harness
 {
    
-    var $newFactory;
+    var $factory;
    var $oldFactory;
    
    function setup() {
-        $new = new HTMLPurifier_DefinitionCacheFactory();
+        parent::setup();
+        $this->factory = new HTMLPurifier_DefinitionCacheFactory();
        $this->oldFactory = HTMLPurifier_DefinitionCacheFactory::instance();
-        HTMLPurifier_DefinitionCacheFactory::instance($new);
+        HTMLPurifier_DefinitionCacheFactory::instance($this->factory);
    }
    
    function teardown() {
@@ -19,46 +20,52 @@ class HTMLPurifier_DefinitionCacheFactoryTest extends HTMLPurifier_Harness
    }
    
    function test_create() {
-        $config  = HTMLPurifier_Config::createDefault();
-        $factory = HTMLPurifier_DefinitionCacheFactory::instance();
-        $cache   = $factory->create('Test', $config);
+        $cache = $this->factory->create('Test', $this->config);
        $this->assertEqual($cache, new HTMLPurifier_DefinitionCache_Serializer('Test'));
    }
    
    function test_create_withDecorator() {
-        $config  = HTMLPurifier_Config::createDefault();
-        $factory =& HTMLPurifier_DefinitionCacheFactory::instance();
-        $factory->addDecorator('Memory');
-        $cache =& $factory->create('Test', $config);
+        $this->factory->addDecorator('Memory');
+        $cache = $this->factory->create('Test', $this->config);
        $cache_real = new HTMLPurifier_DefinitionCache_Decorator_Memory();
        $cache_real = $cache_real->decorate(new HTMLPurifier_DefinitionCache_Serializer('Test'));
        $this->assertEqual($cache, $cache_real);
    }
    
    function test_create_withDecoratorObject() {
-        $config  = HTMLPurifier_Config::createDefault();
-        $factory =& HTMLPurifier_DefinitionCacheFactory::instance();
-        $factory->addDecorator(new HTMLPurifier_DefinitionCache_Decorator_Memory());
-        $cache =& $factory->create('Test', $config);
+        $this->factory->addDecorator(new HTMLPurifier_DefinitionCache_Decorator_Memory());
+        $cache = $this->factory->create('Test', $this->config);
        $cache_real = new HTMLPurifier_DefinitionCache_Decorator_Memory();
        $cache_real = $cache_real->decorate(new HTMLPurifier_DefinitionCache_Serializer('Test'));
        $this->assertEqual($cache, $cache_real);
    }
    
    function test_create_recycling() {
-        $config  = HTMLPurifier_Config::createDefault();
-        $factory =& HTMLPurifier_DefinitionCacheFactory::instance();
-        $cache =& $factory->create('Test', $config);
-        $cache2 =& $factory->create('Test', $config);
+        $cache  =& $this->factory->create('Test', $this->config);
+        $cache2 =& $this->factory->create('Test', $this->config);
        $this->assertReference($cache, $cache2);
    }
    
+    function test_create_invalid() {
+        $this->config->set('Core', 'DefinitionCache', 'Invalid');
+        $this->expectError('Unrecognized DefinitionCache Invalid, using Serializer instead');
+        $cache = $this->factory->create('Test', $this->config);
+        $this->assertIsA($cache, 'HTMLPurifier_DefinitionCache_Serializer');
+    }
+    
    function test_null() {
-        $config = HTMLPurifier_Config::create(array('Core.DefinitionCache' => null));
-        $factory =& HTMLPurifier_DefinitionCacheFactory::instance();
-        $cache =& $factory->create('Test', $config);
+        $this->config->set('Core', 'DefinitionCache', null);
+        $cache = $this->factory->create('Test', $this->config);
        $this->assertEqual($cache, new HTMLPurifier_DefinitionCache_Null('Test'));
    }
    
+    function test_register() {
+        generate_mock_once('HTMLPurifier_DefinitionCache');
+        $this->config->set('Core', 'DefinitionCache', 'TestCache');
+        $this->factory->register('TestCache', $class = 'HTMLPurifier_DefinitionCacheMock');
+        $cache = $this->factory->create('Test', $this->config);
+        $this->assertIsA($cache, $class);
+    }
+    
 }

--- a/tests/HTMLPurifier/EncoderTest.php
+++ b/tests/HTMLPurifier/EncoderTest.php
@@ -9,6 +9,7 @@ class HTMLPurifier_EncoderTest extends HTMLPurifier_Harness
    
    function setUp() {
        $this->_entity_lookup = HTMLPurifier_EntityLookup::instance();
+        parent::setUp();
    }
    
    function assertCleanUTF8($string, $expect = null) {
@@ -26,93 +27,90 @@ class HTMLPurifier_EncoderTest extends HTMLPurifier_Harness
        $this->assertCleanUTF8("\xC2\x80", ''); // two byte invalid SGML
        $this->assertCleanUTF8("\xF3\xBF\xBF\xBF"); // valid four byte
        $this->assertCleanUTF8("\xDF\xFF", ''); // malformed UTF8
+        // invalid codepoints
+        $this->assertCleanUTF8("\xED\xB0\x80", '');
    }
    
-    function test_convertToUTF8() {
-        $config = HTMLPurifier_Config::createDefault();
-        $context = new HTMLPurifier_Context();
-        
+    function test_convertToUTF8_noConvert() {
        // UTF-8 means that we don't touch it
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertToUTF8("\xF6", $config, $context),
+            HTMLPurifier_Encoder::convertToUTF8("\xF6", $this->config, $this->context),
            "\xF6" // this is invalid
        );
-        $this->assertNoErrors();
+    }
    
-        $config = HTMLPurifier_Config::create(array(
-            'Core.Encoding' => 'ISO-8859-1'
-        ));
-        
-        // Now it gets converted
+    function test_convertToUTF8_iso8859_1() {
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertToUTF8("\xF6", $config, $context),
+            HTMLPurifier_Encoder::convertToUTF8("\xF6", $this->config, $this->context),
            "\xC3\xB6"
        );
+    }
    
-        $config = HTMLPurifier_Config::create(array(
-            'Core.Encoding' => 'ISO-8859-1',
-            'Test.ForceNoIconv' => true
-        ));
+    function test_convertToUTF8_withoutIconv() {
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
+        $this->config->set('Test', 'ForceNoIconv', true);
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertToUTF8("\xF6", $config, $context),
+            HTMLPurifier_Encoder::convertToUTF8("\xF6", $this->config, $this->context),
            "\xC3\xB6"
        );
        
    }
    
-    function test_convertFromUTF8() {
-        $config = HTMLPurifier_Config::createDefault();
-        $context = new HTMLPurifier_Context();
-        
-        // zhong-wen
-        $chinese = "\xE4\xB8\xAD\xE6\x96\x87 (Chinese)";
+    function getZhongWen() {
+        return "\xE4\xB8\xAD\xE6\x96\x87 (Chinese)";
+    }
    
+    function test_convertFromUTF8_utf8() {
        // UTF-8 means that we don't touch it
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $config, $context),
+            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $this->config, $this->context),
            "\xC3\xB6"
        );
+    }
    
-        $config = HTMLPurifier_Config::create(array(
-            'Core.Encoding' => 'ISO-8859-1'
-        ));
-        
-        // Now it gets converted
+    function test_convertFromUTF8_iso8859_1() {
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $config, $context),
+            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $this->config, $this->context),
            "\xF6"
        );
+    }
    
-        if (function_exists('iconv')) {
-            // iconv has it's own way
-            $this->assertIdentical(
-                HTMLPurifier_Encoder::convertFromUTF8($chinese, $config, $context),
-                " (Chinese)"
-            );
-        }
+    function test_convertFromUTF8_iconvNoChars() {
+        if (!function_exists('iconv')) return;
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
+        $this->assertIdentical(
+            HTMLPurifier_Encoder::convertFromUTF8($this->getZhongWen(), $this->config, $this->context),
+            " (Chinese)"
+        );
+    }
    
+    function test_convertFromUTF8_phpNormal() {
        // Plain PHP implementation has slightly different behavior
-        $config = HTMLPurifier_Config::create(array(
-            'Core.Encoding' => 'ISO-8859-1',
-            'Test.ForceNoIconv' => true
-        ));
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
+        $this->config->set('Test', 'ForceNoIconv', true);
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $config, $context),
+            HTMLPurifier_Encoder::convertFromUTF8("\xC3\xB6", $this->config, $this->context),
            "\xF6"
        );
+    }
    
+    function test_convertFromUTF8_phpNoChars() {
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
+        $this->config->set('Test', 'ForceNoIconv', true);
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertFromUTF8($chinese, $config, $context),
+            HTMLPurifier_Encoder::convertFromUTF8($this->getZhongWen(), $this->config, $this->context),
            "?? (Chinese)"
        );
+    }
    
+    function test_convertFromUTF8_withProtection() {
        // Preserve the characters!
-        $config = HTMLPurifier_Config::create(array(
-            'Core.Encoding' => 'ISO-8859-1',
-            'Core.EscapeNonASCIICharacters' => true
-        ));
+        $this->config->set('Core', 'Encoding', 'ISO-8859-1');
+        $this->config->set('Core', 'EscapeNonASCIICharacters', true);
        $this->assertIdentical(
-            HTMLPurifier_Encoder::convertFromUTF8($chinese, $config, $context),
+            HTMLPurifier_Encoder::convertFromUTF8($this->getZhongWen(), $this->config, $this->context),
            "&#20013;&#25991; (Chinese)"
        );
        
@@ -139,5 +137,39 @@ class HTMLPurifier_EncoderTest extends HTMLPurifier_Harness
        
    }
    
+    function assertASCIISupportCheck($enc, $ret) {
+        $test = HTMLPurifier_Encoder::testEncodingSupportsASCII($enc, true);
+        if ($test === false) return;
+        $this->assertIdentical(
+            HTMLPurifier_Encoder::testEncodingSupportsASCII($enc),
+            $ret
+        );
+        $this->assertIdentical(
+            HTMLPurifier_Encoder::testEncodingSupportsASCII($enc, true),
+            $ret
+        );
+    }
+    
+    function test_testEncodingSupportsASCII() {
+        $this->assertASCIISupportCheck('Shift_JIS', array("\xC2\xA5" => '\\', "\xE2\x80\xBE" => '~'));
+        $this->assertASCIISupportCheck('JOHAB', array("\xE2\x82\xA9" => '\\'));
+        $this->assertASCIISupportCheck('ISO-8859-1', array());
+        $this->assertASCIISupportCheck('dontexist', array()); // canary
+    }
+    
+    function testShiftJIS() {
+        if (!function_exists('iconv')) return;
+        $this->config->set('Core', 'Encoding', 'Shift_JIS');
+        // This actually looks like a Yen, but we're going to treat it differently
+        $this->assertIdentical(
+            HTMLPurifier_Encoder::convertFromUTF8('\\~', $this->config, $this->context),
+            '\\~'
+        );
+        $this->assertIdentical(
+            HTMLPurifier_Encoder::convertToUTF8('\\~', $this->config, $this->context),
+            '\\~'
+        );
+    }
+    
 }

--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Edward Z. Yang	f38e81785f	Release 2.1.5 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1814 48356398-32a2-884e-a903-53898d9a118a	2008-06-19 22:57:15 +00:00
Edward Z. Yang	2cc829a8cf	Fix PHP 4.3.9/10 bug with float handling git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1806 48356398-32a2-884e-a903-53898d9a118a	2008-06-19 21:13:56 +00:00
Edward Z. Yang	e80a54a7c9	Add missing include. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1805 48356398-32a2-884e-a903-53898d9a118a	2008-06-19 19:58:53 +00:00
Edward Z. Yang	6f71e65661	[2.1.5] [MFH] Fix text-decoration: none bug git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1800 48356398-32a2-884e-a903-53898d9a118a	2008-06-17 03:18:23 +00:00
Edward Z. Yang	6f25c39c3e	[2.1.5] [MFH] Fix Shift_JIS bug. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1793 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 19:01:22 +00:00
Edward Z. Yang	b8b1ac283d	[2.1.5] [MFH] Fix regression in FontFamily git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1792 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 18:54:19 +00:00
Edward Z. Yang	450fc6649d	[2.1.5] [MFH] Fix Shift_JIS encoding wonkiness with yen symbols and whatnot, as well as other patches git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1791 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 18:49:56 +00:00
Edward Z. Yang	369a69d533	[2.1.5] [MFH] Fix stray backslashes in font-family. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1790 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 17:43:48 +00:00
Edward Z. Yang	72f5819ef6	[2.1.5] [MFH] Round up imagecrash support with HTML.MaxImgLength git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1789 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 17:38:25 +00:00
Edward Z. Yang	3540ea7fce	[2.1.5] [MFH] Make modules use setup($config) instead of constructor git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1788 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 17:10:39 +00:00
Edward Z. Yang	c03953f85e	[2.1.5] [MFH] Percent encode query and hash, and lazy update with attr validator git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1787 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 04:00:06 +00:00
Edward Z. Yang	0d262b3a1d	Add missing bits from previous commit. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1786 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 01:56:22 +00:00
Edward Z. Yang	234cd2196f	[2.1.5] [MFH] Complete the imagecrash added protection fixes git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1785 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 01:53:31 +00:00
Edward Z. Yang	0dbe87bbc7	[2.1.5] [MFH] Disable Tidy tests git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1784 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 01:25:05 +00:00
Edward Z. Yang	245b5bdb27	Merged r1746: Length and UnitConverter implementation. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1783 48356398-32a2-884e-a903-53898d9a118a	2008-06-11 01:21:36 +00:00
Edward Z. Yang	864cb9e136	- Fix tagging script to work off of php4 - Fix svn.php to not clobber svn extension - Update NEWS git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1743 48356398-32a2-884e-a903-53898d9a118a	2008-05-18 20:12:17 +00:00
Edward Z. Yang	487fcd55ea	Release 2.1.4 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1736 48356398-32a2-884e-a903-53898d9a118a	2008-05-18 18:56:27 +00:00
Edward Z. Yang	ec6b6821cf	[2.1.4] Add information about PHP 5.0.5 or earlier. - Fix segfault in 5.0.x with IDAccumulator test. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1726 48356398-32a2-884e-a903-53898d9a118a	2008-05-16 01:25:22 +00:00
Edward Z. Yang	f26eb7551a	[2.1.4] [MFH] Fixed bug with fallback languages in LanguageFactory git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1724 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 23:20:21 +00:00
Edward Z. Yang	a2aca4819d	[2.1.4] [MFH] Revamp URI handling of percent encoding and validation from r1709 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1721 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:30:20 +00:00
Edward Z. Yang	a75e4c6b7c	[2.1.4] [MFH] getInstance -> instance from r1689 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1720 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:24:34 +00:00
Edward Z. Yang	e7fa8cbdd5	[2.1.4] [MFH] Add protection against imagecrash attack with CSS height/width from r1684 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1719 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:21:37 +00:00
Edward Z. Yang	5fa575f8ac	[2.1.4] [MFH] Encoder optimization and shut-up operator bugfix from r1680 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1718 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:16:36 +00:00
Edward Z. Yang	9f23bc005b	[2.1.4] [MFH] addAttribute() can be called multiple times, from r1634 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1717 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:13:11 +00:00
Edward Z. Yang	957a840f54	[2.1.4] [MFH] Fix bug with rgb(0, 1, 2) color syntax with spaces inside shorthand syntax from r1612 - Also, repair botched comment patch git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1716 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 05:04:39 +00:00
Edward Z. Yang	a7762c5137	[2.1.4] [MFH] Fix bug in comment parsing with DirectLex from r1570 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1715 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:43:52 +00:00
Edward Z. Yang	aca9d725ed	[2.1.4] [MFH] Fix bug with trusted script handling in libxml versions later than 2.6.28 from r1553. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1714 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:40:13 +00:00
Edward Z. Yang	4ce3deba26	[2.1.4] [MFH] Recursive auto-close with <span><span><div> from r1492 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1713 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:32:05 +00:00
Edward Z. Yang	d4da02ba95	[2.1.4] [MFH] Case-insensitive CSS from r1461 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1712 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:26:30 +00:00
Edward Z. Yang	97d3c8509c	[2.1.4] [MFH] register() for DefinitionCacheFactory from r1464 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1711 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:21:23 +00:00
Edward Z. Yang	21c6803401	[2.1.4] [MFH] Color and CSS bugfixes from r1473 git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1710 48356398-32a2-884e-a903-53898d9a118a	2008-05-15 04:01:45 +00:00
Edward Z. Yang	36badb06f6	Branch out PHP 4 development: we're going PHP 5! git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/php4@1455 48356398-32a2-884e-a903-53898d9a118a	2007-11-23 21:18:32 +00:00
Edward Z. Yang	4066416160	Slight clarification of where ElementDef's required_attr property gets populated git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1454 48356398-32a2-884e-a903-53898d9a118a	2007-11-13 02:49:47 +00:00
Edward Z. Yang	fad6aa45fa	Make phpdoc more efficient, ignore the conf directory git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1450 48356398-32a2-884e-a903-53898d9a118a	2007-11-06 17:50:30 +00:00
Edward Z. Yang	a7e6d85f6d	Update PEAR packager - Ignore standalone directories - Normalize base directory with realpath git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1447 48356398-32a2-884e-a903-53898d9a118a	2007-11-06 16:37:25 +00:00
Edward Z. Yang	c330860606	Release 2.1.3. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1443 48356398-32a2-884e-a903-53898d9a118a	2007-11-06 03:39:59 +00:00
Edward Z. Yang	0ea53e5a3d	Make multitest.php also manage standalone version testing. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1442 48356398-32a2-884e-a903-53898d9a118a	2007-11-06 03:34:45 +00:00
Edward Z. Yang	68167176dc	[2.1.3] - Officially support 4.3.7 and up - Modify PH5P to remove incompatible parameter type def - Add more versions to multitest git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1441 48356398-32a2-884e-a903-53898d9a118a	2007-11-05 05:25:59 +00:00
Edward Z. Yang	bb08f679f0	[2.1.3] - Work around unnecessary DOMElement type-cast in PH5P that caused errors in PHP 5.1 - Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only HTMLDefinition errors, this may indicate problems with error-collecting facilities in PHP 5 - Make ErrorCollectorEMock work in both PHP 4 and PHP 5 . tests/multitest.php allows you to test multiple versions by running tests/index.php through multiple interpreters using `phpv` shell script (you must provide this script!) . Minor cosmetic change to flush-definition-cache.php: trailing newline is outputted . Maintenance script for generating PH5P patch added, original PH5P source file also added under version control . Full unit test runner script title made more descriptive with PHP version git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1440 48356398-32a2-884e-a903-53898d9a118a	2007-11-05 05:01:51 +00:00
Edward Z. Yang	8cd1806ec8	Update INSTALL file with better instructions. Translation needs updating. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1439 48356398-32a2-884e-a903-53898d9a118a	2007-11-05 03:40:32 +00:00
Edward Z. Yang	1274cfed49	[2.1.3] Fix possible error in DirectLex reported by Nate Abele git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1438 48356398-32a2-884e-a903-53898d9a118a	2007-11-05 03:22:22 +00:00
Edward Z. Yang	1ab47ba949	Update NEWS. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1436 48356398-32a2-884e-a903-53898d9a118a	2007-11-02 03:20:55 +00:00
Edward Z. Yang	da95ee096a	Beef up HTML Purifier help message. Todo: make it hideable. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1435 48356398-32a2-884e-a903-53898d9a118a	2007-11-02 01:55:45 +00:00
Edward Z. Yang	6d7250c309	Update Doxygen file after doxygen -u command git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1429 48356398-32a2-884e-a903-53898d9a118a	2007-10-30 03:08:06 +00:00
Edward Z. Yang	df55df1083	Update Doxyfile with new paths, also exclude standalone directory git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1428 48356398-32a2-884e-a903-53898d9a118a	2007-10-30 02:46:26 +00:00
Edward Z. Yang	1a8d864a42	Have tests also check for test-settings in conf file, this allows for configuration files to be separately versioned git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1427 48356398-32a2-884e-a903-53898d9a118a	2007-10-30 02:26:11 +00:00
Edward Z. Yang	552102f7f2	[2.1.3] - HTMLDefinition->addElement now returns a reference to the created element object, as implied by the documentation . Extend Injector hooks to allow for more powerful injector routines . HTMLDefinition->addBlankElement created, as according to the HTMLModule method git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1425 48356398-32a2-884e-a903-53898d9a118a	2007-10-02 22:50:59 +00:00
Edward Z. Yang	f5371bbad4	[2.1.3] - Buggy treatment of end tags of elements that have required attributes fixed (does not manifest on default tag-set) - Spurious internal content reorganization error suppressed . Error unit tests can now specify the expectation of no errors. Future iterations of the harness will be extremely strict about what errors are allowed git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1424 48356398-32a2-884e-a903-53898d9a118a	2007-10-02 01:19:46 +00:00
Edward Z. Yang	c8b020879d	[2.1.3] Refine injector algorithm regarding behavior inside nodes that allow paragraphs inside them git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1423 48356398-32a2-884e-a903-53898d9a118a	2007-09-27 00:39:05 +00:00
Edward Z. Yang	094b20f58f	[2.1.3] Fix PHP warning from MakeAbsolute, also improve URIFilter documentation git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1422 48356398-32a2-884e-a903-53898d9a118a	2007-09-27 00:07:27 +00:00
Edward Z. Yang	f2df669eec	Refactor IDAccumulator so that unit tests now work, and initialization is inside the class. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1421 48356398-32a2-884e-a903-53898d9a118a	2007-09-26 23:36:37 +00:00
Edward Z. Yang	ca43df9fdd	[2.1.3] Fatal error when <img> tag (or any other element with required attributes) has 'id' attribute fixed, thanks NykO18 for reporting git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1420 48356398-32a2-884e-a903-53898d9a118a	2007-09-26 23:18:24 +00:00
Edward Z. Yang	5f76796e14	Some small doc updates git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1419 48356398-32a2-884e-a903-53898d9a118a	2007-09-25 02:42:35 +00:00
Edward Z. Yang	1f9a6ba30e	[2.1.3] Activate strict blockquote functionality for HTML 4.01 Strict. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1417 48356398-32a2-884e-a903-53898d9a118a	2007-09-09 01:46:59 +00:00
Edward Z. Yang	ccca8cc34f	[2.1.3] Rename configuration directive git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1416 48356398-32a2-884e-a903-53898d9a118a	2007-09-09 01:35:50 +00:00
Edward Z. Yang	28c29656af	[2.1.3] Fix off-by-one bug in injector functionality for dormant injectors git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1415 48356398-32a2-884e-a903-53898d9a118a	2007-09-09 01:27:09 +00:00
Edward Z. Yang	88f4f57a47	[2.1.3] Fix poor include ordering. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1414 48356398-32a2-884e-a903-53898d9a118a	2007-09-06 19:38:12 +00:00
Edward Z. Yang	43a98de909	Fix up some comments, reduce code duplication. git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1409 48356398-32a2-884e-a903-53898d9a118a	2007-09-04 00:15:07 +00:00
@@ -1 +1 @@
 .1.2
 .1.5