Release 4.3.0

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Fix CSS URL innerHTML/cssText escaping bug.
2025-08-06 14:16:32 +02:00 · 2011-03-27 23:02:49 +01:00 · 2011-03-27 21:24:32 +01:00 · 2011-03-27 20:35:43 +01:00 · 2011-03-27 11:50:52 +01:00 · 2011-03-24 22:54:39 +00:00
140 changed files with 2784 additions and 888 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -18,3 +18,5 @@ docs/doxygen*
 *.phpt.php
 *.phpt.skip.php
 *.htmlt.ini
+*.patch
+/*.php
--- a/2
+++ b/2
@@ -31,7 +31,7 @@ PROJECT_NAME           = HTMLPurifier
 # This could be handy for archiving the generated documentation or
 # if some version control system is used.

-PROJECT_NUMBER         = 4.0.0
+PROJECT_NUMBER         = 4.3.0

 # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
 # base path where the generated documentation will be put.
--- a/2
+++ b/2
@@ -1,4 +1,4 @@
-5 - Major feature enhancements
+9 - Major security fixes

 [ Appendix A: Release focus IDs ]
 0 - N/A
--- a/1
+++ b/1
@@ -18,6 +18,7 @@ with these contents.
 HTML Purifier is PHP 5 only, and is actively tested from PHP 5.0.5 and
 up. It has no core dependencies with other libraries. PHP
 4 support was deprecated on December 31, 2007 with HTML Purifier 3.0.0.
+HTML Purifier is not compatible with zend.ze1_compatibility_mode.

 These optional extensions can enhance the capabilities of HTML Purifier:

--- a/97
+++ b/97
@@ -9,6 +9,103 @@ NEWS ( CHANGELOG and HISTORY )                                     HTMLPurifier
    . Internal change
 ==========================

+4.3.0, released 2011-03-27
+# Fixed broken caching of customized raw definitions, but requires an
+  API change.  The old API still works but will emit a warning,
+  see http://htmlpurifier.org/docs/enduser-customize.html#optimized
+  for how to upgrade your code.
+# Protect against Internet Explorer innerHTML behavior by specially
+  treating attributes with backticks but no angled brackets, quotes or
+  spaces.  This constitutes a slight semantic change, which can be
+  reverted using %Output.FixInnerHTML.  Reported by Neike Taika-Tessaro
+  and Mario Heiderich.
+# Protect against cssText/innerHTML by restricting allowed characters
+  used in fonts further than mandated by the specification and encoding
+  some extra special characters in URLs.  Reported by Neike
+  Taika-Tessaro and Mario Heiderich.
+! Added %HTML.Nofollow to add rel="nofollow" to external links.
+! More types of SPL autoloaders allowed on later versions of PHP.
+! Implementations for position, top, left, right, bottom, z-index
+  when %CSS.Trusted is on.
+! Add %Cache.SerializerPermissions option for custom serializer
+  directory/file permissions
+! Fix longstanding bug in Flash support for non-IE browsers, and
+  allow more wmode attributes.
+! Add %CSS.AllowedFonts to restrict permissible font names.
+- Switch to an iterative traversal of the DOM, which prevents us
+  from running out of stack space for deeply nested documents.
+  Thanks Maxim Krizhanovsky for contributing a patch.
+- Make removal of conditional IE comments ungreedy; thanks Bernd
+  for reporting.
+- Escape CDATA before removing Internet Explorer comments.
+- Fix removal of id attributes under certain conditions by ensuring
+  armor attributes are preserved when recreating tags.
+- Check if schema.ser was corrupted.
+- Check if zend.ze1_compatibility_mode is on, and error out if it is.
+  This safety check is only done for HTMLPurifier.auto.php; if you
+  are using standalone or the specialized includes files, you're
+  expected to know what you're doing.
+- Stop repeatedly writing the cache file after I'm done customizing a
+  raw definition.  Reported by ajh.
+- Switch to using require_once in the Bootstrap to work around bad
+  interaction with Zend Debugger and APC.  Reported by Antonio Parraga.
+- Fix URI handling when hostname is missing but scheme is present.
+  Reported by Neike Taika-Tessaro.
+- Fix missing numeric entities on DirectLex; thanks Neike Taika-Tessaro
+  for reporting.
+- Fix harmless notice from indexing into empty string.  Thanks Matthijs
+  Kooijman <matthijs@stdin.nl> for reporting.
+- Don't autoclose no parent elements are able to support the element
+  that triggered the autoclose.  In particular fixes strange behavior
+  of stray <li> tags.  Thanks pkuliga@gmail.com for reporting and
+  Neike Taika-Tessaro <pinkgothic@gmail.com> for debugging assistance.
+
+4.2.0, released 2010-09-15
+! Added %Core.RemoveProcessingInstructions, which lets you remove
+  <? ... ?> statements.
+! Added %URI.DisableResources functionality; the directive originally
+  did nothing.  Thanks David Rothstein for reporting.
+! Add documentation about configuration directive types.
+! Add %CSS.ForbiddenProperties configuration directive.
+! Add %HTML.FlashAllowFullScreen to permit embedded Flash objects
+  to utilize full-screen mode.
+! Add optional support for the <code>file</code> URI scheme, enable
+  by explicitly setting %URI.AllowedSchemes.
+! Add %Core.NormalizeNewlines options to allow turning off newline
+  normalization.
+- Fix improper handling of Internet Explorer conditional comments
+  by parser.  Thanks zmonteca for reporting.
+- Fix missing attributes bug when running on Mac Snow Leopard and APC.
+  Thanks sidepodcast for the fix.
+- Warn if an element is allowed, but an attribute it requires is
+  not allowed.
+
+4.1.1, released 2010-05-31
+- Fix undefined index warnings in maintenance scripts.
+- Fix bug in DirectLex for parsing elements with a single attribute
+  with entities.
+- Rewrite CSS output logic for font-family and url().  Thanks Mario
+  Heiderich <mario.heiderich@googlemail.com> for reporting and Takeshi
+  Terada <t-terada@violet.plala.or.jp> for suggesting the fix.
+- Emit an error for CollectErrors if a body is extracted
+- Fix bug where in background-position for center keyword handling.
+- Fix infinite loop when a wrapper element is inserted in a context
+  where it's not allowed.  Thanks Lars <lars@renoz.dk> for reporting.
+- Remove +x bit and shebang from index.php; only supported mode is to
+  explicitly call it with php.
+- Make test script less chatty when log_errors is on.
+
+4.1.0, released 2010-04-26
+! Support proprietary height attribute on table element
+! Support YouTube slideshows that contain /cp/ in their URL.
+! Support for data: URI scheme; not enabled by default, add it using
+  %URI.AllowedSchemes
+! Support flashvars when using %HTML.SafeObject and %HTML.SafeEmbed.
+! Support for Internet Explorer compatibility with %HTML.SafeObject
+  using %Output.FlashCompat.
+! Handle <ol><ol> properly, by inserting the necessary <li> tag.
+- Always quote the insides of url(...) in CSS.
+
 4.0.0, released 2009-07-07
 # APIs for ConfigSchema subsystem have substantially changed. See
  docs/dev-config-bcbreaks.txt for details; in essence, anything that
--- a/87
+++ b/87
@@ -11,22 +11,38 @@ If no interest is expressed for a feature that may require a considerable
 amount of effort to implement, it may get endlessly delayed. Do not be
 afraid to cast your vote for the next feature to be implemented!

- Built-in support for target="_blank" on all external links
- Incorporate data: support as implemented here:
-  http://htmlpurifier.org/phorum/read.php?3,3491,3548
- Fix ImgRequired to handle data correctly
- Incorporate download and resize support as implemented here:
-  http://htmlpurifier.org/phorum/read.php?3,2795,3628
- Think about allowing explicit order of operations hooks for transforms
- Add "register" field to config schemas to eliminate dependence on
-  naming conventions
- Add examples to everything (make built-in which also automatically
-  gives output)
+Things to do as soon as possible:
+
+ - Think about allowing explicit order of operations hooks for transforms
+ - Inputs don't do the right thing with submit
+ - Fix "<.<" bug (trailing < is removed if not EOD)
+ - Build in better internal state dumps and debugging tools for remote
+   debugging
+ - Allowed/Allowed* have strange interactions when both set
+ - Transform lone embeds into object tags
+ - Deprecated config options that emit warnings when you set them (with'
+   a way of muting the warning if you really want to)
+ - Make HTML.Trusted work with Output.FlashCompat

 FUTURE VERSIONS
 ---------------

-4.1 release [It's All About Trust] (floating)
+4.4 release [OMG CONFIG PONIES]
+ ! Fix Printer. It's from the old days when we didn't have decent XML classes
+ ! Factor demo.php into a set of Printer classes, and then create a stub
+   file for users here (inside the actual HTML Purifier library)
+ - Fix error handling with form construction
+ - Do encoding validation in Printers, or at least, where user data comes in
+ - Config: Add examples to everything (make built-in which also automatically
+   gives output)
+ - Add "register" field to config schemas to eliminate dependence on
+   naming conventions (try to remember why we ultimately decided on tihs)
+
+5.0 release [HTML 5]
+ # Swap out code to use html5lib tokenizer and tree-builder
+ ! Allow turning off of FixNesting and required attribute insertion
+
+5.1 release [It's All About Trust] (floating)
 # Implement untrusted, dangerous elements/attributes
 # Implement IDREF support (harder than it seems, since you cannot have
   IDREFs to non-existent IDs)
@@ -35,36 +51,23 @@ FUTURE VERSIONS
 # Frameset XHTML 1.0 and HTML 4.01 doctypes
 - Figure out how to simultaneously set %CSS.Trusted and %HTML.Trusted (?)

-4.2 release [Error'ed]
+5.2 release [Error'ed]
 # Error logging for filtering/cleanup procedures
- - XSS-attempt detection--certain errors are flagged XSS-like
-
-4.3 release [Do What I Mean, Not What I Say]
 # Additional support for poorly written HTML
    - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
    - Friendly strict handling of <address> (block -> <br>)
- ? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
-    1. Analyzing which tags to remove duplicants
-    2. Ensure attributes are merged into the parent tag
-    3. Extend the tag exclusion system to specify whether or not the
-    contents should be dropped or not (currently, there's code that could do
-    something like this if it didn't drop the inner text too.)
- - Remove <span> tags that don't do anything (no attributes)
+ - XSS-attempt detection--certain errors are flagged XSS-like
 - Append something to duplicate IDs so they're still usable (impl. note: the
   dupe detector would also need to detect the suffix as well)
- - Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg

-5.0 release [Beyond HTML]
+6.0 release [Beyond HTML]
 # Legit token based CSS parsing (will require revamping almost every
-   AttrDef class). Probably will use CSSTidy class?
+   AttrDef class). Probably will use CSSTidy
 # More control over allowed CSS properties using a modularization
- # HTML 5 support
 # IRI support (this includes IDN)
 - Standardize token armor for all areas of processing
- - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
-   Also, enable disabling of directionality

-6.0 release [To XML and Beyond]
+7.0 release [To XML and Beyond]
 - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
    - Hooks for adding custom processors to custom namespaced tags and
      attributes, offer default implementation
@@ -75,25 +78,14 @@ Ongoing
 - Refactor unit tests into lots of test methods
 - Plugins for major CMSes (COMPLEX)
    - phpBB
-    - Drupal needs loving!
-    - Phorum need loving!
-    - more! (look for ones that use WYSIWYGs)
-    - Also, maybe a FAQ for extension writers with HTML Purifier
+    - Also, a FAQ for extension writers with HTML Purifier

 AutoFormat
 - Smileys
 - Syntax highlighting (with GeSHi) with <pre> and possibly <?php
 - Look at http://drupal.org/project/Modules/category/63 for ideas

-Optimizations
- - Reduce size of internal data-structures (esp. HTMLDefinition)
- - Get PH5P working with the latest versions of DOM, which have much more
-   stringent error checking procedures. Maybe convert straight to tokens.
- - Get rid of set_include_path(). Save this for another major release.
-
 Neat feature related
- ! Factor demo.php into a set of Printer classes, and then create a stub
-   file for users here (inside the actual HTML Purifier library)
 ! Support exporting configuration, so users can easily tweak settings
   in the demo, and then copy-paste into their own setup
 - Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
@@ -110,10 +102,21 @@ Neat feature related
 - Full set of color keywords. Also, a way to add onto them without
   finalizing the configuration object.
 - Write a var_export and memcached DefinitionCache - Denis
+ - Built-in support for target="_blank" on all external links
+ - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
+   Also, enable disabling of directionality
+ ? Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg
+ ? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
+    1. Analyzing which tags to remove duplicants
+    2. Ensure attributes are merged into the parent tag
+    3. Extend the tag exclusion system to specify whether or not the
+    contents should be dropped or not (currently, there's code that could do
+    something like this if it didn't drop the inner text too.)

 Maintenance related (slightly boring)
 # CHMOD install script for PEAR installs
 ! Factor out command line parser into its own class, and unit test it
+ - Reduce size of internal data-structures (esp. HTMLDefinition)
 - Allow merging configurations.  Thus,
        a -> b -> default
        c -> d -> default
--- a/2
+++ b/2
@@ -1 +1 @@
-4.0.0
+4.3.0
--- a/15
+++ b/15
@@ -1,7 +1,8 @@
-HTML Purifier 4.0 is a major feature release focused on configuration
-It deprecates the $config->set('Ns', 'Directive', $value) syntax for
-$config->set('Ns.Directive', $value); both syntaxes work but the
-former will throw errors.  There are also some new features:  robust
-support for name/id, configuration inheritance, remove nbsp in
-the RemoveEmpty autoformatter, userland configuration directives
-and configuration serialization.
+HTML Purifier 4.3.0 is a major security release addressing various
+security vulnerabilities related to user-submitted code and legitimate
+client-side scripts.  It also contains an accumulation of new features
+and bugfixes over half a year.  New configuration options include
+%CSS.Trusted, %CSS.AllowedFonts and %Cache.SerializerPermissions.
+There is a backwards-incompatible API change for customized raw
+definitions, see <http://htmlpurifier.org/docs/enduser-customize.html#optimized>
+for details.
--- a/configdoc/styles/plain.xsl
+++ b/configdoc/styles/plain.xsl
@@ -40,12 +40,26 @@
                            </xsl:apply-templates>
                        </ul>
                    </div>
+                    <div id="typesContainer">
+                        <h2>Types</h2>
+                        <xsl:apply-templates select="$typeLookup" mode="types" />
+                    </div>
                    <xsl:apply-templates />
                </div>
            </body>
        </html>
    </xsl:template>

+    <xsl:template match="type" mode="types">
+        <div class="type-block">
+            <xsl:attribute name="id">type-<xsl:value-of select="@id" /></xsl:attribute>
+            <h3><code><xsl:value-of select="@id" /></code>: <xsl:value-of select="@name" /></h3>
+            <div class="type-description">
+                <xsl:copy-of xmlns:xhtml="http://www.w3.org/1999/xhtml" select="xhtml:div/node()" />
+            </div>
+        </div>
+    </xsl:template>
+
    <xsl:template match="title" mode="toc" />
    <xsl:template match="namespace" mode="toc">
        <xsl:param name="overflowNumber" />
@@ -192,10 +206,13 @@
            <td>
                <xsl:variable name="type" select="text()" />
                <xsl:attribute name="class">type type-<xsl:value-of select="$type" /></xsl:attribute>
-                <xsl:value-of select="$typeLookup/type[@id=$type]/text()" />
-                <xsl:if test="@allow-null='yes'">
-                    (or null)
-                </xsl:if>
+                <a>
+                    <xsl:attribute name="href">#type-<xsl:value-of select="$type" /></xsl:attribute>
+                    <xsl:value-of select="$typeLookup/type[@id=$type]/@name" />
+                    <xsl:if test="@allow-null='yes'">
+                        (or null)
+                    </xsl:if>
+                </a>
            </td>
        </tr>
    </xsl:template>
--- a/configdoc/types.xml
+++ b/configdoc/types.xml
@@ -1,16 +1,68 @@
 <?xml version="1.0" encoding="UTF-8"?>
 <types>
-  <type id="string">String</type>
-  <type id="istring">Case-insensitive string</type>
-  <type id="text">Text</type>
-  <type id="itext">Case-insensitive text</type>
-  <type id="int">Integer</type>
-  <type id="float">Float</type>
-  <type id="bool">Boolean</type>
-  <type id="lookup">Lookup array</type>
-  <type id="list">Array list</type>
-  <type id="hash">Associative array</type>
-  <type id="mixed">Mixed</type>
+  <type id="string" name="String"><div xmlns="http://www.w3.org/1999/xhtml">
+    A <a
+    href="http://docs.php.net/manual/en/language.types.string.php">sequence
+    of characters</a>.
+  </div></type>
+  <type id="istring" name="Case-insensitive string"><div xmlns="http://www.w3.org/1999/xhtml">
+    A series of case-insensitive characters.  Internally, upper-case
+    ASCII characters will be converted to lower-case.
+  </div></type>
+  <type id="text" name="Text"><div xmlns="http://www.w3.org/1999/xhtml">
+    A series of characters that may contain newlines.  Text tends to
+    indicate human-oriented text, as opposed to a machine format.
+  </div></type>
+  <type id="itext" name="Case-insensitive text"><div xmlns="http://www.w3.org/1999/xhtml">
+    A series of case-insensitive characters that may contain newlines.
+  </div></type>
+  <type id="int" name="Integer"><div xmlns="http://www.w3.org/1999/xhtml">
+    An <a
+      href="http://docs.php.net/manual/en/language.types.integer.php">
+      integer</a>.  You are alternatively permitted to pass a string of
+    digits instead, which will be cast to an integer using
+    <code>(int)</code>.
+  </div></type>
+  <type id="float" name="Float"><div xmlns="http://www.w3.org/1999/xhtml">
+    A <a href="http://docs.php.net/manual/en/language.types.float.php">
+      floating point number</a>.  You are alternatively permitted to
+    pass a numeric string (as defined by <code>is_numeric()</code>),
+    which will be cast to a float using <code>(float)</code>.
+  </div></type>
+  <type id="bool" name="Boolean"><div xmlns="http://www.w3.org/1999/xhtml">
+    A <a
+      href="http://docs.php.net/manual/en/language.types.boolean.php">boolean</a>.
+    You are alternatively permitted to pass an integer <code>0</code> or
+    <code>1</code> (other integers are not permitted) or a string
+    <code>"on"</code>, <code>"true"</code> or <code>"1"</code> for
+    <code>true</code>, and <code>"off"</code>, <code>"false"</code> or
+    <code>"0"</code> for <code>false</code>.
+  </div></type>
+  <type id="lookup" name="Lookup array"><div xmlns="http://www.w3.org/1999/xhtml">
+    An array whose values are <code>true</code>, e.g. <code>array('key'
+      => true, 'key2' => true)</code>.  You are alternatively permitted
+    to pass an array list of the keys <code>array('key', 'key2')</code>
+    or a comma-separated string of keys <code>"key, key2"</code>.  If
+    you pass an array list of values, ensure that your values are
+    strictly numerically indexed: <code>array('key1', 2 =>
+      'key2')</code> will not do what you expect and emits a warning.
+  </div></type>
+  <type id="list" name="Array list"><div xmlns="http://www.w3.org/1999/xhtml">
+    An array which has consecutive integer indexes, e.g.
+    <code>array('val1', 'val2')</code>.  You are alternatively permitted
+    to pass a comma-separated string of keys <code>"val1, val2"</code>.
+    If your array is not in this form, <code>array_values</code> is run
+    on the array and a warning is emitted.
+  </div></type>
+  <type id="hash" name="Associative array"><div xmlns="http://www.w3.org/1999/xhtml">
+    An array which is a mapping of keys to values, e.g.
+    <code>array('key1' => 'val1', 'key2' => 'val2')</code>.  You are
+    alternatively permitted to pass a comma-separated string of
+    key-colon-value strings, e.g. <code>"key1: val1, key2: val2"</code>.
+  </div></type>
+  <type id="mixed" name="Mixed"><div xmlns="http://www.w3.org/1999/xhtml">
+    An arbitrary PHP value of any type.
+  </div></type>
 </types>

 <!-- vim: et sw=4 sts=4
--- a/configdoc/usage.xml
+++ b/configdoc/usage.xml
@@ -6,6 +6,7 @@
  </file>
  <file name="HTMLPurifier/Lexer.php">
   <line>81</line>
+   <line>284</line>
  </file>
  <file name="HTMLPurifier/Lexer/DirectLex.php">
   <line>53</line>
@@ -31,14 +32,24 @@
   <line>218</line>
  </file>
 </directive>
- <directive id="CSS.AllowImportant">
+ <directive id="CSS.Trusted">
  <file name="HTMLPurifier/CSSDefinition.php">
   <line>222</line>
  </file>
 </directive>
+ <directive id="CSS.AllowImportant">
+  <file name="HTMLPurifier/CSSDefinition.php">
+   <line>226</line>
+  </file>
+ </directive>
 <directive id="CSS.AllowedProperties">
  <file name="HTMLPurifier/CSSDefinition.php">
-   <line>275</line>
+   <line>296</line>
+  </file>
+ </directive>
+ <directive id="CSS.ForbiddenProperties">
+  <file name="HTMLPurifier/CSSDefinition.php">
+   <line>310</line>
  </file>
 </directive>
 <directive id="Cache.DefinitionImpl">
@@ -85,22 +96,40 @@
 </directive>
 <directive id="Output.CommentScriptContents">
  <file name="HTMLPurifier/Generator.php">
-   <line>45</line>
+   <line>61</line>
+  </file>
+ </directive>
+ <directive id="Output.FixInnerHTML">
+  <file name="HTMLPurifier/Generator.php">
+   <line>62</line>
  </file>
 </directive>
 <directive id="Output.SortAttr">
  <file name="HTMLPurifier/Generator.php">
-   <line>46</line>
+   <line>63</line>
+  </file>
+ </directive>
+ <directive id="Output.FlashCompat">
+  <file name="HTMLPurifier/Generator.php">
+   <line>64</line>
  </file>
 </directive>
 <directive id="Output.TidyFormat">
  <file name="HTMLPurifier/Generator.php">
-   <line>75</line>
+   <line>93</line>
+  </file>
+ </directive>
+ <directive id="Core.NormalizeNewlines">
+  <file name="HTMLPurifier/Generator.php">
+   <line>107</line>
+  </file>
+  <file name="HTMLPurifier/Lexer.php">
+   <line>266</line>
  </file>
 </directive>
 <directive id="Output.Newline">
  <file name="HTMLPurifier/Generator.php">
-   <line>89</line>
+   <line>108</line>
  </file>
 </directive>
 <directive id="HTML.BlockWrapper">
@@ -130,12 +159,12 @@
 </directive>
 <directive id="HTML.ForbiddenElements">
  <file name="HTMLPurifier/HTMLDefinition.php">
-   <line>337</line>
+   <line>342</line>
  </file>
 </directive>
 <directive id="HTML.ForbiddenAttributes">
  <file name="HTMLPurifier/HTMLDefinition.php">
-   <line>338</line>
+   <line>343</line>
  </file>
 </directive>
 <directive id="HTML.Trusted">
@@ -143,7 +172,7 @@
   <line>202</line>
  </file>
  <file name="HTMLPurifier/Lexer.php">
-   <line>258</line>
+   <line>271</line>
  </file>
  <file name="HTMLPurifier/HTMLModule/Image.php">
   <line>27</line>
@@ -167,15 +196,20 @@
 </directive>
 <directive id="HTML.Proprietary">
  <file name="HTMLPurifier/HTMLModuleManager.php">
-   <line>221</line>
+   <line>220</line>
  </file>
 </directive>
 <directive id="HTML.SafeObject">
  <file name="HTMLPurifier/HTMLModuleManager.php">
-   <line>226</line>
+   <line>223</line>
  </file>
 </directive>
 <directive id="HTML.SafeEmbed">
+  <file name="HTMLPurifier/HTMLModuleManager.php">
+   <line>226</line>
+  </file>
+ </directive>
+ <directive id="HTML.Nofollow">
  <file name="HTMLPurifier/HTMLModuleManager.php">
   <line>229</line>
  </file>
@@ -205,7 +239,12 @@
 </directive>
 <directive id="Core.ConvertDocumentToFragment">
  <file name="HTMLPurifier/Lexer.php">
-   <line>267</line>
+   <line>282</line>
+  </file>
+ </directive>
+ <directive id="Core.RemoveProcessingInstructions">
+  <file name="HTMLPurifier/Lexer.php">
+   <line>303</line>
  </file>
 </directive>
 <directive id="URI.">
@@ -220,6 +259,9 @@
  <file name="HTMLPurifier/URIDefinition.php">
   <line>64</line>
  </file>
+  <file name="HTMLPurifier/URIScheme.php">
+   <line>75</line>
+  </file>
 </directive>
 <directive id="URI.Base">
  <file name="HTMLPurifier/URIDefinition.php">
@@ -254,6 +296,11 @@
   <line>12</line>
  </file>
 </directive>
+ <directive id="CSS.AllowedFonts">
+  <file name="HTMLPurifier/AttrDef/CSS/FontFamily.php">
+   <line>50</line>
+  </file>
+ </directive>
 <directive id="Attr.AllowedClasses">
  <file name="HTMLPurifier/AttrDef/HTML/Class.php">
   <line>18</line>
@@ -320,7 +367,7 @@
 </directive>
 <directive id="Attr.DefaultInvalidImageAlt">
  <file name="HTMLPurifier/AttrTransform/ImgRequired.php">
-   <line>32</line>
+   <line>33</line>
  </file>
 </directive>
 <directive id="HTML.Attr.Name.UseCDATA">
@@ -331,6 +378,11 @@
   <line>13</line>
  </file>
 </directive>
+ <directive id="HTML.FlashAllowFullScreen">
+  <file name="HTMLPurifier/AttrTransform/SafeParam.php">
+   <line>38</line>
+  </file>
+ </directive>
 <directive id="Core.EscapeInvalidChildren">
  <file name="HTMLPurifier/ChildDef/Required.php">
   <line>62</line>
@@ -341,6 +393,12 @@
   <line>91</line>
  </file>
 </directive>
+ <directive id="Cache.SerializerPermissions">
+  <file name="HTMLPurifier/DefinitionCache/Serializer.php">
+   <line>107</line>
+   <line>124</line>
+  </file>
+ </directive>
 <directive id="Filter.ExtractStyleBlocks.TidyImpl">
  <file name="HTMLPurifier/Filter/ExtractStyleBlocks.php">
   <line>41</line>
@@ -409,7 +467,7 @@
 </directive>
 <directive id="Core.EscapeInvalidTags">
  <file name="HTMLPurifier/Strategy/MakeWellFormed.php">
-   <line>45</line>
+   <line>53</line>
  </file>
  <file name="HTMLPurifier/Strategy/RemoveForeignElements.php">
   <line>19</line>
--- a/docs/enduser-customize.html
+++ b/docs/enduser-customize.html
@@ -146,7 +146,9 @@
 <pre>$config = HTMLPurifier_Config::createDefault();
 $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
 $config-&gt;set('HTML.DefinitionRev', 1);
-$def = $config-&gt;getHTMLDefinition(true);</pre>
+if ($def = $config-&gt;maybeGetRawHTMLDefinition()) {
+    // our code will go here
+}</pre>

 <p>
  Assuming that HTML Purifier has already been properly loaded (hint:
@@ -174,23 +176,15 @@ $def = $config-&gt;getHTMLDefinition(true);</pre>
  </li>
  <li>
    The fourth line retrieves a raw <code>HTMLPurifier_HTMLDefinition</code>
-    object that we will be tweaking. If the parameter was removed, we
-    would be retrieving a fully formed definition object, which is somewhat
-    useless for customization purposes.
+    object that we will be tweaking.  Interestingly enough, we have
+    placed it in an if block: this is because
+    <code>maybeGetRawHTMLDefinition</code>, as its name suggests, may
+    return a NULL, in which case we should skip doing any
+    initialization.  This, in fact, will correspond to when our fully
+    customized object is already in the cache.
  </li>
 </ul>

-<h3>Broken backwards-compatibility</h3>
-
-<p>
-  Those of you who have already been twiddling around with the raw
-  HTML definition object, you'll be noticing that you're getting an error
-  when you attempt to retrieve the raw definition object without specifying
-  a DefinitionID.  It is vital to caching (see below) that you make a unique
-  name for your customized definition, so make up something right now and
-  things will operate again.
-</p>
-
 <h2>Turn off caching</h2>

 <p>
@@ -781,6 +775,75 @@ $form-&gt;excludes = array('form' => true);</strong></pre>
  <li><a href="http://repo.or.cz/w/htmlpurifier.git?a=blob;hb=HEAD;f=library/HTMLPurifier/ElementDef.php"><code>library/HTMLPurifier/ElementDef.php</code></a></li>
 </ul>

+<h2 id="optimized">Notes for HTML Purifier 4.2.0 and earlier</h3>
+
+<p>
+    Previously, this tutorial gave some incorrect template code for
+    editing raw definitions, and that template code will now produce the
+    error <q>Due to a documentation error in previous version of HTML
+    Purifier...</q>  Here is how to mechanically transform old-style
+    code into new-style code.
+</p>
+
+<p>
+    First, identify all code that edits the raw definition object, and
+    put it together.  Ensure none of this code must be run on every
+    request; if some sub-part needs to always be run, move it outside
+    this block.  Here is an example below, with the raw definition
+    object code bolded.
+</p>
+
+<pre>$config = HTMLPurifier_Config::createDefault();
+$config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
+$config-&gt;set('HTML.DefinitionRev', 1);
+$def = $config-&gt;getHTMLDefinition(true);
+<strong>$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');</strong>
+$purifier = new HTMLPurifier($config);</pre>
+
+<p>
+    Next, replace the raw definition retrieval with a
+    maybeGetRawHTMLDefinition method call inside an if conditional, and
+    place the editing code inside that if block.
+</p>
+
+<pre>$config = HTMLPurifier_Config::createDefault();
+$config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
+$config-&gt;set('HTML.DefinitionRev', 1);
+<strong>if ($def = $config-&gt;maybeGetRawHTMLDefinition()) {
+    $def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
+}</strong>
+$purifier = new HTMLPurifier($config);</pre>
+
+<p>
+    And you're done!  Alternatively, if you're OK with not ever caching
+    your code, the following will still work and not emit warnings.
+</p>
+
+<pre>$config = HTMLPurifier_Config::createDefault();
+$def = $config-&gt;getHTMLDefinition(true);
+$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
+$purifier = new HTMLPurifier($config);</pre>
+
+<p>
+    A slightly less efficient version of this was what was going on with
+    old versions of HTML Purifier.
+</p>
+
+<p>
+    <em>Technical notes:</em> ajh pointed out on <a
+        href="http://htmlpurifier.org/phorum/read.php?5,5164,5169#msg-5169">in a forum topic</a> that
+    HTML Purifier appeared to be repeatedly writing to the cache even
+    when a cache entry already existed.  Investigation lead to the
+    discovery of the following infelicity: caching of customized
+    definitions didn't actually work!  The problem was that even though
+    a cache file would be written out at the end of the process, there
+    was no way for HTML Purifier to say, <q>Actually, I've already got a
+        copy of your work, no need to reconfigure your
+        customizations</q>.  This required the API to change: placing
+    all of the customizations to the raw definition object in a
+    conditional which could be skipped.
+</p>
+
 </body></html>

 <!-- vim: et sw=4 sts=4
--- a/library/HTMLPurifier.autoload.php
+++ b/library/HTMLPurifier.autoload.php
@@ -3,6 +3,7 @@
 /**
 * @file
 * Convenience file that registers autoload handler for HTML Purifier.
+ * It also does some sanity checks.
 */

 if (function_exists('spl_autoload_register') && function_exists('spl_autoload_unregister')) {
@@ -18,4 +19,8 @@ if (function_exists('spl_autoload_register') && function_exists('spl_autoload_un
    }
 }

+if (ini_get('zend.ze1_compatibility_mode')) {
+    trigger_error("HTML Purifier is not compatible with zend.ze1_compatibility_mode; please turn it off", E_USER_ERROR);
+}
+
 // vim: et sw=4 sts=4
--- a/library/HTMLPurifier.includes.php
+++ b/library/HTMLPurifier.includes.php
@@ -7,7 +7,7 @@
 * primary concern and you are using an opcode cache. PLEASE DO NOT EDIT THIS
 * FILE, changes will be overwritten the next time the script is run.
 *
- * @version 4.0.0
+ * @version 4.3.0
 *
 * @warning
 *      You must *not* include any other HTML Purifier files before this file,
@@ -125,6 +125,7 @@ require 'HTMLPurifier/AttrTransform/Lang.php';
 require 'HTMLPurifier/AttrTransform/Length.php';
 require 'HTMLPurifier/AttrTransform/Name.php';
 require 'HTMLPurifier/AttrTransform/NameSync.php';
+require 'HTMLPurifier/AttrTransform/Nofollow.php';
 require 'HTMLPurifier/AttrTransform/SafeEmbed.php';
 require 'HTMLPurifier/AttrTransform/SafeObject.php';
 require 'HTMLPurifier/AttrTransform/SafeParam.php';
@@ -151,6 +152,7 @@ require 'HTMLPurifier/HTMLModule/Image.php';
 require 'HTMLPurifier/HTMLModule/Legacy.php';
 require 'HTMLPurifier/HTMLModule/List.php';
 require 'HTMLPurifier/HTMLModule/Name.php';
+require 'HTMLPurifier/HTMLModule/Nofollow.php';
 require 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
 require 'HTMLPurifier/HTMLModule/Object.php';
 require 'HTMLPurifier/HTMLModule/Presentation.php';
@@ -176,6 +178,7 @@ require 'HTMLPurifier/Injector/DisplayLinkURI.php';
 require 'HTMLPurifier/Injector/Linkify.php';
 require 'HTMLPurifier/Injector/PurifierLinkify.php';
 require 'HTMLPurifier/Injector/RemoveEmpty.php';
+require 'HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php';
 require 'HTMLPurifier/Injector/SafeObject.php';
 require 'HTMLPurifier/Lexer/DOMLex.php';
 require 'HTMLPurifier/Lexer/DirectLex.php';
@@ -195,9 +198,12 @@ require 'HTMLPurifier/Token/Start.php';
 require 'HTMLPurifier/Token/Text.php';
 require 'HTMLPurifier/URIFilter/DisableExternal.php';
 require 'HTMLPurifier/URIFilter/DisableExternalResources.php';
+require 'HTMLPurifier/URIFilter/DisableResources.php';
 require 'HTMLPurifier/URIFilter/HostBlacklist.php';
 require 'HTMLPurifier/URIFilter/MakeAbsolute.php';
 require 'HTMLPurifier/URIFilter/Munge.php';
+require 'HTMLPurifier/URIScheme/data.php';
+require 'HTMLPurifier/URIScheme/file.php';
 require 'HTMLPurifier/URIScheme/ftp.php';
 require 'HTMLPurifier/URIScheme/http.php';
 require 'HTMLPurifier/URIScheme/https.php';
--- a/library/HTMLPurifier.php
+++ b/library/HTMLPurifier.php
@@ -19,7 +19,7 @@
 */

 /*
-    HTML Purifier 4.0.0 - Standards Compliant HTML Filtering
+    HTML Purifier 4.3.0 - Standards Compliant HTML Filtering
    Copyright (C) 2006-2008 Edward Z. Yang

    This library is free software; you can redistribute it and/or
@@ -55,10 +55,10 @@ class HTMLPurifier
 {

    /** Version of HTML Purifier */
-    public $version = '4.0.0';
+    public $version = '4.3.0';

    /** Constant with version of HTML Purifier */
-    const VERSION = '4.0.0';
+    const VERSION = '4.3.0';

    /** Global configuration object */
    public $config;
--- a/library/HTMLPurifier.safe-includes.php
+++ b/library/HTMLPurifier.safe-includes.php
@@ -119,6 +119,7 @@ require_once $__dir . '/HTMLPurifier/AttrTransform/Lang.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/Length.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/Name.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/NameSync.php';
+require_once $__dir . '/HTMLPurifier/AttrTransform/Nofollow.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/SafeEmbed.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/SafeObject.php';
 require_once $__dir . '/HTMLPurifier/AttrTransform/SafeParam.php';
@@ -145,6 +146,7 @@ require_once $__dir . '/HTMLPurifier/HTMLModule/Image.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/Legacy.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/List.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/Name.php';
+require_once $__dir . '/HTMLPurifier/HTMLModule/Nofollow.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/Object.php';
 require_once $__dir . '/HTMLPurifier/HTMLModule/Presentation.php';
@@ -170,6 +172,7 @@ require_once $__dir . '/HTMLPurifier/Injector/DisplayLinkURI.php';
 require_once $__dir . '/HTMLPurifier/Injector/Linkify.php';
 require_once $__dir . '/HTMLPurifier/Injector/PurifierLinkify.php';
 require_once $__dir . '/HTMLPurifier/Injector/RemoveEmpty.php';
+require_once $__dir . '/HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php';
 require_once $__dir . '/HTMLPurifier/Injector/SafeObject.php';
 require_once $__dir . '/HTMLPurifier/Lexer/DOMLex.php';
 require_once $__dir . '/HTMLPurifier/Lexer/DirectLex.php';
@@ -189,9 +192,12 @@ require_once $__dir . '/HTMLPurifier/Token/Start.php';
 require_once $__dir . '/HTMLPurifier/Token/Text.php';
 require_once $__dir . '/HTMLPurifier/URIFilter/DisableExternal.php';
 require_once $__dir . '/HTMLPurifier/URIFilter/DisableExternalResources.php';
+require_once $__dir . '/HTMLPurifier/URIFilter/DisableResources.php';
 require_once $__dir . '/HTMLPurifier/URIFilter/HostBlacklist.php';
 require_once $__dir . '/HTMLPurifier/URIFilter/MakeAbsolute.php';
 require_once $__dir . '/HTMLPurifier/URIFilter/Munge.php';
+require_once $__dir . '/HTMLPurifier/URIScheme/data.php';
+require_once $__dir . '/HTMLPurifier/URIScheme/file.php';
 require_once $__dir . '/HTMLPurifier/URIScheme/ftp.php';
 require_once $__dir . '/HTMLPurifier/URIScheme/http.php';
 require_once $__dir . '/HTMLPurifier/URIScheme/https.php';
--- a/library/HTMLPurifier/AttrDef.php
+++ b/library/HTMLPurifier/AttrDef.php
@@ -82,6 +82,42 @@ abstract class HTMLPurifier_AttrDef
        return preg_replace('/rgb\((\d+)\s*,\s*(\d+)\s*,\s*(\d+)\)/', 'rgb(\1,\2,\3)', $string);
    }

+    /**
+     * Parses a possibly escaped CSS string and returns the "pure" 
+     * version of it.
+     */
+    protected function expandCSSEscape($string) {
+        // flexibly parse it
+        $ret = '';
+        for ($i = 0, $c = strlen($string); $i < $c; $i++) {
+            if ($string[$i] === '\\') {
+                $i++;
+                if ($i >= $c) {
+                    $ret .= '\\';
+                    break;
+                }
+                if (ctype_xdigit($string[$i])) {
+                    $code = $string[$i];
+                    for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
+                        if (!ctype_xdigit($string[$i])) break;
+                        $code .= $string[$i];
+                    }
+                    // We have to be extremely careful when adding
+                    // new characters, to make sure we're not breaking
+                    // the encoding.
+                    $char = HTMLPurifier_Encoder::unichr(hexdec($code));
+                    if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
+                    $ret .= $char;
+                    if ($i < $c && trim($string[$i]) !== '') $i--;
+                    continue;
+                }
+                if ($string[$i] === "\n") continue;
+            }
+            $ret .= $string[$i];
+        }
+        return $ret;
+    }
+
 }

 // vim: et sw=4 sts=4
--- a/library/HTMLPurifier/AttrDef/CSS/BackgroundPosition.php
+++ b/library/HTMLPurifier/AttrDef/CSS/BackgroundPosition.php
@@ -59,7 +59,8 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
        $keywords = array();
        $keywords['h'] = false; // left, right
        $keywords['v'] = false; // top, bottom
-        $keywords['c'] = false; // center
+        $keywords['ch'] = false; // center (first word)
+        $keywords['cv'] = false; // center (second word)
        $measures = array();

        $i = 0;
@@ -79,6 +80,13 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
            $lbit = ctype_lower($bit) ? $bit : strtolower($bit);
            if (isset($lookup[$lbit])) {
                $status = $lookup[$lbit];
+                if ($status == 'c') {
+                    if ($i == 0) {
+                        $status = 'ch';
+                    } else {
+                        $status = 'cv';
+                    }
+                }
                $keywords[$status] = $lbit;
                $i++;
            }
@@ -101,20 +109,19 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef

        if (!$i) return false; // no valid values were caught

-
        $ret = array();

        // first keyword
        if     ($keywords['h'])     $ret[] = $keywords['h'];
-        elseif (count($measures))   $ret[] = array_shift($measures);
-        elseif ($keywords['c']) {
-            $ret[] = $keywords['c'];
-            $keywords['c'] = false; // prevent re-use: center = center center
+        elseif ($keywords['ch']) {
+            $ret[] = $keywords['ch'];
+            $keywords['cv'] = false; // prevent re-use: center = center center
        }
+        elseif (count($measures))   $ret[] = array_shift($measures);

        if     ($keywords['v'])     $ret[] = $keywords['v'];
+        elseif ($keywords['cv'])    $ret[] = $keywords['cv'];
        elseif (count($measures))   $ret[] = array_shift($measures);
-        elseif ($keywords['c'])     $ret[] = $keywords['c'];

        if (empty($ret)) return false;
        return implode(' ', $ret);
--- a/library/HTMLPurifier/AttrDef/CSS/FontFamily.php
+++ b/library/HTMLPurifier/AttrDef/CSS/FontFamily.php
@@ -2,11 +2,43 @@

 /**
 * Validates a font family list according to CSS spec
- * @todo whitelisting allowed fonts would be nice
 */
 class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
 {

+    protected $mask = null;
+
+    public function __construct() {
+        $this->mask = '- ';
+        for ($c = 'a'; $c <= 'z'; $c++) $this->mask .= $c;
+        for ($c = 'A'; $c <= 'Z'; $c++) $this->mask .= $c;
+        for ($c = '0'; $c <= '9'; $c++) $this->mask .= $c; // cast-y, but should be fine
+        // special bytes used by UTF-8
+        for ($i = 0x80; $i <= 0xFF; $i++) {
+            // We don't bother excluding invalid bytes in this range,
+            // because the our restriction of well-formed UTF-8 will
+            // prevent these from ever occurring.
+            $this->mask .= chr($i);
+        }
+
+        /*
+            PHP's internal strcspn implementation is
+            O(length of string * length of mask), making it inefficient
+            for large masks.  However, it's still faster than
+            preg_match 8)
+          for (p = s1;;) {
+            spanp = s2;
+            do {
+              if (*spanp == c || p == s1_end) {
+                return p - s1;
+              }
+            } while (spanp++ < (s2_end - 1));
+            c = *++p;
+          }
+         */
+        // possible optimization: invert the mask.
+    }
+
    public function validate($string, $config, $context) {
        static $generic_names = array(
            'serif' => true,
@@ -15,6 +47,7 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
            'fantasy' => true,
            'cursive' => true
        );
+        $allowed_fonts = $config->get('CSS.AllowedFonts');

        // assume that no font names contain commas in them
        $fonts = explode(',', $string);
@@ -24,7 +57,9 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
            if ($font === '') continue;
            // match a generic name
            if (isset($generic_names[$font])) {
-                $final .= $font . ', ';
+                if ($allowed_fonts === null || isset($allowed_fonts[$font])) {
+                    $final .= $font . ', ';
+                }
                continue;
            }
            // match a quoted name
@@ -34,50 +69,122 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
                $quote = $font[0];
                if ($font[$length - 1] !== $quote) continue;
                $font = substr($font, 1, $length - 2);
-
-                $new_font = '';
-                for ($i = 0, $c = strlen($font); $i < $c; $i++) {
-                    if ($font[$i] === '\\') {
-                        $i++;
-                        if ($i >= $c) {
-                            $new_font .= '\\';
-                            break;
-                        }
-                        if (ctype_xdigit($font[$i])) {
-                            $code = $font[$i];
-                            for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
-                                if (!ctype_xdigit($font[$i])) break;
-                                $code .= $font[$i];
-                            }
-                            // We have to be extremely careful when adding
-                            // new characters, to make sure we're not breaking
-                            // the encoding.
-                            $char = HTMLPurifier_Encoder::unichr(hexdec($code));
-                            if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
-                            $new_font .= $char;
-                            if ($i < $c && trim($font[$i]) !== '') $i--;
-                            continue;
-                        }
-                        if ($font[$i] === "\n") continue;
-                    }
-                    $new_font .= $font[$i];
-                }
-
-                $font = $new_font;
            }
+
+            $font = $this->expandCSSEscape($font);
+
            // $font is a pure representation of the font name

+            if ($allowed_fonts !== null && !isset($allowed_fonts[$font])) {
+                continue;
+            }
+
            if (ctype_alnum($font) && $font !== '') {
                // very simple font, allow it in unharmed
                $final .= $font . ', ';
                continue;
            }

-            // complicated font, requires quoting
+            // bugger out on whitespace.  form feed (0C) really
+            // shouldn't show up regardless
+            $font = str_replace(array("\n", "\t", "\r", "\x0C"), ' ', $font);

-            // armor single quotes and new lines
-            $font = str_replace("\\", "\\\\", $font);
-            $font = str_replace("'", "\\'", $font);
+            // Here, there are various classes of characters which need
+            // to be treated differently:
+            //  - Alphanumeric characters are essentially safe.  We
+            //    handled these above.
+            //  - Spaces require quoting, though most parsers will do
+            //    the right thing if there aren't any characters that
+            //    can be misinterpreted
+            //  - Dashes rarely occur, but they fairly unproblematic
+            //    for parsing/rendering purposes.
+            //  The above characters cover the majority of Western font
+            //  names.
+            //  - Arbitrary Unicode characters not in ASCII.  Because
+            //    most parsers give little thought to Unicode, treatment
+            //    of these codepoints is basically uniform, even for
+            //    punctuation-like codepoints.  These characters can
+            //    show up in non-Western pages and are supported by most
+            //    major browsers, for example: "ＭＳ 明朝" is a
+            //    legitimate font-name
+            //    <http://ja.wikipedia.org/wiki/MS_明朝>.  See
+            //    the CSS3 spec for more examples:
+            //    <http://www.w3.org/TR/2011/WD-css3-fonts-20110324/localizedfamilynames.png>
+            //    You can see live samples of these on the Internet:
+            //    <http://www.google.co.jp/search?q=font-family+ＭＳ+明朝|ゴシック>
+            //    However, most of these fonts have ASCII equivalents:
+            //    for example, 'MS Mincho', and it's considered
+            //    professional to use ASCII font names instead of
+            //    Unicode font names.  Thanks Takeshi Terada for
+            //    providing this information.
+            //  The following characters, to my knowledge, have not been
+            //  used to name font names.
+            //  - Single quote.  While theoretically you might find a
+            //    font name that has a single quote in its name (serving
+            //    as an apostrophe, e.g. Dave's Scribble), I haven't
+            //    been able to find any actual examples of this.
+            //    Internet Explorer's cssText translation (which I
+            //    believe is invoked by innerHTML) normalizes any
+            //    quoting to single quotes, and fails to escape single
+            //    quotes.  (Note that this is not IE's behavior for all
+            //    CSS properties, just some sort of special casing for
+            //    font-family).  So a single quote *cannot* be used
+            //    safely in the font-family context if there will be an
+            //    innerHTML/cssText translation.  Note that Firefox 3.x
+            //    does this too.
+            //  - Double quote.  In IE, these get normalized to
+            //    single-quotes, no matter what the encoding.  (Fun
+            //    fact, in IE8, the 'content' CSS property gained
+            //    support, where they special cased to preserve encoded
+            //    double quotes, but still translate unadorned double
+            //    quotes into single quotes.)  So, because their
+            //    fixpoint behavior is identical to single quotes, they
+            //    cannot be allowed either.  Firefox 3.x displays
+            //    single-quote style behavior.
+            //  - Backslashes are reduced by one (so \\ -> \) every
+            //    iteration, so they cannot be used safely.  This shows
+            //    up in IE7, IE8 and FF3
+            //  - Semicolons, commas and backticks are handled properly.
+            //  - The rest of the ASCII punctuation is handled properly.
+            // We haven't checked what browsers do to unadorned
+            // versions, but this is not important as long as the
+            // browser doesn't /remove/ surrounding quotes (as IE does
+            // for HTML).
+            //
+            // With these results in hand, we conclude that there are
+            // various levels of safety:
+            //  - Paranoid: alphanumeric, spaces and dashes(?)
+            //  - International: Paranoid + non-ASCII Unicode
+            //  - Edgy: Everything except quotes, backslashes
+            //  - NoJS: Standards compliance, e.g. sod IE. Note that
+            //    with some judicious character escaping (since certain
+            //    types of escaping doesn't work) this is theoretically
+            //    OK as long as innerHTML/cssText is not called.
+            // We believe that international is a reasonable default
+            // (that we will implement now), and once we do more
+            // extensive research, we may feel comfortable with dropping
+            // it down to edgy.
+
+            // Edgy: alphanumeric, spaces, dashes and Unicode.  Use of
+            // str(c)spn assumes that the string was already well formed
+            // Unicode (which of course it is).
+            if (strspn($font, $this->mask) !== strlen($font)) {
+                continue;
+            }
+
+            // Historical:
+            // In the absence of innerHTML/cssText, these ugly
+            // transforms don't pose a security risk (as \\ and \"
+            // might--these escapes are not supported by most browsers).
+            // We could try to be clever and use single-quote wrapping
+            // when there is a double quote present, but I have choosen
+            // not to implement that.  (NOTE: you can reduce the amount
+            // of escapes by one depending on what quoting style you use)
+            // $font = str_replace('\\', '\\5C ', $font);
+            // $font = str_replace('"',  '\\22 ', $font);
+            // $font = str_replace("'",  '\\27 ', $font);
+
+            // font possibly with spaces, requires quoting
            $final .= "'$font', ";
        }
        $final = rtrim($final, ', ');
--- a/library/HTMLPurifier/AttrDef/CSS/URI.php
+++ b/library/HTMLPurifier/AttrDef/CSS/URI.php
@@ -34,20 +34,25 @@ class HTMLPurifier_AttrDef_CSS_URI extends HTMLPurifier_AttrDef_URI
            $uri = substr($uri, 1, $new_length - 1);
        }

-        $keys   = array(  '(',   ')',   ',',   ' ',   '"',   "'");
-        $values = array('\\(', '\\)', '\\,', '\\ ', '\\"', "\\'");
-        $uri = str_replace($values, $keys, $uri);
+        $uri = $this->expandCSSEscape($uri);

        $result = parent::validate($uri, $config, $context);

        if ($result === false) return false;

-        // escape necessary characters according to CSS spec
-        // except for the comma, none of these should appear in the
-        // URI at all
-        $result = str_replace($keys, $values, $result);
+        // extra sanity check; should have been done by URI
+        $result = str_replace(array('"', "\\", "\n", "\x0c", "\r"), "", $result);

-        return "url($result)";
+        // suspicious characters are ()'; we're going to percent encode
+        // them for safety.
+        $result = str_replace(array('(', ')', "'"), array('%28', '%29', '%27'), $result);
+
+        // there's an extra bug where ampersands lose their escaping on
+        // an innerHTML cycle, so a very unlucky query parameter could
+        // then change the meaning of the URL.  Unfortunately, there's
+        // not much we can do about that...
+
+        return "url(\"$result\")";

    }

--- a/library/HTMLPurifier/AttrDef/URI/Host.php
+++ b/library/HTMLPurifier/AttrDef/URI/Host.php
@@ -23,6 +23,12 @@ class HTMLPurifier_AttrDef_URI_Host extends HTMLPurifier_AttrDef

    public function validate($string, $config, $context) {
        $length = strlen($string);
+        // empty hostname is OK; it's usually semantically equivalent:
+        // the default host as defined by a URI scheme is used:
+        //
+        //      If the URI scheme defines a default for host, then that
+        //      default applies when the host subcomponent is undefined
+        //      or when the registered name is empty (zero length).
        if ($string === '') return '';
        if ($length > 1 && $string[0] === '[' && $string[$length-1] === ']') {
            //IPv6
--- a/library/HTMLPurifier/AttrTransform/ImgRequired.php
+++ b/library/HTMLPurifier/AttrTransform/ImgRequired.php
@@ -24,7 +24,8 @@ class HTMLPurifier_AttrTransform_ImgRequired extends HTMLPurifier_AttrTransform
            if ($src) {
                $alt = $config->get('Attr.DefaultImageAlt');
                if ($alt === null) {
-                    $attr['alt'] = basename($attr['src']);
+                    // truncate if the alt is too long
+                    $attr['alt'] = substr(basename($attr['src']),0,40);
                } else {
                    $attr['alt'] = $alt;
                }
--- a/library/HTMLPurifier/AttrTransform/Nofollow.php
+++ b/library/HTMLPurifier/AttrTransform/Nofollow.php
@@ -0,0 +1,41 @@
+<?php
+
+// must be called POST validation
+
+/**
+ * Adds rel="nofollow" to all outbound links.  This transform is
+ * only attached if Attr.Nofollow is TRUE.
+ */
+class HTMLPurifier_AttrTransform_Nofollow extends HTMLPurifier_AttrTransform
+{
+    private $parser;
+
+    public function __construct() {
+        $this->parser = new HTMLPurifier_URIParser();
+    }
+
+    public function transform($attr, $config, $context) {
+
+        if (!isset($attr['href'])) {
+            return $attr;
+        }
+
+        // XXX Kind of inefficient
+        $url = $this->parser->parse($attr['href']);
+        $scheme = $url->getSchemeObj($config, $context);
+
+        if (!is_null($url->host) && $scheme !== false && $scheme->browsable) {
+            if (isset($attr['rel'])) {
+                $attr['rel'] .= ' nofollow';
+            } else {
+                $attr['rel'] = 'nofollow';
+            }
+        }
+
+        return $attr;
+
+    }
+
+}
+
+// vim: et sw=4 sts=4
--- a/library/HTMLPurifier/AttrTransform/SafeParam.php
+++ b/library/HTMLPurifier/AttrTransform/SafeParam.php
@@ -19,6 +19,7 @@ class HTMLPurifier_AttrTransform_SafeParam extends HTMLPurifier_AttrTransform

    public function __construct() {
        $this->uri = new HTMLPurifier_AttrDef_URI(true); // embedded
+        $this->wmode = new HTMLPurifier_AttrDef_Enum(array('window', 'opaque', 'transparent'));
    }

    public function transform($attr, $config, $context) {
@@ -33,12 +34,25 @@ class HTMLPurifier_AttrTransform_SafeParam extends HTMLPurifier_AttrTransform
            case 'allowNetworking':
                $attr['value'] = 'internal';
                break;
+            case 'allowFullScreen':
+                if ($config->get('HTML.FlashAllowFullScreen')) {
+                    $attr['value'] = ($attr['value'] == 'true') ? 'true' : 'false';
+                } else {
+                    $attr['value'] = 'false';
+                }
+                break;
            case 'wmode':
-                $attr['value'] = 'window';
+                $attr['value'] = $this->wmode->validate($attr['value'], $config, $context);
                break;
            case 'movie':
+            case 'src':
+                $attr['name'] = "movie";
                $attr['value'] = $this->uri->validate($attr['value'], $config, $context);
                break;
+            case 'flashvars':
+                // we're going to allow arbitrary inputs to the SWF, on
+                // the reasoning that it could only hack the SWF, not us.
+                break;
            // add other cases to support other param name/value pairs
            default:
                $attr['name'] = $attr['value'] = null;
--- a/library/HTMLPurifier/Bootstrap.php
+++ b/library/HTMLPurifier/Bootstrap.php
@@ -37,7 +37,12 @@ class HTMLPurifier_Bootstrap
    public static function autoload($class) {
        $file = HTMLPurifier_Bootstrap::getPath($class);
        if (!$file) return false;
-        require HTMLPURIFIER_PREFIX . '/' . $file;
+        // Technically speaking, it should be ok and more efficient to
+        // just do 'require', but Antonio Parraga reports that with
+        // Zend extensions such as Zend debugger and APC, this invariant
+        // may be broken.  Since we have efficient alternatives, pay
+        // the cost here and avoid the bug.
+        require_once HTMLPURIFIER_PREFIX . '/' . $file;
        return true;
    }

@@ -65,10 +70,11 @@ class HTMLPurifier_Bootstrap
        if ( ($funcs = spl_autoload_functions()) === false ) {
            spl_autoload_register($autoload);
        } elseif (function_exists('spl_autoload_unregister')) {
+            $buggy  = version_compare(PHP_VERSION, '5.2.11', '<');
            $compat = version_compare(PHP_VERSION, '5.1.2', '<=') &&
                      version_compare(PHP_VERSION, '5.1.0', '>=');
            foreach ($funcs as $func) {
-                if (is_array($func)) {
+                if ($buggy && is_array($func)) {
                    // :TRICKY: There are some compatibility issues and some
                    // places where we need to error out
                    $reflector = new ReflectionMethod($func[0], $func[1]);
--- a/library/HTMLPurifier/CSSDefinition.php
+++ b/library/HTMLPurifier/CSSDefinition.php
@@ -219,6 +219,10 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
            $this->doSetupTricky($config);
        }

+        if ($config->get('CSS.Trusted')) {
+            $this->doSetupTrusted($config);
+        }
+
        $allow_important = $config->get('CSS.AllowImportant');
        // wrap all attr-defs with decorator that handles !important
        foreach ($this->info as $k => $v) {
@@ -260,6 +264,23 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
        $this->info['overflow'] = new HTMLPurifier_AttrDef_Enum(array('visible', 'hidden', 'auto', 'scroll'));
    }

+    protected function doSetupTrusted($config) {
+        $this->info['position'] = new HTMLPurifier_AttrDef_Enum(array(
+            'static', 'relative', 'absolute', 'fixed'
+        ));
+        $this->info['top'] =
+        $this->info['left'] =
+        $this->info['right'] =
+        $this->info['bottom'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
+            new HTMLPurifier_AttrDef_CSS_Length(),
+            new HTMLPurifier_AttrDef_CSS_Percentage(),
+            new HTMLPurifier_AttrDef_Enum(array('auto')),
+        ));
+        $this->info['z-index'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
+            new HTMLPurifier_AttrDef_Integer(),
+            new HTMLPurifier_AttrDef_Enum(array('auto')),
+        ));
+    }

    /**
     * Performs extra config-based processing. Based off of
@@ -272,20 +293,29 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
        // setup allowed elements
        $support = "(for information on implementing this, see the ".
                   "support forums) ";
-        $allowed_attributes = $config->get('CSS.AllowedProperties');
-        if ($allowed_attributes !== null) {
+        $allowed_properties = $config->get('CSS.AllowedProperties');
+        if ($allowed_properties !== null) {
            foreach ($this->info as $name => $d) {
-                if(!isset($allowed_attributes[$name])) unset($this->info[$name]);
-                unset($allowed_attributes[$name]);
+                if(!isset($allowed_properties[$name])) unset($this->info[$name]);
+                unset($allowed_properties[$name]);
            }
            // emit errors
-            foreach ($allowed_attributes as $name => $d) {
+            foreach ($allowed_properties as $name => $d) {
                // :TODO: Is this htmlspecialchars() call really necessary?
                $name = htmlspecialchars($name);
                trigger_error("Style attribute '$name' is not supported $support", E_USER_WARNING);
            }
        }

+        $forbidden_properties = $config->get('CSS.ForbiddenProperties');
+        if ($forbidden_properties !== null) {
+            foreach ($this->info as $name => $d) {
+                if (isset($forbidden_properties[$name])) {
+                    unset($this->info[$name]);
+                }
+            }
+        }
+
    }
 }

--- a/library/HTMLPurifier/Config.php
+++ b/library/HTMLPurifier/Config.php
@@ -20,7 +20,7 @@ class HTMLPurifier_Config
    /**
     * HTML Purifier's version
     */
-    public $version = '4.0.0';
+    public $version = '4.3.0';

    /**
     * Bool indicator whether or not to automatically finalize
@@ -76,7 +76,8 @@ class HTMLPurifier_Config

    /**
     * Set to false if you do not want line and file numbers in errors
-     * (useful when unit testing)
+     * (useful when unit testing).  This will also compress some errors
+     * and exceptions.
     */
    public $chatty = true;

@@ -318,26 +319,64 @@ class HTMLPurifier_Config
     * Retrieves object reference to the HTML definition.
     * @param $raw Return a copy that has not been setup yet. Must be
     *             called before it's been setup, otherwise won't work.
+     * @param $optimized If true, this method may return null, to
+     *             indicate that a cached version of the modified
+     *             definition object is available and no further edits
+     *             are necessary.  Consider using
+     *             maybeGetRawHTMLDefinition, which is more explicitly
+     *             named, instead.
     */
-    public function getHTMLDefinition($raw = false) {
-        return $this->getDefinition('HTML', $raw);
+    public function getHTMLDefinition($raw = false, $optimized = false) {
+        return $this->getDefinition('HTML', $raw, $optimized);
    }

    /**
     * Retrieves object reference to the CSS definition
     * @param $raw Return a copy that has not been setup yet. Must be
     *             called before it's been setup, otherwise won't work.
+     * @param $optimized If true, this method may return null, to
+     *             indicate that a cached version of the modified
+     *             definition object is available and no further edits
+     *             are necessary.  Consider using
+     *             maybeGetRawCSSDefinition, which is more explicitly
+     *             named, instead.
     */
-    public function getCSSDefinition($raw = false) {
-        return $this->getDefinition('CSS', $raw);
+    public function getCSSDefinition($raw = false, $optimized = false) {
+        return $this->getDefinition('CSS', $raw, $optimized);
+    }
+
+    /**
+     * Retrieves object reference to the URI definition
+     * @param $raw Return a copy that has not been setup yet. Must be
+     *             called before it's been setup, otherwise won't work.
+     * @param $optimized If true, this method may return null, to
+     *             indicate that a cached version of the modified
+     *             definition object is available and no further edits
+     *             are necessary.  Consider using
+     *             maybeGetRawURIDefinition, which is more explicitly
+     *             named, instead.
+     */
+    public function getURIDefinition($raw = false, $optimized = false) {
+        return $this->getDefinition('URI', $raw, $optimized);
    }

    /**
     * Retrieves a definition
     * @param $type Type of definition: HTML, CSS, etc
     * @param $raw  Whether or not definition should be returned raw
+     * @param $optimized Only has an effect when $raw is true.  Whether
+     *        or not to return null if the result is already present in
+     *        the cache.  This is off by default for backwards
+     *        compatibility reasons, but you need to do things this
+     *        way in order to ensure that caching is done properly.
+     *        Check out enduser-customize.html for more details.
+     *        We probably won't ever change this default, as much as the
+     *        maybe semantics is the "right thing to do."
     */
-    public function getDefinition($type, $raw = false) {
+    public function getDefinition($type, $raw = false, $optimized = false) {
+        if ($optimized && !$raw) {
+            throw new HTMLPurifier_Exception("Cannot set optimized = true when raw = false");
+        }
        if (!$this->finalized) $this->autoFinalize();
        // temporarily suspend locks, so we can handle recursive definition calls
        $lock = $this->lock;
@@ -346,52 +385,137 @@ class HTMLPurifier_Config
        $cache = $factory->create($type, $this);
        $this->lock = $lock;
        if (!$raw) {
-            // see if we can quickly supply a definition
+            // full definition
+            // ---------------
+            // check if definition is in memory
            if (!empty($this->definitions[$type])) {
-                if (!$this->definitions[$type]->setup) {
-                    $this->definitions[$type]->setup($this);
-                    $cache->set($this->definitions[$type], $this);
+                $def = $this->definitions[$type];
+                // check if the definition is setup
+                if ($def->setup) {
+                    return $def;
+                } else {
+                    $def->setup($this);
+                    if ($def->optimized) $cache->add($def, $this);
+                    return $def;
                }
-                return $this->definitions[$type];
            }
-            // memory check missed, try cache
-            $this->definitions[$type] = $cache->get($this);
-            if ($this->definitions[$type]) {
-                // definition in cache, return it
-                return $this->definitions[$type];
+            // check if definition is in cache
+            $def = $cache->get($this);
+            if ($def) {
+                // definition in cache, save to memory and return it
+                $this->definitions[$type] = $def;
+                return $def;
            }
-        } elseif (
-            !empty($this->definitions[$type]) &&
-            !$this->definitions[$type]->setup
-        ) {
-            // raw requested, raw in memory, quick return
-            return $this->definitions[$type];
+            // initialize it
+            $def = $this->initDefinition($type);
+            // set it up
+            $this->lock = $type;
+            $def->setup($this);
+            $this->lock = null;
+            // save in cache
+            $cache->add($def, $this);
+            // return it
+            return $def;
+        } else {
+            // raw definition
+            // --------------
+            // check preconditions
+            $def = null;
+            if ($optimized) {
+                if (is_null($this->get($type . '.DefinitionID'))) {
+                    // fatally error out if definition ID not set
+                    throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID");
+                }
+            }
+            if (!empty($this->definitions[$type])) {
+                $def = $this->definitions[$type];
+                if ($def->setup && !$optimized) {
+                    $extra = $this->chatty ? " (try moving this code block earlier in your initialization)" : "";
+                    throw new HTMLPurifier_Exception("Cannot retrieve raw definition after it has already been setup" . $extra);
+                }
+                if ($def->optimized === null) {
+                    $extra = $this->chatty ? " (try flushing your cache)" : "";
+                    throw new HTMLPurifier_Exception("Optimization status of definition is unknown" . $extra);
+                }
+                if ($def->optimized !== $optimized) {
+                    $msg = $optimized ? "optimized" : "unoptimized";
+                    $extra = $this->chatty ? " (this backtrace is for the first inconsistent call, which was for a $msg raw definition)" : "";
+                    throw new HTMLPurifier_Exception("Inconsistent use of optimized and unoptimized raw definition retrievals" . $extra);
+                }
+            }
+            // check if definition was in memory
+            if ($def) {
+                if ($def->setup) {
+                    // invariant: $optimized === true (checked above)
+                    return null;
+                } else {
+                    return $def;
+                }
+            }
+            // if optimized, check if definition was in cache
+            // (because we do the memory check first, this formulation
+            // is prone to cache slamming, but I think
+            // guaranteeing that either /all/ of the raw
+            // setup code or /none/ of it is run is more important.)
+            if ($optimized) {
+                // This code path only gets run once; once we put
+                // something in $definitions (which is guaranteed by the
+                // trailing code), we always short-circuit above.
+                $def = $cache->get($this);
+                if ($def) {
+                    // save the full definition for later, but don't
+                    // return it yet
+                    $this->definitions[$type] = $def;
+                    return null;
+                }
+            }
+            // check invariants for creation
+            if (!$optimized) {
+                if (!is_null($this->get($type . '.DefinitionID'))) {
+                    if ($this->chatty) {
+                        $this->triggerError("Due to a documentation error in previous version of HTML Purifier, your definitions are not being cached.  If this is OK, you can remove the %$type.DefinitionRev and %$type.DefinitionID declaration.  Otherwise, modify your code to use maybeGetRawDefinition, and test if the returned value is null before making any edits (if it is null, that means that a cached version is available, and no raw operations are necessary).  See <a href='http://htmlpurifier.org/docs/enduser-customize.html#optimized'>Customize</a> for more details", E_USER_WARNING);
+                    } else {
+                        $this->triggerError("Useless DefinitionID declaration", E_USER_WARNING);
+                    }
+                }
+            }
+            // initialize it
+            $def = $this->initDefinition($type);
+            $def->optimized = $optimized;
+            return $def;
        }
+        throw new HTMLPurifier_Exception("The impossible happened!");
+    }
+
+    private function initDefinition($type) {
        // quick checks failed, let's create the object
        if ($type == 'HTML') {
-            $this->definitions[$type] = new HTMLPurifier_HTMLDefinition();
+            $def = new HTMLPurifier_HTMLDefinition();
        } elseif ($type == 'CSS') {
-            $this->definitions[$type] = new HTMLPurifier_CSSDefinition();
+            $def = new HTMLPurifier_CSSDefinition();
        } elseif ($type == 'URI') {
-            $this->definitions[$type] = new HTMLPurifier_URIDefinition();
+            $def = new HTMLPurifier_URIDefinition();
        } else {
            throw new HTMLPurifier_Exception("Definition of $type type not supported");
        }
-        // quick abort if raw
-        if ($raw) {
-            if (is_null($this->get($type . '.DefinitionID'))) {
-                // fatally error out if definition ID not set
-                throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID");
-            }
-            return $this->definitions[$type];
-        }
-        // set it up
-        $this->lock = $type;
-        $this->definitions[$type]->setup($this);
-        $this->lock = null;
-        // save in cache
-        $cache->set($this->definitions[$type], $this);
-        return $this->definitions[$type];
+        $this->definitions[$type] = $def;
+        return $def;
+    }
+
+    public function maybeGetRawDefinition($name) {
+        return $this->getDefinition($name, true, true);
+    }
+
+    public function maybeGetRawHTMLDefinition() {
+        return $this->getDefinition('HTML', true, true);
+    }
+
+    public function maybeGetRawCSSDefinition() {
+        return $this->getDefinition('CSS', true, true);
+    }
+
+    public function maybeGetRawURIDefinition() {
+        return $this->getDefinition('URI', true, true);
    }

    /**
@@ -549,17 +673,22 @@ class HTMLPurifier_Config

    /**
     * Produces a nicely formatted error message by supplying the
-     * stack frame information from two levels up and OUTSIDE of
-     * HTMLPurifier_Config.
+     * stack frame information OUTSIDE of HTMLPurifier_Config.
     */
    protected function triggerError($msg, $no) {
        // determine previous stack frame
-        $backtrace = debug_backtrace();
-        if ($this->chatty && isset($backtrace[1])) {
-            $frame = $backtrace[1];
-            $extra = " on line {$frame['line']} in file {$frame['file']}";
-        } else {
-            $extra = '';
+        $extra = '';
+        if ($this->chatty) {
+            $trace = debug_backtrace();
+            // zip(tail(trace), trace) -- but PHP is not Haskell har har
+            for ($i = 0, $c = count($trace); $i < $c - 1; $i++) {
+                if ($trace[$i + 1]['class'] === 'HTMLPurifier_Config') {
+                    continue;
+                }
+                $frame = $trace[$i];
+                $extra = " invoked on line {$frame['line']} in file {$frame['file']}";
+                break;
+            }
        }
        trigger_error($msg . $extra, $no);
    }
--- a/library/HTMLPurifier/ConfigSchema.php
+++ b/library/HTMLPurifier/ConfigSchema.php
@@ -60,7 +60,13 @@ class HTMLPurifier_ConfigSchema {
     * Unserializes the default ConfigSchema.
     */
    public static function makeFromSerial() {
-        return unserialize(file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema.ser'));
+        $contents = file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema.ser');
+        $r = unserialize($contents);
+        if (!$r) {
+            $hash = sha1($contents);
+            trigger_error("Unserialization of configuration schema failed, sha1 of file was $hash", E_USER_ERROR);
+        }
+        return $r;
    }

    /**
--- a/library/HTMLPurifier/ConfigSchema/schema.ser
+++ b/library/HTMLPurifier/ConfigSchema/schema.ser
--- a/library/HTMLPurifier/ConfigSchema/schema/AutoFormat.RemoveSpansWithoutAttributes.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/AutoFormat.RemoveSpansWithoutAttributes.txt
@@ -0,0 +1,11 @@
+AutoFormat.RemoveSpansWithoutAttributes
+TYPE: bool
+VERSION: 4.0.1
+DEFAULT: false
+--DESCRIPTION--
+<p>
+  This directive causes <code>span</code> tags without any attributes
+  to be removed. It will also remove spans that had all attributes
+  removed during processing.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/CSS.AllowedFonts.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/CSS.AllowedFonts.txt
@@ -0,0 +1,12 @@
+CSS.AllowedFonts
+TYPE: lookup/null
+VERSION: 4.3.0
+DEFAULT: NULL
+--DESCRIPTION--
+<p>
+    Allows you to manually specify a set of allowed fonts.  If
+    <code>NULL</code>, all fonts are allowed.  This directive
+    affects generic names (serif, sans-serif, monospace, cursive,
+    fantasy) as well as specific font families.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/CSS.ForbiddenProperties.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/CSS.ForbiddenProperties.txt
@@ -0,0 +1,13 @@
+CSS.ForbiddenProperties
+TYPE: lookup
+VERSION: 4.2.0
+DEFAULT: array()
+--DESCRIPTION--
+<p>
+    This is the logical inverse of %CSS.AllowedProperties, and it will
+    override that directive or any other directive.  If possible,
+    %CSS.AllowedProperties is recommended over this directive,
+    because it can sometimes be difficult to tell whether or not you've
+    forbidden all of the CSS properties you truly would like to disallow.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/CSS.Trusted.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/CSS.Trusted.txt
@@ -0,0 +1,9 @@
+CSS.Trusted
+TYPE: bool
+VERSION: 4.2.1
+DEFAULT: false
+--DESCRIPTION--
+Indicates whether or not the user's CSS input is trusted or not. If the
+input is trusted, a more expansive set of allowed properties.  See
+also %HTML.Trusted.
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Cache.SerializerPermissions.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Cache.SerializerPermissions.txt
@@ -0,0 +1,11 @@
+Cache.SerializerPermissions
+TYPE: int
+VERSION: 4.3.0
+DEFAULT: 0755
+--DESCRIPTION--
+
+<p>
+    Directory permissions of the files and directories created inside
+    the DefinitionCache/Serializer or other custom serializer path.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Core.NormalizeNewlines.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Core.NormalizeNewlines.txt
@@ -0,0 +1,11 @@
+Core.NormalizeNewlines
+TYPE: bool
+VERSION: 4.2.0
+DEFAULT: true
+--DESCRIPTION--
+<p>
+    Whether or not to normalize newlines to the operating
+    system default.  When <code>false</code>, HTML Purifier
+    will attempt to preserve mixed newline files.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Core.RemoveProcessingInstructions.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Core.RemoveProcessingInstructions.txt
@@ -0,0 +1,11 @@
+Core.RemoveProcessingInstructions
+TYPE: bool
+VERSION: 4.2.0
+DEFAULT: false
+--DESCRIPTION--
+Instead of escaping processing instructions in the form <code>&lt;? ...
+?&gt;</code>, remove it out-right.  This may be useful if the HTML
+you are validating contains XML processing instruction gunk, however,
+it can also be user-unfriendly for people attempting to post PHP
+snippets.
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Filter.YouTube.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Filter.YouTube.txt
@@ -3,6 +3,11 @@ TYPE: bool
 VERSION: 3.1.0
 DEFAULT: false
 --DESCRIPTION--
+<p>
+  <strong>Warning:</strong> Deprecated in favor of %HTML.SafeObject and
+  %Output.FlashCompat (turn both on to allow YouTube videos and other
+  Flash content).
+</p>
 <p>
  This directive enables YouTube video embedding in HTML Purifier. Check
  <a href="http://htmlpurifier.org/docs/enduser-youtube.html">this document
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.Allowed.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.Allowed.txt
@@ -5,11 +5,14 @@ DEFAULT: NULL
 --DESCRIPTION--

 <p>
-    This is a convenience directive that rolls the functionality of
-    %HTML.AllowedElements and %HTML.AllowedAttributes into one directive.
+    This is a preferred convenience directive that combines
+    %HTML.AllowedElements and %HTML.AllowedAttributes.
    Specify elements and attributes that are allowed using:
-    <code>element1[attr1|attr2],element2...</code>. You can also use
-    newlines instead of commas to separate elements.
+    <code>element1[attr1|attr2],element2...</code>.  For example,
+    if you would like to only allow paragraphs and links, specify
+    <code>a[href],p</code>.  You can specify attributes that apply
+    to all elements using an asterisk, e.g. <code>*[lang]</code>.
+    You can also use newlines instead of commas to separate elements.
 </p>
 <p>
    <strong>Warning</strong>:
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.AllowedElements.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.AllowedElements.txt
@@ -4,12 +4,17 @@ VERSION: 1.3.0
 DEFAULT: NULL
 --DESCRIPTION--
 <p>
-    If HTML Purifier's tag set is unsatisfactory for your needs, you
-    can overload it with your own list of tags to allow.  Note that this
-    method is subtractive: it does its job by taking away from HTML Purifier
-    usual feature set, so you cannot add a tag that HTML Purifier never
-    supported in the first place (like embed, form or head).  If you
-    change this, you probably also want to change %HTML.AllowedAttributes.
+    If HTML Purifier's tag set is unsatisfactory for your needs, you can
+    overload it with your own list of tags to allow.  If you change
+    this, you probably also want to change %HTML.AllowedAttributes; see
+    also %HTML.Allowed which lets you set allowed elements and
+    attributes at the same time.
+</p>
+<p>
+    If you attempt to allow an element that HTML Purifier does not know
+    about, HTML Purifier will raise an error.  You will need to manually
+    tell HTML Purifier about this element by using the
+    <a href="http://htmlpurifier.org/docs/enduser-customize.html">advanced customization features.</a>
 </p>
 <p>
    <strong>Warning:</strong> If another directive conflicts with the
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.FlashAllowFullScreen.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.FlashAllowFullScreen.txt
@@ -0,0 +1,11 @@
+HTML.FlashAllowFullScreen
+TYPE: bool
+VERSION: 4.2.0
+DEFAULT: false
+--DESCRIPTION--
+<p>
+    Whether or not to permit embedded Flash content from
+    %HTML.SafeObject to expand to the full screen.  Corresponds to
+    the <code>allowFullScreen</code> parameter.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.Nofollow.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.Nofollow.txt
@@ -0,0 +1,7 @@
+HTML.Nofollow
+TYPE: bool
+VERSION: 4.3.0
+DEFAULT: FALSE
+--DESCRIPTION--
+If enabled, nofollow rel attributes are added to all outgoing links.
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.SafeEmbed.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.SafeEmbed.txt
@@ -7,8 +7,7 @@ DEFAULT: false
    Whether or not to permit embed tags in documents, with a number of extra
    security features added to prevent script execution. This is similar to
    what websites like MySpace do to embed tags. Embed is a proprietary
-    element and will cause your website to stop validating. You probably want
-    to enable this with %HTML.SafeObject.
-    <strong>Highly experimental.</strong>
-</p>
+    element and will cause your website to stop validating; you should
+    see if you can use %Output.FlashCompat with %HTML.SafeObject instead
+    first.</p>
 --# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.SafeObject.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.SafeObject.txt
@@ -6,9 +6,8 @@ DEFAULT: false
 <p>
    Whether or not to permit object tags in documents, with a number of extra
    security features added to prevent script execution. This is similar to
-    what websites like MySpace do to object tags. You may also want to
-    enable %HTML.SafeEmbed for maximum interoperability with Internet Explorer,
-    although embed tags will cause your website to stop validating.
-    <strong>Highly experimental.</strong>
+    what websites like MySpace do to object tags.  You should also enable
+    %Output.FlashCompat in order to generate Internet Explorer
+    compatibility code for your object tags.
 </p>
 --# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/HTML.Trusted.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/HTML.Trusted.txt
@@ -5,4 +5,5 @@ DEFAULT: false
 --DESCRIPTION--
 Indicates whether or not the user input is trusted or not. If the input is
 trusted, a more expansive set of allowed tags and attributes will be used.
+See also %CSS.Trusted.
 --# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Output.FixInnerHTML.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Output.FixInnerHTML.txt
@@ -0,0 +1,15 @@
+Output.FixInnerHTML
+TYPE: bool
+VERSION: 4.3.0
+DEFAULT: true
+--DESCRIPTION--
+<p>
+  If true, HTML Purifier will protect against Internet Explorer's
+  mishandling of the <code>innerHTML</code> attribute by appending
+  a space to any attribute that does not contain angled brackets, spaces
+  or quotes, but contains a backtick.  This slightly changes the
+  semantics of any given attribute, so if this is unacceptable and
+  you do not use <code>innerHTML</code> on any of your pages, you can
+  turn this directive off.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/Output.FlashCompat.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/Output.FlashCompat.txt
@@ -0,0 +1,11 @@
+Output.FlashCompat
+TYPE: bool
+VERSION: 4.1.0
+DEFAULT: false
+--DESCRIPTION--
+<p>
+  If true, HTML Purifier will generate Internet Explorer compatibility
+  code for all object code.  This is highly recommended if you enable
+  %HTML.SafeObject.
+</p>
+--# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/URI.AllowedSchemes.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/URI.AllowedSchemes.txt
@@ -12,4 +12,6 @@ array (
 --DESCRIPTION--
 Whitelist that defines the schemes that a URI is allowed to have.  This
 prevents XSS attacks from using pseudo-schemes like javascript or mocha.
+There is also support for the <code>data</code> and <code>file</code>
+URI schemes, but they are not enabled by default.
 --# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/ConfigSchema/schema/URI.DisableResources.txt
+++ b/library/HTMLPurifier/ConfigSchema/schema/URI.DisableResources.txt
@@ -1,12 +1,15 @@
 URI.DisableResources
 TYPE: bool
-VERSION: 1.3.0
+VERSION: 4.2.0
 DEFAULT: false
 --DESCRIPTION--
-
 <p>
    Disables embedding resources, essentially meaning no pictures. You can
    still link to them though. See %URI.DisableExternalResources for why
    this might be a good idea.
 </p>
+<p>
+    <em>Note:</em> While this directive has been available since 1.3.0,
+    it didn't actually start doing anything until 4.2.0.
+</p>
 --# vim: et sw=4 sts=4
--- a/library/HTMLPurifier/Definition.php
+++ b/library/HTMLPurifier/Definition.php
@@ -12,6 +12,17 @@ abstract class HTMLPurifier_Definition
     */
    public $setup = false;

+    /**
+     * If true, write out the final definition object to the cache after
+     * setup.  This will be true only if all invocations to get a raw
+     * definition object are also optimized.  This does not cause file
+     * system thrashing because on subsequent calls the cached object
+     * is used and any writes to the raw definition object are short
+     * circuited.  See enduser-customize.html for the high-level
+     * picture.
+     */
+    public $optimized = null;
+
    /**
     * What type of definition is it?
     */
--- a/library/HTMLPurifier/DefinitionCache/Serializer.php
+++ b/library/HTMLPurifier/DefinitionCache/Serializer.php
@@ -9,14 +9,14 @@ class HTMLPurifier_DefinitionCache_Serializer extends
        $file = $this->generateFilePath($config);
        if (file_exists($file)) return false;
        if (!$this->_prepareDir($config)) return false;
-        return $this->_write($file, serialize($def));
+        return $this->_write($file, serialize($def), $config);
    }

    public function set($def, $config) {
        if (!$this->checkDefType($def)) return;
        $file = $this->generateFilePath($config);
        if (!$this->_prepareDir($config)) return false;
-        return $this->_write($file, serialize($def));
+        return $this->_write($file, serialize($def), $config);
    }

    public function replace($def, $config) {
@@ -24,7 +24,7 @@ class HTMLPurifier_DefinitionCache_Serializer extends
        $file = $this->generateFilePath($config);
        if (!file_exists($file)) return false;
        if (!$this->_prepareDir($config)) return false;
-        return $this->_write($file, serialize($def));
+        return $this->_write($file, serialize($def), $config);
    }

    public function get($config) {
@@ -97,18 +97,34 @@ class HTMLPurifier_DefinitionCache_Serializer extends
     * Convenience wrapper function for file_put_contents
     * @param $file File name to write to
     * @param $data Data to write into file
+     * @param $config Config object
     * @return Number of bytes written if success, or false if failure.
     */
-    private function _write($file, $data) {
-        return file_put_contents($file, $data);
+    private function _write($file, $data, $config) {
+        $result = file_put_contents($file, $data);
+        if ($result !== false) {
+            // set permissions of the new file (no execute)
+            $chmod = $config->get('Cache.SerializerPermissions');
+            if (!$chmod) {
+                $chmod = 0644; // invalid config or simpletest
+            }
+            $chmod = $chmod & 0666;
+            chmod($file, $chmod);
+        }
+        return $result;
    }

    /**
     * Prepares the directory that this type stores the serials in
+     * @param $config Config object
     * @return True if successful
     */
    private function _prepareDir($config) {
        $directory = $this->generateDirectoryPath($config);
+        $chmod = $config->get('Cache.SerializerPermissions');
+        if (!$chmod) {
+            $chmod = 0755; // invalid config or simpletest
+        }
        if (!is_dir($directory)) {
            $base = $this->generateBaseDirectoryPath($config);
            if (!is_dir($base)) {
@@ -116,13 +132,13 @@ class HTMLPurifier_DefinitionCache_Serializer extends
                    please create or change using %Cache.SerializerPath',
                    E_USER_WARNING);
                return false;
-            } elseif (!$this->_testPermissions($base)) {
+            } elseif (!$this->_testPermissions($base, $chmod)) {
                return false;
            }
-            $old = umask(0022); // disable group and world writes
-            mkdir($directory);
+            $old = umask(0000);
+            mkdir($directory, $chmod);
            umask($old);
-        } elseif (!$this->_testPermissions($directory)) {
+        } elseif (!$this->_testPermissions($directory, $chmod)) {
            return false;
        }
        return true;
@@ -131,8 +147,11 @@ class HTMLPurifier_DefinitionCache_Serializer extends
    /**
     * Tests permissions on a directory and throws out friendly
     * error messages and attempts to chmod it itself if possible
+     * @param $dir Directory path
+     * @param $chmod Permissions
+     * @return True if directory writable
     */
-    private function _testPermissions($dir) {
+    private function _testPermissions($dir, $chmod) {
        // early abort, if it is writable, everything is hunky-dory
        if (is_writable($dir)) return true;
        if (!is_dir($dir)) {
@@ -146,17 +165,17 @@ class HTMLPurifier_DefinitionCache_Serializer extends
            // POSIX system, we can give more specific advice
            if (fileowner($dir) === posix_getuid()) {
                // we can chmod it ourselves
-                chmod($dir, 0755);
-                return true;
+                $chmod = $chmod | 0700;
+                if (chmod($dir, $chmod)) return true;
            } elseif (filegroup($dir) === posix_getgid()) {
-                $chmod = '775';
+                $chmod = $chmod | 0070;
            } else {
                // PHP's probably running as nobody, so we'll
                // need to give global permissions
-                $chmod = '777';
+                $chmod = $chmod | 0777;
            }
            trigger_error('Directory '.$dir.' not writable, '.
-                'please chmod to ' . $chmod,
+                'please chmod to ' . decoct($chmod),
                E_USER_WARNING);
        } else {
            // generic error message
--- a/library/HTMLPurifier/ElementDef.php
+++ b/library/HTMLPurifier/ElementDef.php
@@ -97,6 +97,13 @@ class HTMLPurifier_ElementDef
     */
    public $autoclose = array();

+    /**
+     * If a foreign element is found in this element, test if it is
+     * allowed by this sub-element; if it is, instead of closing the
+     * current element, place it inside this element.
+     */
+    public $wrap;
+
    /**
     * Whether or not this is a formatting element affected by the
     * "Active Formatting Elements" algorithm.
--- a/library/HTMLPurifier/EntityLookup/entities.ser
+++ b/library/HTMLPurifier/EntityLookup/entities.ser
--- a/library/HTMLPurifier/Filter/YouTube.php
+++ b/library/HTMLPurifier/Filter/YouTube.php
@@ -7,13 +7,13 @@ class HTMLPurifier_Filter_YouTube extends HTMLPurifier_Filter

    public function preFilter($html, $config, $context) {
        $pre_regex = '#<object[^>]+>.+?'.
-            'http://www.youtube.com/v/([A-Za-z0-9\-_]+).+?</object>#s';
+            'http://www.youtube.com/((?:v|cp)/[A-Za-z0-9\-_=]+).+?</object>#s';
        $pre_replace = '<span class="youtube-embed">\1</span>';
        return preg_replace($pre_regex, $pre_replace, $html);
    }

    public function postFilter($html, $config, $context) {
-        $post_regex = '#<span class="youtube-embed">([A-Za-z0-9\-_]+)</span>#';
+        $post_regex = '#<span class="youtube-embed">((?:v|cp)/[A-Za-z0-9\-_=]+)</span>#';
        return preg_replace_callback($post_regex, array($this, 'postFilterCallback'), $html);
    }

@@ -24,10 +24,10 @@ class HTMLPurifier_Filter_YouTube extends HTMLPurifier_Filter
    protected function postFilterCallback($matches) {
        $url = $this->armorUrl($matches[1]);
        return '<object width="425" height="350" type="application/x-shockwave-flash" '.
-            'data="http://www.youtube.com/v/'.$url.'">'.
-            '<param name="movie" value="http://www.youtube.com/v/'.$url.'"></param>'.
+            'data="http://www.youtube.com/'.$url.'">'.
+            '<param name="movie" value="http://www.youtube.com/'.$url.'"></param>'.
            '<!--[if IE]>'.
-            '<embed src="http://www.youtube.com/v/'.$url.'"'.
+            '<embed src="http://www.youtube.com/'.$url.'"'.
            'type="application/x-shockwave-flash"'.
            'wmode="transparent" width="425" height="350" />'.
            '<![endif]-->'.
--- a/library/HTMLPurifier/Generator.php
+++ b/library/HTMLPurifier/Generator.php
@@ -31,6 +31,22 @@ class HTMLPurifier_Generator
     */
    private $_sortAttr;

+    /**
+     * Cache of %Output.FlashCompat
+     */
+    private $_flashCompat;
+
+    /**
+     * Cache of %Output.FixInnerHTML
+     */
+    private $_innerHTMLFix;
+
+    /**
+     * Stack for keeping track of object information when outputting IE
+     * compatibility code.
+     */
+    private $_flashStack = array();
+
    /**
     * Configuration for the generator
     */
@@ -43,7 +59,9 @@ class HTMLPurifier_Generator
    public function __construct($config, $context) {
        $this->config = $config;
        $this->_scriptFix = $config->get('Output.CommentScriptContents');
+        $this->_innerHTMLFix = $config->get('Output.FixInnerHTML');
        $this->_sortAttr = $config->get('Output.SortAttr');
+        $this->_flashCompat = $config->get('Output.FlashCompat');
        $this->_def = $config->getHTMLDefinition();
        $this->_xhtml = $this->_def->doctype->xml;
    }
@@ -86,9 +104,11 @@ class HTMLPurifier_Generator
        }

        // Normalize newlines to system defined value
-        $nl = $this->config->get('Output.Newline');
-        if ($nl === null) $nl = PHP_EOL;
-        if ($nl !== "\n") $html = str_replace("\n", $nl, $html);
+        if ($this->config->get('Core.NormalizeNewlines')) {
+            $nl = $this->config->get('Output.Newline');
+            if ($nl === null) $nl = PHP_EOL;
+            if ($nl !== "\n") $html = str_replace("\n", $nl, $html);
+        }
        return $html;
    }

@@ -104,12 +124,29 @@ class HTMLPurifier_Generator

        } elseif ($token instanceof HTMLPurifier_Token_Start) {
            $attr = $this->generateAttributes($token->attr, $token->name);
+            if ($this->_flashCompat) {
+                if ($token->name == "object") {
+                    $flash = new stdclass();
+                    $flash->attr = $token->attr;
+                    $flash->param = array();
+                    $this->_flashStack[] = $flash;
+                }
+            }
            return '<' . $token->name . ($attr ? ' ' : '') . $attr . '>';

        } elseif ($token instanceof HTMLPurifier_Token_End) {
-            return '</' . $token->name . '>';
+            $_extra = '';
+            if ($this->_flashCompat) {
+                if ($token->name == "object" && !empty($this->_flashStack)) {
+                    // doesn't do anything for now
+                }
+            }
+            return $_extra . '</' . $token->name . '>';

        } elseif ($token instanceof HTMLPurifier_Token_Empty) {
+            if ($this->_flashCompat && $token->name == "param" && !empty($this->_flashStack)) {
+                $this->_flashStack[count($this->_flashStack)-1]->param[$token->attr['name']] = $token->attr['value'];
+            }
            $attr = $this->generateAttributes($token->attr, $token->name);
             return '<' . $token->name . ($attr ? ' ' : '') . $attr .
                ( $this->_xhtml ? ' /': '' ) // <br /> v. <br>
@@ -159,6 +196,37 @@ class HTMLPurifier_Generator
                    continue;
                }
            }
+            // Workaround for Internet Explorer innerHTML bug.
+            // Essentially, Internet Explorer, when calculating
+            // innerHTML, omits quotes if there are no instances of
+            // angled brackets, quotes or spaces.  However, when parsing
+            // HTML (for example, when you assign to innerHTML), it
+            // treats backticks as quotes.  Thus,
+            //      <img alt="``" />
+            // becomes
+            //      <img alt=`` />
+            // becomes
+            //      <img alt='' />
+            // Fortunately, all we need to do is trigger an appropriate
+            // quoting style, which we do by adding an extra space.
+            // This also is consistent with the W3C spec, which states
+            // that user agents may ignore leading or trailing
+            // whitespace (in fact, most don't, at least for attributes
+            // like alt, but an extra space at the end is barely
+            // noticeable).  Still, we have a configuration knob for
+            // this, since this transformation is not necesary if you
+            // don't process user input with innerHTML or you don't plan
+            // on supporting Internet Explorer.
+            if ($this->_innerHTMLFix) {
+                if (strpos($value, '`') !== false) {
+                    // check if correct quoting style would not already be
+                    // triggered
+                    if (strcspn($value, '"\' <>') === strlen($value)) {
+                        // protect!
+                        $value .= ' ';
+                    }
+                }
+            }
            $html .= $key.'="'.$this->escape($value).'" ';
        }
        return rtrim($html);
@@ -174,7 +242,10 @@ class HTMLPurifier_Generator
     *               permissible for non-attribute output.
     * @return String escaped data.
     */
-    public function escape($string, $quote = ENT_COMPAT) {
+    public function escape($string, $quote = null) {
+        // Workaround for APC bug on Mac Leopard reported by sidepodcast
+        // http://htmlpurifier.org/phorum/read.php?3,4823,4846
+        if ($quote === null) $quote = ENT_COMPAT;
        return htmlspecialchars($string, $quote, 'UTF-8');
    }

--- a/library/HTMLPurifier/HTMLDefinition.php
+++ b/library/HTMLPurifier/HTMLDefinition.php
@@ -300,7 +300,12 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
                            unset($allowed_attributes_mutable[$key]);
                        }
                    }
-                    if ($delete) unset($this->info[$tag]->attr[$attr]);
+                    if ($delete) {
+                        if ($this->info[$tag]->attr[$attr]->required) {
+                            trigger_error("Required attribute '$attr' in element '$tag' was not allowed, which means '$tag' will not be allowed either", E_USER_WARNING);
+                        }
+                        unset($this->info[$tag]->attr[$attr]);
+                    }
                }
            }
            // emit errors
--- a/library/HTMLPurifier/HTMLModule/List.php
+++ b/library/HTMLPurifier/HTMLModule/List.php
@@ -20,8 +20,10 @@ class HTMLPurifier_HTMLModule_List extends HTMLPurifier_HTMLModule
    public $content_sets = array('Flow' => 'List');

    public function setup($config) {
-        $this->addElement('ol', 'List', 'Required: li', 'Common');
-        $this->addElement('ul', 'List', 'Required: li', 'Common');
+        $ol = $this->addElement('ol', 'List', 'Required: li', 'Common');
+        $ol->wrap = "li";
+        $ul = $this->addElement('ul', 'List', 'Required: li', 'Common');
+        $ul->wrap = "li";
        $this->addElement('dl', 'List', 'Required: dt | dd', 'Common');

        $this->addElement('li', false, 'Flow', 'Common');
--- a/library/HTMLPurifier/HTMLModule/Nofollow.php
+++ b/library/HTMLPurifier/HTMLModule/Nofollow.php
@@ -0,0 +1,19 @@
+<?php
+
+/**
+ * Module adds the nofollow attribute transformation to a tags.  It
+ * is enabled by HTML.Nofollow
+ */
+class HTMLPurifier_HTMLModule_Nofollow extends HTMLPurifier_HTMLModule
+{
+
+    public $name = 'Nofollow';
+
+    public function setup($config) {
+        $a = $this->addBlankElement('a');
+        $a->attr_transform_post[] = new HTMLPurifier_AttrTransform_Nofollow();
+    }
+
+}
+
+// vim: et sw=4 sts=4
--- a/library/HTMLPurifier/HTMLModule/SafeEmbed.php
+++ b/library/HTMLPurifier/HTMLModule/SafeEmbed.php
@@ -20,7 +20,8 @@ class HTMLPurifier_HTMLModule_SafeEmbed extends HTMLPurifier_HTMLModule
                'height' => 'Pixels#' . $max,
                'allowscriptaccess' => 'Enum#never',
                'allownetworking' => 'Enum#internal',
-                'wmode' => 'Enum#window',
+                'flashvars' => 'Text',
+                'wmode' => 'Enum#window,transparent,opaque',
                'name' => 'ID',
            )
        );
--- a/library/HTMLPurifier/HTMLModule/SafeObject.php
+++ b/library/HTMLPurifier/HTMLModule/SafeObject.php
@@ -28,7 +28,9 @@ class HTMLPurifier_HTMLModule_SafeObject extends HTMLPurifier_HTMLModule
                'type'   => 'Enum#application/x-shockwave-flash',
                'width'  => 'Pixels#' . $max,
                'height' => 'Pixels#' . $max,
-                'data'   => 'URI#embedded'
+                'data'   => 'URI#embedded',
+                'codebase' => new HTMLPurifier_AttrDef_Enum(array(
+                    'http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0')),
            )
        );
        $object->attr_transform_post[] = new HTMLPurifier_AttrTransform_SafeObject();
--- a/library/HTMLPurifier/HTMLModule/Tidy/Proprietary.php
+++ b/library/HTMLPurifier/HTMLModule/Tidy/Proprietary.php
@@ -15,6 +15,7 @@ class HTMLPurifier_HTMLModule_Tidy_Proprietary extends HTMLPurifier_HTMLModule_T
        $r['thead@background'] = new HTMLPurifier_AttrTransform_Background();
        $r['tfoot@background'] = new HTMLPurifier_AttrTransform_Background();
        $r['tbody@background'] = new HTMLPurifier_AttrTransform_Background();
+        $r['table@height']     = new HTMLPurifier_AttrTransform_Length('height');
        return $r;
    }

--- a/library/HTMLPurifier/HTMLModuleManager.php
+++ b/library/HTMLPurifier/HTMLModuleManager.php
@@ -216,19 +216,19 @@ class HTMLPurifier_HTMLModuleManager
            }
        }

-        // add proprietary module (this gets special treatment because
-        // it is completely removed from doctypes, etc.)
+        // custom modules
        if ($config->get('HTML.Proprietary')) {
            $modules[] = 'Proprietary';
        }
-
-        // add SafeObject/Safeembed modules
        if ($config->get('HTML.SafeObject')) {
            $modules[] = 'SafeObject';
        }
        if ($config->get('HTML.SafeEmbed')) {
            $modules[] = 'SafeEmbed';
        }
+        if ($config->get('HTML.Nofollow')) {
+            $modules[] = 'Nofollow';
+        }

        // merge in custom modules
        $modules = array_merge($modules, $this->userModules);
--- a/library/HTMLPurifier/Injector/AutoParagraph.php
+++ b/library/HTMLPurifier/Injector/AutoParagraph.php
@@ -34,16 +34,21 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
                    //               ----
                    // This is a degenerate case
                } else {
-                    // State 1.2: PAR1
-                    //            ----
+                    if (!$token->is_whitespace || $this->_isInline($current)) {
+                        // State 1.2: PAR1
+                        //            ----

-                    // State 1.3: PAR1\n\nPAR2
-                    //            ------------
+                        // State 1.3: PAR1\n\nPAR2
+                        //            ------------

-                    // State 1.4: <div>PAR1\n\nPAR2 (see State 2)
-                    //                 ------------
-                    $token = array($this->_pStart());
-                    $this->_splitText($text, $token);
+                        // State 1.4: <div>PAR1\n\nPAR2 (see State 2)
+                        //                 ------------
+                        $token = array($this->_pStart());
+                        $this->_splitText($text, $token);
+                    } else {
+                        // State 1.5: \n<hr />
+                        //            --
+                    }
                }
            } else {
                // State 2:   <div>PAR1... (similar to 1.4)
--- a/library/HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php
+++ b/library/HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php
@@ -0,0 +1,60 @@
+<?php
+
+/**
+ * Injector that removes spans with no attributes
+ */
+class HTMLPurifier_Injector_RemoveSpansWithoutAttributes extends HTMLPurifier_Injector
+{
+    public $name = 'RemoveSpansWithoutAttributes';
+    public $needed = array('span');
+
+    private $attrValidator;
+
+    /**
+     * Used by AttrValidator
+     */
+    private $config;
+    private $context;
+
+    public function prepare($config, $context) {
+        $this->attrValidator = new HTMLPurifier_AttrValidator();
+        $this->config = $config;
+        $this->context = $context;
+        return parent::prepare($config, $context);
+    }
+
+    public function handleElement(&$token) {
+        if ($token->name !== 'span' || !$token instanceof HTMLPurifier_Token_Start) {
+            return;
+        }
+
+        // We need to validate the attributes now since this doesn't normally
+        // happen until after MakeWellFormed. If all the attributes are removed
+        // the span needs to be removed too.
+        $this->attrValidator->validateToken($token, $this->config, $this->context);
+        $token->armor['ValidateAttributes'] = true;
+
+        if (!empty($token->attr)) {
+            return;
+        }
+
+        $nesting = 0;
+        $spanContentTokens = array();
+        while ($this->forwardUntilEndToken($i, $current, $nesting)) {}
+
+        if ($current instanceof HTMLPurifier_Token_End && $current->name === 'span') {
+            // Mark closing span tag for deletion
+            $current->markForDeletion = true;
+            // Delete open span tag
+            $token = false;
+        }
+    }
+
+    public function handleEnd(&$token) {
+        if ($token->markForDeletion) {
+            $token = false;
+        }
+    }
+}
+
+// vim: et sw=4 sts=4
--- a/library/HTMLPurifier/Injector/SafeObject.php
+++ b/library/HTMLPurifier/Injector/SafeObject.php
@@ -20,6 +20,9 @@ class HTMLPurifier_Injector_SafeObject extends HTMLPurifier_Injector
    protected $allowedParam = array(
        'wmode' => true,
        'movie' => true,
+        'flashvars' => true,
+        'src' => true,
+        'allowFullScreen' => true, // if omitted, assume to be 'false'
    );

    public function prepare($config, $context) {
@@ -47,7 +50,8 @@ class HTMLPurifier_Injector_SafeObject extends HTMLPurifier_Injector
                // We need this fix because YouTube doesn't supply a data
                // attribute, which we need if a type is specified. This is
                // *very* Flash specific.
-                if (!isset($this->objectStack[$i]->attr['data']) && $token->attr['name'] == 'movie') {
+                if (!isset($this->objectStack[$i]->attr['data']) &&
+                    ($token->attr['name'] == 'movie' || $token->attr['name'] == 'src')) {
                    $this->objectStack[$i]->attr['data'] = $token->attr['value'];
                }
                // Check if the parameter is the correct value but has not
--- a/library/HTMLPurifier/Language/messages/en.php
+++ b/library/HTMLPurifier/Language/messages/en.php
@@ -23,6 +23,7 @@ $messages = array(
 'Lexer: Missing gt'            => 'Missing greater-than sign (>), previous less-than sign (<) should be escaped',
 'Lexer: Missing attribute key' => 'Attribute declaration has no key',
 'Lexer: Missing end quote'     => 'Attribute declaration has no end quote',
+'Lexer: Extracted body'        => 'Removed document metadata tags',

 'Strategy_RemoveForeignElements: Tag transform'              => '<$1> element transformed into $CurrentToken.Serialized',
 'Strategy_RemoveForeignElements: Missing required attribute' => '$CurrentToken.Compact element missing required attribute $1',
--- a/library/HTMLPurifier/Lexer.php
+++ b/library/HTMLPurifier/Lexer.php
@@ -230,6 +230,17 @@ class HTMLPurifier_Lexer
        );
    }

+    /**
+     * Special Internet Explorer conditional comments should be removed.
+     */
+    protected static function removeIEConditional($string) {
+        return preg_replace(
+            '#<!--\[if [^>]+\]>.*?<!\[endif\]-->#si', // probably should generalize for all strings
+            '',
+            $string
+        );
+    }
+
    /**
     * Callback function for escapeCDATA() that does the work.
     *
@@ -252,8 +263,10 @@ class HTMLPurifier_Lexer
    public function normalize($html, $config, $context) {

        // normalize newlines to \n
-        $html = str_replace("\r\n", "\n", $html);
-        $html = str_replace("\r", "\n", $html);
+        if ($config->get('Core.NormalizeNewlines')) {
+            $html = str_replace("\r\n", "\n", $html);
+            $html = str_replace("\r", "\n", $html);
+        }

        if ($config->get('HTML.Trusted')) {
            // escape convoluted CDATA
@@ -263,9 +276,19 @@ class HTMLPurifier_Lexer
        // escape CDATA
        $html = $this->escapeCDATA($html);

+        $html = $this->removeIEConditional($html);
+
        // extract body from document if applicable
        if ($config->get('Core.ConvertDocumentToFragment')) {
-            $html = $this->extractBody($html);
+            $e = false;
+            if ($config->get('Core.CollectErrors')) {
+                $e =& $context->get('ErrorCollector');
+            }
+            $new_html = $this->extractBody($html);
+            if ($e && $new_html != $html) {
+                $e->send(E_WARNING, 'Lexer: Extracted body');
+            }
+            $html = $new_html;
        }

        // expand entities that aren't the big five
@@ -276,6 +299,11 @@ class HTMLPurifier_Lexer
        // represent non-SGML characters (horror, horror!)
        $html = HTMLPurifier_Encoder::cleanUTF8($html);

+        // if processing instructions are to removed, remove them now
+        if ($config->get('Core.RemoveProcessingInstructions')) {
+            $html = preg_replace('#<\?.+?\?>#s', '', $html);
+        }
+
        return $html;
    }

--- a/library/HTMLPurifier/Lexer/DOMLex.php
+++ b/library/HTMLPurifier/Lexer/DOMLex.php
@@ -72,23 +72,57 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
    }

    /**
-     * Recursive function that tokenizes a node, putting it into an accumulator.
-     *
+     * Iterative function that tokenizes a node, putting it into an accumulator.
+     * To iterate is human, to recurse divine - L. Peter Deutsch
     * @param $node     DOMNode to be tokenized.
     * @param $tokens   Array-list of already tokenized tokens.
-     * @param $collect  Says whether or start and close are collected, set to
-     *                  false at first recursion because it's the implicit DIV
-     *                  tag you're dealing with.
     * @returns Tokens of node appended to previously passed tokens.
     */
-    protected function tokenizeDOM($node, &$tokens, $collect = false) {
+    protected function tokenizeDOM($node, &$tokens) {

+        $level = 0;
+        $nodes = array($level => array($node));
+        $closingNodes = array();
+        do {
+            while (!empty($nodes[$level])) {
+                $node = array_shift($nodes[$level]); // FIFO
+                $collect = $level > 0 ? true : false;
+                $needEndingTag = $this->createStartNode($node, $tokens, $collect);
+                if ($needEndingTag) {
+                    $closingNodes[$level][] = $node;
+                }
+                if ($node->childNodes && $node->childNodes->length) {
+                    $level++;
+                    $nodes[$level] = array();
+                    foreach ($node->childNodes as $childNode) {
+                        array_push($nodes[$level], $childNode);
+                    }
+                }
+            }
+            $level--;
+            if ($level && isset($closingNodes[$level])) {
+                while($node = array_pop($closingNodes[$level])) {
+                    $this->createEndNode($node, $tokens);
+                }
+            }
+        } while ($level > 0);
+    }
+
+    /**
+     * @param $node  DOMNode to be tokenized.
+     * @param $tokens   Array-list of already tokenized tokens.
+     * @param $collect  Says whether or start and close are collected, set to
+     *                    false at first recursion because it's the implicit DIV
+     *                    tag you're dealing with.
+     * @returns bool if the token needs an endtoken
+     */
+    protected function createStartNode($node, &$tokens, $collect) {
        // intercept non element nodes. WE MUST catch all of them,
        // but we're not getting the character reference nodes because
        // those should have been preprocessed
        if ($node->nodeType === XML_TEXT_NODE) {
            $tokens[] = $this->factory->createText($node->data);
-            return;
+            return false;
        } elseif ($node->nodeType === XML_CDATA_SECTION_NODE) {
            // undo libxml's special treatment of <script> and <style> tags
            $last = end($tokens);
@@ -106,48 +140,44 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
                }
            }
            $tokens[] = $this->factory->createText($this->parseData($data));
-            return;
+            return false;
        } elseif ($node->nodeType === XML_COMMENT_NODE) {
            // this is code is only invoked for comments in script/style in versions
            // of libxml pre-2.6.28 (regular comments, of course, are still
            // handled regularly)
            $tokens[] = $this->factory->createComment($node->data);
-            return;
+            return false;
        } elseif (
            // not-well tested: there may be other nodes we have to grab
            $node->nodeType !== XML_ELEMENT_NODE
        ) {
-            return;
+            return false;
        }

-        $attr = $node->hasAttributes() ?
-            $this->transformAttrToAssoc($node->attributes) :
-            array();
+        $attr = $node->hasAttributes() ? $this->transformAttrToAssoc($node->attributes) : array();

        // We still have to make sure that the element actually IS empty
        if (!$node->childNodes->length) {
            if ($collect) {
                $tokens[] = $this->factory->createEmpty($node->tagName, $attr);
            }
+            return false;
        } else {
-            if ($collect) { // don't wrap on first iteration
+            if ($collect) {
                $tokens[] = $this->factory->createStart(
                    $tag_name = $node->tagName, // somehow, it get's dropped
                    $attr
                );
            }
-            foreach ($node->childNodes as $node) {
-                // remember, it's an accumulator. Otherwise, we'd have
-                // to use array_merge
-                $this->tokenizeDOM($node, $tokens, true);
-            }
-            if ($collect) {
-                $tokens[] = $this->factory->createEnd($tag_name);
-            }
+            return true;
        }
-
    }

+    protected function createEndNode($node, &$tokens) {
+        $tokens[] = $this->factory->createEnd($node->tagName);
+    }
+
+
    /**
     * Converts a DOMNamedNodeMap of DOMAttr objects into an assoc array.
     *
--- a/library/HTMLPurifier/Lexer/DirectLex.php
+++ b/library/HTMLPurifier/Lexer/DirectLex.php
@@ -384,7 +384,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
                }
            }
            if ($value === false) $value = '';
-            return array($key => $value);
+            return array($key => $this->parseData($value));
        }

        // setup loop environment
--- a/library/HTMLPurifier/Lexer/PEARSax3.php
+++ b/library/HTMLPurifier/Lexer/PEARSax3.php
@@ -26,13 +26,20 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
     * Internal accumulator array for SAX parsers.
     */
    protected $tokens = array();
+    protected $last_token_was_empty;
+
+    private $parent_handler;
+    private $stack = array();

    public function tokenizeHTML($string, $config, $context) {

        $this->tokens = array();
+        $this->last_token_was_empty = false;

        $string = $this->normalize($string, $config, $context);

+        $this->parent_handler = set_error_handler(array($this, 'muteStrictErrorHandler'));
+
        $parser = new XML_HTMLSax3();
        $parser->set_object($this);
        $parser->set_element_handler('openHandler','closeHandler');
@@ -44,6 +51,8 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer

        $parser->parse($string);

+        restore_error_handler();
+
        return $this->tokens;

    }
@@ -58,9 +67,11 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
        }
        if ($closed) {
            $this->tokens[] = new HTMLPurifier_Token_Empty($name, $attrs);
+            $this->last_token_was_empty = true;
        } else {
            $this->tokens[] = new HTMLPurifier_Token_Start($name, $attrs);
        }
+        $this->stack[] = $name;
        return true;
    }

@@ -71,10 +82,12 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
        // HTMLSax3 seems to always send empty tags an extra close tag
        // check and ignore if you see it:
        // [TESTME] to make sure it doesn't overreach
-        if ($this->tokens[count($this->tokens)-1] instanceof HTMLPurifier_Token_Empty) {
+        if ($this->last_token_was_empty) {
+            $this->last_token_was_empty = false;
            return true;
        }
        $this->tokens[] = new HTMLPurifier_Token_End($name);
+        if (!empty($this->stack)) array_pop($this->stack);
        return true;
    }

@@ -82,6 +95,7 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
     * Data event handler, interface is defined by PEAR package.
     */
    public function dataHandler(&$parser, $data) {
+        $this->last_token_was_empty = false;
        $this->tokens[] = new HTMLPurifier_Token_Text($data);
        return true;
    }
@@ -91,7 +105,18 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
     */
    public function escapeHandler(&$parser, $data) {
        if (strpos($data, '--') === 0) {
-            $this->tokens[] = new HTMLPurifier_Token_Comment($data);
+            // remove trailing and leading double-dashes
+            $data = substr($data, 2);
+            if (strlen($data) >= 2 && substr($data, -2) == "--") {
+                $data = substr($data, 0, -2);
+            }
+            if (isset($this->stack[sizeof($this->stack) - 1]) &&
+                $this->stack[sizeof($this->stack) - 1] == "style") {
+                $this->tokens[] = new HTMLPurifier_Token_Text($data);
+            } else {
+                $this->tokens[] = new HTMLPurifier_Token_Comment($data);
+            }
+            $this->last_token_was_empty = false;
        }
        // CDATA is handled elsewhere, but if it was handled here:
        //if (strpos($data, '[CDATA[') === 0) {
@@ -101,6 +126,14 @@ class HTMLPurifier_Lexer_PEARSax3 extends HTMLPurifier_Lexer
        return true;
    }

+    /**
+     * An error handler that mutes strict errors
+     */
+    public function muteStrictErrorHandler($errno, $errstr, $errfile=null, $errline=null, $errcontext=null) {
+        if ($errno == E_STRICT) return;
+        return call_user_func($this->parent_handler, $errno, $errstr, $errfile, $errline, $errcontext);
+    }
+
 }

 // vim: et sw=4 sts=4
--- a/library/HTMLPurifier/Lexer/PH5P.php
+++ b/library/HTMLPurifier/Lexer/PH5P.php
@@ -125,8 +125,6 @@ class HTML5 {
    const EOF      = 5;

    public function __construct($data) {
-        $data = str_replace("\r\n", "\n", $data);
-        $data = str_replace("\r", null, $data);

        $this->data = $data;
        $this->char = -1;
--- a/library/HTMLPurifier/Strategy/MakeWellFormed.php
+++ b/library/HTMLPurifier/Strategy/MakeWellFormed.php
@@ -2,6 +2,14 @@

 /**
 * Takes tokens makes them well-formed (balance end tags, etc.)
+ *
+ * Specification of the armor attributes this strategy uses:
+ *
+ *      - MakeWellFormed_TagClosedError: This armor field is used to
+ *        suppress tag closed errors for certain tokens [TagClosedSuppress],
+ *        in particular, if a tag was generated automatically by HTML
+ *        Purifier, we may rely on our infrastructure to close it for us
+ *        and shouldn't report an error to the user [TagClosedAuto].
 */
 class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
 {
@@ -43,6 +51,12 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
        // local variables
        $generator = new HTMLPurifier_Generator($config, $context);
        $escape_invalid_tags = $config->get('Core.EscapeInvalidTags');
+        // used for autoclose early abortion
+        $global_parent_allowed_elements = array();
+        if (isset($definition->info[$definition->info_parent])) {
+            // may be unset under testing circumstances
+            $global_parent_allowed_elements = $definition->info[$definition->info_parent]->child->getAllowedElements($config);
+        }
        $e = $context->get('ErrorCollector', true);
        $t = false; // token index
        $i = false; // injector index
@@ -83,6 +97,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            $this->injectors[] = $injector;
        }
        foreach ($custom_injectors as $injector) {
+            if (!$injector) continue;
            if (is_string($injector)) {
                $injector = "HTMLPurifier_Injector_$injector";
                $injector = new $injector;
@@ -101,7 +116,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy

        // -- end INJECTOR --

-        // a note on punting:
+        // a note on reprocessing:
        //      In order to reduce code duplication, whenever some code needs
        //      to make HTML changes in order to make things "correct", the
        //      new HTML gets sent through the purifier, regardless of its
@@ -148,7 +163,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                $top_nesting = array_pop($this->stack);
                $this->stack[] = $top_nesting;

-                // send error
+                // send error [TagClosedSuppress]
                if ($e && !isset($top_nesting->armor['MakeWellFormed_TagClosedError'])) {
                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $top_nesting);
                }
@@ -164,6 +179,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            $token = $tokens[$t];

            //echo '<br>'; printTokens($tokens, $t); printTokens($this->stack);
+            //flush();

            // quick-check: if it's not a tag, no need to process
            if (empty($token->is_tag)) {
@@ -191,12 +207,12 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            $ok = false;
            if ($type === 'empty' && $token instanceof HTMLPurifier_Token_Start) {
                // claims to be a start tag but is empty
-                $token = new HTMLPurifier_Token_Empty($token->name, $token->attr);
+                $token = new HTMLPurifier_Token_Empty($token->name, $token->attr, $token->line, $token->col, $token->armor);
                $ok = true;
            } elseif ($type && $type !== 'empty' && $token instanceof HTMLPurifier_Token_Empty) {
                // claims to be empty but really is a start tag
                $this->swap(new HTMLPurifier_Token_End($token->name));
-                $this->insertBefore(new HTMLPurifier_Token_Start($token->name, $token->attr));
+                $this->insertBefore(new HTMLPurifier_Token_Start($token->name, $token->attr, $token->line, $token->col, $token->armor));
                // punt (since we had to modify the input stream in a non-trivial way)
                $reprocess = true;
                continue;
@@ -209,6 +225,19 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                // ...unless they also have to close their parent
                if (!empty($this->stack)) {

+                    // Performance note: you might think that it's rather
+                    // inefficient, recalculating the autoclose information
+                    // for every tag that a token closes (since when we
+                    // do an autoclose, we push a new token into the
+                    // stream and then /process/ that, before
+                    // re-processing this token.)  But this is
+                    // necessary, because an injector can make an
+                    // arbitrary transformations to the autoclosing
+                    // tokens we introduce, so things may have changed
+                    // in the meantime.  Also, doing the inefficient thing is
+                    // "easy" to reason about (for certain perverse definitions
+                    // of "easy")
+
                    $parent = array_pop($this->stack);
                    $this->stack[] = $parent;

@@ -219,30 +248,73 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                        $autoclose = false;
                    }

+                    if ($autoclose && $definition->info[$token->name]->wrap) {
+                        // Check if an element can be wrapped by another 
+                        // element to make it valid in a context (for 
+                        // example, <ul><ul> needs a <li> in between)
+                        $wrapname = $definition->info[$token->name]->wrap;
+                        $wrapdef = $definition->info[$wrapname];
+                        $elements = $wrapdef->child->getAllowedElements($config);
+                        $parent_elements = $definition->info[$parent->name]->child->getAllowedElements($config);
+                        if (isset($elements[$token->name]) && isset($parent_elements[$wrapname])) {
+                            $newtoken = new HTMLPurifier_Token_Start($wrapname);
+                            $this->insertBefore($newtoken);
+                            $reprocess = true;
+                            continue;
+                        }
+                    }
+
                    $carryover = false;
                    if ($autoclose && $definition->info[$parent->name]->formatting) {
                        $carryover = true;
                    }

                    if ($autoclose) {
-                        // errors need to be updated
-                        $new_token = new HTMLPurifier_Token_End($parent->name);
-                        $new_token->start = $parent;
-                        if ($carryover) {
-                            $element = clone $parent;
-                            $element->armor['MakeWellFormed_TagClosedError'] = true;
-                            $element->carryover = true;
-                            $this->processToken(array($new_token, $token, $element));
-                        } else {
-                            $this->insertBefore($new_token);
-                        }
-                        if ($e && !isset($parent->armor['MakeWellFormed_TagClosedError'])) {
-                            if (!$carryover) {
-                                $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag auto closed', $parent);
-                            } else {
-                                $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag carryover', $parent);
+                        // check if this autoclose is doomed to fail
+                        // (this rechecks $parent, which his harmless)
+                        $autoclose_ok = isset($global_parent_allowed_elements[$token->name]);
+                        if (!$autoclose_ok) {
+                            foreach ($this->stack as $ancestor) {
+                                $elements = $definition->info[$ancestor->name]->child->getAllowedElements($config);
+                                if (isset($elements[$token->name])) {
+                                    $autoclose_ok = true;
+                                    break;
+                                }
+                                if ($definition->info[$token->name]->wrap) {
+                                    $wrapname = $definition->info[$token->name]->wrap;
+                                    $wrapdef = $definition->info[$wrapname];
+                                    $wrap_elements = $wrapdef->child->getAllowedElements($config);
+                                    if (isset($wrap_elements[$token->name]) && isset($elements[$wrapname])) {
+                                        $autoclose_ok = true;
+                                        break;
+                                    }
+                                }
                            }
                        }
+                        if ($autoclose_ok) {
+                            // errors need to be updated
+                            $new_token = new HTMLPurifier_Token_End($parent->name);
+                            $new_token->start = $parent;
+                            if ($carryover) {
+                                $element = clone $parent;
+                                // [TagClosedAuto]
+                                $element->armor['MakeWellFormed_TagClosedError'] = true;
+                                $element->carryover = true;
+                                $this->processToken(array($new_token, $token, $element));
+                            } else {
+                                $this->insertBefore($new_token);
+                            }
+                            // [TagClosedSuppress]
+                            if ($e && !isset($parent->armor['MakeWellFormed_TagClosedError'])) {
+                                if (!$carryover) {
+                                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag auto closed', $parent);
+                                } else {
+                                    $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag carryover', $parent);
+                                }
+                            }
+                        } else {
+                            $this->remove();
+                        }
                        $reprocess = true;
                        continue;
                    }
@@ -348,7 +420,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
            if ($e) {
                for ($j = $c - 1; $j > 0; $j--) {
                    // notice we exclude $j == 0, i.e. the current ending tag, from
-                    // the errors...
+                    // the errors... [TagClosedSuppress]
                    if (!isset($skipped_tags[$j]->armor['MakeWellFormed_TagClosedError'])) {
                        $e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$j]);
                    }
@@ -363,6 +435,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
                $new_token->start = $skipped_tags[$j];
                array_unshift($replace, $new_token);
                if (isset($definition->info[$new_token->name]) && $definition->info[$new_token->name]->formatting) {
+                    // [TagClosedAuto]
                    $element = clone $skipped_tags[$j];
                    $element->carryover = true;
                    $element->armor['MakeWellFormed_TagClosedError'] = true;
@@ -431,7 +504,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
    }

    /**
-     * Inserts a token before the current token. Cursor now points to this token
+     * Inserts a token before the current token. Cursor now points to
+     * this token.  You must reprocess after this.
     */
    private function insertBefore($token) {
        array_splice($this->tokens, $this->t, 0, array($token));
@@ -439,14 +513,15 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy

    /**
     * Removes current token. Cursor now points to new token occupying previously
-     * occupied space.
+     * occupied space.  You must reprocess after this.
     */
    private function remove() {
        array_splice($this->tokens, $this->t, 1);
    }

    /**
-     * Swap current token with new token. Cursor points to new token (no change).
+     * Swap current token with new token. Cursor points to new token (no
+     * change).  You must reprocess after this.
     */
    private function swap($token) {
        $this->tokens[$this->t] = $token;
--- a/library/HTMLPurifier/TagTransform/Font.php
+++ b/library/HTMLPurifier/TagTransform/Font.php
@@ -63,13 +63,15 @@ class HTMLPurifier_TagTransform_Font extends HTMLPurifier_TagTransform
        // handle size transform
        if (isset($attr['size'])) {
            // normalize large numbers
-            if ($attr['size']{0} == '+' || $attr['size']{0} == '-') {
-                $size = (int) $attr['size'];
-                if ($size < -2) $attr['size'] = '-2';
-                if ($size > 4)  $attr['size'] = '+4';
-            } else {
-                $size = (int) $attr['size'];
-                if ($size > 7) $attr['size'] = '7';
+            if ($attr['size'] !== '') {
+                if ($attr['size']{0} == '+' || $attr['size']{0} == '-') {
+                    $size = (int) $attr['size'];
+                    if ($size < -2) $attr['size'] = '-2';
+                    if ($size > 4)  $attr['size'] = '+4';
+                } else {
+                    $size = (int) $attr['size'];
+                    if ($size > 7) $attr['size'] = '7';
+                }
            }
            if (isset($this->_size_lookup[$attr['size']])) {
                $prepend_style .= 'font-size:' .
--- a/library/HTMLPurifier/Token/Tag.php
+++ b/library/HTMLPurifier/Token/Tag.php
@@ -33,7 +33,7 @@ class HTMLPurifier_Token_Tag extends HTMLPurifier_Token
     * @param $name String name.
     * @param $attr Associative array of attributes.
     */
-    public function __construct($name, $attr = array(), $line = null, $col = null) {
+    public function __construct($name, $attr = array(), $line = null, $col = null, $armor = array()) {
        $this->name = ctype_lower($name) ? $name : strtolower($name);
        foreach ($attr as $key => $value) {
            // normalization only necessary when key is not lowercase
@@ -50,6 +50,7 @@ class HTMLPurifier_Token_Tag extends HTMLPurifier_Token
        $this->attr = $attr;
        $this->line = $line;
        $this->col  = $col;
+        $this->armor = $armor;
    }
 }

--- a/library/HTMLPurifier/URI.php
+++ b/library/HTMLPurifier/URI.php
@@ -67,14 +67,6 @@ class HTMLPurifier_URI
        $chars_gen_delims = ':/?#[]@';
        $chars_pchar = $chars_sub_delims . ':@';

-        // validate scheme (MUST BE FIRST!)
-        if (!is_null($this->scheme) && is_null($this->host)) {
-            $def = $config->getDefinition('URI');
-            if ($def->defaultScheme === $this->scheme) {
-                $this->scheme = null;
-            }
-        }
-
        // validate host
        if (!is_null($this->host)) {
            $host_def = new HTMLPurifier_AttrDef_URI_Host();
@@ -82,6 +74,21 @@ class HTMLPurifier_URI
            if ($this->host === false) $this->host = null;
        }

+        // validate scheme
+        // NOTE: It's not appropriate to check whether or not this
+        // scheme is in our registry, since a URIFilter may convert a
+        // URI that we don't allow into one we do.  So instead, we just
+        // check if the scheme can be dropped because there is no host
+        // and it is our default scheme.
+        if (!is_null($this->scheme) && is_null($this->host) || $this->host === '') {
+            // support for relative paths is pretty abysmal when the
+            // scheme is present, so axe it when possible
+            $def = $config->getDefinition('URI');
+            if ($def->defaultScheme === $this->scheme) {
+                $this->scheme = null;
+            }
+        }
+
        // validate username
        if (!is_null($this->userinfo)) {
            $encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . ':');
@@ -96,32 +103,48 @@ class HTMLPurifier_URI
        // validate path
        $path_parts = array();
        $segments_encoder = new HTMLPurifier_PercentEncoder($chars_pchar . '/');
-        if (!is_null($this->host)) {
+        if (!is_null($this->host)) { // this catches $this->host === ''
            // path-abempty (hier and relative)
+            // http://www.example.com/my/path
+            // //www.example.com/my/path (looks odd, but works, and
+            //                            recognized by most browsers)
+            // (this set is valid or invalid on a scheme by scheme
+            // basis, so we'll deal with it later)
+            // file:///my/path
+            // ///my/path
            $this->path = $segments_encoder->encode($this->path);
-        } elseif ($this->path !== '' && $this->path[0] === '/') {
-            // path-absolute (hier and relative)
-            if (strlen($this->path) >= 2 && $this->path[1] === '/') {
-                // This shouldn't ever happen!
-                $this->path = '';
-            } else {
+        } elseif ($this->path !== '') {
+            if ($this->path[0] === '/') {
+                // path-absolute (hier and relative)
+                // http:/my/path
+                // /my/path
+                if (strlen($this->path) >= 2 && $this->path[1] === '/') {
+                    // This could happen if both the host gets stripped
+                    // out
+                    // http://my/path
+                    // //my/path
+                    $this->path = '';
+                } else {
+                    $this->path = $segments_encoder->encode($this->path);
+                }
+            } elseif (!is_null($this->scheme)) {
+                // path-rootless (hier)
+                // http:my/path
+                // Short circuit evaluation means we don't need to check nz
                $this->path = $segments_encoder->encode($this->path);
-            }
-        } elseif (!is_null($this->scheme) && $this->path !== '') {
-            // path-rootless (hier)
-            // Short circuit evaluation means we don't need to check nz
-            $this->path = $segments_encoder->encode($this->path);
-        } elseif (is_null($this->scheme) && $this->path !== '') {
-            // path-noscheme (relative)
-            // (once again, not checking nz)
-            $segment_nc_encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . '@');
-            $c = strpos($this->path, '/');
-            if ($c !== false) {
-                $this->path =
-                    $segment_nc_encoder->encode(substr($this->path, 0, $c)) .
-                    $segments_encoder->encode(substr($this->path, $c));
            } else {
-                $this->path = $segment_nc_encoder->encode($this->path);
+                // path-noscheme (relative)
+                // my/path
+                // (once again, not checking nz)
+                $segment_nc_encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . '@');
+                $c = strpos($this->path, '/');
+                if ($c !== false) {
+                    $this->path =
+                        $segment_nc_encoder->encode(substr($this->path, 0, $c)) .
+                        $segments_encoder->encode(substr($this->path, $c));
+                } else {
+                    $this->path = $segment_nc_encoder->encode($this->path);
+                }
            }
        } else {
            // path-empty (hier and relative)
@@ -150,6 +173,9 @@ class HTMLPurifier_URI
    public function toString() {
        // reconstruct authority
        $authority = null;
+        // there is a rendering difference between a null authority
+        // (http:foo-bar) and an empty string authority
+        // (http:///foo-bar).
        if (!is_null($this->host)) {
            $authority = '';
            if(!is_null($this->userinfo)) $authority .= $this->userinfo . '@';
@@ -157,7 +183,12 @@ class HTMLPurifier_URI
            if(!is_null($this->port))     $authority .= ':' . $this->port;
        }

-        // reconstruct the result
+        // Reconstruct the result
+        // One might wonder about parsing quirks from browsers after
+        // this reconstruction.  Unfortunately, parsing behavior depends
+        // on what *scheme* was employed (file:///foo is handled *very*
+        // differently than http:///foo), so unfortunately we have to
+        // defer to the schemes to do the right thing.
        $result = '';
        if (!is_null($this->scheme))    $result .= $this->scheme . ':';
        if (!is_null($authority))       $result .=  '//' . $authority;
--- a/library/HTMLPurifier/URIFilter/DisableResources.php
+++ b/library/HTMLPurifier/URIFilter/DisableResources.php
@@ -0,0 +1,11 @@
+<?php
+
+class HTMLPurifier_URIFilter_DisableResources extends HTMLPurifier_URIFilter
+{
+    public $name = 'DisableResources';
+    public function filter(&$uri, $config, $context) {
+        return !$context->get('EmbeddedURI', true);
+    }
+}
+
+// vim: et sw=4 sts=4
--- a/library/HTMLPurifier/URIScheme.php
+++ b/library/HTMLPurifier/URIScheme.php
@@ -3,11 +3,13 @@
 /**
 * Validator for the components of a URI for a specific scheme
 */
-class HTMLPurifier_URIScheme
+abstract class HTMLPurifier_URIScheme
 {

    /**
-     * Scheme's default port (integer)
+     * Scheme's default port (integer).  If an explicit port number is
+     * specified that coincides with the default port, it will be
+     * elided.
     */
    public $default_port = null;

@@ -24,17 +26,62 @@ class HTMLPurifier_URIScheme
    public $hierarchical = false;

    /**
-     * Validates the components of a URI
-     * @note This implementation should be called by children if they define
-     *       a default port, as it does port processing.
-     * @param $uri Instance of HTMLPurifier_URI
+     * Whether or not the URI may omit a hostname when the scheme is
+     * explicitly specified, ala file:///path/to/file. As of writing,
+     * 'file' is the only scheme that browsers support his properly.
+     */
+    public $may_omit_host = false;
+
+    /**
+     * Validates the components of a URI for a specific scheme.
+     * @param $uri Reference to a HTMLPurifier_URI object
+     * @param $config HTMLPurifier_Config object
+     * @param $context HTMLPurifier_Context object
+     * @return Bool success or failure
+     */
+    public abstract function doValidate(&$uri, $config, $context);
+
+    /**
+     * Public interface for validating components of a URI.  Performs a
+     * bunch of default actions. Don't overload this method.
+     * @param $uri Reference to a HTMLPurifier_URI object
     * @param $config HTMLPurifier_Config object
     * @param $context HTMLPurifier_Context object
     * @return Bool success or failure
     */
    public function validate(&$uri, $config, $context) {
        if ($this->default_port == $uri->port) $uri->port = null;
-        return true;
+        // kludge: browsers do funny things when the scheme but not the
+        // authority is set
+        if (!$this->may_omit_host &&
+            // if the scheme is present, a missing host is always in error
+            (!is_null($uri->scheme) && ($uri->host === '' || is_null($uri->host))) ||
+            // if the scheme is not present, a *blank* host is in error,
+            // since this translates into '///path' which most browsers
+            // interpret as being 'http://path'.
+             (is_null($uri->scheme) && $uri->host === '')
+        ) {
+            do {
+                if (is_null($uri->scheme)) {
+                    if (substr($uri->path, 0, 2) != '//') {
+                        $uri->host = null;
+                        break;
+                    }
+                    // URI is '////path', so we cannot nullify the
+                    // host to preserve semantics.  Try expanding the
+                    // hostname instead (fall through)
+                }
+                // first see if we can manually insert a hostname
+                $host = $config->get('URI.Host');
+                if (!is_null($host)) {
+                    $uri->host = $host;
+                } else {
+                    // we can't do anything sensible, reject the URL.
+                    return false;
+                }
+            } while (false);
+        }
+        return $this->doValidate($uri, $config, $context);
    }

 }
--- a/library/HTMLPurifier/URIScheme/data.php
+++ b/library/HTMLPurifier/URIScheme/data.php
@@ -0,0 +1,96 @@
+<?php
+
+/**
+ * Implements data: URI for base64 encoded images supported by GD.
+ */
+class HTMLPurifier_URIScheme_data extends HTMLPurifier_URIScheme {
+
+    public $browsable = true;
+    public $allowed_types = array(
+        // you better write validation code for other types if you
+        // decide to allow them
+        'image/jpeg' => true,
+        'image/gif' => true,
+        'image/png' => true,
+        );
+    // this is actually irrelevant since we only write out the path
+    // component
+    public $may_omit_host = true;
+
+    public function doValidate(&$uri, $config, $context) {
+        $result = explode(',', $uri->path, 2);
+        $is_base64 = false;
+        $charset = null;
+        $content_type = null;
+        if (count($result) == 2) {
+            list($metadata, $data) = $result;
+            // do some legwork on the metadata
+            $metas = explode(';', $metadata);
+            while(!empty($metas)) {
+                $cur = array_shift($metas);
+                if ($cur == 'base64') {
+                    $is_base64 = true;
+                    break;
+                }
+                if (substr($cur, 0, 8) == 'charset=') {
+                    // doesn't match if there are arbitrary spaces, but
+                    // whatever dude
+                    if ($charset !== null) continue; // garbage
+                    $charset = substr($cur, 8); // not used
+                } else {
+                    if ($content_type !== null) continue; // garbage
+                    $content_type = $cur;
+                }
+            }
+        } else {
+            $data = $result[0];
+        }
+        if ($content_type !== null && empty($this->allowed_types[$content_type])) {
+            return false;
+        }
+        if ($charset !== null) {
+            // error; we don't allow plaintext stuff
+            $charset = null;
+        }
+        $data = rawurldecode($data);
+        if ($is_base64) {
+            $raw_data = base64_decode($data);
+        } else {
+            $raw_data = $data;
+        }
+        // XXX probably want to refactor this into a general mechanism
+        // for filtering arbitrary content types
+        $file = tempnam("/tmp", "");
+        file_put_contents($file, $raw_data);
+        if (function_exists('exif_imagetype')) {
+            $image_code = exif_imagetype($file);
+        } elseif (function_exists('getimagesize')) {
+            set_error_handler(array($this, 'muteErrorHandler'));
+            $info = getimagesize($file);
+            restore_error_handler();
+            if ($info == false) return false;
+            $image_code = $info[2];
+        } else {
+            trigger_error("could not find exif_imagetype or getimagesize functions", E_USER_ERROR);
+        }
+        $real_content_type = image_type_to_mime_type($image_code);
+        if ($real_content_type != $content_type) {
+            // we're nice guys; if the content type is something else we
+            // support, change it over
+            if (empty($this->allowed_types[$real_content_type])) return false;
+            $content_type = $real_content_type;
+        }
+        // ok, it's kosher, rewrite what we need
+        $uri->userinfo = null;
+        $uri->host = null;
+        $uri->port = null;
+        $uri->fragment = null;
+        $uri->query = null;
+        $uri->path = "$content_type;base64," . base64_encode($raw_data);
+        return true;
+    }
+
+    public function muteErrorHandler($errno, $errstr) {}
+
+}
+
--- a/library/HTMLPurifier/URIScheme/file.php
+++ b/library/HTMLPurifier/URIScheme/file.php
@@ -0,0 +1,32 @@
+<?php
+
+/**
+ * Validates file as defined by RFC 1630 and RFC 1738.
+ */
+class HTMLPurifier_URIScheme_file extends HTMLPurifier_URIScheme {
+
+    // Generally file:// URLs are not accessible from most
+    // machines, so placing them as an img src is incorrect.
+    public $browsable = false;
+
+    // Basically the *only* URI scheme for which this is true, since
+    // accessing files on the local machine is very common.  In fact,
+    // browsers on some operating systems don't understand the
+    // authority, though I hear it is used on Windows to refer to
+    // network shares.
+    public $may_omit_host = true;
+
+    public function doValidate(&$uri, $config, $context) {
+        // Authentication method is not supported
+        $uri->userinfo = null;
+        // file:// makes no provisions for accessing the resource
+        $uri->port     = null;
+        // While it seems to work on Firefox, the querystring has
+        // no possible effect and is thus stripped.
+        $uri->query    = null;
+        return true;
+    }
+
+}
+
+// vim: et sw=4 sts=4
--- a/library/HTMLPurifier/URIScheme/ftp.php
+++ b/library/HTMLPurifier/URIScheme/ftp.php
@@ -9,8 +9,7 @@ class HTMLPurifier_URIScheme_ftp extends HTMLPurifier_URIScheme {
    public $browsable = true; // usually
    public $hierarchical = true;

-    public function validate(&$uri, $config, $context) {
-        parent::validate($uri, $config, $context);
+    public function doValidate(&$uri, $config, $context) {
        $uri->query    = null;

        // typecode check
--- a/library/HTMLPurifier/URIScheme/http.php
+++ b/library/HTMLPurifier/URIScheme/http.php
@@ -9,8 +9,7 @@ class HTMLPurifier_URIScheme_http extends HTMLPurifier_URIScheme {
    public $browsable = true;
    public $hierarchical = true;

-    public function validate(&$uri, $config, $context) {
-        parent::validate($uri, $config, $context);
+    public function doValidate(&$uri, $config, $context) {
        $uri->userinfo = null;
        return true;
    }
--- a/library/HTMLPurifier/URIScheme/mailto.php
+++ b/library/HTMLPurifier/URIScheme/mailto.php
@@ -12,9 +12,9 @@
 class HTMLPurifier_URIScheme_mailto extends HTMLPurifier_URIScheme {

    public $browsable = false;
+    public $may_omit_host = true;

-    public function validate(&$uri, $config, $context) {
-        parent::validate($uri, $config, $context);
+    public function doValidate(&$uri, $config, $context) {
        $uri->userinfo = null;
        $uri->host     = null;
        $uri->port     = null;
--- a/library/HTMLPurifier/URIScheme/news.php
+++ b/library/HTMLPurifier/URIScheme/news.php
@@ -6,9 +6,9 @@
 class HTMLPurifier_URIScheme_news extends HTMLPurifier_URIScheme {

    public $browsable = false;
+    public $may_omit_host = true;

-    public function validate(&$uri, $config, $context) {
-        parent::validate($uri, $config, $context);
+    public function doValidate(&$uri, $config, $context) {
        $uri->userinfo = null;
        $uri->host     = null;
        $uri->port     = null;
--- a/library/HTMLPurifier/URIScheme/nntp.php
+++ b/library/HTMLPurifier/URIScheme/nntp.php
@@ -8,8 +8,7 @@ class HTMLPurifier_URIScheme_nntp extends HTMLPurifier_URIScheme {
    public $default_port = 119;
    public $browsable = false;

-    public function validate(&$uri, $config, $context) {
-        parent::validate($uri, $config, $context);
+    public function doValidate(&$uri, $config, $context) {
        $uri->userinfo = null;
        $uri->query    = null;
        return true;
--- a/library/HTMLPurifier/VarParser/Flexible.php
+++ b/library/HTMLPurifier/VarParser/Flexible.php
@@ -62,7 +62,7 @@ class HTMLPurifier_VarParser_Flexible extends HTMLPurifier_VarParser
                        foreach ($var as $keypair) {
                            $c = explode(':', $keypair, 2);
                            if (!isset($c[1])) continue;
-                            $nvar[$c[0]] = $c[1];
+                            $nvar[trim($c[0])] = trim($c[1]);
                        }
                        $var = $nvar;
                    }
@@ -79,8 +79,15 @@ class HTMLPurifier_VarParser_Flexible extends HTMLPurifier_VarParser
                        return $new;
                    } else break;
                }
+                if ($type === self::ALIST) {
+                    trigger_error("Array list did not have consecutive integer indexes", E_USER_WARNING);
+                    return array_values($var);
+                }
                if ($type === self::LOOKUP) {
                    foreach ($var as $key => $value) {
+                        if ($value !== true) {
+                            trigger_error("Lookup array has non-true value at key '$key'; maybe your input array was not indexed numerically", E_USER_WARNING);
+                        }
                        $var[$key] = true;
                    }
                }
--- a/maintenance/compile-doxygen.sh
+++ b/maintenance/compile-doxygen.sh
@@ -0,0 +1,11 @@
+#!/bin/bash
+cd ..
+mkdir docs/doxygen
+rm -Rf docs/doxygen/*
+doxygen 1>docs/doxygen/info.log 2>docs/doxygen/errors.log
+if [ "$?" != 0 ]; then
+    cat docs/doxygen/errors.log
+    exit
+fi
+cd docs
+tar czf doxygen.tgz doxygen
--- a/maintenance/flush.php
+++ b/maintenance/flush.php
@@ -18,8 +18,7 @@ function e($cmd) {
    if ($status) exit($status);
 }

-$php = $_SERVER['argv'][1];
-if (!$php) $php = 'php';
+$php = empty($_SERVER['argv'][1]) ? 'php' : $_SERVER['argv'][1];

 e($php . ' generate-includes.php');
 e($php . ' generate-schema-cache.php');
--- a/maintenance/generate-entity-file.php
+++ b/maintenance/generate-entity-file.php
@@ -36,7 +36,7 @@ function unichr($dec) {
 }

 if ( !is_dir($entity_dir) ) exit("Fatal Error: Can't find entity directory.\n");
-if ( file_exists($output_file) ) exit("Fatal Error: entity-lookup.txt already exists.\n");
+if ( file_exists($output_file) ) exit("Fatal Error: output file already exists.\n");

 $dh = @opendir($entity_dir);
 if ( !$dh ) exit("Fatal Error: Cannot read entity directory.\n");
@@ -52,7 +52,7 @@ closedir($dh);
 if ( !$entity_files ) exit("Fatal Error: No entity files to parse.\n");

 $entity_table = array();
-$regexp = '/<!ENTITY\s+([A-Za-z]+)\s+"&#(?:38;#)?([0-9]+);">/';
+$regexp = '/<!ENTITY\s+([A-Za-z0-9]+)\s+"&#(?:38;#)?([0-9]+);">/';

 foreach ( $entity_files as $file ) {
    $contents = file_get_contents($entity_dir . $file);
--- a/maintenance/generate-includes.php
+++ b/maintenance/generate-includes.php
@@ -80,8 +80,9 @@ function get_dependency_lookup($file) {
        if (strncmp('class', $line, 5) === 0) {
            // The implementation here is fragile and will break if we attempt
            // to use interfaces. Beware!
-            list(, $parent) = explode(' extends ', trim($line, ' {'."\n\r"), 2);
-            if (empty($parent)) break;
+            $arr = explode(' extends ', trim($line, ' {'."\n\r"), 2);
+            if (count($arr) < 2) break;
+            $parent = $arr[1];
            $dep_file = HTMLPurifier_Bootstrap::getPath($parent);
            if (!$dep_file) break;
            $deps[$dep_file] = true;
--- a/maintenance/phpt-modifications.patch
+++ b/maintenance/phpt-modifications.patch
@@ -1,367 +0,0 @@
-Index: src/PHPT/Case.php
-===================================================================
--- src/PHPT/Case.php	(revision 691)
-+++ src/PHPT/Case.php	(working copy)
-@@ -28,17 +28,14 @@
-     {
-         $reporter->onCaseStart($this);
-         try {
-            if ($this->sections->filterByInterface('RunnableBefore')->valid()) {
-                foreach ($this->sections as $section) {
-                    $section->run($this);
-                }
-+            $runnable_before = $this->sections->filterByInterface('RunnableBefore');
-+            foreach ($runnable_before as $section) {
-+                $section->run($this);
-             }
-            $this->sections->filterByInterface();
-             $this->sections->FILE->run($this);
-            if ($this->sections->filterByInterface('RunnableAfter')->valid()) {
-                foreach ($this->sections as $section) {
-                    $section->run($this);
-                }
-+            $runnable_after = $this->sections->filterByInterface('RunnableAfter');
-+            foreach ($runnable_after as $section) {
-+                $section->run($this);
-             }
-             $reporter->onCasePass($this);
-         } catch (PHPT_Case_VetoException $veto) {
-@@ -46,7 +43,6 @@
-         } catch (PHPT_Case_FailureException $failure) {
-             $reporter->onCaseFail($this, $failure);
-         }
-        $this->sections->filterByInterface();
-         $reporter->onCaseEnd($this);
-     }
-     
-Index: src/PHPT/Case/Validator/CgiRequired.php
-===================================================================
--- src/PHPT/Case/Validator/CgiRequired.php	(revision 691)
-+++ src/PHPT/Case/Validator/CgiRequired.php	(working copy)
-@@ -17,7 +17,6 @@
-     public function is(PHPT_Case $case)
-     {
-         $return = $case->sections->filterByInterface('CgiExecutable')->valid();
-        $case->sections->filterByInterface();
-         return $return;
-     }
- }
-Index: src/PHPT/CodeRunner/CommandLine.php
-===================================================================
--- src/PHPT/CodeRunner/CommandLine.php	(revision 691)
-+++ src/PHPT/CodeRunner/CommandLine.php	(working copy)
-@@ -13,7 +13,7 @@
-         $this->_filename = $runner->filename;
-         $this->_ini = (string)$runner->ini;
-         $this->_args = (string)$runner->args;
-        $this->_executable = str_replace(' ', '\ ', (string)$runner->executable);
-+        $this->_executable = $runner->executable;
-         $this->_post_filename = (string)$runner->post_filename;
-     }
-     
-Index: src/PHPT/CodeRunner/Driver/WScriptShell.php
-===================================================================
--- src/PHPT/CodeRunner/Driver/WScriptShell.php	(revision 691)
-+++ src/PHPT/CodeRunner/Driver/WScriptShell.php	(working copy)
-@@ -23,9 +23,9 @@
-             }
-         }
-         if ($found == false) {
-            throw new PHPT_CodeRunner_InvalidExecutableException(
-                'unable to locate PHP executable: ' . $this->executable
-            );
-+            //throw new PHPT_CodeRunner_InvalidExecutableException(
-+            //    'unable to locate PHP executable: ' . $this->executable
-+            //);
-         }
-     }
- 
-@@ -69,7 +69,7 @@
- 
-         $error = $this->_process->StdErr->ReadAll();
-         if (!empty($error)) {
-            throw new PHPT_CodeRunner_ExecutionException($error);
-+            throw new PHPT_CodeRunner_ExecutionException($error, $this->_commandFactory());
-         }
- 
-         return $this->_process->StdOut->ReadAll();
-@@ -93,6 +93,7 @@
-     {
-         $return = '';
-         foreach ($this->environment as $key => $value) {
-+            $value = str_replace('&', '^&', $value);
-             $return .= "set {$key}={$value} & ";
-         }
-         return $return;
-Index: src/PHPT/CodeRunner/Factory.php
-===================================================================
--- src/PHPT/CodeRunner/Factory.php	(revision 691)
-+++ src/PHPT/CodeRunner/Factory.php	(working copy)
-@@ -33,7 +33,13 @@
-                 'php-cgi';
-         }
- 
-        if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
-+        if (
-+            strtoupper(substr(PHP_OS, 0, 3)) == 'WIN' &&
-+            (
-+                $runner->executable == 'php' ||
-+                $runner->executable == 'php-cgi'
-+            )
-+        ) {
-             $runner->executable = $runner->executable . '.exe';
-         }
-         try {
-Index: src/PHPT/Section/ModifiableAbstract.php
-===================================================================
--- src/PHPT/Section/ModifiableAbstract.php	(revision 691)
-+++ src/PHPT/Section/ModifiableAbstract.php	(working copy)
-@@ -15,12 +15,10 @@
-     
-     public function run(PHPT_Case $case)
-     {
-        $sections = clone $case->sections;
-        if ($sections->filterByInterface($this->_modifier_name . 'Modifier')->valid()) {
-            $modifyMethod = 'modify' . $this->_modifier_name;
-            foreach ($sections as $section) {
-                $section->$modifyMethod($this);
-            }
-+        $modifiers = $case->sections->filterByInterface($this->_modifier_name . 'Modifier');
-+        $modifyMethod = 'modify' . $this->_modifier_name;
-+        foreach ($modifiers as $section) {
-+            $section->$modifyMethod($this);
-         }
-     }
-     
-Index: src/PHPT/Section/SKIPIF.php
-===================================================================
--- src/PHPT/Section/SKIPIF.php	(revision 691)
-+++ src/PHPT/Section/SKIPIF.php	(working copy)
-@@ -3,10 +3,12 @@
- class PHPT_Section_SKIPIF implements PHPT_Section_RunnableBefore
- {
-     private $_data = null;
-+    private $_runner_factory = null;
-     
-     public function __construct($data)
-     {
-         $this->_data = $data;
-+        $this->_runner_factory = new PHPT_CodeRunner_Factory();
-     }
-     
-     public function run(PHPT_Case $case)
-@@ -16,9 +18,7 @@
-         
-         // @todo refactor to PHPT_CodeRunner
-         file_put_contents($filename, $this->_data);
-        $response = array();
-        exec('php -f ' . $filename, $response);
-        $response = implode("\n", $response);
-+        $response = $this->_runner_factory->factory($case)->run($filename)->output;
-         unlink($filename);
-         
-         if (preg_match('/^skip( - (.*))?/', $response, $matches)) {
-Index: src/PHPT/SectionList.php
-===================================================================
--- src/PHPT/SectionList.php	(revision 691)
-+++ src/PHPT/SectionList.php	(working copy)
-@@ -2,7 +2,6 @@
- 
- class PHPT_SectionList implements Iterator
- {
-    private $_raw_sections = array();
-     private $_sections = array();
-     private $_section_map = array();
-     private $_key_map = array();
-@@ -15,14 +14,12 @@
-             }
-             $name = strtoupper(str_replace('PHPT_Section_', '', get_class($section)));
-             $key = $section instanceof PHPT_Section_Runnable ? $section->getPriority() . '.' . $name : $name;
-            $this->_raw_sections[$key] = $section;
-+            $this->_sections[$key] = $section;
-             $this->_section_map[$name] = $key;
-             $this->_key_map[$key] = $name;
-         }
-         
-        ksort($this->_raw_sections);
-        
-        $this->_sections = $this->_raw_sections;
-+        ksort($this->_sections);
-     }
-     
-     public function current()
-@@ -52,21 +49,23 @@
-     
-     public function filterByInterface($interface = null)
-     {
-+        $ret = new PHPT_SectionList();
-+        
-         if (is_null($interface)) {
-            $this->_sections = $this->_raw_sections;
-            return $this;
-+            $ret->_sections = $this->_sections;
-+            return $ret;
-         }
-         
-         $full_interface = 'PHPT_Section_' . $interface;
-        $this->_sections = array();
-        foreach ($this->_raw_sections as $name => $section) {
-+        $ret->_sections = array();
-+        foreach ($this->_sections as $name => $section) {
-             if (!$section instanceof $full_interface) {
-                 continue;
-             }
-            $this->_sections[$name] = $section;
-+            $ret->_sections[$name] = $section;
-         }
-         
-        return $this;
-+        return $ret;
-     }
-     
-     public function has($name)
-@@ -74,11 +73,11 @@
-         if (!isset($this->_section_map[$name])) {
-             return false;
-         }
-        return isset($this->_raw_sections[$this->_section_map[$name]]);
-+        return isset($this->_sections[$this->_section_map[$name]]);
-     }
-     
-     public function __get($key)
-     {
-        return $this->_raw_sections[$this->_section_map[$key]];
-+        return $this->_sections[$this->_section_map[$key]];
-     }
- }
-Index: tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt
-===================================================================
--- tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt	(revision 691)
-+++ tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt	(working copy)
-@@ -17,9 +17,9 @@
- 
- // sanity check
- $obj = new FoobarIni();
-assert('(string)$obj == " -d display_errors=1 "');
-+assert('(string)$obj == " -d \"display_errors=1\" "');
- $obj->display_errors = 0;
-assert('(string)$obj == " -d display_errors=0 "');
-+assert('(string)$obj == " -d \"display_errors=0\" "');
- unset($obj);
- 
- 
-Index: tests/Section/File/restores-case-sections.phpt
-===================================================================
--- tests/Section/File/restores-case-sections.phpt	(revision 691)
-+++ tests/Section/File/restores-case-sections.phpt	(working copy)
-@@ -1,24 +0,0 @@
---TEST--
-After PHPT_Section_FILE::run(), the sections property of the provide $case object
-is restored to its unfiltered state
---FILE--
-<?php
-
-require_once dirname(__FILE__) . '/../../_setup.inc';
-require_once dirname(__FILE__) . '/../_simple-test-case.inc';
-require_once dirname(__FILE__) . '/_simple-file-modifier.inc';
-
-$case = new PHPT_SimpleTestCase();
-$case->sections = new PHPT_SectionList(array(
-    new PHPT_Section_ARGS('foo=bar'),
-));
-
-$section = new PHPT_Section_FILE('hello world');
-$section->run($case);
-
-assert('$case->sections->valid()');
-
-?>
-===DONE===
---EXPECT--
-===DONE===
-Index: tests/SectionList/filter-by-interface.phpt
-===================================================================
--- tests/SectionList/filter-by-interface.phpt	(revision 691)
-+++ tests/SectionList/filter-by-interface.phpt	(working copy)
-@@ -17,10 +17,10 @@
- 
- $data = array_merge($runnable, $non_runnable);
- $list = new PHPT_SectionList($data);
-$list->filterByInterface('Runnable');
-assert('$list->valid()');
-$list->filterByInterface('EnvModifier');
-assert('$list->valid() == false');
-+$runnable = $list->filterByInterface('Runnable');
-+assert('$runnable->valid()');
-+$env_modifier = $list->filterByInterface('EnvModifier');
-+assert('$env_modifier->valid() == false');
- 
- ?>
- ===DONE===
-Index: tests/SectionList/filter-resets-with-null.phpt
-===================================================================
--- tests/SectionList/filter-resets-with-null.phpt	(revision 691)
-+++ tests/SectionList/filter-resets-with-null.phpt	(working copy)
-@@ -1,36 +0,0 @@
---TEST--
-If you call filterByInterface() with null or no-value, the full dataset is restored
---FILE--
-<?php
-
-require_once dirname(__FILE__) . '/../_setup.inc';
-
-$runnable = array(
-    'ENV' => new PHPT_Section_ENV(''),
-    'CLEAN' => new PHPT_Section_CLEAN(''),
-);
-
-class PHPT_Section_FOO implements PHPT_Section { }
-$non_runnable = array(
-    'FOO' => new PHPT_Section_FOO(), 
-);
-
-$data = array_merge($runnable, $non_runnable);
-$list = new PHPT_SectionList($data);
-$list->filterByInterface('Runnable');
-
-// sanity check
-foreach ($list as $key => $value) {
-    assert('$runnable[$key] == $value');
-}
-
-$list->filterByInterface();
-
-foreach ($list as $key => $value) {
-    assert('$data[$key] == $value');
-}
-
-?>
-===DONE===
---EXPECT--
-===DONE===
-Index: tests/Util/Code/runAsFile-executes-in-file.phpt
-===================================================================
--- tests/Util/Code/runAsFile-executes-in-file.phpt	(revision 691)
-+++ tests/Util/Code/runAsFile-executes-in-file.phpt	(working copy)
-@@ -10,7 +10,7 @@
- 
- $util = new PHPT_Util_Code($code);
- 
-$file = dirname(__FILE__) . '/foobar.php';
-+$file = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'foobar.php';
- $result = $util->runAsFile($file);
- 
- assert('$result == $file');
-Index: tests/Util/Code/runAsFile-returns-output-if-no-return.phpt
-===================================================================
--- tests/Util/Code/runAsFile-returns-output-if-no-return.phpt	(revision 691)
-+++ tests/Util/Code/runAsFile-returns-output-if-no-return.phpt	(working copy)
-@@ -9,7 +9,7 @@
- 
- $util = new PHPT_Util_Code($code);
- 
-$file = dirname(__FILE__) . '/foobar.php';
-+$file = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'foobar.php';
- $result = $util->runAsFile($file);
- 
- assert('$result == $file');
--- a/maintenance/regenerate-docs.sh
+++ b/maintenance/regenerate-docs.sh
@@ -0,0 +1,5 @@
+#!/bin/bash -e
+./compile-doxygen.sh
+cd ../docs
+scp doxygen.tgz htmlpurifier.org:/home/ezyang/htmlpurifier.org
+ssh htmlpurifier.org "cd /home/ezyang/htmlpurifier.org && ./reload-docs.sh"
--- a/package.php
+++ b/package.php
@@ -10,7 +10,7 @@ $pkg = new PEAR_PackageFileManager2;
 $pkg->setOptions(
    array(
        'baseinstalldir' => '/',
-        'packagefile' => 'package2.xml',
+        'packagefile' => 'package.xml',
        'packagedirectory' => realpath(dirname(__FILE__) . '/library'),
        'filelistgenerator' => 'file',
        'include' => array('*'),
@@ -56,8 +56,6 @@ $pkg->setPearinstallerDep('1.4.3');

 $pkg->generateContents();

-$compat =& $pkg->exportCompatiblePackageFile1();
-$compat->writePackageFile();
 $pkg->writePackageFile();

 // vim: et sw=4 sts=4
--- a/plugins/phorum/Changelog
+++ b/plugins/phorum/Changelog
@@ -9,7 +9,8 @@ Changelog                                         HTMLPurifier : Phorum Mod
    . Internal change
 ==========================

-Version 3.0.0.1 for Phorum 5.2, unknown release date
+Version 4.0.0 for Phorum 5.2, released July 9, 2009
+# Works only with HTML Purifier 4.0.0
 ! Better installation documentation
 - Fixed double encoded quotes
 - Fixed fatal error when migrate.php is blank
--- a/plugins/phorum/INSTALL
+++ b/plugins/phorum/INSTALL
@@ -2,6 +2,11 @@
 Install
    How to install the Phorum HTML Purifier plugin

+0. PREREQUISITES
+----------------
+This Phorum module only works on PHP5 and with HTML Purifier 4.0.0
+or later.
+
 1. UNZIP
 --------
 Unzip phorum-htmlpurifier-x.y.z, producing an htmlpurifier folder.
--- a/plugins/phorum/htmlpurifier.php
+++ b/plugins/phorum/htmlpurifier.php
@@ -17,7 +17,7 @@
 * administrators who need to edit other people's comments may be at
 * risk for some nasty attacks.
 *
- * Tested with Phorum 5.2.6.
+ * Tested with Phorum 5.2.11.
 */

 // Note: Cache data is base64 encoded because Phorum insists on flinging
--- a/plugins/phorum/info.txt
+++ b/plugins/phorum/info.txt
@@ -2,7 +2,7 @@ title:   HTML Purifier Phorum Mod
 desc:    This module enables standards-compliant HTML filtering on Phorum. Please check migrate.bbcode.php before enabling this mod.
 author:  Edward Z. Yang
 url:     http://htmlpurifier.org/
-version: 3.0.0
+version: 4.0.0

 hook:  format|phorum_htmlpurifier_format
 hook:  quote|phorum_htmlpurifier_quote
--- a/smoketests/dataScheme.php
+++ b/smoketests/dataScheme.php
@@ -0,0 +1,37 @@
+<?php
+
+require_once 'common.php';
+
+echo '<?xml version="1.0" encoding="UTF-8" ?>';
+?><!DOCTYPE html
+     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+    <title>HTML Purifier data Scheme Smoketest</title>
+    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+</head>
+<body>
+<h1>HTML Purifier data Scheme Smoketest</h1>
+<?php
+
+$string = '<img src="data:image/png;base64,
+iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
+C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
+AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
+REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
+ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
+vr4MkhoXe0rZigAAAABJRU5ErkJggg==" alt="Red dot" />';
+
+$purifier = new HTMLPurifier(array('URI.AllowedSchemes' => 'data'));
+
+?>
+<div><?php
+echo $purifier->purify($string);
+?></div>
+
+</body>
+</html>
+<?php
+
+// vim: et sw=4 sts=4
--- a/smoketests/innerHTML.html
+++ b/smoketests/innerHTML.html
@@ -0,0 +1,33 @@
+<html>
+<head>
+    <title>innerHTML smoketest</title>
+</head>
+<body>
+<!--
+
+What we're going to do is use JavaScript to calculate
+fixpoints of innerHTML parse and reparsing.  We start with
+an input value, encoded in a JavaScript string.
+
+x.innerHTML = input
+
+We then snapshot the DOM state of x, and then perform the
+iteration:
+
+intermediate = x.innerHTML
+x.innerHTML = intermediate
+
+What inputs are we going to test?
+
+We will generate using the following alphabet:
+
+    a01~!@#$%^&*()_+`-=[]\{}|;':",./<>? (and <space>)
+
+
+
+-->
+<textarea id="out" style="width:100%;height:100%;"></textarea>
+<div id="testContainer" style="display:none"></div>
+<script src="innerHTML.js" type="text/javascript"></script>
+</body>
+</html>
--- a/smoketests/innerHTML.js
+++ b/smoketests/innerHTML.js
@@ -0,0 +1,51 @@
+var alphabet = 'a!`=[]\\;\':"/<> &';
+
+var out             = document.getElementById('out');
+var testContainer   = document.getElementById('testContainer');
+
+function print(s) {
+    out.value += s + "\n";
+}
+
+function testImage() {
+    return testContainer.firstChild;
+}
+
+function test(input) {
+    var count = 0;
+    var oldInput, newInput;
+    testContainer.innerHTML = "<img />";
+    testImage().setAttribute("alt", input);
+    print("------");
+    print("Test input: " + input);
+    do {
+        oldInput = testImage().getAttribute("alt");
+        var intermediate = testContainer.innerHTML;
+        print("Render: " + intermediate);
+        testContainer.innerHTML = intermediate;
+        if (testImage() == null) {
+            print("Image disappeared...");
+            break;
+        }
+        newInput = testImage().getAttribute("alt");
+        print("New value: " + newInput);
+        count++;
+    } while (count < 5 && newInput != oldInput);
+    if (count == 5) {
+        print("Failed to achieve fixpoint");
+    }
+    testContainer.innerHTML = "";
+}
+
+print("Go!");
+
+test("`` ");
+test("'' ");
+
+for (var i = 0; i < alphabet.length; i++) {
+    for (var j = 0; j < alphabet.length; j++) {
+        test(alphabet.charAt(i) + alphabet.charAt(j));
+    }
+}
+
+// document.getElementById('out').textContent = alphabet;
--- a/smoketests/preserveYouTube.php
+++ b/smoketests/preserveYouTube.php
@@ -15,12 +15,37 @@ echo '<?xml version="1.0" encoding="UTF-8" ?>';
 <h1>HTML Purifier Preserve YouTube Smoketest</h1>
 <?php

-$string = '<object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/BdU--T8rLns"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/BdU--T8rLns" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed></object>';
+$string = '<object width="425" height="350"><param name="movie" value="http://www.youtube.com/v/BdU--T8rLns"></param><param name="wmode" value="transparent"></param><embed src="http://www.youtube.com/v/BdU--T8rLns" type="application/x-shockwave-flash" wmode="transparent" width="425" height="350"></embed></object>
+
+<object width="416" height="337"><param name="movie" value="http://www.youtube.com/cp/vjVQa1PpcFNbP_fag8PvopkXZyiXyT0J8U47lw7x5Fc="></param><embed src="http://www.youtube.com/cp/vjVQa1PpcFNbP_fag8PvopkXZyiXyT0J8U47lw7x5Fc=" type="application/x-shockwave-flash" width="416" height="337"></embed></object>
+
+<object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/uNxBeJNyAqA&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/uNxBeJNyAqA&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object>
+
+<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" height="385" width="480"><param name="width" value="480" /><param name="height" value="385" /><param name="src" value="http://www.youtube.com/p/E37ADDDFCA0FD050&amp;hl=en" /><embed height="385" src="http://www.youtube.com/p/E37ADDDFCA0FD050&amp;hl=en" type="application/x-shockwave-flash" width="480"></embed></object>
+
+<object
+    classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
+    id="ooyalaPlayer_229z0_gbps1mrs" width="630" height="354"
+    codebase="http://fpdownload.macromedia.com/get/flashplayer/current/swflash.cab"><param
+    name="movie" value="http://player.ooyala.com/player.swf?embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za&version=2"
+    /><param name="bgcolor" value="#000000" /><param
+    name="allowScriptAccess" value="always" /><param
+    name="allowFullScreen" value="true" /><param name="flashvars"
+    value="embedType=noscriptObjectTag&embedCode=pteGRrMTpcKMyQ052c8NwYZ5M5FdSV3j"
+    /><embed src="http://player.ooyala.com/player.swf?embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za&version=2"
+    bgcolor="#000000" width="630" height="354"
+    name="ooyalaPlayer_229z0_gbps1mrs" align="middle" play="true"
+    loop="false" allowscriptaccess="always" allowfullscreen="true"
+    type="application/x-shockwave-flash"
+    flashvars="&embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za"
+    pluginspage="http://www.adobe.com/go/getflashplayer"></embed></object>
+';

 $regular_purifier = new HTMLPurifier();

-$youtube_purifier = new HTMLPurifier(array(
-    'Filter.YouTube' => true,
+$safeobject_purifier = new HTMLPurifier(array(
+    'HTML.SafeObject' => true,
+    'Output.FlashCompat' => true,
 ));

 ?>
@@ -35,9 +60,9 @@ if (isset($_GET['break'])) echo $string;
 echo $regular_purifier->purify($string);
 ?></div>

-<h2>With YouTube exception</h2>
+<h2>With SafeObject exception and flash compatibility</h2>
 <div><?php
-echo $youtube_purifier->purify($string);
+echo $safeobject_purifier->purify($string);
 ?></div>

 </body>
--- a/test-settings.sample.php
+++ b/test-settings.sample.php
@@ -37,13 +37,14 @@ $simpletest_location = '/path/to/simpletest/';
 // OPTIONAL SETTINGS

 // Note on running PHPT:
-//      Vanilla PHPT from http://phpt.info will not work, because there are
-//      a number of bugs that prevent HTML Purifier from doing what they need
-//      to do. If you really want to run PHPT, you'll will need to apply the
-//      patches in maintenance/phpt-modifications.patch on the PHPT Core trunk,
-//      which can be checked out using:
+//      Vanilla PHPT from https://github.com/tswicegood/PHPT_Core should
+//      work fine on Linux w/o multitest.
 //
-//        $ svn co https://svn.phpt.info/Core/trunk phpt-core
+//      To do multitest or Windows testing, you'll need some more
+//      patches at https://github.com/ezyang/PHPT_Core
+//
+//      I haven't tested the Windows setup in a while so I don't know if
+//      it still works.

 // Should PHPT tests be enabled?
 $GLOBALS['HTMLPurifierTest']['PHPT'] = false;
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Edward Z. Yang	f1439f0af5	Release 4.3.0 Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 23:02:49 +01:00
Edward Z. Yang	0124605918	Fix CSS URL innerHTML/cssText escaping bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 21:24:32 +01:00
Edward Z. Yang	afb007d22f	Protect against font family innerHTML/cssText attacks. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 20:35:43 +01:00
Edward Z. Yang	0dd9e4faf4	Fix Internet Explorer innerHTML bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-27 11:50:52 +01:00
Edward Z. Yang	94ed3b1231	Implement CSS.AllowedFonts. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-24 22:54:39 +00:00
Edward Z. Yang	6a6c0ed5d7	Don't autoclose if no parents support the tag. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-22 00:26:41 +00:00
Edward Z. Yang	e05b555448	Safety update for nested ul test. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-21 21:05:23 +00:00
Edward Z. Yang	ee9c70ab7f	Fix E_NOTICE from indexing into empty string. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-03-17 17:33:11 +00:00
Edward Z. Yang	b4469f17aa	Fix missing numeric entities (shows up when DirectLexing). Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-02-27 11:58:37 +00:00
Edward Z. Yang	e76f4b45d0	Dramatically rewrite null host URI handling. Basically, browsers don't parse what should be valid URIs correctly, so we have to go through some backbends to accomodate them. Specifically, for browseable URIs, the following URIs have unintended behavior: - ///example.com - http:/example.com - http:///example.com Furthermore, if the path begins with //, modifying these URLs must be done with care, as if you remove the host-name component, the parse tree changes. I've modified the engine to follow correct URI semantics as much as possible while outputting browser compatible code, and invalidate the URI in cases where we can't deal. There has been a refactoring of URIScheme so that this important check is always performed, introducing a new member variable allow_empty_host which is true on data, file, mailto and news schemes. This also fixes bypass bugs on URI.Munge. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-25 18:56:46 +00:00
Edward Z. Yang	a32d5b52e1	Fix embedding flash on non-IE browsers and allow more wmode. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-22 12:28:57 +00:00
Maxim Krizhanovsky	a3d71fe606	Iterative traversal of DOM. There are some deep DOMs you can hit the maximum nesting level limit in tokenizeDOM (we've experienced this even with maximum nesting level of 300). Here is an iterative version of the same function with simple queue/dequeue approach. Signed-off-by: Maxim Krizhanovsky <darhazer@gmail.com>	2011-01-19 22:06:40 +00:00
Edward Z. Yang	77982bd61d	Bump version number for Cache.SerializerPermissions. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-14 00:40:39 +00:00
Petr Skoda	78c4e62245	Add new Cache.SerializerPermissions option.	2011-01-13 22:57:40 +00:00
Edward Z. Yang	5803c06765	Check that argv is set before operating on it. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2011-01-13 22:42:47 +00:00
Edward Z. Yang	b63569ac22	Fix bad interaction between bootstrap autoloader and Zend Debugger/APC. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-12-31 09:48:28 +00:00
Edward Z. Yang	f3d050c517	Fix two bugs with caching of customized raw definitions. The first bug is that we will repeatedly write out the result of a customized raw definition to the filesystem, even when a cache entry already exists. The second bug is that caching these definitions doesn't actually work (the cache entry is written but never used.) A new API for retrieving raw definitions permits the user to take advantage of caching. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-12-30 23:51:53 +00:00
Edward Z. Yang	6dcc37cb55	Update PHPT instructions. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-21 14:00:20 +00:00
Edward Z. Yang	cfc4ee1faf	Add initial implementation of CSS.Trusted. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 18:45:03 +00:00
Edward Z. Yang	598c5b60c9	Add sanity check against ze1_compatibility_mode. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 16:15:03 +00:00
Edward Z. Yang	c9e7ffc172	Fix incorrect PEARSax3 test assertion. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-11-12 16:06:34 +00:00
Edward Z. Yang	feeffe6ed2	Check if schema.ser was corrupted. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-10-29 14:47:40 +01:00
Edward Z. Yang	4754d407aa	Fix removal of id with DirectLex by preserving armor. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-10-28 17:25:31 +01:00
Nick Pope	0b9db1f54b	Allow non-static autoload methods w/ PHP >= 5.2.11 HTML Purifier loads itself as the first autoload function by unregistering all existing functions and re-registering them after registering itself. Originally an exception was thrown when a non-static object method was encountered as the behaviour of spl_autoload_functions() did not return the object instance, but only the class name. This was filed on PHP bugs (#44144). The bug was fixed for PHP >= 5.2.11 and >= 5.3 Signed-off-by: Nick Pope <nick@nickpope.me.uk> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-10-28 17:25:17 +01:00
Edward Z. Yang	1d4a38d055	Escape CDATA before handling conditional comments. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-28 12:11:26 -04:00
Edward Z. Yang	8c80349f9d	Implement HTML.Nofollow for external links. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-28 12:01:57 -04:00
Edward Z. Yang	d848c99b74	Make IE conditional comment matching ungreedy. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-28 10:22:38 -04:00
Edward Z. Yang	882ffed9ba	Release 4.2.0. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-15 02:52:57 -04:00
Edward Z. Yang	86990a21f1	Rename newline normalization directive to something better. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-15 02:50:39 -04:00
Tomasz Muras	9573f0933d	Make newline normalization optional.	2010-09-14 23:49:28 -04:00
Edward Z. Yang	632bf2bbd4	Shift to 4.2.0 release cycle. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-14 23:38:51 -04:00
Edward Z. Yang	ec86598446	Add support for file:// URI scheme. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-09 00:01:26 -04:00
Edward Z. Yang	b6c3f5e89b	Update TODO. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-08 23:42:05 -04:00
Edward Z. Yang	7c91104532	Implement HTML.FlashAllowFullScreen. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-08 23:39:20 -04:00
Edward Z. Yang	eac628f490	Add %CSS.ForbiddenProperties directive. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-04 02:59:03 -04:00
Edward Z. Yang	92913bc816	Add documentation about configuration directive types. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-04 02:28:53 -04:00
Edward Z. Yang	479d793562	Reword documentation to be clearer, and give warning on common user error. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-09-04 01:31:20 -04:00
Edward Z. Yang	e2c15f1c98	Fix Mac Snow Leopard APC bug. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-08-26 21:40:58 -07:00
Edward Z. Yang	57ced3f361	Tighten up ignore spec. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-30 06:00:45 -07:00
Edward Z. Yang	c04a441b3e	Actually make URI.DisableResources do something. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-30 05:59:17 -07:00
Edward Z. Yang	1bed8b6d5f	Added %Core.RemoveProcessingInstructions. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-20 18:26:44 -07:00
Edward Z. Yang	33afd7d9e0	Fix improper handling of IE conditional comments. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-06-18 06:08:54 -07:00
Edward Z. Yang	18e538317a	Release 4.1.1. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 20:17:31 -07:00
Edward Z. Yang	96a4193fc9	Fix undefined index warnings in maintenance scripts. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 20:07:27 -07:00
Edward Z. Yang	00c66fa9cb	Fix bug in parsing single attribute with entities. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 19:44:18 -07:00
Edward Z. Yang	d3abcb90e3	Rewrite CSS url() and font-family output logic. The new logic is as follows: * Given a URL to insert into url(), check that it is properly URL encoded (in particular, a doublequote and backslash never occurs within it) and then place it as url("http://example.com"). * Given a font name, if it is strictly alphanumeric, it is safe to omit quotes. Otherwise, wrap in double quotes and replace '"' with '\22 ' (note trailing space) and '\' with '\5C ' (ditto). We introduce expandCSSEscape() which is a hack for common parsing idioms in CSS; this means that CSS escapes are now recognized inside URLs as well as unquoted font names. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-31 18:45:21 -07:00
Edward Z. Yang	df3100b1b3	Make test script less chatty when log_errors is on. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-20 21:50:44 -04:00
Edward Z. Yang	143e1ad718	Remove shebang and +x from test script. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-20 21:21:26 -04:00
Edward Z. Yang	875b0febde	Fix infinite loop involving wrapping formedness. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-17 23:22:51 -04:00
Edward Z. Yang	3166b8a10f	Fix bug in background-position with center keyword. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-05 15:08:57 -04:00
Edward Z. Yang	1a70bffd5a	Emit errors when body is extracted. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-05-04 13:41:09 -04:00
Edward Z. Yang	f4c6e10ff7	Release 4.1.0. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-26 18:31:40 -04:00
Edward Z. Yang	c1cbd9e565	Mute STRICT errors from CSSTidy and don't run PEARSax3 on PHP 5.3. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-26 18:27:32 -04:00
Edward Z. Yang	da94d3d6ac	Always quote the contents of url() in CSS. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-26 12:10:15 -04:00
Edward Z. Yang	80793e925e	Remove +x bit from RemoveSpansWithoutAttributes.php Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-04-17 00:23:09 -04:00
Edward Z. Yang	8ef4fb22db	Support for flashvars in HTML.SafeEmbed. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-30 13:33:13 -04:00
Edward Z. Yang	70a7a3f5dd	Handle <ol><ol> properly by adding missing <li> tag. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-10 00:58:37 -05:00
Edward Z. Yang	4d612d5a77	Improve handling of malformed object parameters. When specifying source material for <object> tags, you must use data inside the object tag as well as specify movie in a param. If you specify a src (which is the appropriate markup for <embed>) we now convert and fill in the other attributes appropriately. Also, fix a PHP warning in Generator code. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-09 17:29:38 -05:00
Edward Z. Yang	63a854ee5d	Remove call-time pass-by-reference. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-08 03:45:11 -05:00
Edward Z. Yang	0229458f8f	Implement Internet Explorer compatibility code for embedded content. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-08 01:56:40 -05:00
Edward Z. Yang	baa477ac08	Truncate alt text from src if it's too long. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-08 01:22:21 -05:00
Edward Z. Yang	dc90e8e85b	Support flashvars. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-08 01:16:57 -05:00
Edward Z. Yang	97125ed18b	Implement data URI scheme. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-07 21:45:39 -05:00
Paul Stone	9a9036c689	Implement auto-formatter that removes empty span tags. Signed-off-by: Paul Stone <patches@pdjs.co.uk> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-07 18:59:33 -05:00
Edward Z. Yang	aea7d02dfe	Support YouTube slideshow embedding. YouTube slideshows contain a /cp/, not a /v/, in their URL; relax the YouTube filter to allow them. Signed-off-by: Nigel McNie <nigel@catalyst.net.nz> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-03-07 18:57:22 -05:00
Brian DeRocher	b3ca1498c2	Add boolean value flag for PEARSax3 for testing if a token is empty. Signed-off-by: Brian DeRocher <brian@derocher.org> Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-26 21:36:51 -05:00
Edward Z. Yang	ac18672aba	Fix extant broken PEARSax3 parsing patterns. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-26 21:14:52 -05:00
Edward Z. Yang	faf28682ad	Manually work around PEARSax3 E_STRICT errors. Previously, my development environment was not running the PEARSax3 tests because my environment was set to E_STRICT error handling, and thus the tests were skipped. Relax this requirement by making the wrapper class E_STRICT safe. This introduces a few failing tests. Also update TODO and add another fresh test. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-26 20:42:42 -05:00
Edward Z. Yang	e2cd852bcf	Add shebang line to tests index script. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-15 02:55:43 -05:00
Edward Z. Yang	694583259c	Fix autoparagraph bug with non-inline elements. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2010-02-15 02:55:33 -05:00
Edward Z. Yang	bde4de3c78	Update TODO. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-08-27 20:17:41 -04:00
Edward Z. Yang	5b4e5c983e	Support proprietary height attribute on table. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-08-27 20:17:24 -04:00
Edward Z. Yang	1ad8fd5ce9	Gracefully deal with null injectors. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-08-27 20:03:31 -04:00
Edward Z. Yang	6bdf161afd	Update TODO. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-15 14:50:52 -04:00
Edward Z. Yang	af45a6c191	Release Phorum module 4.0.0. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-09 21:12:35 -04:00
Edward Z. Yang	2b72d0445f	Add 4.1.0 release NEWS entry. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-09 21:03:46 -04:00
Edward Z. Yang	d7b3117678	Add doxygen doc scripts, and fix package.php Signed-off-by: Edward Z. Yang <ezyang@mit.edu>	2009-07-08 22:11:15 -04:00
@@ -1 +1 @@
 .0.0
 .3.0