1
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2025-08-03 12:47:56 +02:00

Compare commits

...

51 Commits

Author SHA1 Message Date
Edward Z. Yang
f1439f0af5 Release 4.3.0
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-27 23:02:49 +01:00
Edward Z. Yang
0124605918 Fix CSS URL innerHTML/cssText escaping bug.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-27 21:24:32 +01:00
Edward Z. Yang
afb007d22f Protect against font family innerHTML/cssText attacks.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-27 20:35:43 +01:00
Edward Z. Yang
0dd9e4faf4 Fix Internet Explorer innerHTML bug.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-27 11:50:52 +01:00
Edward Z. Yang
94ed3b1231 Implement CSS.AllowedFonts.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-24 22:54:39 +00:00
Edward Z. Yang
6a6c0ed5d7 Don't autoclose if no parents support the tag.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-22 00:26:41 +00:00
Edward Z. Yang
e05b555448 Safety update for nested ul test.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-21 21:05:23 +00:00
Edward Z. Yang
ee9c70ab7f Fix E_NOTICE from indexing into empty string.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-03-17 17:33:11 +00:00
Edward Z. Yang
b4469f17aa Fix missing numeric entities (shows up when DirectLexing).
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-02-27 11:58:37 +00:00
Edward Z. Yang
e76f4b45d0 Dramatically rewrite null host URI handling.
Basically, browsers don't parse what should be valid URIs correctly, so
we have to go through some backbends to accomodate them.  Specifically,
for browseable URIs, the following URIs have unintended behavior:

    - ///example.com
    - http:/example.com
    - http:///example.com

Furthermore, if the path begins with //, modifying these URLs must
be done with care, as if you remove the host-name component, the
parse tree changes.

I've modified the engine to follow correct URI semantics as much
as possible while outputting browser compatible code, and invalidate
the URI in cases where we can't deal.  There has been a refactoring
of URIScheme so that this important check is always performed,
introducing a new member variable allow_empty_host which is true
on data, file, mailto and news schemes.

This also fixes bypass bugs on URI.Munge.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-01-25 18:56:46 +00:00
Edward Z. Yang
a32d5b52e1 Fix embedding flash on non-IE browsers and allow more wmode.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-01-22 12:28:57 +00:00
Maxim Krizhanovsky
a3d71fe606 Iterative traversal of DOM.
There are some deep DOMs you can hit the maximum nesting level
limit in tokenizeDOM (we've experienced this even with maximum nesting
level of 300). Here is an iterative version of the same function with
simple queue/dequeue approach.

Signed-off-by: Maxim Krizhanovsky <darhazer@gmail.com>
2011-01-19 22:06:40 +00:00
Edward Z. Yang
77982bd61d Bump version number for Cache.SerializerPermissions.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-01-14 00:40:39 +00:00
Petr Skoda
78c4e62245 Add new Cache.SerializerPermissions option. 2011-01-13 22:57:40 +00:00
Edward Z. Yang
5803c06765 Check that argv is set before operating on it.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2011-01-13 22:42:47 +00:00
Edward Z. Yang
b63569ac22 Fix bad interaction between bootstrap autoloader and Zend Debugger/APC.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-12-31 09:48:28 +00:00
Edward Z. Yang
f3d050c517 Fix two bugs with caching of customized raw definitions.
The first bug is that we will repeatedly write out the result
of a customized raw definition to the filesystem, even when a cache
entry already exists.

The second bug is that caching these definitions doesn't actually
work (the cache entry is written but never used.)  A new API
for retrieving raw definitions permits the user to take advantage
of caching.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-12-30 23:51:53 +00:00
Edward Z. Yang
6dcc37cb55 Update PHPT instructions.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-11-21 14:00:20 +00:00
Edward Z. Yang
cfc4ee1faf Add initial implementation of CSS.Trusted.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-11-12 18:45:03 +00:00
Edward Z. Yang
598c5b60c9 Add sanity check against ze1_compatibility_mode.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-11-12 16:15:03 +00:00
Edward Z. Yang
c9e7ffc172 Fix incorrect PEARSax3 test assertion.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-11-12 16:06:34 +00:00
Edward Z. Yang
feeffe6ed2 Check if schema.ser was corrupted.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-10-29 14:47:40 +01:00
Edward Z. Yang
4754d407aa Fix removal of id with DirectLex by preserving armor.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-10-28 17:25:31 +01:00
Nick Pope
0b9db1f54b Allow non-static autoload methods w/ PHP >= 5.2.11
HTML Purifier loads itself as the first autoload function by
unregistering all existing functions and re-registering them after
registering itself.

Originally an exception was thrown when a non-static object method was
encountered as the behaviour of spl_autoload_functions() did not return
the object instance, but only the class name.  This was filed on PHP
bugs (#44144).

The bug was fixed for PHP >= 5.2.11 and >= 5.3

Signed-off-by: Nick Pope <nick@nickpope.me.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-10-28 17:25:17 +01:00
Edward Z. Yang
1d4a38d055 Escape CDATA before handling conditional comments.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-28 12:11:26 -04:00
Edward Z. Yang
8c80349f9d Implement HTML.Nofollow for external links.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-28 12:01:57 -04:00
Edward Z. Yang
d848c99b74 Make IE conditional comment matching ungreedy.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-28 10:22:38 -04:00
Edward Z. Yang
882ffed9ba Release 4.2.0.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-15 02:52:57 -04:00
Edward Z. Yang
86990a21f1 Rename newline normalization directive to something better.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-15 02:50:39 -04:00
Tomasz Muras
9573f0933d Make newline normalization optional. 2010-09-14 23:49:28 -04:00
Edward Z. Yang
632bf2bbd4 Shift to 4.2.0 release cycle.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-14 23:38:51 -04:00
Edward Z. Yang
ec86598446 Add support for file:// URI scheme.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-09 00:01:26 -04:00
Edward Z. Yang
b6c3f5e89b Update TODO.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-08 23:42:05 -04:00
Edward Z. Yang
7c91104532 Implement HTML.FlashAllowFullScreen.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-08 23:39:20 -04:00
Edward Z. Yang
eac628f490 Add %CSS.ForbiddenProperties directive.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-04 02:59:03 -04:00
Edward Z. Yang
92913bc816 Add documentation about configuration directive types.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-04 02:28:53 -04:00
Edward Z. Yang
479d793562 Reword documentation to be clearer, and give warning on common user error.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-09-04 01:31:20 -04:00
Edward Z. Yang
e2c15f1c98 Fix Mac Snow Leopard APC bug.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-08-26 21:40:58 -07:00
Edward Z. Yang
57ced3f361 Tighten up ignore spec.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-30 06:00:45 -07:00
Edward Z. Yang
c04a441b3e Actually make URI.DisableResources do something.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-30 05:59:17 -07:00
Edward Z. Yang
1bed8b6d5f Added %Core.RemoveProcessingInstructions.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-20 18:26:44 -07:00
Edward Z. Yang
33afd7d9e0 Fix improper handling of IE conditional comments.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-06-18 06:08:54 -07:00
Edward Z. Yang
18e538317a Release 4.1.1.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 20:17:31 -07:00
Edward Z. Yang
96a4193fc9 Fix undefined index warnings in maintenance scripts.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 20:07:27 -07:00
Edward Z. Yang
00c66fa9cb Fix bug in parsing single attribute with entities.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 19:44:18 -07:00
Edward Z. Yang
d3abcb90e3 Rewrite CSS url() and font-family output logic.
The new logic is as follows:

* Given a URL to insert into url(), check that it is properly URL
  encoded (in particular, a doublequote and backslash never occurs
  within it) and then place it as url("http://example.com").

* Given a font name, if it is strictly alphanumeric, it is safe to omit
  quotes. Otherwise, wrap in double quotes and replace '"' with '\22 '
  (note trailing space) and '\' with '\5C ' (ditto).

We introduce expandCSSEscape() which is a hack for common parsing
idioms in CSS; this means that CSS escapes are now recognized inside
URLs as well as unquoted font names.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 18:45:21 -07:00
Edward Z. Yang
df3100b1b3 Make test script less chatty when log_errors is on.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-20 21:50:44 -04:00
Edward Z. Yang
143e1ad718 Remove shebang and +x from test script.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-20 21:21:26 -04:00
Edward Z. Yang
875b0febde Fix infinite loop involving wrapping formedness.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-17 23:22:51 -04:00
Edward Z. Yang
3166b8a10f Fix bug in background-position with center keyword.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-05 15:08:57 -04:00
Edward Z. Yang
1a70bffd5a Emit errors when body is extracted.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-04 13:41:09 -04:00
117 changed files with 2129 additions and 794 deletions

2
.gitignore vendored
View File

@@ -18,3 +18,5 @@ docs/doxygen*
*.phpt.php
*.phpt.skip.php
*.htmlt.ini
*.patch
/*.php

View File

@@ -31,7 +31,7 @@ PROJECT_NAME = HTMLPurifier
# This could be handy for archiving the generated documentation or
# if some version control system is used.
PROJECT_NUMBER = 4.1.0
PROJECT_NUMBER = 4.3.0
# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
# base path where the generated documentation will be put.

View File

@@ -18,6 +18,7 @@ with these contents.
HTML Purifier is PHP 5 only, and is actively tested from PHP 5.0.5 and
up. It has no core dependencies with other libraries. PHP
4 support was deprecated on December 31, 2007 with HTML Purifier 3.0.0.
HTML Purifier is not compatible with zend.ze1_compatibility_mode.
These optional extensions can enhance the capabilities of HTML Purifier:

86
NEWS
View File

@@ -9,6 +9,92 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier
. Internal change
==========================
4.3.0, released 2011-03-27
# Fixed broken caching of customized raw definitions, but requires an
API change. The old API still works but will emit a warning,
see http://htmlpurifier.org/docs/enduser-customize.html#optimized
for how to upgrade your code.
# Protect against Internet Explorer innerHTML behavior by specially
treating attributes with backticks but no angled brackets, quotes or
spaces. This constitutes a slight semantic change, which can be
reverted using %Output.FixInnerHTML. Reported by Neike Taika-Tessaro
and Mario Heiderich.
# Protect against cssText/innerHTML by restricting allowed characters
used in fonts further than mandated by the specification and encoding
some extra special characters in URLs. Reported by Neike
Taika-Tessaro and Mario Heiderich.
! Added %HTML.Nofollow to add rel="nofollow" to external links.
! More types of SPL autoloaders allowed on later versions of PHP.
! Implementations for position, top, left, right, bottom, z-index
when %CSS.Trusted is on.
! Add %Cache.SerializerPermissions option for custom serializer
directory/file permissions
! Fix longstanding bug in Flash support for non-IE browsers, and
allow more wmode attributes.
! Add %CSS.AllowedFonts to restrict permissible font names.
- Switch to an iterative traversal of the DOM, which prevents us
from running out of stack space for deeply nested documents.
Thanks Maxim Krizhanovsky for contributing a patch.
- Make removal of conditional IE comments ungreedy; thanks Bernd
for reporting.
- Escape CDATA before removing Internet Explorer comments.
- Fix removal of id attributes under certain conditions by ensuring
armor attributes are preserved when recreating tags.
- Check if schema.ser was corrupted.
- Check if zend.ze1_compatibility_mode is on, and error out if it is.
This safety check is only done for HTMLPurifier.auto.php; if you
are using standalone or the specialized includes files, you're
expected to know what you're doing.
- Stop repeatedly writing the cache file after I'm done customizing a
raw definition. Reported by ajh.
- Switch to using require_once in the Bootstrap to work around bad
interaction with Zend Debugger and APC. Reported by Antonio Parraga.
- Fix URI handling when hostname is missing but scheme is present.
Reported by Neike Taika-Tessaro.
- Fix missing numeric entities on DirectLex; thanks Neike Taika-Tessaro
for reporting.
- Fix harmless notice from indexing into empty string. Thanks Matthijs
Kooijman <matthijs@stdin.nl> for reporting.
- Don't autoclose no parent elements are able to support the element
that triggered the autoclose. In particular fixes strange behavior
of stray <li> tags. Thanks pkuliga@gmail.com for reporting and
Neike Taika-Tessaro <pinkgothic@gmail.com> for debugging assistance.
4.2.0, released 2010-09-15
! Added %Core.RemoveProcessingInstructions, which lets you remove
<? ... ?> statements.
! Added %URI.DisableResources functionality; the directive originally
did nothing. Thanks David Rothstein for reporting.
! Add documentation about configuration directive types.
! Add %CSS.ForbiddenProperties configuration directive.
! Add %HTML.FlashAllowFullScreen to permit embedded Flash objects
to utilize full-screen mode.
! Add optional support for the <code>file</code> URI scheme, enable
by explicitly setting %URI.AllowedSchemes.
! Add %Core.NormalizeNewlines options to allow turning off newline
normalization.
- Fix improper handling of Internet Explorer conditional comments
by parser. Thanks zmonteca for reporting.
- Fix missing attributes bug when running on Mac Snow Leopard and APC.
Thanks sidepodcast for the fix.
- Warn if an element is allowed, but an attribute it requires is
not allowed.
4.1.1, released 2010-05-31
- Fix undefined index warnings in maintenance scripts.
- Fix bug in DirectLex for parsing elements with a single attribute
with entities.
- Rewrite CSS output logic for font-family and url(). Thanks Mario
Heiderich <mario.heiderich@googlemail.com> for reporting and Takeshi
Terada <t-terada@violet.plala.or.jp> for suggesting the fix.
- Emit an error for CollectErrors if a body is extracted
- Fix bug where in background-position for center keyword handling.
- Fix infinite loop when a wrapper element is inserted in a context
where it's not allowed. Thanks Lars <lars@renoz.dk> for reporting.
- Remove +x bit and shebang from index.php; only supported mode is to
explicitly call it with php.
- Make test script less chatty when log_errors is on.
4.1.0, released 2010-04-26
! Support proprietary height attribute on table element
! Support YouTube slideshows that contain /cp/ in their URL.

5
TODO
View File

@@ -20,11 +20,14 @@ Things to do as soon as possible:
debugging
- Allowed/Allowed* have strange interactions when both set
- Transform lone embeds into object tags
- Deprecated config options that emit warnings when you set them (with'
a way of muting the warning if you really want to)
- Make HTML.Trusted work with Output.FlashCompat
FUTURE VERSIONS
---------------
4.2 release [OMG CONFIG PONIES]
4.4 release [OMG CONFIG PONIES]
! Fix Printer. It's from the old days when we didn't have decent XML classes
! Factor demo.php into a set of Printer classes, and then create a stub
file for users here (inside the actual HTML Purifier library)

View File

@@ -1 +1 @@
4.1.0
4.3.0

View File

@@ -1,6 +1,8 @@
HTML Purifier 4.1 is a major security release that fixes an XSS
vulnerability exploitable on Internet Explorer. It also contains
a number of new features, including dramatically more flexible Flash
support, including %Output.FlashCompat to replace %HTML.SafeEmbed,
optional support for the data: URI scheme and better HTML parsing
capabilities.
HTML Purifier 4.3.0 is a major security release addressing various
security vulnerabilities related to user-submitted code and legitimate
client-side scripts. It also contains an accumulation of new features
and bugfixes over half a year. New configuration options include
%CSS.Trusted, %CSS.AllowedFonts and %Cache.SerializerPermissions.
There is a backwards-incompatible API change for customized raw
definitions, see <http://htmlpurifier.org/docs/enduser-customize.html#optimized>
for details.

View File

@@ -40,12 +40,26 @@
</xsl:apply-templates>
</ul>
</div>
<div id="typesContainer">
<h2>Types</h2>
<xsl:apply-templates select="$typeLookup" mode="types" />
</div>
<xsl:apply-templates />
</div>
</body>
</html>
</xsl:template>
<xsl:template match="type" mode="types">
<div class="type-block">
<xsl:attribute name="id">type-<xsl:value-of select="@id" /></xsl:attribute>
<h3><code><xsl:value-of select="@id" /></code>: <xsl:value-of select="@name" /></h3>
<div class="type-description">
<xsl:copy-of xmlns:xhtml="http://www.w3.org/1999/xhtml" select="xhtml:div/node()" />
</div>
</div>
</xsl:template>
<xsl:template match="title" mode="toc" />
<xsl:template match="namespace" mode="toc">
<xsl:param name="overflowNumber" />
@@ -192,10 +206,13 @@
<td>
<xsl:variable name="type" select="text()" />
<xsl:attribute name="class">type type-<xsl:value-of select="$type" /></xsl:attribute>
<xsl:value-of select="$typeLookup/type[@id=$type]/text()" />
<xsl:if test="@allow-null='yes'">
(or null)
</xsl:if>
<a>
<xsl:attribute name="href">#type-<xsl:value-of select="$type" /></xsl:attribute>
<xsl:value-of select="$typeLookup/type[@id=$type]/@name" />
<xsl:if test="@allow-null='yes'">
(or null)
</xsl:if>
</a>
</td>
</tr>
</xsl:template>

View File

@@ -1,16 +1,68 @@
<?xml version="1.0" encoding="UTF-8"?>
<types>
<type id="string">String</type>
<type id="istring">Case-insensitive string</type>
<type id="text">Text</type>
<type id="itext">Case-insensitive text</type>
<type id="int">Integer</type>
<type id="float">Float</type>
<type id="bool">Boolean</type>
<type id="lookup">Lookup array</type>
<type id="list">Array list</type>
<type id="hash">Associative array</type>
<type id="mixed">Mixed</type>
<type id="string" name="String"><div xmlns="http://www.w3.org/1999/xhtml">
A <a
href="http://docs.php.net/manual/en/language.types.string.php">sequence
of characters</a>.
</div></type>
<type id="istring" name="Case-insensitive string"><div xmlns="http://www.w3.org/1999/xhtml">
A series of case-insensitive characters. Internally, upper-case
ASCII characters will be converted to lower-case.
</div></type>
<type id="text" name="Text"><div xmlns="http://www.w3.org/1999/xhtml">
A series of characters that may contain newlines. Text tends to
indicate human-oriented text, as opposed to a machine format.
</div></type>
<type id="itext" name="Case-insensitive text"><div xmlns="http://www.w3.org/1999/xhtml">
A series of case-insensitive characters that may contain newlines.
</div></type>
<type id="int" name="Integer"><div xmlns="http://www.w3.org/1999/xhtml">
An <a
href="http://docs.php.net/manual/en/language.types.integer.php">
integer</a>. You are alternatively permitted to pass a string of
digits instead, which will be cast to an integer using
<code>(int)</code>.
</div></type>
<type id="float" name="Float"><div xmlns="http://www.w3.org/1999/xhtml">
A <a href="http://docs.php.net/manual/en/language.types.float.php">
floating point number</a>. You are alternatively permitted to
pass a numeric string (as defined by <code>is_numeric()</code>),
which will be cast to a float using <code>(float)</code>.
</div></type>
<type id="bool" name="Boolean"><div xmlns="http://www.w3.org/1999/xhtml">
A <a
href="http://docs.php.net/manual/en/language.types.boolean.php">boolean</a>.
You are alternatively permitted to pass an integer <code>0</code> or
<code>1</code> (other integers are not permitted) or a string
<code>"on"</code>, <code>"true"</code> or <code>"1"</code> for
<code>true</code>, and <code>"off"</code>, <code>"false"</code> or
<code>"0"</code> for <code>false</code>.
</div></type>
<type id="lookup" name="Lookup array"><div xmlns="http://www.w3.org/1999/xhtml">
An array whose values are <code>true</code>, e.g. <code>array('key'
=> true, 'key2' => true)</code>. You are alternatively permitted
to pass an array list of the keys <code>array('key', 'key2')</code>
or a comma-separated string of keys <code>"key, key2"</code>. If
you pass an array list of values, ensure that your values are
strictly numerically indexed: <code>array('key1', 2 =>
'key2')</code> will not do what you expect and emits a warning.
</div></type>
<type id="list" name="Array list"><div xmlns="http://www.w3.org/1999/xhtml">
An array which has consecutive integer indexes, e.g.
<code>array('val1', 'val2')</code>. You are alternatively permitted
to pass a comma-separated string of keys <code>"val1, val2"</code>.
If your array is not in this form, <code>array_values</code> is run
on the array and a warning is emitted.
</div></type>
<type id="hash" name="Associative array"><div xmlns="http://www.w3.org/1999/xhtml">
An array which is a mapping of keys to values, e.g.
<code>array('key1' => 'val1', 'key2' => 'val2')</code>. You are
alternatively permitted to pass a comma-separated string of
key-colon-value strings, e.g. <code>"key1: val1, key2: val2"</code>.
</div></type>
<type id="mixed" name="Mixed"><div xmlns="http://www.w3.org/1999/xhtml">
An arbitrary PHP value of any type.
</div></type>
</types>
<!-- vim: et sw=4 sts=4

View File

@@ -6,6 +6,7 @@
</file>
<file name="HTMLPurifier/Lexer.php">
<line>81</line>
<line>284</line>
</file>
<file name="HTMLPurifier/Lexer/DirectLex.php">
<line>53</line>
@@ -31,14 +32,24 @@
<line>218</line>
</file>
</directive>
<directive id="CSS.AllowImportant">
<directive id="CSS.Trusted">
<file name="HTMLPurifier/CSSDefinition.php">
<line>222</line>
</file>
</directive>
<directive id="CSS.AllowImportant">
<file name="HTMLPurifier/CSSDefinition.php">
<line>226</line>
</file>
</directive>
<directive id="CSS.AllowedProperties">
<file name="HTMLPurifier/CSSDefinition.php">
<line>275</line>
<line>296</line>
</file>
</directive>
<directive id="CSS.ForbiddenProperties">
<file name="HTMLPurifier/CSSDefinition.php">
<line>310</line>
</file>
</directive>
<directive id="Cache.DefinitionImpl">
@@ -85,27 +96,40 @@
</directive>
<directive id="Output.CommentScriptContents">
<file name="HTMLPurifier/Generator.php">
<line>56</line>
<line>61</line>
</file>
</directive>
<directive id="Output.FixInnerHTML">
<file name="HTMLPurifier/Generator.php">
<line>62</line>
</file>
</directive>
<directive id="Output.SortAttr">
<file name="HTMLPurifier/Generator.php">
<line>57</line>
<line>63</line>
</file>
</directive>
<directive id="Output.FlashCompat">
<file name="HTMLPurifier/Generator.php">
<line>58</line>
<line>64</line>
</file>
</directive>
<directive id="Output.TidyFormat">
<file name="HTMLPurifier/Generator.php">
<line>87</line>
<line>93</line>
</file>
</directive>
<directive id="Core.NormalizeNewlines">
<file name="HTMLPurifier/Generator.php">
<line>107</line>
</file>
<file name="HTMLPurifier/Lexer.php">
<line>266</line>
</file>
</directive>
<directive id="Output.Newline">
<file name="HTMLPurifier/Generator.php">
<line>101</line>
<line>108</line>
</file>
</directive>
<directive id="HTML.BlockWrapper">
@@ -135,12 +159,12 @@
</directive>
<directive id="HTML.ForbiddenElements">
<file name="HTMLPurifier/HTMLDefinition.php">
<line>337</line>
<line>342</line>
</file>
</directive>
<directive id="HTML.ForbiddenAttributes">
<file name="HTMLPurifier/HTMLDefinition.php">
<line>338</line>
<line>343</line>
</file>
</directive>
<directive id="HTML.Trusted">
@@ -148,7 +172,7 @@
<line>202</line>
</file>
<file name="HTMLPurifier/Lexer.php">
<line>258</line>
<line>271</line>
</file>
<file name="HTMLPurifier/HTMLModule/Image.php">
<line>27</line>
@@ -172,15 +196,20 @@
</directive>
<directive id="HTML.Proprietary">
<file name="HTMLPurifier/HTMLModuleManager.php">
<line>221</line>
<line>220</line>
</file>
</directive>
<directive id="HTML.SafeObject">
<file name="HTMLPurifier/HTMLModuleManager.php">
<line>226</line>
<line>223</line>
</file>
</directive>
<directive id="HTML.SafeEmbed">
<file name="HTMLPurifier/HTMLModuleManager.php">
<line>226</line>
</file>
</directive>
<directive id="HTML.Nofollow">
<file name="HTMLPurifier/HTMLModuleManager.php">
<line>229</line>
</file>
@@ -210,7 +239,12 @@
</directive>
<directive id="Core.ConvertDocumentToFragment">
<file name="HTMLPurifier/Lexer.php">
<line>267</line>
<line>282</line>
</file>
</directive>
<directive id="Core.RemoveProcessingInstructions">
<file name="HTMLPurifier/Lexer.php">
<line>303</line>
</file>
</directive>
<directive id="URI.">
@@ -225,6 +259,9 @@
<file name="HTMLPurifier/URIDefinition.php">
<line>64</line>
</file>
<file name="HTMLPurifier/URIScheme.php">
<line>75</line>
</file>
</directive>
<directive id="URI.Base">
<file name="HTMLPurifier/URIDefinition.php">
@@ -259,6 +296,11 @@
<line>12</line>
</file>
</directive>
<directive id="CSS.AllowedFonts">
<file name="HTMLPurifier/AttrDef/CSS/FontFamily.php">
<line>50</line>
</file>
</directive>
<directive id="Attr.AllowedClasses">
<file name="HTMLPurifier/AttrDef/HTML/Class.php">
<line>18</line>
@@ -336,6 +378,11 @@
<line>13</line>
</file>
</directive>
<directive id="HTML.FlashAllowFullScreen">
<file name="HTMLPurifier/AttrTransform/SafeParam.php">
<line>38</line>
</file>
</directive>
<directive id="Core.EscapeInvalidChildren">
<file name="HTMLPurifier/ChildDef/Required.php">
<line>62</line>
@@ -346,6 +393,12 @@
<line>91</line>
</file>
</directive>
<directive id="Cache.SerializerPermissions">
<file name="HTMLPurifier/DefinitionCache/Serializer.php">
<line>107</line>
<line>124</line>
</file>
</directive>
<directive id="Filter.ExtractStyleBlocks.TidyImpl">
<file name="HTMLPurifier/Filter/ExtractStyleBlocks.php">
<line>41</line>
@@ -414,7 +467,7 @@
</directive>
<directive id="Core.EscapeInvalidTags">
<file name="HTMLPurifier/Strategy/MakeWellFormed.php">
<line>45</line>
<line>53</line>
</file>
<file name="HTMLPurifier/Strategy/RemoveForeignElements.php">
<line>19</line>

View File

@@ -146,7 +146,9 @@
<pre>$config = HTMLPurifier_Config::createDefault();
$config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config-&gt;set('HTML.DefinitionRev', 1);
$def = $config-&gt;getHTMLDefinition(true);</pre>
if ($def = $config-&gt;maybeGetRawHTMLDefinition()) {
// our code will go here
}</pre>
<p>
Assuming that HTML Purifier has already been properly loaded (hint:
@@ -174,23 +176,15 @@ $def = $config-&gt;getHTMLDefinition(true);</pre>
</li>
<li>
The fourth line retrieves a raw <code>HTMLPurifier_HTMLDefinition</code>
object that we will be tweaking. If the parameter was removed, we
would be retrieving a fully formed definition object, which is somewhat
useless for customization purposes.
object that we will be tweaking. Interestingly enough, we have
placed it in an if block: this is because
<code>maybeGetRawHTMLDefinition</code>, as its name suggests, may
return a NULL, in which case we should skip doing any
initialization. This, in fact, will correspond to when our fully
customized object is already in the cache.
</li>
</ul>
<h3>Broken backwards-compatibility</h3>
<p>
Those of you who have already been twiddling around with the raw
HTML definition object, you'll be noticing that you're getting an error
when you attempt to retrieve the raw definition object without specifying
a DefinitionID. It is vital to caching (see below) that you make a unique
name for your customized definition, so make up something right now and
things will operate again.
</p>
<h2>Turn off caching</h2>
<p>
@@ -781,6 +775,75 @@ $form-&gt;excludes = array('form' => true);</strong></pre>
<li><a href="http://repo.or.cz/w/htmlpurifier.git?a=blob;hb=HEAD;f=library/HTMLPurifier/ElementDef.php"><code>library/HTMLPurifier/ElementDef.php</code></a></li>
</ul>
<h2 id="optimized">Notes for HTML Purifier 4.2.0 and earlier</h3>
<p>
Previously, this tutorial gave some incorrect template code for
editing raw definitions, and that template code will now produce the
error <q>Due to a documentation error in previous version of HTML
Purifier...</q> Here is how to mechanically transform old-style
code into new-style code.
</p>
<p>
First, identify all code that edits the raw definition object, and
put it together. Ensure none of this code must be run on every
request; if some sub-part needs to always be run, move it outside
this block. Here is an example below, with the raw definition
object code bolded.
</p>
<pre>$config = HTMLPurifier_Config::createDefault();
$config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config-&gt;set('HTML.DefinitionRev', 1);
$def = $config-&gt;getHTMLDefinition(true);
<strong>$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');</strong>
$purifier = new HTMLPurifier($config);</pre>
<p>
Next, replace the raw definition retrieval with a
maybeGetRawHTMLDefinition method call inside an if conditional, and
place the editing code inside that if block.
</p>
<pre>$config = HTMLPurifier_Config::createDefault();
$config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config-&gt;set('HTML.DefinitionRev', 1);
<strong>if ($def = $config-&gt;maybeGetRawHTMLDefinition()) {
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
}</strong>
$purifier = new HTMLPurifier($config);</pre>
<p>
And you're done! Alternatively, if you're OK with not ever caching
your code, the following will still work and not emit warnings.
</p>
<pre>$config = HTMLPurifier_Config::createDefault();
$def = $config-&gt;getHTMLDefinition(true);
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
$purifier = new HTMLPurifier($config);</pre>
<p>
A slightly less efficient version of this was what was going on with
old versions of HTML Purifier.
</p>
<p>
<em>Technical notes:</em> ajh pointed out on <a
href="http://htmlpurifier.org/phorum/read.php?5,5164,5169#msg-5169">in a forum topic</a> that
HTML Purifier appeared to be repeatedly writing to the cache even
when a cache entry already existed. Investigation lead to the
discovery of the following infelicity: caching of customized
definitions didn't actually work! The problem was that even though
a cache file would be written out at the end of the process, there
was no way for HTML Purifier to say, <q>Actually, I've already got a
copy of your work, no need to reconfigure your
customizations</q>. This required the API to change: placing
all of the customizations to the raw definition object in a
conditional which could be skipped.
</p>
</body></html>
<!-- vim: et sw=4 sts=4

View File

@@ -3,6 +3,7 @@
/**
* @file
* Convenience file that registers autoload handler for HTML Purifier.
* It also does some sanity checks.
*/
if (function_exists('spl_autoload_register') && function_exists('spl_autoload_unregister')) {
@@ -18,4 +19,8 @@ if (function_exists('spl_autoload_register') && function_exists('spl_autoload_un
}
}
if (ini_get('zend.ze1_compatibility_mode')) {
trigger_error("HTML Purifier is not compatible with zend.ze1_compatibility_mode; please turn it off", E_USER_ERROR);
}
// vim: et sw=4 sts=4

View File

@@ -7,7 +7,7 @@
* primary concern and you are using an opcode cache. PLEASE DO NOT EDIT THIS
* FILE, changes will be overwritten the next time the script is run.
*
* @version 4.1.0
* @version 4.3.0
*
* @warning
* You must *not* include any other HTML Purifier files before this file,
@@ -125,6 +125,7 @@ require 'HTMLPurifier/AttrTransform/Lang.php';
require 'HTMLPurifier/AttrTransform/Length.php';
require 'HTMLPurifier/AttrTransform/Name.php';
require 'HTMLPurifier/AttrTransform/NameSync.php';
require 'HTMLPurifier/AttrTransform/Nofollow.php';
require 'HTMLPurifier/AttrTransform/SafeEmbed.php';
require 'HTMLPurifier/AttrTransform/SafeObject.php';
require 'HTMLPurifier/AttrTransform/SafeParam.php';
@@ -151,6 +152,7 @@ require 'HTMLPurifier/HTMLModule/Image.php';
require 'HTMLPurifier/HTMLModule/Legacy.php';
require 'HTMLPurifier/HTMLModule/List.php';
require 'HTMLPurifier/HTMLModule/Name.php';
require 'HTMLPurifier/HTMLModule/Nofollow.php';
require 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
require 'HTMLPurifier/HTMLModule/Object.php';
require 'HTMLPurifier/HTMLModule/Presentation.php';
@@ -196,10 +198,12 @@ require 'HTMLPurifier/Token/Start.php';
require 'HTMLPurifier/Token/Text.php';
require 'HTMLPurifier/URIFilter/DisableExternal.php';
require 'HTMLPurifier/URIFilter/DisableExternalResources.php';
require 'HTMLPurifier/URIFilter/DisableResources.php';
require 'HTMLPurifier/URIFilter/HostBlacklist.php';
require 'HTMLPurifier/URIFilter/MakeAbsolute.php';
require 'HTMLPurifier/URIFilter/Munge.php';
require 'HTMLPurifier/URIScheme/data.php';
require 'HTMLPurifier/URIScheme/file.php';
require 'HTMLPurifier/URIScheme/ftp.php';
require 'HTMLPurifier/URIScheme/http.php';
require 'HTMLPurifier/URIScheme/https.php';

View File

@@ -19,7 +19,7 @@
*/
/*
HTML Purifier 4.1.0 - Standards Compliant HTML Filtering
HTML Purifier 4.3.0 - Standards Compliant HTML Filtering
Copyright (C) 2006-2008 Edward Z. Yang
This library is free software; you can redistribute it and/or
@@ -55,10 +55,10 @@ class HTMLPurifier
{
/** Version of HTML Purifier */
public $version = '4.1.0';
public $version = '4.3.0';
/** Constant with version of HTML Purifier */
const VERSION = '4.1.0';
const VERSION = '4.3.0';
/** Global configuration object */
public $config;

View File

@@ -119,6 +119,7 @@ require_once $__dir . '/HTMLPurifier/AttrTransform/Lang.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Length.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Name.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/NameSync.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Nofollow.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeEmbed.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeObject.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeParam.php';
@@ -145,6 +146,7 @@ require_once $__dir . '/HTMLPurifier/HTMLModule/Image.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/Legacy.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/List.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/Name.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/Nofollow.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/Object.php';
require_once $__dir . '/HTMLPurifier/HTMLModule/Presentation.php';
@@ -190,10 +192,12 @@ require_once $__dir . '/HTMLPurifier/Token/Start.php';
require_once $__dir . '/HTMLPurifier/Token/Text.php';
require_once $__dir . '/HTMLPurifier/URIFilter/DisableExternal.php';
require_once $__dir . '/HTMLPurifier/URIFilter/DisableExternalResources.php';
require_once $__dir . '/HTMLPurifier/URIFilter/DisableResources.php';
require_once $__dir . '/HTMLPurifier/URIFilter/HostBlacklist.php';
require_once $__dir . '/HTMLPurifier/URIFilter/MakeAbsolute.php';
require_once $__dir . '/HTMLPurifier/URIFilter/Munge.php';
require_once $__dir . '/HTMLPurifier/URIScheme/data.php';
require_once $__dir . '/HTMLPurifier/URIScheme/file.php';
require_once $__dir . '/HTMLPurifier/URIScheme/ftp.php';
require_once $__dir . '/HTMLPurifier/URIScheme/http.php';
require_once $__dir . '/HTMLPurifier/URIScheme/https.php';

View File

@@ -82,6 +82,42 @@ abstract class HTMLPurifier_AttrDef
return preg_replace('/rgb\((\d+)\s*,\s*(\d+)\s*,\s*(\d+)\)/', 'rgb(\1,\2,\3)', $string);
}
/**
* Parses a possibly escaped CSS string and returns the "pure"
* version of it.
*/
protected function expandCSSEscape($string) {
// flexibly parse it
$ret = '';
for ($i = 0, $c = strlen($string); $i < $c; $i++) {
if ($string[$i] === '\\') {
$i++;
if ($i >= $c) {
$ret .= '\\';
break;
}
if (ctype_xdigit($string[$i])) {
$code = $string[$i];
for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
if (!ctype_xdigit($string[$i])) break;
$code .= $string[$i];
}
// We have to be extremely careful when adding
// new characters, to make sure we're not breaking
// the encoding.
$char = HTMLPurifier_Encoder::unichr(hexdec($code));
if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
$ret .= $char;
if ($i < $c && trim($string[$i]) !== '') $i--;
continue;
}
if ($string[$i] === "\n") continue;
}
$ret .= $string[$i];
}
return $ret;
}
}
// vim: et sw=4 sts=4

View File

@@ -59,7 +59,8 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
$keywords = array();
$keywords['h'] = false; // left, right
$keywords['v'] = false; // top, bottom
$keywords['c'] = false; // center
$keywords['ch'] = false; // center (first word)
$keywords['cv'] = false; // center (second word)
$measures = array();
$i = 0;
@@ -79,6 +80,13 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
$lbit = ctype_lower($bit) ? $bit : strtolower($bit);
if (isset($lookup[$lbit])) {
$status = $lookup[$lbit];
if ($status == 'c') {
if ($i == 0) {
$status = 'ch';
} else {
$status = 'cv';
}
}
$keywords[$status] = $lbit;
$i++;
}
@@ -101,20 +109,19 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
if (!$i) return false; // no valid values were caught
$ret = array();
// first keyword
if ($keywords['h']) $ret[] = $keywords['h'];
elseif (count($measures)) $ret[] = array_shift($measures);
elseif ($keywords['c']) {
$ret[] = $keywords['c'];
$keywords['c'] = false; // prevent re-use: center = center center
elseif ($keywords['ch']) {
$ret[] = $keywords['ch'];
$keywords['cv'] = false; // prevent re-use: center = center center
}
elseif (count($measures)) $ret[] = array_shift($measures);
if ($keywords['v']) $ret[] = $keywords['v'];
elseif ($keywords['cv']) $ret[] = $keywords['cv'];
elseif (count($measures)) $ret[] = array_shift($measures);
elseif ($keywords['c']) $ret[] = $keywords['c'];
if (empty($ret)) return false;
return implode(' ', $ret);

View File

@@ -2,11 +2,43 @@
/**
* Validates a font family list according to CSS spec
* @todo whitelisting allowed fonts would be nice
*/
class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
{
protected $mask = null;
public function __construct() {
$this->mask = '- ';
for ($c = 'a'; $c <= 'z'; $c++) $this->mask .= $c;
for ($c = 'A'; $c <= 'Z'; $c++) $this->mask .= $c;
for ($c = '0'; $c <= '9'; $c++) $this->mask .= $c; // cast-y, but should be fine
// special bytes used by UTF-8
for ($i = 0x80; $i <= 0xFF; $i++) {
// We don't bother excluding invalid bytes in this range,
// because the our restriction of well-formed UTF-8 will
// prevent these from ever occurring.
$this->mask .= chr($i);
}
/*
PHP's internal strcspn implementation is
O(length of string * length of mask), making it inefficient
for large masks. However, it's still faster than
preg_match 8)
for (p = s1;;) {
spanp = s2;
do {
if (*spanp == c || p == s1_end) {
return p - s1;
}
} while (spanp++ < (s2_end - 1));
c = *++p;
}
*/
// possible optimization: invert the mask.
}
public function validate($string, $config, $context) {
static $generic_names = array(
'serif' => true,
@@ -15,6 +47,7 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
'fantasy' => true,
'cursive' => true
);
$allowed_fonts = $config->get('CSS.AllowedFonts');
// assume that no font names contain commas in them
$fonts = explode(',', $string);
@@ -24,7 +57,9 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
if ($font === '') continue;
// match a generic name
if (isset($generic_names[$font])) {
$final .= $font . ', ';
if ($allowed_fonts === null || isset($allowed_fonts[$font])) {
$final .= $font . ', ';
}
continue;
}
// match a quoted name
@@ -34,50 +69,122 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
$quote = $font[0];
if ($font[$length - 1] !== $quote) continue;
$font = substr($font, 1, $length - 2);
$new_font = '';
for ($i = 0, $c = strlen($font); $i < $c; $i++) {
if ($font[$i] === '\\') {
$i++;
if ($i >= $c) {
$new_font .= '\\';
break;
}
if (ctype_xdigit($font[$i])) {
$code = $font[$i];
for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
if (!ctype_xdigit($font[$i])) break;
$code .= $font[$i];
}
// We have to be extremely careful when adding
// new characters, to make sure we're not breaking
// the encoding.
$char = HTMLPurifier_Encoder::unichr(hexdec($code));
if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
$new_font .= $char;
if ($i < $c && trim($font[$i]) !== '') $i--;
continue;
}
if ($font[$i] === "\n") continue;
}
$new_font .= $font[$i];
}
$font = $new_font;
}
$font = $this->expandCSSEscape($font);
// $font is a pure representation of the font name
if ($allowed_fonts !== null && !isset($allowed_fonts[$font])) {
continue;
}
if (ctype_alnum($font) && $font !== '') {
// very simple font, allow it in unharmed
$final .= $font . ', ';
continue;
}
// complicated font, requires quoting
// bugger out on whitespace. form feed (0C) really
// shouldn't show up regardless
$font = str_replace(array("\n", "\t", "\r", "\x0C"), ' ', $font);
// armor single quotes and new lines
$font = str_replace("\\", "\\\\", $font);
$font = str_replace("'", "\\'", $font);
// Here, there are various classes of characters which need
// to be treated differently:
// - Alphanumeric characters are essentially safe. We
// handled these above.
// - Spaces require quoting, though most parsers will do
// the right thing if there aren't any characters that
// can be misinterpreted
// - Dashes rarely occur, but they fairly unproblematic
// for parsing/rendering purposes.
// The above characters cover the majority of Western font
// names.
// - Arbitrary Unicode characters not in ASCII. Because
// most parsers give little thought to Unicode, treatment
// of these codepoints is basically uniform, even for
// punctuation-like codepoints. These characters can
// show up in non-Western pages and are supported by most
// major browsers, for example: " 明朝" is a
// legitimate font-name
// <http://ja.wikipedia.org/wiki/MS_明朝>. See
// the CSS3 spec for more examples:
// <http://www.w3.org/TR/2011/WD-css3-fonts-20110324/localizedfamilynames.png>
// You can see live samples of these on the Internet:
// <http://www.google.co.jp/search?q=font-family++明朝|ゴシック>
// However, most of these fonts have ASCII equivalents:
// for example, 'MS Mincho', and it's considered
// professional to use ASCII font names instead of
// Unicode font names. Thanks Takeshi Terada for
// providing this information.
// The following characters, to my knowledge, have not been
// used to name font names.
// - Single quote. While theoretically you might find a
// font name that has a single quote in its name (serving
// as an apostrophe, e.g. Dave's Scribble), I haven't
// been able to find any actual examples of this.
// Internet Explorer's cssText translation (which I
// believe is invoked by innerHTML) normalizes any
// quoting to single quotes, and fails to escape single
// quotes. (Note that this is not IE's behavior for all
// CSS properties, just some sort of special casing for
// font-family). So a single quote *cannot* be used
// safely in the font-family context if there will be an
// innerHTML/cssText translation. Note that Firefox 3.x
// does this too.
// - Double quote. In IE, these get normalized to
// single-quotes, no matter what the encoding. (Fun
// fact, in IE8, the 'content' CSS property gained
// support, where they special cased to preserve encoded
// double quotes, but still translate unadorned double
// quotes into single quotes.) So, because their
// fixpoint behavior is identical to single quotes, they
// cannot be allowed either. Firefox 3.x displays
// single-quote style behavior.
// - Backslashes are reduced by one (so \\ -> \) every
// iteration, so they cannot be used safely. This shows
// up in IE7, IE8 and FF3
// - Semicolons, commas and backticks are handled properly.
// - The rest of the ASCII punctuation is handled properly.
// We haven't checked what browsers do to unadorned
// versions, but this is not important as long as the
// browser doesn't /remove/ surrounding quotes (as IE does
// for HTML).
//
// With these results in hand, we conclude that there are
// various levels of safety:
// - Paranoid: alphanumeric, spaces and dashes(?)
// - International: Paranoid + non-ASCII Unicode
// - Edgy: Everything except quotes, backslashes
// - NoJS: Standards compliance, e.g. sod IE. Note that
// with some judicious character escaping (since certain
// types of escaping doesn't work) this is theoretically
// OK as long as innerHTML/cssText is not called.
// We believe that international is a reasonable default
// (that we will implement now), and once we do more
// extensive research, we may feel comfortable with dropping
// it down to edgy.
// Edgy: alphanumeric, spaces, dashes and Unicode. Use of
// str(c)spn assumes that the string was already well formed
// Unicode (which of course it is).
if (strspn($font, $this->mask) !== strlen($font)) {
continue;
}
// Historical:
// In the absence of innerHTML/cssText, these ugly
// transforms don't pose a security risk (as \\ and \"
// might--these escapes are not supported by most browsers).
// We could try to be clever and use single-quote wrapping
// when there is a double quote present, but I have choosen
// not to implement that. (NOTE: you can reduce the amount
// of escapes by one depending on what quoting style you use)
// $font = str_replace('\\', '\\5C ', $font);
// $font = str_replace('"', '\\22 ', $font);
// $font = str_replace("'", '\\27 ', $font);
// font possibly with spaces, requires quoting
$final .= "'$font', ";
}
$final = rtrim($final, ', ');

View File

@@ -34,20 +34,25 @@ class HTMLPurifier_AttrDef_CSS_URI extends HTMLPurifier_AttrDef_URI
$uri = substr($uri, 1, $new_length - 1);
}
$keys = array( '(', ')', ',', ' ', '"', "'");
$values = array('\\(', '\\)', '\\,', '\\ ', '\\"', "\\'");
$uri = str_replace($values, $keys, $uri);
$uri = $this->expandCSSEscape($uri);
$result = parent::validate($uri, $config, $context);
if ($result === false) return false;
// escape necessary characters according to CSS spec
// except for the comma, none of these should appear in the
// URI at all
$result = str_replace($keys, $values, $result);
// extra sanity check; should have been done by URI
$result = str_replace(array('"', "\\", "\n", "\x0c", "\r"), "", $result);
return "url('$result')";
// suspicious characters are ()'; we're going to percent encode
// them for safety.
$result = str_replace(array('(', ')', "'"), array('%28', '%29', '%27'), $result);
// there's an extra bug where ampersands lose their escaping on
// an innerHTML cycle, so a very unlucky query parameter could
// then change the meaning of the URL. Unfortunately, there's
// not much we can do about that...
return "url(\"$result\")";
}

View File

@@ -23,6 +23,12 @@ class HTMLPurifier_AttrDef_URI_Host extends HTMLPurifier_AttrDef
public function validate($string, $config, $context) {
$length = strlen($string);
// empty hostname is OK; it's usually semantically equivalent:
// the default host as defined by a URI scheme is used:
//
// If the URI scheme defines a default for host, then that
// default applies when the host subcomponent is undefined
// or when the registered name is empty (zero length).
if ($string === '') return '';
if ($length > 1 && $string[0] === '[' && $string[$length-1] === ']') {
//IPv6

View File

@@ -0,0 +1,41 @@
<?php
// must be called POST validation
/**
* Adds rel="nofollow" to all outbound links. This transform is
* only attached if Attr.Nofollow is TRUE.
*/
class HTMLPurifier_AttrTransform_Nofollow extends HTMLPurifier_AttrTransform
{
private $parser;
public function __construct() {
$this->parser = new HTMLPurifier_URIParser();
}
public function transform($attr, $config, $context) {
if (!isset($attr['href'])) {
return $attr;
}
// XXX Kind of inefficient
$url = $this->parser->parse($attr['href']);
$scheme = $url->getSchemeObj($config, $context);
if (!is_null($url->host) && $scheme !== false && $scheme->browsable) {
if (isset($attr['rel'])) {
$attr['rel'] .= ' nofollow';
} else {
$attr['rel'] = 'nofollow';
}
}
return $attr;
}
}
// vim: et sw=4 sts=4

View File

@@ -19,6 +19,7 @@ class HTMLPurifier_AttrTransform_SafeParam extends HTMLPurifier_AttrTransform
public function __construct() {
$this->uri = new HTMLPurifier_AttrDef_URI(true); // embedded
$this->wmode = new HTMLPurifier_AttrDef_Enum(array('window', 'opaque', 'transparent'));
}
public function transform($attr, $config, $context) {
@@ -33,8 +34,15 @@ class HTMLPurifier_AttrTransform_SafeParam extends HTMLPurifier_AttrTransform
case 'allowNetworking':
$attr['value'] = 'internal';
break;
case 'allowFullScreen':
if ($config->get('HTML.FlashAllowFullScreen')) {
$attr['value'] = ($attr['value'] == 'true') ? 'true' : 'false';
} else {
$attr['value'] = 'false';
}
break;
case 'wmode':
$attr['value'] = 'window';
$attr['value'] = $this->wmode->validate($attr['value'], $config, $context);
break;
case 'movie':
case 'src':

View File

@@ -37,7 +37,12 @@ class HTMLPurifier_Bootstrap
public static function autoload($class) {
$file = HTMLPurifier_Bootstrap::getPath($class);
if (!$file) return false;
require HTMLPURIFIER_PREFIX . '/' . $file;
// Technically speaking, it should be ok and more efficient to
// just do 'require', but Antonio Parraga reports that with
// Zend extensions such as Zend debugger and APC, this invariant
// may be broken. Since we have efficient alternatives, pay
// the cost here and avoid the bug.
require_once HTMLPURIFIER_PREFIX . '/' . $file;
return true;
}
@@ -65,10 +70,11 @@ class HTMLPurifier_Bootstrap
if ( ($funcs = spl_autoload_functions()) === false ) {
spl_autoload_register($autoload);
} elseif (function_exists('spl_autoload_unregister')) {
$buggy = version_compare(PHP_VERSION, '5.2.11', '<');
$compat = version_compare(PHP_VERSION, '5.1.2', '<=') &&
version_compare(PHP_VERSION, '5.1.0', '>=');
foreach ($funcs as $func) {
if (is_array($func)) {
if ($buggy && is_array($func)) {
// :TRICKY: There are some compatibility issues and some
// places where we need to error out
$reflector = new ReflectionMethod($func[0], $func[1]);

View File

@@ -219,6 +219,10 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
$this->doSetupTricky($config);
}
if ($config->get('CSS.Trusted')) {
$this->doSetupTrusted($config);
}
$allow_important = $config->get('CSS.AllowImportant');
// wrap all attr-defs with decorator that handles !important
foreach ($this->info as $k => $v) {
@@ -260,6 +264,23 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
$this->info['overflow'] = new HTMLPurifier_AttrDef_Enum(array('visible', 'hidden', 'auto', 'scroll'));
}
protected function doSetupTrusted($config) {
$this->info['position'] = new HTMLPurifier_AttrDef_Enum(array(
'static', 'relative', 'absolute', 'fixed'
));
$this->info['top'] =
$this->info['left'] =
$this->info['right'] =
$this->info['bottom'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
new HTMLPurifier_AttrDef_CSS_Length(),
new HTMLPurifier_AttrDef_CSS_Percentage(),
new HTMLPurifier_AttrDef_Enum(array('auto')),
));
$this->info['z-index'] = new HTMLPurifier_AttrDef_CSS_Composite(array(
new HTMLPurifier_AttrDef_Integer(),
new HTMLPurifier_AttrDef_Enum(array('auto')),
));
}
/**
* Performs extra config-based processing. Based off of
@@ -272,20 +293,29 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
// setup allowed elements
$support = "(for information on implementing this, see the ".
"support forums) ";
$allowed_attributes = $config->get('CSS.AllowedProperties');
if ($allowed_attributes !== null) {
$allowed_properties = $config->get('CSS.AllowedProperties');
if ($allowed_properties !== null) {
foreach ($this->info as $name => $d) {
if(!isset($allowed_attributes[$name])) unset($this->info[$name]);
unset($allowed_attributes[$name]);
if(!isset($allowed_properties[$name])) unset($this->info[$name]);
unset($allowed_properties[$name]);
}
// emit errors
foreach ($allowed_attributes as $name => $d) {
foreach ($allowed_properties as $name => $d) {
// :TODO: Is this htmlspecialchars() call really necessary?
$name = htmlspecialchars($name);
trigger_error("Style attribute '$name' is not supported $support", E_USER_WARNING);
}
}
$forbidden_properties = $config->get('CSS.ForbiddenProperties');
if ($forbidden_properties !== null) {
foreach ($this->info as $name => $d) {
if (isset($forbidden_properties[$name])) {
unset($this->info[$name]);
}
}
}
}
}

View File

@@ -20,7 +20,7 @@ class HTMLPurifier_Config
/**
* HTML Purifier's version
*/
public $version = '4.1.0';
public $version = '4.3.0';
/**
* Bool indicator whether or not to automatically finalize
@@ -76,7 +76,8 @@ class HTMLPurifier_Config
/**
* Set to false if you do not want line and file numbers in errors
* (useful when unit testing)
* (useful when unit testing). This will also compress some errors
* and exceptions.
*/
public $chatty = true;
@@ -318,26 +319,64 @@ class HTMLPurifier_Config
* Retrieves object reference to the HTML definition.
* @param $raw Return a copy that has not been setup yet. Must be
* called before it's been setup, otherwise won't work.
* @param $optimized If true, this method may return null, to
* indicate that a cached version of the modified
* definition object is available and no further edits
* are necessary. Consider using
* maybeGetRawHTMLDefinition, which is more explicitly
* named, instead.
*/
public function getHTMLDefinition($raw = false) {
return $this->getDefinition('HTML', $raw);
public function getHTMLDefinition($raw = false, $optimized = false) {
return $this->getDefinition('HTML', $raw, $optimized);
}
/**
* Retrieves object reference to the CSS definition
* @param $raw Return a copy that has not been setup yet. Must be
* called before it's been setup, otherwise won't work.
* @param $optimized If true, this method may return null, to
* indicate that a cached version of the modified
* definition object is available and no further edits
* are necessary. Consider using
* maybeGetRawCSSDefinition, which is more explicitly
* named, instead.
*/
public function getCSSDefinition($raw = false) {
return $this->getDefinition('CSS', $raw);
public function getCSSDefinition($raw = false, $optimized = false) {
return $this->getDefinition('CSS', $raw, $optimized);
}
/**
* Retrieves object reference to the URI definition
* @param $raw Return a copy that has not been setup yet. Must be
* called before it's been setup, otherwise won't work.
* @param $optimized If true, this method may return null, to
* indicate that a cached version of the modified
* definition object is available and no further edits
* are necessary. Consider using
* maybeGetRawURIDefinition, which is more explicitly
* named, instead.
*/
public function getURIDefinition($raw = false, $optimized = false) {
return $this->getDefinition('URI', $raw, $optimized);
}
/**
* Retrieves a definition
* @param $type Type of definition: HTML, CSS, etc
* @param $raw Whether or not definition should be returned raw
* @param $optimized Only has an effect when $raw is true. Whether
* or not to return null if the result is already present in
* the cache. This is off by default for backwards
* compatibility reasons, but you need to do things this
* way in order to ensure that caching is done properly.
* Check out enduser-customize.html for more details.
* We probably won't ever change this default, as much as the
* maybe semantics is the "right thing to do."
*/
public function getDefinition($type, $raw = false) {
public function getDefinition($type, $raw = false, $optimized = false) {
if ($optimized && !$raw) {
throw new HTMLPurifier_Exception("Cannot set optimized = true when raw = false");
}
if (!$this->finalized) $this->autoFinalize();
// temporarily suspend locks, so we can handle recursive definition calls
$lock = $this->lock;
@@ -346,52 +385,137 @@ class HTMLPurifier_Config
$cache = $factory->create($type, $this);
$this->lock = $lock;
if (!$raw) {
// see if we can quickly supply a definition
// full definition
// ---------------
// check if definition is in memory
if (!empty($this->definitions[$type])) {
if (!$this->definitions[$type]->setup) {
$this->definitions[$type]->setup($this);
$cache->set($this->definitions[$type], $this);
$def = $this->definitions[$type];
// check if the definition is setup
if ($def->setup) {
return $def;
} else {
$def->setup($this);
if ($def->optimized) $cache->add($def, $this);
return $def;
}
return $this->definitions[$type];
}
// memory check missed, try cache
$this->definitions[$type] = $cache->get($this);
if ($this->definitions[$type]) {
// definition in cache, return it
return $this->definitions[$type];
// check if definition is in cache
$def = $cache->get($this);
if ($def) {
// definition in cache, save to memory and return it
$this->definitions[$type] = $def;
return $def;
}
} elseif (
!empty($this->definitions[$type]) &&
!$this->definitions[$type]->setup
) {
// raw requested, raw in memory, quick return
return $this->definitions[$type];
// initialize it
$def = $this->initDefinition($type);
// set it up
$this->lock = $type;
$def->setup($this);
$this->lock = null;
// save in cache
$cache->add($def, $this);
// return it
return $def;
} else {
// raw definition
// --------------
// check preconditions
$def = null;
if ($optimized) {
if (is_null($this->get($type . '.DefinitionID'))) {
// fatally error out if definition ID not set
throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID");
}
}
if (!empty($this->definitions[$type])) {
$def = $this->definitions[$type];
if ($def->setup && !$optimized) {
$extra = $this->chatty ? " (try moving this code block earlier in your initialization)" : "";
throw new HTMLPurifier_Exception("Cannot retrieve raw definition after it has already been setup" . $extra);
}
if ($def->optimized === null) {
$extra = $this->chatty ? " (try flushing your cache)" : "";
throw new HTMLPurifier_Exception("Optimization status of definition is unknown" . $extra);
}
if ($def->optimized !== $optimized) {
$msg = $optimized ? "optimized" : "unoptimized";
$extra = $this->chatty ? " (this backtrace is for the first inconsistent call, which was for a $msg raw definition)" : "";
throw new HTMLPurifier_Exception("Inconsistent use of optimized and unoptimized raw definition retrievals" . $extra);
}
}
// check if definition was in memory
if ($def) {
if ($def->setup) {
// invariant: $optimized === true (checked above)
return null;
} else {
return $def;
}
}
// if optimized, check if definition was in cache
// (because we do the memory check first, this formulation
// is prone to cache slamming, but I think
// guaranteeing that either /all/ of the raw
// setup code or /none/ of it is run is more important.)
if ($optimized) {
// This code path only gets run once; once we put
// something in $definitions (which is guaranteed by the
// trailing code), we always short-circuit above.
$def = $cache->get($this);
if ($def) {
// save the full definition for later, but don't
// return it yet
$this->definitions[$type] = $def;
return null;
}
}
// check invariants for creation
if (!$optimized) {
if (!is_null($this->get($type . '.DefinitionID'))) {
if ($this->chatty) {
$this->triggerError("Due to a documentation error in previous version of HTML Purifier, your definitions are not being cached. If this is OK, you can remove the %$type.DefinitionRev and %$type.DefinitionID declaration. Otherwise, modify your code to use maybeGetRawDefinition, and test if the returned value is null before making any edits (if it is null, that means that a cached version is available, and no raw operations are necessary). See <a href='http://htmlpurifier.org/docs/enduser-customize.html#optimized'>Customize</a> for more details", E_USER_WARNING);
} else {
$this->triggerError("Useless DefinitionID declaration", E_USER_WARNING);
}
}
}
// initialize it
$def = $this->initDefinition($type);
$def->optimized = $optimized;
return $def;
}
throw new HTMLPurifier_Exception("The impossible happened!");
}
private function initDefinition($type) {
// quick checks failed, let's create the object
if ($type == 'HTML') {
$this->definitions[$type] = new HTMLPurifier_HTMLDefinition();
$def = new HTMLPurifier_HTMLDefinition();
} elseif ($type == 'CSS') {
$this->definitions[$type] = new HTMLPurifier_CSSDefinition();
$def = new HTMLPurifier_CSSDefinition();
} elseif ($type == 'URI') {
$this->definitions[$type] = new HTMLPurifier_URIDefinition();
$def = new HTMLPurifier_URIDefinition();
} else {
throw new HTMLPurifier_Exception("Definition of $type type not supported");
}
// quick abort if raw
if ($raw) {
if (is_null($this->get($type . '.DefinitionID'))) {
// fatally error out if definition ID not set
throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID");
}
return $this->definitions[$type];
}
// set it up
$this->lock = $type;
$this->definitions[$type]->setup($this);
$this->lock = null;
// save in cache
$cache->set($this->definitions[$type], $this);
return $this->definitions[$type];
$this->definitions[$type] = $def;
return $def;
}
public function maybeGetRawDefinition($name) {
return $this->getDefinition($name, true, true);
}
public function maybeGetRawHTMLDefinition() {
return $this->getDefinition('HTML', true, true);
}
public function maybeGetRawCSSDefinition() {
return $this->getDefinition('CSS', true, true);
}
public function maybeGetRawURIDefinition() {
return $this->getDefinition('URI', true, true);
}
/**
@@ -549,17 +673,22 @@ class HTMLPurifier_Config
/**
* Produces a nicely formatted error message by supplying the
* stack frame information from two levels up and OUTSIDE of
* HTMLPurifier_Config.
* stack frame information OUTSIDE of HTMLPurifier_Config.
*/
protected function triggerError($msg, $no) {
// determine previous stack frame
$backtrace = debug_backtrace();
if ($this->chatty && isset($backtrace[1])) {
$frame = $backtrace[1];
$extra = " on line {$frame['line']} in file {$frame['file']}";
} else {
$extra = '';
$extra = '';
if ($this->chatty) {
$trace = debug_backtrace();
// zip(tail(trace), trace) -- but PHP is not Haskell har har
for ($i = 0, $c = count($trace); $i < $c - 1; $i++) {
if ($trace[$i + 1]['class'] === 'HTMLPurifier_Config') {
continue;
}
$frame = $trace[$i];
$extra = " invoked on line {$frame['line']} in file {$frame['file']}";
break;
}
}
trigger_error($msg . $extra, $no);
}

View File

@@ -60,7 +60,13 @@ class HTMLPurifier_ConfigSchema {
* Unserializes the default ConfigSchema.
*/
public static function makeFromSerial() {
return unserialize(file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema.ser'));
$contents = file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema.ser');
$r = unserialize($contents);
if (!$r) {
$hash = sha1($contents);
trigger_error("Unserialization of configuration schema failed, sha1 of file was $hash", E_USER_ERROR);
}
return $r;
}
/**

View File

@@ -0,0 +1,12 @@
CSS.AllowedFonts
TYPE: lookup/null
VERSION: 4.3.0
DEFAULT: NULL
--DESCRIPTION--
<p>
Allows you to manually specify a set of allowed fonts. If
<code>NULL</code>, all fonts are allowed. This directive
affects generic names (serif, sans-serif, monospace, cursive,
fantasy) as well as specific font families.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,13 @@
CSS.ForbiddenProperties
TYPE: lookup
VERSION: 4.2.0
DEFAULT: array()
--DESCRIPTION--
<p>
This is the logical inverse of %CSS.AllowedProperties, and it will
override that directive or any other directive. If possible,
%CSS.AllowedProperties is recommended over this directive,
because it can sometimes be difficult to tell whether or not you've
forbidden all of the CSS properties you truly would like to disallow.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,9 @@
CSS.Trusted
TYPE: bool
VERSION: 4.2.1
DEFAULT: false
--DESCRIPTION--
Indicates whether or not the user's CSS input is trusted or not. If the
input is trusted, a more expansive set of allowed properties. See
also %HTML.Trusted.
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
Cache.SerializerPermissions
TYPE: int
VERSION: 4.3.0
DEFAULT: 0755
--DESCRIPTION--
<p>
Directory permissions of the files and directories created inside
the DefinitionCache/Serializer or other custom serializer path.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
Core.NormalizeNewlines
TYPE: bool
VERSION: 4.2.0
DEFAULT: true
--DESCRIPTION--
<p>
Whether or not to normalize newlines to the operating
system default. When <code>false</code>, HTML Purifier
will attempt to preserve mixed newline files.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
Core.RemoveProcessingInstructions
TYPE: bool
VERSION: 4.2.0
DEFAULT: false
--DESCRIPTION--
Instead of escaping processing instructions in the form <code>&lt;? ...
?&gt;</code>, remove it out-right. This may be useful if the HTML
you are validating contains XML processing instruction gunk, however,
it can also be user-unfriendly for people attempting to post PHP
snippets.
--# vim: et sw=4 sts=4

View File

@@ -3,6 +3,11 @@ TYPE: bool
VERSION: 3.1.0
DEFAULT: false
--DESCRIPTION--
<p>
<strong>Warning:</strong> Deprecated in favor of %HTML.SafeObject and
%Output.FlashCompat (turn both on to allow YouTube videos and other
Flash content).
</p>
<p>
This directive enables YouTube video embedding in HTML Purifier. Check
<a href="http://htmlpurifier.org/docs/enduser-youtube.html">this document

View File

@@ -5,11 +5,14 @@ DEFAULT: NULL
--DESCRIPTION--
<p>
This is a convenience directive that rolls the functionality of
%HTML.AllowedElements and %HTML.AllowedAttributes into one directive.
This is a preferred convenience directive that combines
%HTML.AllowedElements and %HTML.AllowedAttributes.
Specify elements and attributes that are allowed using:
<code>element1[attr1|attr2],element2...</code>. You can also use
newlines instead of commas to separate elements.
<code>element1[attr1|attr2],element2...</code>. For example,
if you would like to only allow paragraphs and links, specify
<code>a[href],p</code>. You can specify attributes that apply
to all elements using an asterisk, e.g. <code>*[lang]</code>.
You can also use newlines instead of commas to separate elements.
</p>
<p>
<strong>Warning</strong>:

View File

@@ -4,12 +4,17 @@ VERSION: 1.3.0
DEFAULT: NULL
--DESCRIPTION--
<p>
If HTML Purifier's tag set is unsatisfactory for your needs, you
can overload it with your own list of tags to allow. Note that this
method is subtractive: it does its job by taking away from HTML Purifier
usual feature set, so you cannot add a tag that HTML Purifier never
supported in the first place (like embed, form or head). If you
change this, you probably also want to change %HTML.AllowedAttributes.
If HTML Purifier's tag set is unsatisfactory for your needs, you can
overload it with your own list of tags to allow. If you change
this, you probably also want to change %HTML.AllowedAttributes; see
also %HTML.Allowed which lets you set allowed elements and
attributes at the same time.
</p>
<p>
If you attempt to allow an element that HTML Purifier does not know
about, HTML Purifier will raise an error. You will need to manually
tell HTML Purifier about this element by using the
<a href="http://htmlpurifier.org/docs/enduser-customize.html">advanced customization features.</a>
</p>
<p>
<strong>Warning:</strong> If another directive conflicts with the

View File

@@ -0,0 +1,11 @@
HTML.FlashAllowFullScreen
TYPE: bool
VERSION: 4.2.0
DEFAULT: false
--DESCRIPTION--
<p>
Whether or not to permit embedded Flash content from
%HTML.SafeObject to expand to the full screen. Corresponds to
the <code>allowFullScreen</code> parameter.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,7 @@
HTML.Nofollow
TYPE: bool
VERSION: 4.3.0
DEFAULT: FALSE
--DESCRIPTION--
If enabled, nofollow rel attributes are added to all outgoing links.
--# vim: et sw=4 sts=4

View File

@@ -5,4 +5,5 @@ DEFAULT: false
--DESCRIPTION--
Indicates whether or not the user input is trusted or not. If the input is
trusted, a more expansive set of allowed tags and attributes will be used.
See also %CSS.Trusted.
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,15 @@
Output.FixInnerHTML
TYPE: bool
VERSION: 4.3.0
DEFAULT: true
--DESCRIPTION--
<p>
If true, HTML Purifier will protect against Internet Explorer's
mishandling of the <code>innerHTML</code> attribute by appending
a space to any attribute that does not contain angled brackets, spaces
or quotes, but contains a backtick. This slightly changes the
semantics of any given attribute, so if this is unacceptable and
you do not use <code>innerHTML</code> on any of your pages, you can
turn this directive off.
</p>
--# vim: et sw=4 sts=4

View File

@@ -12,6 +12,6 @@ array (
--DESCRIPTION--
Whitelist that defines the schemes that a URI is allowed to have. This
prevents XSS attacks from using pseudo-schemes like javascript or mocha.
There is also support for the <code>data</code> URI scheme, but it is not
enabled by default.
There is also support for the <code>data</code> and <code>file</code>
URI schemes, but they are not enabled by default.
--# vim: et sw=4 sts=4

View File

@@ -1,12 +1,15 @@
URI.DisableResources
TYPE: bool
VERSION: 1.3.0
VERSION: 4.2.0
DEFAULT: false
--DESCRIPTION--
<p>
Disables embedding resources, essentially meaning no pictures. You can
still link to them though. See %URI.DisableExternalResources for why
this might be a good idea.
</p>
<p>
<em>Note:</em> While this directive has been available since 1.3.0,
it didn't actually start doing anything until 4.2.0.
</p>
--# vim: et sw=4 sts=4

View File

@@ -12,6 +12,17 @@ abstract class HTMLPurifier_Definition
*/
public $setup = false;
/**
* If true, write out the final definition object to the cache after
* setup. This will be true only if all invocations to get a raw
* definition object are also optimized. This does not cause file
* system thrashing because on subsequent calls the cached object
* is used and any writes to the raw definition object are short
* circuited. See enduser-customize.html for the high-level
* picture.
*/
public $optimized = null;
/**
* What type of definition is it?
*/

View File

@@ -9,14 +9,14 @@ class HTMLPurifier_DefinitionCache_Serializer extends
$file = $this->generateFilePath($config);
if (file_exists($file)) return false;
if (!$this->_prepareDir($config)) return false;
return $this->_write($file, serialize($def));
return $this->_write($file, serialize($def), $config);
}
public function set($def, $config) {
if (!$this->checkDefType($def)) return;
$file = $this->generateFilePath($config);
if (!$this->_prepareDir($config)) return false;
return $this->_write($file, serialize($def));
return $this->_write($file, serialize($def), $config);
}
public function replace($def, $config) {
@@ -24,7 +24,7 @@ class HTMLPurifier_DefinitionCache_Serializer extends
$file = $this->generateFilePath($config);
if (!file_exists($file)) return false;
if (!$this->_prepareDir($config)) return false;
return $this->_write($file, serialize($def));
return $this->_write($file, serialize($def), $config);
}
public function get($config) {
@@ -97,18 +97,34 @@ class HTMLPurifier_DefinitionCache_Serializer extends
* Convenience wrapper function for file_put_contents
* @param $file File name to write to
* @param $data Data to write into file
* @param $config Config object
* @return Number of bytes written if success, or false if failure.
*/
private function _write($file, $data) {
return file_put_contents($file, $data);
private function _write($file, $data, $config) {
$result = file_put_contents($file, $data);
if ($result !== false) {
// set permissions of the new file (no execute)
$chmod = $config->get('Cache.SerializerPermissions');
if (!$chmod) {
$chmod = 0644; // invalid config or simpletest
}
$chmod = $chmod & 0666;
chmod($file, $chmod);
}
return $result;
}
/**
* Prepares the directory that this type stores the serials in
* @param $config Config object
* @return True if successful
*/
private function _prepareDir($config) {
$directory = $this->generateDirectoryPath($config);
$chmod = $config->get('Cache.SerializerPermissions');
if (!$chmod) {
$chmod = 0755; // invalid config or simpletest
}
if (!is_dir($directory)) {
$base = $this->generateBaseDirectoryPath($config);
if (!is_dir($base)) {
@@ -116,13 +132,13 @@ class HTMLPurifier_DefinitionCache_Serializer extends
please create or change using %Cache.SerializerPath',
E_USER_WARNING);
return false;
} elseif (!$this->_testPermissions($base)) {
} elseif (!$this->_testPermissions($base, $chmod)) {
return false;
}
$old = umask(0022); // disable group and world writes
mkdir($directory);
$old = umask(0000);
mkdir($directory, $chmod);
umask($old);
} elseif (!$this->_testPermissions($directory)) {
} elseif (!$this->_testPermissions($directory, $chmod)) {
return false;
}
return true;
@@ -131,8 +147,11 @@ class HTMLPurifier_DefinitionCache_Serializer extends
/**
* Tests permissions on a directory and throws out friendly
* error messages and attempts to chmod it itself if possible
* @param $dir Directory path
* @param $chmod Permissions
* @return True if directory writable
*/
private function _testPermissions($dir) {
private function _testPermissions($dir, $chmod) {
// early abort, if it is writable, everything is hunky-dory
if (is_writable($dir)) return true;
if (!is_dir($dir)) {
@@ -146,17 +165,17 @@ class HTMLPurifier_DefinitionCache_Serializer extends
// POSIX system, we can give more specific advice
if (fileowner($dir) === posix_getuid()) {
// we can chmod it ourselves
chmod($dir, 0755);
return true;
$chmod = $chmod | 0700;
if (chmod($dir, $chmod)) return true;
} elseif (filegroup($dir) === posix_getgid()) {
$chmod = '775';
$chmod = $chmod | 0070;
} else {
// PHP's probably running as nobody, so we'll
// need to give global permissions
$chmod = '777';
$chmod = $chmod | 0777;
}
trigger_error('Directory '.$dir.' not writable, '.
'please chmod to ' . $chmod,
'please chmod to ' . decoct($chmod),
E_USER_WARNING);
} else {
// generic error message

File diff suppressed because one or more lines are too long

View File

@@ -36,6 +36,11 @@ class HTMLPurifier_Generator
*/
private $_flashCompat;
/**
* Cache of %Output.FixInnerHTML
*/
private $_innerHTMLFix;
/**
* Stack for keeping track of object information when outputting IE
* compatibility code.
@@ -54,6 +59,7 @@ class HTMLPurifier_Generator
public function __construct($config, $context) {
$this->config = $config;
$this->_scriptFix = $config->get('Output.CommentScriptContents');
$this->_innerHTMLFix = $config->get('Output.FixInnerHTML');
$this->_sortAttr = $config->get('Output.SortAttr');
$this->_flashCompat = $config->get('Output.FlashCompat');
$this->_def = $config->getHTMLDefinition();
@@ -98,9 +104,11 @@ class HTMLPurifier_Generator
}
// Normalize newlines to system defined value
$nl = $this->config->get('Output.Newline');
if ($nl === null) $nl = PHP_EOL;
if ($nl !== "\n") $html = str_replace("\n", $nl, $html);
if ($this->config->get('Core.NormalizeNewlines')) {
$nl = $this->config->get('Output.Newline');
if ($nl === null) $nl = PHP_EOL;
if ($nl !== "\n") $html = str_replace("\n", $nl, $html);
}
return $html;
}
@@ -130,19 +138,7 @@ class HTMLPurifier_Generator
$_extra = '';
if ($this->_flashCompat) {
if ($token->name == "object" && !empty($this->_flashStack)) {
$flash = array_pop($this->_flashStack);
$compat_token = new HTMLPurifier_Token_Empty("embed");
foreach ($flash->attr as $name => $val) {
if ($name == "classid") continue;
if ($name == "type") continue;
if ($name == "data") $name = "src";
$compat_token->attr[$name] = $val;
}
foreach ($flash->param as $name => $val) {
if ($name == "movie") $name = "src";
$compat_token->attr[$name] = $val;
}
$_extra = "<!--[if IE]>".$this->generateFromToken($compat_token)."<![endif]-->";
// doesn't do anything for now
}
}
return $_extra . '</' . $token->name . '>';
@@ -200,6 +196,37 @@ class HTMLPurifier_Generator
continue;
}
}
// Workaround for Internet Explorer innerHTML bug.
// Essentially, Internet Explorer, when calculating
// innerHTML, omits quotes if there are no instances of
// angled brackets, quotes or spaces. However, when parsing
// HTML (for example, when you assign to innerHTML), it
// treats backticks as quotes. Thus,
// <img alt="``" />
// becomes
// <img alt=`` />
// becomes
// <img alt='' />
// Fortunately, all we need to do is trigger an appropriate
// quoting style, which we do by adding an extra space.
// This also is consistent with the W3C spec, which states
// that user agents may ignore leading or trailing
// whitespace (in fact, most don't, at least for attributes
// like alt, but an extra space at the end is barely
// noticeable). Still, we have a configuration knob for
// this, since this transformation is not necesary if you
// don't process user input with innerHTML or you don't plan
// on supporting Internet Explorer.
if ($this->_innerHTMLFix) {
if (strpos($value, '`') !== false) {
// check if correct quoting style would not already be
// triggered
if (strcspn($value, '"\' <>') === strlen($value)) {
// protect!
$value .= ' ';
}
}
}
$html .= $key.'="'.$this->escape($value).'" ';
}
return rtrim($html);
@@ -215,7 +242,10 @@ class HTMLPurifier_Generator
* permissible for non-attribute output.
* @return String escaped data.
*/
public function escape($string, $quote = ENT_COMPAT) {
public function escape($string, $quote = null) {
// Workaround for APC bug on Mac Leopard reported by sidepodcast
// http://htmlpurifier.org/phorum/read.php?3,4823,4846
if ($quote === null) $quote = ENT_COMPAT;
return htmlspecialchars($string, $quote, 'UTF-8');
}

View File

@@ -300,7 +300,12 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
unset($allowed_attributes_mutable[$key]);
}
}
if ($delete) unset($this->info[$tag]->attr[$attr]);
if ($delete) {
if ($this->info[$tag]->attr[$attr]->required) {
trigger_error("Required attribute '$attr' in element '$tag' was not allowed, which means '$tag' will not be allowed either", E_USER_WARNING);
}
unset($this->info[$tag]->attr[$attr]);
}
}
}
// emit errors

View File

@@ -0,0 +1,19 @@
<?php
/**
* Module adds the nofollow attribute transformation to a tags. It
* is enabled by HTML.Nofollow
*/
class HTMLPurifier_HTMLModule_Nofollow extends HTMLPurifier_HTMLModule
{
public $name = 'Nofollow';
public function setup($config) {
$a = $this->addBlankElement('a');
$a->attr_transform_post[] = new HTMLPurifier_AttrTransform_Nofollow();
}
}
// vim: et sw=4 sts=4

View File

@@ -21,7 +21,7 @@ class HTMLPurifier_HTMLModule_SafeEmbed extends HTMLPurifier_HTMLModule
'allowscriptaccess' => 'Enum#never',
'allownetworking' => 'Enum#internal',
'flashvars' => 'Text',
'wmode' => 'Enum#window',
'wmode' => 'Enum#window,transparent,opaque',
'name' => 'ID',
)
);

View File

@@ -29,7 +29,6 @@ class HTMLPurifier_HTMLModule_SafeObject extends HTMLPurifier_HTMLModule
'width' => 'Pixels#' . $max,
'height' => 'Pixels#' . $max,
'data' => 'URI#embedded',
'classid' => 'Enum#clsid:d27cdb6e-ae6d-11cf-96b8-444553540000',
'codebase' => new HTMLPurifier_AttrDef_Enum(array(
'http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0')),
)

View File

@@ -216,19 +216,19 @@ class HTMLPurifier_HTMLModuleManager
}
}
// add proprietary module (this gets special treatment because
// it is completely removed from doctypes, etc.)
// custom modules
if ($config->get('HTML.Proprietary')) {
$modules[] = 'Proprietary';
}
// add SafeObject/Safeembed modules
if ($config->get('HTML.SafeObject')) {
$modules[] = 'SafeObject';
}
if ($config->get('HTML.SafeEmbed')) {
$modules[] = 'SafeEmbed';
}
if ($config->get('HTML.Nofollow')) {
$modules[] = 'Nofollow';
}
// merge in custom modules
$modules = array_merge($modules, $this->userModules);

View File

@@ -22,6 +22,7 @@ class HTMLPurifier_Injector_SafeObject extends HTMLPurifier_Injector
'movie' => true,
'flashvars' => true,
'src' => true,
'allowFullScreen' => true, // if omitted, assume to be 'false'
);
public function prepare($config, $context) {

View File

@@ -23,6 +23,7 @@ $messages = array(
'Lexer: Missing gt' => 'Missing greater-than sign (>), previous less-than sign (<) should be escaped',
'Lexer: Missing attribute key' => 'Attribute declaration has no key',
'Lexer: Missing end quote' => 'Attribute declaration has no end quote',
'Lexer: Extracted body' => 'Removed document metadata tags',
'Strategy_RemoveForeignElements: Tag transform' => '<$1> element transformed into $CurrentToken.Serialized',
'Strategy_RemoveForeignElements: Missing required attribute' => '$CurrentToken.Compact element missing required attribute $1',

View File

@@ -230,6 +230,17 @@ class HTMLPurifier_Lexer
);
}
/**
* Special Internet Explorer conditional comments should be removed.
*/
protected static function removeIEConditional($string) {
return preg_replace(
'#<!--\[if [^>]+\]>.*?<!\[endif\]-->#si', // probably should generalize for all strings
'',
$string
);
}
/**
* Callback function for escapeCDATA() that does the work.
*
@@ -252,8 +263,10 @@ class HTMLPurifier_Lexer
public function normalize($html, $config, $context) {
// normalize newlines to \n
$html = str_replace("\r\n", "\n", $html);
$html = str_replace("\r", "\n", $html);
if ($config->get('Core.NormalizeNewlines')) {
$html = str_replace("\r\n", "\n", $html);
$html = str_replace("\r", "\n", $html);
}
if ($config->get('HTML.Trusted')) {
// escape convoluted CDATA
@@ -263,9 +276,19 @@ class HTMLPurifier_Lexer
// escape CDATA
$html = $this->escapeCDATA($html);
$html = $this->removeIEConditional($html);
// extract body from document if applicable
if ($config->get('Core.ConvertDocumentToFragment')) {
$html = $this->extractBody($html);
$e = false;
if ($config->get('Core.CollectErrors')) {
$e =& $context->get('ErrorCollector');
}
$new_html = $this->extractBody($html);
if ($e && $new_html != $html) {
$e->send(E_WARNING, 'Lexer: Extracted body');
}
$html = $new_html;
}
// expand entities that aren't the big five
@@ -276,6 +299,11 @@ class HTMLPurifier_Lexer
// represent non-SGML characters (horror, horror!)
$html = HTMLPurifier_Encoder::cleanUTF8($html);
// if processing instructions are to removed, remove them now
if ($config->get('Core.RemoveProcessingInstructions')) {
$html = preg_replace('#<\?.+?\?>#s', '', $html);
}
return $html;
}

View File

@@ -72,23 +72,57 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
}
/**
* Recursive function that tokenizes a node, putting it into an accumulator.
*
* Iterative function that tokenizes a node, putting it into an accumulator.
* To iterate is human, to recurse divine - L. Peter Deutsch
* @param $node DOMNode to be tokenized.
* @param $tokens Array-list of already tokenized tokens.
* @param $collect Says whether or start and close are collected, set to
* false at first recursion because it's the implicit DIV
* tag you're dealing with.
* @returns Tokens of node appended to previously passed tokens.
*/
protected function tokenizeDOM($node, &$tokens, $collect = false) {
protected function tokenizeDOM($node, &$tokens) {
$level = 0;
$nodes = array($level => array($node));
$closingNodes = array();
do {
while (!empty($nodes[$level])) {
$node = array_shift($nodes[$level]); // FIFO
$collect = $level > 0 ? true : false;
$needEndingTag = $this->createStartNode($node, $tokens, $collect);
if ($needEndingTag) {
$closingNodes[$level][] = $node;
}
if ($node->childNodes && $node->childNodes->length) {
$level++;
$nodes[$level] = array();
foreach ($node->childNodes as $childNode) {
array_push($nodes[$level], $childNode);
}
}
}
$level--;
if ($level && isset($closingNodes[$level])) {
while($node = array_pop($closingNodes[$level])) {
$this->createEndNode($node, $tokens);
}
}
} while ($level > 0);
}
/**
* @param $node DOMNode to be tokenized.
* @param $tokens Array-list of already tokenized tokens.
* @param $collect Says whether or start and close are collected, set to
* false at first recursion because it's the implicit DIV
* tag you're dealing with.
* @returns bool if the token needs an endtoken
*/
protected function createStartNode($node, &$tokens, $collect) {
// intercept non element nodes. WE MUST catch all of them,
// but we're not getting the character reference nodes because
// those should have been preprocessed
if ($node->nodeType === XML_TEXT_NODE) {
$tokens[] = $this->factory->createText($node->data);
return;
return false;
} elseif ($node->nodeType === XML_CDATA_SECTION_NODE) {
// undo libxml's special treatment of <script> and <style> tags
$last = end($tokens);
@@ -106,48 +140,44 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
}
}
$tokens[] = $this->factory->createText($this->parseData($data));
return;
return false;
} elseif ($node->nodeType === XML_COMMENT_NODE) {
// this is code is only invoked for comments in script/style in versions
// of libxml pre-2.6.28 (regular comments, of course, are still
// handled regularly)
$tokens[] = $this->factory->createComment($node->data);
return;
return false;
} elseif (
// not-well tested: there may be other nodes we have to grab
$node->nodeType !== XML_ELEMENT_NODE
) {
return;
return false;
}
$attr = $node->hasAttributes() ?
$this->transformAttrToAssoc($node->attributes) :
array();
$attr = $node->hasAttributes() ? $this->transformAttrToAssoc($node->attributes) : array();
// We still have to make sure that the element actually IS empty
if (!$node->childNodes->length) {
if ($collect) {
$tokens[] = $this->factory->createEmpty($node->tagName, $attr);
}
return false;
} else {
if ($collect) { // don't wrap on first iteration
if ($collect) {
$tokens[] = $this->factory->createStart(
$tag_name = $node->tagName, // somehow, it get's dropped
$attr
);
}
foreach ($node->childNodes as $node) {
// remember, it's an accumulator. Otherwise, we'd have
// to use array_merge
$this->tokenizeDOM($node, $tokens, true);
}
if ($collect) {
$tokens[] = $this->factory->createEnd($tag_name);
}
return true;
}
}
protected function createEndNode($node, &$tokens) {
$tokens[] = $this->factory->createEnd($node->tagName);
}
/**
* Converts a DOMNamedNodeMap of DOMAttr objects into an assoc array.
*

View File

@@ -384,7 +384,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
}
}
if ($value === false) $value = '';
return array($key => $value);
return array($key => $this->parseData($value));
}
// setup loop environment

View File

@@ -125,8 +125,6 @@ class HTML5 {
const EOF = 5;
public function __construct($data) {
$data = str_replace("\r\n", "\n", $data);
$data = str_replace("\r", null, $data);
$this->data = $data;
$this->char = -1;

View File

@@ -2,6 +2,14 @@
/**
* Takes tokens makes them well-formed (balance end tags, etc.)
*
* Specification of the armor attributes this strategy uses:
*
* - MakeWellFormed_TagClosedError: This armor field is used to
* suppress tag closed errors for certain tokens [TagClosedSuppress],
* in particular, if a tag was generated automatically by HTML
* Purifier, we may rely on our infrastructure to close it for us
* and shouldn't report an error to the user [TagClosedAuto].
*/
class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
{
@@ -43,6 +51,12 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
// local variables
$generator = new HTMLPurifier_Generator($config, $context);
$escape_invalid_tags = $config->get('Core.EscapeInvalidTags');
// used for autoclose early abortion
$global_parent_allowed_elements = array();
if (isset($definition->info[$definition->info_parent])) {
// may be unset under testing circumstances
$global_parent_allowed_elements = $definition->info[$definition->info_parent]->child->getAllowedElements($config);
}
$e = $context->get('ErrorCollector', true);
$t = false; // token index
$i = false; // injector index
@@ -102,7 +116,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
// -- end INJECTOR --
// a note on punting:
// a note on reprocessing:
// In order to reduce code duplication, whenever some code needs
// to make HTML changes in order to make things "correct", the
// new HTML gets sent through the purifier, regardless of its
@@ -149,7 +163,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
$top_nesting = array_pop($this->stack);
$this->stack[] = $top_nesting;
// send error
// send error [TagClosedSuppress]
if ($e && !isset($top_nesting->armor['MakeWellFormed_TagClosedError'])) {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $top_nesting);
}
@@ -165,6 +179,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
$token = $tokens[$t];
//echo '<br>'; printTokens($tokens, $t); printTokens($this->stack);
//flush();
// quick-check: if it's not a tag, no need to process
if (empty($token->is_tag)) {
@@ -192,12 +207,12 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
$ok = false;
if ($type === 'empty' && $token instanceof HTMLPurifier_Token_Start) {
// claims to be a start tag but is empty
$token = new HTMLPurifier_Token_Empty($token->name, $token->attr);
$token = new HTMLPurifier_Token_Empty($token->name, $token->attr, $token->line, $token->col, $token->armor);
$ok = true;
} elseif ($type && $type !== 'empty' && $token instanceof HTMLPurifier_Token_Empty) {
// claims to be empty but really is a start tag
$this->swap(new HTMLPurifier_Token_End($token->name));
$this->insertBefore(new HTMLPurifier_Token_Start($token->name, $token->attr));
$this->insertBefore(new HTMLPurifier_Token_Start($token->name, $token->attr, $token->line, $token->col, $token->armor));
// punt (since we had to modify the input stream in a non-trivial way)
$reprocess = true;
continue;
@@ -210,6 +225,19 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
// ...unless they also have to close their parent
if (!empty($this->stack)) {
// Performance note: you might think that it's rather
// inefficient, recalculating the autoclose information
// for every tag that a token closes (since when we
// do an autoclose, we push a new token into the
// stream and then /process/ that, before
// re-processing this token.) But this is
// necessary, because an injector can make an
// arbitrary transformations to the autoclosing
// tokens we introduce, so things may have changed
// in the meantime. Also, doing the inefficient thing is
// "easy" to reason about (for certain perverse definitions
// of "easy")
$parent = array_pop($this->stack);
$this->stack[] = $parent;
@@ -221,11 +249,14 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
}
if ($autoclose && $definition->info[$token->name]->wrap) {
// check if this is actually a wrap (mmm wraps!)
// Check if an element can be wrapped by another
// element to make it valid in a context (for
// example, <ul><ul> needs a <li> in between)
$wrapname = $definition->info[$token->name]->wrap;
$wrapdef = $definition->info[$wrapname];
$elements = $wrapdef->child->getAllowedElements($config);
if (isset($elements[$token->name])) {
$parent_elements = $definition->info[$parent->name]->child->getAllowedElements($config);
if (isset($elements[$token->name]) && isset($parent_elements[$wrapname])) {
$newtoken = new HTMLPurifier_Token_Start($wrapname);
$this->insertBefore($newtoken);
$reprocess = true;
@@ -239,24 +270,51 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
}
if ($autoclose) {
// errors need to be updated
$new_token = new HTMLPurifier_Token_End($parent->name);
$new_token->start = $parent;
if ($carryover) {
$element = clone $parent;
$element->armor['MakeWellFormed_TagClosedError'] = true;
$element->carryover = true;
$this->processToken(array($new_token, $token, $element));
} else {
$this->insertBefore($new_token);
}
if ($e && !isset($parent->armor['MakeWellFormed_TagClosedError'])) {
if (!$carryover) {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag auto closed', $parent);
} else {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag carryover', $parent);
// check if this autoclose is doomed to fail
// (this rechecks $parent, which his harmless)
$autoclose_ok = isset($global_parent_allowed_elements[$token->name]);
if (!$autoclose_ok) {
foreach ($this->stack as $ancestor) {
$elements = $definition->info[$ancestor->name]->child->getAllowedElements($config);
if (isset($elements[$token->name])) {
$autoclose_ok = true;
break;
}
if ($definition->info[$token->name]->wrap) {
$wrapname = $definition->info[$token->name]->wrap;
$wrapdef = $definition->info[$wrapname];
$wrap_elements = $wrapdef->child->getAllowedElements($config);
if (isset($wrap_elements[$token->name]) && isset($elements[$wrapname])) {
$autoclose_ok = true;
break;
}
}
}
}
if ($autoclose_ok) {
// errors need to be updated
$new_token = new HTMLPurifier_Token_End($parent->name);
$new_token->start = $parent;
if ($carryover) {
$element = clone $parent;
// [TagClosedAuto]
$element->armor['MakeWellFormed_TagClosedError'] = true;
$element->carryover = true;
$this->processToken(array($new_token, $token, $element));
} else {
$this->insertBefore($new_token);
}
// [TagClosedSuppress]
if ($e && !isset($parent->armor['MakeWellFormed_TagClosedError'])) {
if (!$carryover) {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag auto closed', $parent);
} else {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag carryover', $parent);
}
}
} else {
$this->remove();
}
$reprocess = true;
continue;
}
@@ -362,7 +420,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
if ($e) {
for ($j = $c - 1; $j > 0; $j--) {
// notice we exclude $j == 0, i.e. the current ending tag, from
// the errors...
// the errors... [TagClosedSuppress]
if (!isset($skipped_tags[$j]->armor['MakeWellFormed_TagClosedError'])) {
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$j]);
}
@@ -377,6 +435,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
$new_token->start = $skipped_tags[$j];
array_unshift($replace, $new_token);
if (isset($definition->info[$new_token->name]) && $definition->info[$new_token->name]->formatting) {
// [TagClosedAuto]
$element = clone $skipped_tags[$j];
$element->carryover = true;
$element->armor['MakeWellFormed_TagClosedError'] = true;
@@ -445,7 +504,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
}
/**
* Inserts a token before the current token. Cursor now points to this token
* Inserts a token before the current token. Cursor now points to
* this token. You must reprocess after this.
*/
private function insertBefore($token) {
array_splice($this->tokens, $this->t, 0, array($token));
@@ -453,14 +513,15 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
/**
* Removes current token. Cursor now points to new token occupying previously
* occupied space.
* occupied space. You must reprocess after this.
*/
private function remove() {
array_splice($this->tokens, $this->t, 1);
}
/**
* Swap current token with new token. Cursor points to new token (no change).
* Swap current token with new token. Cursor points to new token (no
* change). You must reprocess after this.
*/
private function swap($token) {
$this->tokens[$this->t] = $token;

View File

@@ -63,13 +63,15 @@ class HTMLPurifier_TagTransform_Font extends HTMLPurifier_TagTransform
// handle size transform
if (isset($attr['size'])) {
// normalize large numbers
if ($attr['size']{0} == '+' || $attr['size']{0} == '-') {
$size = (int) $attr['size'];
if ($size < -2) $attr['size'] = '-2';
if ($size > 4) $attr['size'] = '+4';
} else {
$size = (int) $attr['size'];
if ($size > 7) $attr['size'] = '7';
if ($attr['size'] !== '') {
if ($attr['size']{0} == '+' || $attr['size']{0} == '-') {
$size = (int) $attr['size'];
if ($size < -2) $attr['size'] = '-2';
if ($size > 4) $attr['size'] = '+4';
} else {
$size = (int) $attr['size'];
if ($size > 7) $attr['size'] = '7';
}
}
if (isset($this->_size_lookup[$attr['size']])) {
$prepend_style .= 'font-size:' .

View File

@@ -33,7 +33,7 @@ class HTMLPurifier_Token_Tag extends HTMLPurifier_Token
* @param $name String name.
* @param $attr Associative array of attributes.
*/
public function __construct($name, $attr = array(), $line = null, $col = null) {
public function __construct($name, $attr = array(), $line = null, $col = null, $armor = array()) {
$this->name = ctype_lower($name) ? $name : strtolower($name);
foreach ($attr as $key => $value) {
// normalization only necessary when key is not lowercase
@@ -50,6 +50,7 @@ class HTMLPurifier_Token_Tag extends HTMLPurifier_Token
$this->attr = $attr;
$this->line = $line;
$this->col = $col;
$this->armor = $armor;
}
}

View File

@@ -67,14 +67,6 @@ class HTMLPurifier_URI
$chars_gen_delims = ':/?#[]@';
$chars_pchar = $chars_sub_delims . ':@';
// validate scheme (MUST BE FIRST!)
if (!is_null($this->scheme) && is_null($this->host)) {
$def = $config->getDefinition('URI');
if ($def->defaultScheme === $this->scheme) {
$this->scheme = null;
}
}
// validate host
if (!is_null($this->host)) {
$host_def = new HTMLPurifier_AttrDef_URI_Host();
@@ -82,6 +74,21 @@ class HTMLPurifier_URI
if ($this->host === false) $this->host = null;
}
// validate scheme
// NOTE: It's not appropriate to check whether or not this
// scheme is in our registry, since a URIFilter may convert a
// URI that we don't allow into one we do. So instead, we just
// check if the scheme can be dropped because there is no host
// and it is our default scheme.
if (!is_null($this->scheme) && is_null($this->host) || $this->host === '') {
// support for relative paths is pretty abysmal when the
// scheme is present, so axe it when possible
$def = $config->getDefinition('URI');
if ($def->defaultScheme === $this->scheme) {
$this->scheme = null;
}
}
// validate username
if (!is_null($this->userinfo)) {
$encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . ':');
@@ -96,32 +103,48 @@ class HTMLPurifier_URI
// validate path
$path_parts = array();
$segments_encoder = new HTMLPurifier_PercentEncoder($chars_pchar . '/');
if (!is_null($this->host)) {
if (!is_null($this->host)) { // this catches $this->host === ''
// path-abempty (hier and relative)
// http://www.example.com/my/path
// //www.example.com/my/path (looks odd, but works, and
// recognized by most browsers)
// (this set is valid or invalid on a scheme by scheme
// basis, so we'll deal with it later)
// file:///my/path
// ///my/path
$this->path = $segments_encoder->encode($this->path);
} elseif ($this->path !== '' && $this->path[0] === '/') {
// path-absolute (hier and relative)
if (strlen($this->path) >= 2 && $this->path[1] === '/') {
// This shouldn't ever happen!
$this->path = '';
} else {
} elseif ($this->path !== '') {
if ($this->path[0] === '/') {
// path-absolute (hier and relative)
// http:/my/path
// /my/path
if (strlen($this->path) >= 2 && $this->path[1] === '/') {
// This could happen if both the host gets stripped
// out
// http://my/path
// //my/path
$this->path = '';
} else {
$this->path = $segments_encoder->encode($this->path);
}
} elseif (!is_null($this->scheme)) {
// path-rootless (hier)
// http:my/path
// Short circuit evaluation means we don't need to check nz
$this->path = $segments_encoder->encode($this->path);
}
} elseif (!is_null($this->scheme) && $this->path !== '') {
// path-rootless (hier)
// Short circuit evaluation means we don't need to check nz
$this->path = $segments_encoder->encode($this->path);
} elseif (is_null($this->scheme) && $this->path !== '') {
// path-noscheme (relative)
// (once again, not checking nz)
$segment_nc_encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . '@');
$c = strpos($this->path, '/');
if ($c !== false) {
$this->path =
$segment_nc_encoder->encode(substr($this->path, 0, $c)) .
$segments_encoder->encode(substr($this->path, $c));
} else {
$this->path = $segment_nc_encoder->encode($this->path);
// path-noscheme (relative)
// my/path
// (once again, not checking nz)
$segment_nc_encoder = new HTMLPurifier_PercentEncoder($chars_sub_delims . '@');
$c = strpos($this->path, '/');
if ($c !== false) {
$this->path =
$segment_nc_encoder->encode(substr($this->path, 0, $c)) .
$segments_encoder->encode(substr($this->path, $c));
} else {
$this->path = $segment_nc_encoder->encode($this->path);
}
}
} else {
// path-empty (hier and relative)
@@ -150,6 +173,9 @@ class HTMLPurifier_URI
public function toString() {
// reconstruct authority
$authority = null;
// there is a rendering difference between a null authority
// (http:foo-bar) and an empty string authority
// (http:///foo-bar).
if (!is_null($this->host)) {
$authority = '';
if(!is_null($this->userinfo)) $authority .= $this->userinfo . '@';
@@ -157,7 +183,12 @@ class HTMLPurifier_URI
if(!is_null($this->port)) $authority .= ':' . $this->port;
}
// reconstruct the result
// Reconstruct the result
// One might wonder about parsing quirks from browsers after
// this reconstruction. Unfortunately, parsing behavior depends
// on what *scheme* was employed (file:///foo is handled *very*
// differently than http:///foo), so unfortunately we have to
// defer to the schemes to do the right thing.
$result = '';
if (!is_null($this->scheme)) $result .= $this->scheme . ':';
if (!is_null($authority)) $result .= '//' . $authority;

View File

@@ -0,0 +1,11 @@
<?php
class HTMLPurifier_URIFilter_DisableResources extends HTMLPurifier_URIFilter
{
public $name = 'DisableResources';
public function filter(&$uri, $config, $context) {
return !$context->get('EmbeddedURI', true);
}
}
// vim: et sw=4 sts=4

View File

@@ -3,11 +3,13 @@
/**
* Validator for the components of a URI for a specific scheme
*/
class HTMLPurifier_URIScheme
abstract class HTMLPurifier_URIScheme
{
/**
* Scheme's default port (integer)
* Scheme's default port (integer). If an explicit port number is
* specified that coincides with the default port, it will be
* elided.
*/
public $default_port = null;
@@ -24,17 +26,62 @@ class HTMLPurifier_URIScheme
public $hierarchical = false;
/**
* Validates the components of a URI
* @note This implementation should be called by children if they define
* a default port, as it does port processing.
* @param $uri Instance of HTMLPurifier_URI
* Whether or not the URI may omit a hostname when the scheme is
* explicitly specified, ala file:///path/to/file. As of writing,
* 'file' is the only scheme that browsers support his properly.
*/
public $may_omit_host = false;
/**
* Validates the components of a URI for a specific scheme.
* @param $uri Reference to a HTMLPurifier_URI object
* @param $config HTMLPurifier_Config object
* @param $context HTMLPurifier_Context object
* @return Bool success or failure
*/
public abstract function doValidate(&$uri, $config, $context);
/**
* Public interface for validating components of a URI. Performs a
* bunch of default actions. Don't overload this method.
* @param $uri Reference to a HTMLPurifier_URI object
* @param $config HTMLPurifier_Config object
* @param $context HTMLPurifier_Context object
* @return Bool success or failure
*/
public function validate(&$uri, $config, $context) {
if ($this->default_port == $uri->port) $uri->port = null;
return true;
// kludge: browsers do funny things when the scheme but not the
// authority is set
if (!$this->may_omit_host &&
// if the scheme is present, a missing host is always in error
(!is_null($uri->scheme) && ($uri->host === '' || is_null($uri->host))) ||
// if the scheme is not present, a *blank* host is in error,
// since this translates into '///path' which most browsers
// interpret as being 'http://path'.
(is_null($uri->scheme) && $uri->host === '')
) {
do {
if (is_null($uri->scheme)) {
if (substr($uri->path, 0, 2) != '//') {
$uri->host = null;
break;
}
// URI is '////path', so we cannot nullify the
// host to preserve semantics. Try expanding the
// hostname instead (fall through)
}
// first see if we can manually insert a hostname
$host = $config->get('URI.Host');
if (!is_null($host)) {
$uri->host = $host;
} else {
// we can't do anything sensible, reject the URL.
return false;
}
} while (false);
}
return $this->doValidate($uri, $config, $context);
}
}

View File

@@ -13,8 +13,11 @@ class HTMLPurifier_URIScheme_data extends HTMLPurifier_URIScheme {
'image/gif' => true,
'image/png' => true,
);
// this is actually irrelevant since we only write out the path
// component
public $may_omit_host = true;
public function validate(&$uri, $config, $context) {
public function doValidate(&$uri, $config, $context) {
$result = explode(',', $uri->path, 2);
$is_base64 = false;
$charset = null;

View File

@@ -0,0 +1,32 @@
<?php
/**
* Validates file as defined by RFC 1630 and RFC 1738.
*/
class HTMLPurifier_URIScheme_file extends HTMLPurifier_URIScheme {
// Generally file:// URLs are not accessible from most
// machines, so placing them as an img src is incorrect.
public $browsable = false;
// Basically the *only* URI scheme for which this is true, since
// accessing files on the local machine is very common. In fact,
// browsers on some operating systems don't understand the
// authority, though I hear it is used on Windows to refer to
// network shares.
public $may_omit_host = true;
public function doValidate(&$uri, $config, $context) {
// Authentication method is not supported
$uri->userinfo = null;
// file:// makes no provisions for accessing the resource
$uri->port = null;
// While it seems to work on Firefox, the querystring has
// no possible effect and is thus stripped.
$uri->query = null;
return true;
}
}
// vim: et sw=4 sts=4

View File

@@ -9,8 +9,7 @@ class HTMLPurifier_URIScheme_ftp extends HTMLPurifier_URIScheme {
public $browsable = true; // usually
public $hierarchical = true;
public function validate(&$uri, $config, $context) {
parent::validate($uri, $config, $context);
public function doValidate(&$uri, $config, $context) {
$uri->query = null;
// typecode check

View File

@@ -9,8 +9,7 @@ class HTMLPurifier_URIScheme_http extends HTMLPurifier_URIScheme {
public $browsable = true;
public $hierarchical = true;
public function validate(&$uri, $config, $context) {
parent::validate($uri, $config, $context);
public function doValidate(&$uri, $config, $context) {
$uri->userinfo = null;
return true;
}

View File

@@ -12,9 +12,9 @@
class HTMLPurifier_URIScheme_mailto extends HTMLPurifier_URIScheme {
public $browsable = false;
public $may_omit_host = true;
public function validate(&$uri, $config, $context) {
parent::validate($uri, $config, $context);
public function doValidate(&$uri, $config, $context) {
$uri->userinfo = null;
$uri->host = null;
$uri->port = null;

View File

@@ -6,9 +6,9 @@
class HTMLPurifier_URIScheme_news extends HTMLPurifier_URIScheme {
public $browsable = false;
public $may_omit_host = true;
public function validate(&$uri, $config, $context) {
parent::validate($uri, $config, $context);
public function doValidate(&$uri, $config, $context) {
$uri->userinfo = null;
$uri->host = null;
$uri->port = null;

View File

@@ -8,8 +8,7 @@ class HTMLPurifier_URIScheme_nntp extends HTMLPurifier_URIScheme {
public $default_port = 119;
public $browsable = false;
public function validate(&$uri, $config, $context) {
parent::validate($uri, $config, $context);
public function doValidate(&$uri, $config, $context) {
$uri->userinfo = null;
$uri->query = null;
return true;

View File

@@ -62,7 +62,7 @@ class HTMLPurifier_VarParser_Flexible extends HTMLPurifier_VarParser
foreach ($var as $keypair) {
$c = explode(':', $keypair, 2);
if (!isset($c[1])) continue;
$nvar[$c[0]] = $c[1];
$nvar[trim($c[0])] = trim($c[1]);
}
$var = $nvar;
}
@@ -79,8 +79,15 @@ class HTMLPurifier_VarParser_Flexible extends HTMLPurifier_VarParser
return $new;
} else break;
}
if ($type === self::ALIST) {
trigger_error("Array list did not have consecutive integer indexes", E_USER_WARNING);
return array_values($var);
}
if ($type === self::LOOKUP) {
foreach ($var as $key => $value) {
if ($value !== true) {
trigger_error("Lookup array has non-true value at key '$key'; maybe your input array was not indexed numerically", E_USER_WARNING);
}
$var[$key] = true;
}
}

View File

@@ -18,8 +18,7 @@ function e($cmd) {
if ($status) exit($status);
}
$php = $_SERVER['argv'][1];
if (!$php) $php = 'php';
$php = empty($_SERVER['argv'][1]) ? 'php' : $_SERVER['argv'][1];
e($php . ' generate-includes.php');
e($php . ' generate-schema-cache.php');

View File

@@ -36,7 +36,7 @@ function unichr($dec) {
}
if ( !is_dir($entity_dir) ) exit("Fatal Error: Can't find entity directory.\n");
if ( file_exists($output_file) ) exit("Fatal Error: entity-lookup.txt already exists.\n");
if ( file_exists($output_file) ) exit("Fatal Error: output file already exists.\n");
$dh = @opendir($entity_dir);
if ( !$dh ) exit("Fatal Error: Cannot read entity directory.\n");
@@ -52,7 +52,7 @@ closedir($dh);
if ( !$entity_files ) exit("Fatal Error: No entity files to parse.\n");
$entity_table = array();
$regexp = '/<!ENTITY\s+([A-Za-z]+)\s+"&#(?:38;#)?([0-9]+);">/';
$regexp = '/<!ENTITY\s+([A-Za-z0-9]+)\s+"&#(?:38;#)?([0-9]+);">/';
foreach ( $entity_files as $file ) {
$contents = file_get_contents($entity_dir . $file);

View File

@@ -80,8 +80,9 @@ function get_dependency_lookup($file) {
if (strncmp('class', $line, 5) === 0) {
// The implementation here is fragile and will break if we attempt
// to use interfaces. Beware!
list(, $parent) = explode(' extends ', trim($line, ' {'."\n\r"), 2);
if (empty($parent)) break;
$arr = explode(' extends ', trim($line, ' {'."\n\r"), 2);
if (count($arr) < 2) break;
$parent = $arr[1];
$dep_file = HTMLPurifier_Bootstrap::getPath($parent);
if (!$dep_file) break;
$deps[$dep_file] = true;

View File

@@ -1,367 +0,0 @@
Index: src/PHPT/Case.php
===================================================================
--- src/PHPT/Case.php (revision 691)
+++ src/PHPT/Case.php (working copy)
@@ -28,17 +28,14 @@
{
$reporter->onCaseStart($this);
try {
- if ($this->sections->filterByInterface('RunnableBefore')->valid()) {
- foreach ($this->sections as $section) {
- $section->run($this);
- }
+ $runnable_before = $this->sections->filterByInterface('RunnableBefore');
+ foreach ($runnable_before as $section) {
+ $section->run($this);
}
- $this->sections->filterByInterface();
$this->sections->FILE->run($this);
- if ($this->sections->filterByInterface('RunnableAfter')->valid()) {
- foreach ($this->sections as $section) {
- $section->run($this);
- }
+ $runnable_after = $this->sections->filterByInterface('RunnableAfter');
+ foreach ($runnable_after as $section) {
+ $section->run($this);
}
$reporter->onCasePass($this);
} catch (PHPT_Case_VetoException $veto) {
@@ -46,7 +43,6 @@
} catch (PHPT_Case_FailureException $failure) {
$reporter->onCaseFail($this, $failure);
}
- $this->sections->filterByInterface();
$reporter->onCaseEnd($this);
}
Index: src/PHPT/Case/Validator/CgiRequired.php
===================================================================
--- src/PHPT/Case/Validator/CgiRequired.php (revision 691)
+++ src/PHPT/Case/Validator/CgiRequired.php (working copy)
@@ -17,7 +17,6 @@
public function is(PHPT_Case $case)
{
$return = $case->sections->filterByInterface('CgiExecutable')->valid();
- $case->sections->filterByInterface();
return $return;
}
}
Index: src/PHPT/CodeRunner/CommandLine.php
===================================================================
--- src/PHPT/CodeRunner/CommandLine.php (revision 691)
+++ src/PHPT/CodeRunner/CommandLine.php (working copy)
@@ -13,7 +13,7 @@
$this->_filename = $runner->filename;
$this->_ini = (string)$runner->ini;
$this->_args = (string)$runner->args;
- $this->_executable = str_replace(' ', '\ ', (string)$runner->executable);
+ $this->_executable = $runner->executable;
$this->_post_filename = (string)$runner->post_filename;
}
Index: src/PHPT/CodeRunner/Driver/WScriptShell.php
===================================================================
--- src/PHPT/CodeRunner/Driver/WScriptShell.php (revision 691)
+++ src/PHPT/CodeRunner/Driver/WScriptShell.php (working copy)
@@ -23,9 +23,9 @@
}
}
if ($found == false) {
- throw new PHPT_CodeRunner_InvalidExecutableException(
- 'unable to locate PHP executable: ' . $this->executable
- );
+ //throw new PHPT_CodeRunner_InvalidExecutableException(
+ // 'unable to locate PHP executable: ' . $this->executable
+ //);
}
}
@@ -69,7 +69,7 @@
$error = $this->_process->StdErr->ReadAll();
if (!empty($error)) {
- throw new PHPT_CodeRunner_ExecutionException($error);
+ throw new PHPT_CodeRunner_ExecutionException($error, $this->_commandFactory());
}
return $this->_process->StdOut->ReadAll();
@@ -93,6 +93,7 @@
{
$return = '';
foreach ($this->environment as $key => $value) {
+ $value = str_replace('&', '^&', $value);
$return .= "set {$key}={$value} & ";
}
return $return;
Index: src/PHPT/CodeRunner/Factory.php
===================================================================
--- src/PHPT/CodeRunner/Factory.php (revision 691)
+++ src/PHPT/CodeRunner/Factory.php (working copy)
@@ -33,7 +33,13 @@
'php-cgi';
}
- if (strtoupper(substr(PHP_OS, 0, 3)) == 'WIN') {
+ if (
+ strtoupper(substr(PHP_OS, 0, 3)) == 'WIN' &&
+ (
+ $runner->executable == 'php' ||
+ $runner->executable == 'php-cgi'
+ )
+ ) {
$runner->executable = $runner->executable . '.exe';
}
try {
Index: src/PHPT/Section/ModifiableAbstract.php
===================================================================
--- src/PHPT/Section/ModifiableAbstract.php (revision 691)
+++ src/PHPT/Section/ModifiableAbstract.php (working copy)
@@ -15,12 +15,10 @@
public function run(PHPT_Case $case)
{
- $sections = clone $case->sections;
- if ($sections->filterByInterface($this->_modifier_name . 'Modifier')->valid()) {
- $modifyMethod = 'modify' . $this->_modifier_name;
- foreach ($sections as $section) {
- $section->$modifyMethod($this);
- }
+ $modifiers = $case->sections->filterByInterface($this->_modifier_name . 'Modifier');
+ $modifyMethod = 'modify' . $this->_modifier_name;
+ foreach ($modifiers as $section) {
+ $section->$modifyMethod($this);
}
}
Index: src/PHPT/Section/SKIPIF.php
===================================================================
--- src/PHPT/Section/SKIPIF.php (revision 691)
+++ src/PHPT/Section/SKIPIF.php (working copy)
@@ -3,10 +3,12 @@
class PHPT_Section_SKIPIF implements PHPT_Section_RunnableBefore
{
private $_data = null;
+ private $_runner_factory = null;
public function __construct($data)
{
$this->_data = $data;
+ $this->_runner_factory = new PHPT_CodeRunner_Factory();
}
public function run(PHPT_Case $case)
@@ -16,9 +18,7 @@
// @todo refactor to PHPT_CodeRunner
file_put_contents($filename, $this->_data);
- $response = array();
- exec('php -f ' . $filename, $response);
- $response = implode("\n", $response);
+ $response = $this->_runner_factory->factory($case)->run($filename)->output;
unlink($filename);
if (preg_match('/^skip( - (.*))?/', $response, $matches)) {
Index: src/PHPT/SectionList.php
===================================================================
--- src/PHPT/SectionList.php (revision 691)
+++ src/PHPT/SectionList.php (working copy)
@@ -2,7 +2,6 @@
class PHPT_SectionList implements Iterator
{
- private $_raw_sections = array();
private $_sections = array();
private $_section_map = array();
private $_key_map = array();
@@ -15,14 +14,12 @@
}
$name = strtoupper(str_replace('PHPT_Section_', '', get_class($section)));
$key = $section instanceof PHPT_Section_Runnable ? $section->getPriority() . '.' . $name : $name;
- $this->_raw_sections[$key] = $section;
+ $this->_sections[$key] = $section;
$this->_section_map[$name] = $key;
$this->_key_map[$key] = $name;
}
- ksort($this->_raw_sections);
-
- $this->_sections = $this->_raw_sections;
+ ksort($this->_sections);
}
public function current()
@@ -52,21 +49,23 @@
public function filterByInterface($interface = null)
{
+ $ret = new PHPT_SectionList();
+
if (is_null($interface)) {
- $this->_sections = $this->_raw_sections;
- return $this;
+ $ret->_sections = $this->_sections;
+ return $ret;
}
$full_interface = 'PHPT_Section_' . $interface;
- $this->_sections = array();
- foreach ($this->_raw_sections as $name => $section) {
+ $ret->_sections = array();
+ foreach ($this->_sections as $name => $section) {
if (!$section instanceof $full_interface) {
continue;
}
- $this->_sections[$name] = $section;
+ $ret->_sections[$name] = $section;
}
- return $this;
+ return $ret;
}
public function has($name)
@@ -74,11 +73,11 @@
if (!isset($this->_section_map[$name])) {
return false;
}
- return isset($this->_raw_sections[$this->_section_map[$name]]);
+ return isset($this->_sections[$this->_section_map[$name]]);
}
public function __get($key)
{
- return $this->_raw_sections[$this->_section_map[$key]];
+ return $this->_sections[$this->_section_map[$key]];
}
}
Index: tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt
===================================================================
--- tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt (revision 691)
+++ tests/CodeRunner/Driver/WScriptShell/injects-ini-settings.phpt (working copy)
@@ -17,9 +17,9 @@
// sanity check
$obj = new FoobarIni();
-assert('(string)$obj == " -d display_errors=1 "');
+assert('(string)$obj == " -d \"display_errors=1\" "');
$obj->display_errors = 0;
-assert('(string)$obj == " -d display_errors=0 "');
+assert('(string)$obj == " -d \"display_errors=0\" "');
unset($obj);
Index: tests/Section/File/restores-case-sections.phpt
===================================================================
--- tests/Section/File/restores-case-sections.phpt (revision 691)
+++ tests/Section/File/restores-case-sections.phpt (working copy)
@@ -1,24 +0,0 @@
---TEST--
-After PHPT_Section_FILE::run(), the sections property of the provide $case object
-is restored to its unfiltered state
---FILE--
-<?php
-
-require_once dirname(__FILE__) . '/../../_setup.inc';
-require_once dirname(__FILE__) . '/../_simple-test-case.inc';
-require_once dirname(__FILE__) . '/_simple-file-modifier.inc';
-
-$case = new PHPT_SimpleTestCase();
-$case->sections = new PHPT_SectionList(array(
- new PHPT_Section_ARGS('foo=bar'),
-));
-
-$section = new PHPT_Section_FILE('hello world');
-$section->run($case);
-
-assert('$case->sections->valid()');
-
-?>
-===DONE===
---EXPECT--
-===DONE===
Index: tests/SectionList/filter-by-interface.phpt
===================================================================
--- tests/SectionList/filter-by-interface.phpt (revision 691)
+++ tests/SectionList/filter-by-interface.phpt (working copy)
@@ -17,10 +17,10 @@
$data = array_merge($runnable, $non_runnable);
$list = new PHPT_SectionList($data);
-$list->filterByInterface('Runnable');
-assert('$list->valid()');
-$list->filterByInterface('EnvModifier');
-assert('$list->valid() == false');
+$runnable = $list->filterByInterface('Runnable');
+assert('$runnable->valid()');
+$env_modifier = $list->filterByInterface('EnvModifier');
+assert('$env_modifier->valid() == false');
?>
===DONE===
Index: tests/SectionList/filter-resets-with-null.phpt
===================================================================
--- tests/SectionList/filter-resets-with-null.phpt (revision 691)
+++ tests/SectionList/filter-resets-with-null.phpt (working copy)
@@ -1,36 +0,0 @@
---TEST--
-If you call filterByInterface() with null or no-value, the full dataset is restored
---FILE--
-<?php
-
-require_once dirname(__FILE__) . '/../_setup.inc';
-
-$runnable = array(
- 'ENV' => new PHPT_Section_ENV(''),
- 'CLEAN' => new PHPT_Section_CLEAN(''),
-);
-
-class PHPT_Section_FOO implements PHPT_Section { }
-$non_runnable = array(
- 'FOO' => new PHPT_Section_FOO(),
-);
-
-$data = array_merge($runnable, $non_runnable);
-$list = new PHPT_SectionList($data);
-$list->filterByInterface('Runnable');
-
-// sanity check
-foreach ($list as $key => $value) {
- assert('$runnable[$key] == $value');
-}
-
-$list->filterByInterface();
-
-foreach ($list as $key => $value) {
- assert('$data[$key] == $value');
-}
-
-?>
-===DONE===
---EXPECT--
-===DONE===
Index: tests/Util/Code/runAsFile-executes-in-file.phpt
===================================================================
--- tests/Util/Code/runAsFile-executes-in-file.phpt (revision 691)
+++ tests/Util/Code/runAsFile-executes-in-file.phpt (working copy)
@@ -10,7 +10,7 @@
$util = new PHPT_Util_Code($code);
-$file = dirname(__FILE__) . '/foobar.php';
+$file = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'foobar.php';
$result = $util->runAsFile($file);
assert('$result == $file');
Index: tests/Util/Code/runAsFile-returns-output-if-no-return.phpt
===================================================================
--- tests/Util/Code/runAsFile-returns-output-if-no-return.phpt (revision 691)
+++ tests/Util/Code/runAsFile-returns-output-if-no-return.phpt (working copy)
@@ -9,7 +9,7 @@
$util = new PHPT_Util_Code($code);
-$file = dirname(__FILE__) . '/foobar.php';
+$file = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'foobar.php';
$result = $util->runAsFile($file);
assert('$result == $file');

33
smoketests/innerHTML.html Normal file
View File

@@ -0,0 +1,33 @@
<html>
<head>
<title>innerHTML smoketest</title>
</head>
<body>
<!--
What we're going to do is use JavaScript to calculate
fixpoints of innerHTML parse and reparsing. We start with
an input value, encoded in a JavaScript string.
x.innerHTML = input
We then snapshot the DOM state of x, and then perform the
iteration:
intermediate = x.innerHTML
x.innerHTML = intermediate
What inputs are we going to test?
We will generate using the following alphabet:
a01~!@#$%^&*()_+`-=[]\{}|;':",./<>? (and <space>)
-->
<textarea id="out" style="width:100%;height:100%;"></textarea>
<div id="testContainer" style="display:none"></div>
<script src="innerHTML.js" type="text/javascript"></script>
</body>
</html>

51
smoketests/innerHTML.js Normal file
View File

@@ -0,0 +1,51 @@
var alphabet = 'a!`=[]\\;\':"/<> &';
var out = document.getElementById('out');
var testContainer = document.getElementById('testContainer');
function print(s) {
out.value += s + "\n";
}
function testImage() {
return testContainer.firstChild;
}
function test(input) {
var count = 0;
var oldInput, newInput;
testContainer.innerHTML = "<img />";
testImage().setAttribute("alt", input);
print("------");
print("Test input: " + input);
do {
oldInput = testImage().getAttribute("alt");
var intermediate = testContainer.innerHTML;
print("Render: " + intermediate);
testContainer.innerHTML = intermediate;
if (testImage() == null) {
print("Image disappeared...");
break;
}
newInput = testImage().getAttribute("alt");
print("New value: " + newInput);
count++;
} while (count < 5 && newInput != oldInput);
if (count == 5) {
print("Failed to achieve fixpoint");
}
testContainer.innerHTML = "";
}
print("Go!");
test("`` ");
test("'' ");
for (var i = 0; i < alphabet.length; i++) {
for (var j = 0; j < alphabet.length; j++) {
test(alphabet.charAt(i) + alphabet.charAt(j));
}
}
// document.getElementById('out').textContent = alphabet;

View File

@@ -22,6 +22,23 @@ $string = '<object width="425" height="350"><param name="movie" value="http://ww
<object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/uNxBeJNyAqA&hl=en_US&fs=1&"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/uNxBeJNyAqA&hl=en_US&fs=1&" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object>
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0" height="385" width="480"><param name="width" value="480" /><param name="height" value="385" /><param name="src" value="http://www.youtube.com/p/E37ADDDFCA0FD050&amp;hl=en" /><embed height="385" src="http://www.youtube.com/p/E37ADDDFCA0FD050&amp;hl=en" type="application/x-shockwave-flash" width="480"></embed></object>
<object
classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000"
id="ooyalaPlayer_229z0_gbps1mrs" width="630" height="354"
codebase="http://fpdownload.macromedia.com/get/flashplayer/current/swflash.cab"><param
name="movie" value="http://player.ooyala.com/player.swf?embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za&version=2"
/><param name="bgcolor" value="#000000" /><param
name="allowScriptAccess" value="always" /><param
name="allowFullScreen" value="true" /><param name="flashvars"
value="embedType=noscriptObjectTag&embedCode=pteGRrMTpcKMyQ052c8NwYZ5M5FdSV3j"
/><embed src="http://player.ooyala.com/player.swf?embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za&version=2"
bgcolor="#000000" width="630" height="354"
name="ooyalaPlayer_229z0_gbps1mrs" align="middle" play="true"
loop="false" allowscriptaccess="always" allowfullscreen="true"
type="application/x-shockwave-flash"
flashvars="&embedCode=FpZnZwMTo1wqBF-ed2__OUBb3V4HR6za"
pluginspage="http://www.adobe.com/go/getflashplayer"></embed></object>
';
$regular_purifier = new HTMLPurifier();

View File

@@ -37,13 +37,14 @@ $simpletest_location = '/path/to/simpletest/';
// OPTIONAL SETTINGS
// Note on running PHPT:
// Vanilla PHPT from http://phpt.info will not work, because there are
// a number of bugs that prevent HTML Purifier from doing what they need
// to do. If you really want to run PHPT, you'll will need to apply the
// patches in maintenance/phpt-modifications.patch on the PHPT Core trunk,
// which can be checked out using:
// Vanilla PHPT from https://github.com/tswicegood/PHPT_Core should
// work fine on Linux w/o multitest.
//
// $ svn co https://svn.phpt.info/Core/trunk phpt-core
// To do multitest or Windows testing, you'll need some more
// patches at https://github.com/ezyang/PHPT_Core
//
// I haven't tested the Windows setup in a while so I don't know if
// it still works.
// Should PHPT tests be enabled?
$GLOBALS['HTMLPurifierTest']['PHPT'] = false;

View File

@@ -28,13 +28,13 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPositionTest extends HTMLPurifier_AttrD
// reordered due to internal impl details
$this->assertDef('top left', 'left top');
$this->assertDef('top center', 'center top');
$this->assertDef('top center', 'top');
$this->assertDef('top right', 'right top');
$this->assertDef('center left', 'left center');
$this->assertDef('center center', 'center'); // two centers collide
$this->assertDef('center right', 'right center');
$this->assertDef('center left', 'left');
$this->assertDef('center center', 'center');
$this->assertDef('center right', 'right');
$this->assertDef('bottom left', 'left bottom');
$this->assertDef('bottom center', 'center bottom');
$this->assertDef('bottom center', 'bottom');
$this->assertDef('bottom right', 'right bottom');
// more cases from the defined syntax

View File

@@ -8,12 +8,12 @@ class HTMLPurifier_AttrDef_CSS_BackgroundTest extends HTMLPurifier_AttrDefHarnes
$config = HTMLPurifier_Config::createDefault();
$this->def = new HTMLPurifier_AttrDef_CSS_Background($config);
$valid = '#333 url(\'chess.png\') repeat fixed 50% top';
$valid = '#333 url("chess.png") repeat fixed 50% top';
$this->assertDef($valid);
$this->assertDef('url("chess.png") #333 50% top repeat fixed', $valid);
$this->assertDef('url(\'chess.png\') #333 50% top repeat fixed', $valid);
$this->assertDef(
'rgb(34, 56, 33) url(chess.png) repeat fixed top',
'rgb(34,56,33) url(\'chess.png\') repeat fixed top'
'rgb(34,56,33) url("chess.png") repeat fixed top'
);
}

View File

@@ -8,30 +8,43 @@ class HTMLPurifier_AttrDef_CSS_FontFamilyTest extends HTMLPurifier_AttrDefHarnes
$this->def = new HTMLPurifier_AttrDef_CSS_FontFamily();
$this->assertDef('Gill, Helvetica, sans-serif');
$this->assertDef('\'Times New Roman\', serif');
$this->assertDef('"Times New Roman"', "'Times New Roman'");
$this->assertDef("'Times New Roman', serif");
$this->assertDef("\"Times New Roman\"", "'Times New Roman'");
$this->assertDef('01234');
$this->assertDef(',', false);
$this->assertDef('Times New Roman, serif', '\'Times New Roman\', serif');
$this->assertDef($d = "'John\\'s Font'");
$this->assertDef("John's Font", $d);
$this->assertDef('Times New Roman, serif', "'Times New Roman', serif");
$this->assertDef($d = "'\xE5\xAE\x8B\xE4\xBD\x93'");
$this->assertDef("\xE5\xAE\x8B\xE4\xBD\x93", $d);
$this->assertDef("'\\','f'", "'\\\\', f");
$this->assertDef("'\\01'", "''");
$this->assertDef("'\\20'", "' '");
$this->assertDef("\\0020", "'\\\\0020'");
$this->assertDef("\\0020", "' '");
$this->assertDef("'\\000045'", "E");
$this->assertDef("','", false);
$this->assertDef("',' foobar','", "' foobar'");
$this->assertDef("'\\27'", "'\''");
$this->assertDef('"\\22"', "'\"'");
$this->assertDef('"\\""', "'\"'");
$this->assertDef('"\'"', "'\\''");
$this->assertDef("'\\000045a'", "Ea");
$this->assertDef("'\\00045 a'", "Ea");
$this->assertDef("'\\00045 a'", "'E a'");
$this->assertDef("'\\\nf'", "f");
// No longer supported, except maybe in NoJS mode (see source
// file for more explanation)
//$this->assertDef($d = '"John\'s Font"');
//$this->assertDef("John's Font", $d);
//$this->assertDef("'\\','f'", "\"\\5C \", f");
//$this->assertDef("'\\27'", "\"'\"");
//$this->assertDef('"\\22"', "\"\\22 \"");
//$this->assertDef('"\\""', "\"\\22 \"");
//$this->assertDef('"\'"', "\"'\"");
}
function testAllowed() {
$this->config->set('CSS.AllowedFonts', array('serif', 'Times New Roman'));
$this->assertDef('serif');
$this->assertDef('sans-serif', false);
$this->assertDef('serif, sans-serif', 'serif');
$this->assertDef('Times New Roman', "'Times New Roman'");
$this->assertDef("'Times New Roman'");
$this->assertDef('foo', false);
}
}

View File

@@ -11,10 +11,10 @@ class HTMLPurifier_AttrDef_CSS_FontTest extends HTMLPurifier_AttrDefHarness
// hodgepodge of usage cases from W3C spec, but " -> '
$this->assertDef('12px/14px sans-serif');
$this->assertDef('80% sans-serif');
$this->assertDef('x-large/110% \'New Century Schoolbook\', serif');
$this->assertDef("x-large/110% 'New Century Schoolbook', serif");
$this->assertDef('bold italic large Palatino, serif');
$this->assertDef('normal small-caps 120%/120% fantasy');
$this->assertDef('300 italic 1.3em/1.7em \'FB Armada\', sans-serif');
$this->assertDef("300 italic 1.3em/1.7em 'FB Armada', sans-serif");
$this->assertDef('600 9px Charcoal');
$this->assertDef('600 9px/ 12px Charcoal', '600 9px/12px Charcoal');

View File

@@ -13,14 +13,14 @@ class HTMLPurifier_AttrDef_CSS_ListStyleTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('circle outside');
$this->assertDef('inside');
$this->assertDef('none');
$this->assertDef('url(\'foo.gif\')');
$this->assertDef('circle url(\'foo.gif\') inside');
$this->assertDef('url("foo.gif")');
$this->assertDef('circle url("foo.gif") inside');
// invalid values
$this->assertDef('outside inside', 'outside');
// ordering
$this->assertDef('url(foo.gif) none', 'none url(\'foo.gif\')');
$this->assertDef('url(foo.gif) none', 'none url("foo.gif")');
$this->assertDef('circle lower-alpha', 'circle');
// the spec is ambiguous about what happens in these
// cases, so we're going off the W3C CSS validator

View File

@@ -12,20 +12,16 @@ class HTMLPurifier_AttrDef_CSS_URITest extends HTMLPurifier_AttrDefHarness
// we could be nice but we won't be
$this->assertDef('http://www.example.com/', false);
// no quotes are used, since that's the most widely supported
// syntax
$this->assertDef('url(', false);
$this->assertDef('url(\'\')', true);
$result = "url('http://www.example.com/')";
$this->assertDef('url("")', true);
$result = 'url("http://www.example.com/")';
$this->assertDef('url(http://www.example.com/)', $result);
$this->assertDef('url("http://www.example.com/")', $result);
$this->assertDef("url('http://www.example.com/')", $result);
$this->assertDef(
' url( "http://www.example.com/" ) ', $result);
// escaping
$this->assertDef("url(http://www.example.com/foo,bar\))",
"url('http://www.example.com/foo\,bar\)')");
$this->assertDef("url(http://www.example.com/foo,bar\)\'\()",
'url("http://www.example.com/foo,bar%29%27%28")');
}
}

View File

@@ -25,7 +25,7 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('text-transform:capitalize;');
$this->assertDef('background-color:rgb(0,0,255);');
$this->assertDef('background-color:transparent;');
$this->assertDef('background:#333 url(\'chess.png\') repeat fixed 50% top;');
$this->assertDef('background:#333 url("chess.png") repeat fixed 50% top;');
$this->assertDef('color:#F00;');
$this->assertDef('border-top-color:#F00;');
$this->assertDef('border-color:#F00 #FF0;');
@@ -62,7 +62,7 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('width:-50px;', false);
$this->assertDef('text-decoration:underline;');
$this->assertDef('font-family:sans-serif;');
$this->assertDef('font-family:Gill, \'Times New Roman\', sans-serif;');
$this->assertDef("font-family:Gill, 'Times New Roman', sans-serif;");
$this->assertDef('font:12px serif;');
$this->assertDef('border:1px solid #000;');
$this->assertDef('border-bottom:2em double #FF00FA;');
@@ -73,9 +73,9 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('vertical-align:12px;');
$this->assertDef('vertical-align:50%;');
$this->assertDef('table-layout:fixed;');
$this->assertDef('list-style-image:url(\'nice.jpg\');');
$this->assertDef('list-style:disc url(\'nice.jpg\') inside;');
$this->assertDef('background-image:url(\'foo.jpg\');');
$this->assertDef('list-style-image:url("nice.jpg");');
$this->assertDef('list-style:disc url("nice.jpg") inside;');
$this->assertDef('background-image:url("foo.jpg");');
$this->assertDef('background-image:none;');
$this->assertDef('background-repeat:repeat-y;');
$this->assertDef('background-attachment:fixed;');
@@ -144,6 +144,21 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
$this->assertDef('overflow:scroll;');
}
function testForbidden() {
$this->config->set('CSS.ForbiddenProperties', 'float');
$this->assertDef('float:left;', false);
$this->assertDef('text-align:right;');
}
function testTrusted() {
$this->config->set('CSS.Trusted', true);
$this->assertDef('position:relative;');
$this->assertDef('left:2px;');
$this->assertDef('right:100%;');
$this->assertDef('top:auto;');
$this->assertDef('z-index:-2;');
}
}
// vim: et sw=4 sts=4

View File

@@ -74,6 +74,15 @@ class HTMLPurifier_AttrDef_URITest extends HTMLPurifier_AttrDefHarness
$this->assertDef('mailto:this-looks-like-a-path@example.com');
}
function testResolveNullSchemeAmbiguity() {
$this->assertDef('///foo', '/foo');
}
function testResolveNullSchemeDoubleAmbiguity() {
$this->config->set('URI.Host', 'example.com');
$this->assertDef('////foo', '//example.com//foo');
}
function testURIDefinitionValidation() {
$parser = new HTMLPurifier_URIParser();
$uri = $parser->parse('http://example.com');

View File

@@ -18,8 +18,6 @@ class HTMLPurifier_AttrValidator_ErrorsTest extends HTMLPurifier_ErrorsHarness
}
function testAttributesTransformedGlobalPre() {
$this->config->set('HTML.DefinitionID',
'HTMLPurifier_AttrValidator_ErrorsTest::testAttributesTransformedGlobalPre');
$def = $this->config->getHTMLDefinition(true);
generate_mock_once('HTMLPurifier_AttrTransform');
$transform = new HTMLPurifier_AttrTransformMock();

View File

@@ -4,6 +4,7 @@ class HTMLPurifier_ConfigTest extends HTMLPurifier_Harness
{
protected $schema;
protected $oldFactory;
public function setUp() {
// set up a dummy schema object for testing
@@ -230,23 +231,58 @@ class HTMLPurifier_ConfigTest extends HTMLPurifier_Harness
$this->assertNotEqual($def, $old_def);
$this->assertTrue($def->setup);
// test retrieval of raw definition
}
function test_getHTMLDefinition_deprecatedRawError() {
$config = HTMLPurifier_Config::createDefault();
$config->chatty = false;
// test deprecated retrieval of raw definition
$config->set('HTML.DefinitionID', 'HTMLPurifier_ConfigTest->test_getHTMLDefinition()');
$config->set('HTML.DefinitionRev', 3);
$this->expectError("Useless DefinitionID declaration");
$def = $config->getHTMLDefinition(true);
$this->assertNotEqual($def, $old_def);
$this->assertEqual(false, $def->setup);
// auto initialization
$config->getHTMLDefinition();
$this->assertTrue($def->setup);
}
function test_getHTMLDefinition_optimizedRawError() {
$this->expectException(new HTMLPurifier_Exception("Cannot set optimized = true when raw = false"));
$config = HTMLPurifier_Config::createDefault();
$config->getHTMLDefinition(false, true);
}
function test_getHTMLDefinition_rawAfterSetupError() {
$this->expectException(new HTMLPurifier_Exception("Cannot retrieve raw definition after it has already been setup"));
$config = HTMLPurifier_Config::createDefault();
$config->chatty = false;
$config->getHTMLDefinition();
$config->getHTMLDefinition(true);
}
function test_getHTMLDefinition_inconsistentOptimizedError() {
$this->expectError("Useless DefinitionID declaration");
$this->expectException(new HTMLPurifier_Exception("Inconsistent use of optimized and unoptimized raw definition retrievals"));
$config = HTMLPurifier_Config::create(array('HTML.DefinitionID' => 'HTMLPurifier_ConfigTest->test_getHTMLDefinition_inconsistentOptimizedError'));
$config->chatty = false;
$config->getHTMLDefinition(true, false);
$config->getHTMLDefinition(true, true);
}
function test_getHTMLDefinition_inconsistentOptimizedError2() {
$this->expectException(new HTMLPurifier_Exception("Inconsistent use of optimized and unoptimized raw definition retrievals"));
$config = HTMLPurifier_Config::create(array('HTML.DefinitionID' => 'HTMLPurifier_ConfigTest->test_getHTMLDefinition_inconsistentOptimizedError2'));
$config->chatty = false;
$config->getHTMLDefinition(true, true);
$config->getHTMLDefinition(true, false);
}
function test_getHTMLDefinition_rawError() {
$config = HTMLPurifier_Config::createDefault();
$this->expectException(new HTMLPurifier_Exception('Cannot retrieve raw version without specifying %HTML.DefinitionID'));
$def = $config->getHTMLDefinition(true);
$def = $config->getHTMLDefinition(true, true);
}
function test_getCSSDefinition() {
@@ -458,6 +494,64 @@ class HTMLPurifier_ConfigTest extends HTMLPurifier_Harness
$this->assertIdentical($config, $config2);
}
function testDefinitionCachingNothing() {
list($mock, $config) = $this->setupCacheMock('HTML');
// should not touch the cache
$mock->expectNever('get');
$mock->expectNever('add');
$mock->expectNever('set');
$config->getDefinition('HTML', true);
$config->getDefinition('HTML', true);
$config->getDefinition('HTML');
$this->teardownCacheMock();
}
function testDefinitionCachingOptimized() {
list($mock, $config) = $this->setupCacheMock('HTML');
$mock->expectNever('set');
$config->set('HTML.DefinitionID', 'HTMLPurifier_ConfigTest->testDefinitionCachingOptimized');
$mock->expectOnce('get');
$mock->setReturnValue('get', null);
$this->assertTrue($config->maybeGetRawHTMLDefinition());
$this->assertTrue($config->maybeGetRawHTMLDefinition());
$mock->expectOnce('add');
$config->getDefinition('HTML');
$this->teardownCacheMock();
}
function testDefinitionCachingOptimizedHit() {
$fake_config = HTMLPurifier_Config::createDefault();
$fake_def = $fake_config->getHTMLDefinition();
list($mock, $config) = $this->setupCacheMock('HTML');
// should never frob cache
$mock->expectNever('add');
$mock->expectNever('set');
$config->set('HTML.DefinitionID', 'HTMLPurifier_ConfigTest->testDefinitionCachingOptimizedHit');
$mock->expectOnce('get');
$mock->setReturnValue('get', $fake_def);
$this->assertNull($config->maybeGetRawHTMLDefinition());
$config->getDefinition('HTML');
$config->getDefinition('HTML');
$this->teardownCacheMock();
}
protected function setupCacheMock($type) {
// inject our definition cache mock globally (borrowed from
// DefinitionFactoryTest)
generate_mock_once("HTMLPurifier_DefinitionCacheFactory");
$factory = new HTMLPurifier_DefinitionCacheFactoryMock();
$this->oldFactory = HTMLPurifier_DefinitionCacheFactory::instance();
HTMLPurifier_DefinitionCacheFactory::instance($factory);
generate_mock_once("HTMLPurifier_DefinitionCache");
$mock = new HTMLPurifier_DefinitionCacheMock();
$config = HTMLPurifier_Config::createDefault();
$factory->setReturnValue('create', $mock, array($type, $config));
return array($mock, $config);
}
protected function teardownCacheMock() {
HTMLPurifier_DefinitionCacheFactory::instance($this->oldFactory);
}
}
// vim: et sw=4 sts=4

View File

@@ -172,6 +172,7 @@ class HTMLPurifier_DefinitionCache_SerializerTest extends HTMLPurifier_Definitio
* Asserts that a file does not exist, ignoring the stat cache
*/
function assertFileNotExist($file) {
clearstatcache();
$this->assertFalse(file_exists($file), 'Expected ' . $file . ' does not exist');
}
@@ -193,6 +194,27 @@ class HTMLPurifier_DefinitionCache_SerializerTest extends HTMLPurifier_Definitio
}
function testAlternatePermissions() {
$cache = new HTMLPurifier_DefinitionCache_Serializer('Test');
$config = $this->generateConfigMock('serial');
$config->version = '1.0.0';
$config->setReturnValue('get', 1, array('Test.DefinitionRev'));
$dir = dirname(__FILE__) . '/SerializerTest';
$config->setReturnValue('get', $dir, array('Cache.SerializerPath'));
$config->setReturnValue('get', 0777, array('Cache.SerializerPermissions'));
$def_original = $this->generateDefinition();
$cache->add($def_original, $config);
$this->assertFileExist($dir . '/Test/1.0.0,serial,1.ser');
$this->assertEqual(0666, 0777 & fileperms($dir . '/Test/1.0.0,serial,1.ser'));
$this->assertEqual(0777, 0777 & fileperms($dir . '/Test'));
unlink($dir . '/Test/1.0.0,serial,1.ser');
rmdir( $dir . '/Test');
}
}
// vim: et sw=4 sts=4

View File

@@ -79,10 +79,13 @@ class HTMLPurifier_Filter_ExtractStyleBlocksTest extends HTMLPurifier_Harness
}
function test_cleanCSS_angledBrackets() {
$this->assertCleanCSS(
".class {\nfont-family:'</style>';\n}",
".class {\nfont-family:'\\3C /style\\3E ';\n}"
);
// [Content] No longer can smuggle in angled brackets using
// font-family; when we add support for 'content', reinstate
// this test.
//$this->assertCleanCSS(
// ".class {\nfont-family:'</style>';\n}",
// ".class {\nfont-family:\"\\3C /style\\3E \";\n}"
//);
}
function test_cleanCSS_angledBrackets2() {
@@ -97,18 +100,20 @@ class HTMLPurifier_Filter_ExtractStyleBlocksTest extends HTMLPurifier_Harness
$this->assertCleanCSS("div {bogus:tree;}", "div {\n}");
}
/* [CONTENT]
function test_cleanCSS_escapeCodes() {
$this->assertCleanCSS(
".class {\nfont-family:'\\3C /style\\3E ';\n}"
".class {\nfont-family:\"\\3C /style\\3E \";\n}"
);
}
function test_cleanCSS_noEscapeCodes() {
$this->config->set('Filter.ExtractStyleBlocks.Escaping', false);
$this->assertCleanCSS(
".class {\nfont-family:'</style>';\n}"
".class {\nfont-family:\"</style>\";\n}"
);
}
*/
function test_cleanCSS_scope() {
$this->config->set('Filter.ExtractStyleBlocks.Scope', '#foo');

View File

@@ -82,7 +82,7 @@ class HTMLPurifier_GeneratorTest extends HTMLPurifier_Harness
$this->assertGenerateFromToken( null, '' );
}
function test_generateFromToken_() {
function test_generateFromToken_unicode() {
$theta_char = $this->_entity_lookup->table['theta'];
$this->assertGenerateFromToken(
new HTMLPurifier_Token_Text($theta_char),
@@ -90,6 +90,28 @@ class HTMLPurifier_GeneratorTest extends HTMLPurifier_Harness
);
}
function test_generateFromToken_backtick() {
$this->assertGenerateFromToken(
new HTMLPurifier_Token_Start('img', array('alt' => '`foo')),
'<img alt="`foo ">'
);
}
function test_generateFromToken_backtickDisabled() {
$this->config->set('Output.FixInnerHTML', false);
$this->assertGenerateFromToken(
new HTMLPurifier_Token_Start('img', array('alt' => '`')),
'<img alt="`">'
);
}
function test_generateFromToken_backtickNoChange() {
$this->assertGenerateFromToken(
new HTMLPurifier_Token_Start('img', array('alt' => '`foo` bar')),
'<img alt="`foo` bar">'
);
}
function assertGenerateAttributes($attr, $expect, $element = false) {
$generator = $this->createGenerator();
$result = $generator->generateAttributes($attr, $element);

View File

@@ -122,17 +122,20 @@ a[href|title]
}
function test_AllowedAttributes_global_preferredSyntax() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'style');
$this->assertPurification_AllowedAttributes_global_style();
}
function test_AllowedAttributes_global_verboseSyntax() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', '*@style');
$this->assertPurification_AllowedAttributes_global_style();
}
function test_AllowedAttributes_global_discouragedSyntax() {
// Emit errors eventually
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', '*.style');
$this->assertPurification_AllowedAttributes_global_style();
}
@@ -144,16 +147,19 @@ a[href|title]
}
function test_AllowedAttributes_local_preferredSyntax() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p@style');
$this->assertPurification_AllowedAttributes_local_p_style();
}
function test_AllowedAttributes_local_discouragedSyntax() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p.style');
$this->assertPurification_AllowedAttributes_local_p_style();
}
function test_AllowedAttributes_multiple() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p@style,br@class,title');
$this->assertPurification(
'<p style="font-weight:bold;" class="foo" title="foo">Jelly</p><br style="clear:both;" class="foo" title="foo" />',
@@ -162,29 +168,34 @@ a[href|title]
}
function test_AllowedAttributes_local_invalidAttribute() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', array('p@style', 'p@<foo>'));
$this->expectError(new PatternExpectation("/Attribute '&lt;foo&gt;' in element 'p' not supported/"));
$this->assertPurification_AllowedAttributes_local_p_style();
}
function test_AllowedAttributes_global_invalidAttribute() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', array('style', '<foo>'));
$this->expectError(new PatternExpectation("/Global attribute '&lt;foo&gt;' is not supported in any elements/"));
$this->assertPurification_AllowedAttributes_global_style();
}
function test_AllowedAttributes_local_invalidAttributeDueToMissingElement() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p.style,foo.style');
$this->expectError(new PatternExpectation("/Cannot allow attribute 'style' if element 'foo' is not allowed\/supported/"));
$this->assertPurification_AllowedAttributes_local_p_style();
}
function test_AllowedAttributes_duplicate() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p.style,p@style');
$this->assertPurification_AllowedAttributes_local_p_style();
}
function test_AllowedAttributes_multipleErrors() {
$this->config->set('HTML.AllowedElements', array('p', 'br'));
$this->config->set('HTML.AllowedAttributes', 'p.style,foo.style,<foo>');
$this->expectError(new PatternExpectation("/Cannot allow attribute 'style' if element 'foo' is not allowed\/supported/"));
$this->expectError(new PatternExpectation("/Global attribute '&lt;foo&gt;' is not supported in any elements/"));
@@ -243,9 +254,7 @@ a[href|title]
function test_addAttribute() {
$config = HTMLPurifier_Config::create(array(
'HTML.DefinitionID' => 'HTMLPurifier_HTMLDefinitionTest->test_addAttribute'
));
$config = HTMLPurifier_Config::createDefault();
$def = $config->getHTMLDefinition(true);
$def->addAttribute('span', 'custom', 'Enum#attribute');
@@ -258,9 +267,7 @@ a[href|title]
function test_addAttribute_multiple() {
$config = HTMLPurifier_Config::create(array(
'HTML.DefinitionID' => 'HTMLPurifier_HTMLDefinitionTest->test_addAttribute_multiple'
));
$config = HTMLPurifier_Config::createDefault();
$def = $config->getHTMLDefinition(true);
$def->addAttribute('span', 'custom', 'Enum#attribute');
$def->addAttribute('span', 'foo', 'Text');
@@ -274,9 +281,7 @@ a[href|title]
function test_addElement() {
$config = HTMLPurifier_Config::create(array(
'HTML.DefinitionID' => 'HTMLPurifier_HTMLDefinitionTest->test_addElement'
));
$config = HTMLPurifier_Config::createDefault();
$def = $config->getHTMLDefinition(true);
$def->addElement('marquee', 'Inline', 'Inline', 'Common', array('width' => 'Length'));
@@ -288,8 +293,6 @@ a[href|title]
}
function test_injector() {
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLDefinitionTest->test_injector');
generate_mock_once('HTMLPurifier_Injector');
$injector = new HTMLPurifier_InjectorMock();
$injector->name = 'MyInjector';
@@ -306,8 +309,6 @@ a[href|title]
}
function test_injectorMissingNeeded() {
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLDefinitionTest->test_injectorMissingNeeded');
generate_mock_once('HTMLPurifier_Injector');
$injector = new HTMLPurifier_InjectorMock();
$injector->name = 'MyInjector';
@@ -322,8 +323,6 @@ a[href|title]
}
function test_injectorIntegration() {
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLDefinitionTest->test_injectorIntegration');
$module = $this->config->getHTMLDefinition(true)->getAnonymousModule();
$module->info_injector[] = 'Linkify';
@@ -334,8 +333,6 @@ a[href|title]
}
function test_injectorIntegrationFail() {
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLDefinitionTest->test_injectorIntegrationFail');
$this->config->set('HTML.Allowed', 'p');
$module = $this->config->getHTMLDefinition(true)->getAnonymousModule();
@@ -347,6 +344,12 @@ a[href|title]
);
}
function test_notAllowedRequiredAttributeError() {
$this->expectError("Required attribute 'src' in element 'img' was not allowed, which means 'img' will not be allowed either");
$this->config->set('HTML.Allowed', 'img[alt]');
$this->config->getHTMLDefinition();
}
}
// vim: et sw=4 sts=4

View File

@@ -0,0 +1,20 @@
<?php
class HTMLPurifier_HTMLModule_NofollowTest extends HTMLPurifier_HTMLModuleHarness
{
function setUp() {
parent::setUp();
$this->config->set('HTML.Nofollow', true);
}
function testNofollow() {
$this->assertResult(
'<a href="http://google.com">a</a><a href="/local">b</a><a href="mailto:foo@example.com">c</a>',
'<a href="http://google.com" rel="nofollow">a</a><a href="/local">b</a><a href="mailto:foo@example.com">c</a>'
);
}
}
// vim: et sw=4 sts=4

View File

@@ -5,7 +5,6 @@ class HTMLPurifier_HTMLModule_SafeEmbedTest extends HTMLPurifier_HTMLModuleHarne
function setUp() {
parent::setUp();
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLModule_SafeEmbedTest');
$def = $this->config->getHTMLDefinition(true);
$def->manager->addModule('SafeEmbed');
}

View File

@@ -6,8 +6,7 @@ class HTMLPurifier_HTMLModule_SafeObjectTest extends HTMLPurifier_HTMLModuleHarn
function setUp() {
parent::setUp();
$this->config->set('HTML.DefinitionID', 'HTMLPurifier_HTMLModule_SafeObjectTest');
$def = $this->config->getHTMLDefinition(true);
$def->manager->addModule('SafeObject');
$this->config->set('HTML.SafeObject', true);
}
function testMinimal() {
@@ -38,6 +37,13 @@ class HTMLPurifier_HTMLModule_SafeObjectTest extends HTMLPurifier_HTMLModuleHarn
);
}
function testFullScreen() {
$this->config->set('HTML.FlashAllowFullScreen', true);
$this->assertResult(
'<b><object width="425" height="344" type="application/x-shockwave-flash" data="Foobar"><param name="allowScriptAccess" value="never" /><param name="allowNetworking" value="internal" /><param name="flashvars" value="foobarbaz=bally" /><param name="movie" value="http://www.youtube.com/v/RVtEQxH7PWA&amp;hl=en" /><param name="wmode" value="window" /><param name="allowFullScreen" value="true" /></object></b>'
);
}
}
// vim: et sw=4 sts=4

View File

@@ -0,0 +1,6 @@
--INI--
HTML.SafeObject = true
Output.FlashCompat = true
--HTML--
<object width="425" height="350" data="http://www.youtube.com/v/BdU--T8rLns" type="application/x-shockwave-flash"><param name="allowScriptAccess" value="never" /><param name="allowNetworking" value="internal" /><param name="movie" value="http://www.youtube.com/v/BdU--T8rLns" /><param name="wmode" value="window" /></object>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,5 @@
--INI--
URI.AllowedSchemes = file
--HTML--
<a href="file:///foo">foo</a>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,8 @@
--INI--
Attr.EnableID = true
Core.LexerImpl = DirectLex
--HTML--
<img src="img_11775.jpg" alt="[Img #11775]" id="EMBEDDED_IMG_11775" >
--EXPECT--
<img src="img_11775.jpg" alt="[Img #11775]" id="EMBEDDED_IMG_11775" />
--# vim: et sw=4 sts=4

Some files were not shown because too many files have changed in this diff Show More