1
0
mirror of https://github.com/ezyang/htmlpurifier.git synced 2025-08-04 21:28:06 +02:00

Compare commits

..

70 Commits

Author SHA1 Message Date
Edward Z. Yang
18e538317a Release 4.1.1.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 20:17:31 -07:00
Edward Z. Yang
96a4193fc9 Fix undefined index warnings in maintenance scripts.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 20:07:27 -07:00
Edward Z. Yang
00c66fa9cb Fix bug in parsing single attribute with entities.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 19:44:18 -07:00
Edward Z. Yang
d3abcb90e3 Rewrite CSS url() and font-family output logic.
The new logic is as follows:

* Given a URL to insert into url(), check that it is properly URL
  encoded (in particular, a doublequote and backslash never occurs
  within it) and then place it as url("http://example.com").

* Given a font name, if it is strictly alphanumeric, it is safe to omit
  quotes. Otherwise, wrap in double quotes and replace '"' with '\22 '
  (note trailing space) and '\' with '\5C ' (ditto).

We introduce expandCSSEscape() which is a hack for common parsing
idioms in CSS; this means that CSS escapes are now recognized inside
URLs as well as unquoted font names.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-31 18:45:21 -07:00
Edward Z. Yang
df3100b1b3 Make test script less chatty when log_errors is on.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-20 21:50:44 -04:00
Edward Z. Yang
143e1ad718 Remove shebang and +x from test script.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-20 21:21:26 -04:00
Edward Z. Yang
875b0febde Fix infinite loop involving wrapping formedness.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-17 23:22:51 -04:00
Edward Z. Yang
3166b8a10f Fix bug in background-position with center keyword.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-05 15:08:57 -04:00
Edward Z. Yang
1a70bffd5a Emit errors when body is extracted.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-05-04 13:41:09 -04:00
Edward Z. Yang
f4c6e10ff7 Release 4.1.0.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-04-26 18:31:40 -04:00
Edward Z. Yang
c1cbd9e565 Mute STRICT errors from CSSTidy and don't run PEARSax3 on PHP 5.3.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-04-26 18:27:32 -04:00
Edward Z. Yang
da94d3d6ac Always quote the contents of url() in CSS.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-04-26 12:10:15 -04:00
Edward Z. Yang
80793e925e Remove +x bit from RemoveSpansWithoutAttributes.php
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-04-17 00:23:09 -04:00
Edward Z. Yang
8ef4fb22db Support for flashvars in HTML.SafeEmbed.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-30 13:33:13 -04:00
Edward Z. Yang
70a7a3f5dd Handle <ol><ol> properly by adding missing <li> tag.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-10 00:58:37 -05:00
Edward Z. Yang
4d612d5a77 Improve handling of malformed object parameters.
When specifying source material for <object> tags, you must use
data inside the object tag as well as specify movie in a param.
If you specify a src (which is the appropriate markup for <embed>)
we now convert and fill in the other attributes appropriately.

Also, fix a PHP warning in Generator code.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-09 17:29:38 -05:00
Edward Z. Yang
63a854ee5d Remove call-time pass-by-reference.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-08 03:45:11 -05:00
Edward Z. Yang
0229458f8f Implement Internet Explorer compatibility code for embedded content.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-08 01:56:40 -05:00
Edward Z. Yang
baa477ac08 Truncate alt text from src if it's too long.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-08 01:22:21 -05:00
Edward Z. Yang
dc90e8e85b Support flashvars.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-08 01:16:57 -05:00
Edward Z. Yang
97125ed18b Implement data URI scheme.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-07 21:45:39 -05:00
Paul Stone
9a9036c689 Implement auto-formatter that removes empty span tags.
Signed-off-by: Paul Stone <patches@pdjs.co.uk>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-07 18:59:33 -05:00
Edward Z. Yang
aea7d02dfe Support YouTube slideshow embedding.
YouTube slideshows contain a /cp/, not a /v/, in their URL;
relax the YouTube filter to allow them.

Signed-off-by: Nigel McNie <nigel@catalyst.net.nz>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-03-07 18:57:22 -05:00
Brian DeRocher
b3ca1498c2 Add boolean value flag for PEARSax3 for testing if a token is empty.
Signed-off-by: Brian DeRocher <brian@derocher.org>
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-02-26 21:36:51 -05:00
Edward Z. Yang
ac18672aba Fix extant broken PEARSax3 parsing patterns.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-02-26 21:14:52 -05:00
Edward Z. Yang
faf28682ad Manually work around PEARSax3 E_STRICT errors.
Previously, my development environment was not running the PEARSax3
tests because my environment was set to E_STRICT error handling, and
thus the tests were skipped.  Relax this requirement by making the
wrapper class E_STRICT safe.  This introduces a few failing tests.

Also update TODO and add another fresh test.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-02-26 20:42:42 -05:00
Edward Z. Yang
e2cd852bcf Add shebang line to tests index script.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-02-15 02:55:43 -05:00
Edward Z. Yang
694583259c Fix autoparagraph bug with non-inline elements.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2010-02-15 02:55:33 -05:00
Edward Z. Yang
bde4de3c78 Update TODO.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-08-27 20:17:41 -04:00
Edward Z. Yang
5b4e5c983e Support proprietary height attribute on table.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-08-27 20:17:24 -04:00
Edward Z. Yang
1ad8fd5ce9 Gracefully deal with null injectors.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-08-27 20:03:31 -04:00
Edward Z. Yang
6bdf161afd Update TODO.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-15 14:50:52 -04:00
Edward Z. Yang
af45a6c191 Release Phorum module 4.0.0.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-09 21:12:35 -04:00
Edward Z. Yang
2b72d0445f Add 4.1.0 release NEWS entry.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-09 21:03:46 -04:00
Edward Z. Yang
d7b3117678 Add doxygen doc scripts, and fix package.php
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-08 22:11:15 -04:00
Edward Z. Yang
53ff3e2744 Release 4.0.0.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-07 22:41:01 -04:00
Edward Z. Yang
6776efccdd Update configuration scanner to parse new format.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-07 22:32:44 -04:00
Edward Z. Yang
ba9fd175d7 Make extractBody not terminate prematurely on first </body>.
Previously, if two </body> tags were present, HTML Purifier
would truncate everything after the first </body>.  This is
not ideal behavior; so HTML Purifier has been changed to
match up to the last </body>.

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-07 22:19:04 -04:00
Edward Z. Yang
4d27906b02 Make %URI.Munge respect %URI.Host (don't munge).
%URI.Munge incorrectly munged URIs that pointed to the
same host as the current website (it did, however, have
the correct behavior for when the munge URL was on the
same server).

Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-06 22:04:51 -04:00
Edward Z. Yang
8f573df3dc XHTML 2 is dead. Long live XHTML 2.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
2009-07-02 15:43:42 -04:00
Edward Z. Yang
c7594487a2 Fix inability to totally override content model.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-06-10 18:24:52 -04:00
Edward Z. Yang
733a5ce5c3 Fix allowsElement() bug manifesting in LinkifyTest.
Thanks frank farmer for reporting.

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-06-10 18:11:34 -04:00
Edward Z. Yang
e8abd5953c Fix prototype impedance in HTMLDefinition and typo in
docs/enduser-customize.html
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-06-07 16:05:46 -04:00
Edward Z. Yang
1b8c8865b2 Fix PHP 5.3.0 problem with numeric indices causing -0 problem.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-06-07 16:04:07 -04:00
Edward Z. Yang
6e66dc9cad Add HTMLPurifier_config->serialize()
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-30 00:25:14 -04:00
Edward Z. Yang
77b60a4206 Update documentation to new configuration format.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-29 23:46:40 -04:00
Edward Z. Yang
5bf7ac4e9f Add docs and facilities for having separate directories of schemas.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-29 22:16:35 -04:00
Edward Z. Yang
a025203b18 Minor updates to Config and TODO items thereof.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-29 18:03:57 -04:00
Edward Z. Yang
809da84ae1 Ignore tags files (from exuberant ctags)
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-29 18:03:44 -04:00
Edward Z. Yang
777781a95c Don't have mute error handler be private.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-29 17:59:30 -04:00
Edward Z. Yang
4a87f732ca Fix two minor bugs, updating Phorum and removing unused $dir variable.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-27 01:17:23 -04:00
Edward Z. Yang
a2885181df Update TODO file.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-26 12:55:09 -04:00
Edward Z. Yang
84abae08f5 Relax allowed values of class for certain doctypes, see %Attr.ClassUseCDATA
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-26 01:07:40 -04:00
Edward Z. Yang
10e2d32a79 Lock configuration objects to a single namespace, to help prevent bugs.
* Also, fix a slight bug with URI definition clearing.

Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-25 23:38:49 -04:00
Edward Z. Yang
baf053b016 Implement %Attr.AllowedClasses and %Attr.ForbiddenClasses.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-25 22:08:45 -04:00
Edward Z. Yang
bf71c3f392 Add documents on how to restructure configuration directives.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-25 21:54:43 -04:00
Edward Z. Yang
bfbe29d5a1 Rename ExtractStyleBlocks configuration parameters.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-25 21:54:39 -04:00
Edward Z. Yang
e194b8efc6 Rename AutoFormatParam.PurifierLinkifyDocURL.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-25 21:51:08 -04:00
Edward Z. Yang
4214ac9d67 Update TODO list.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-22 14:52:43 -04:00
Edward Z. Yang
24f761d84a Remove PHP4 cruft from URISchemeRegistry.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-05-13 16:14:57 -04:00
Edward Z. Yang
41c9226f3d Style refresh: add/remove vimlines, fix minor factual errors.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-04-09 12:47:10 -04:00
Edward Z. Yang
e3c2063f69 Implement %AutoFormat.RemoveEmpty.RemoveNbsp, by popular demand.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-04-09 00:53:19 -04:00
Edward Z. Yang
398a02039e Implement %HTML.Attr.Name.UseCDATA which relaxes name validation rules.
Sponsored-by: Ian Cook <thinkspill@gmail.com>
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-03-20 19:34:38 -04:00
Edward Z. Yang
84e2e141fc Fix bad configuration call in NameSyncTest.php.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-03-14 19:18:02 -04:00
Edward Z. Yang
47bbbad000 Fix typo in YouTube docs. Thanks vbMark for reporting.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-03-13 13:33:51 -04:00
Edward Z. Yang
eaa906f8fc Implement configuration inheritance.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:01:02 -05:00
Edward Z. Yang
86ca784da3 Convert all to new configuration get/set format.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:00:34 -05:00
Edward Z. Yang
b107eec452 Revamp configuration backend.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 03:00:33 -05:00
Edward Z. Yang
fcbf724e6e Make name="" and id="" play nicely together.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-21 02:58:30 -05:00
Edward Z. Yang
92344cc83a Add 4.0.0 release information.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
2009-02-16 22:00:22 -05:00
284 changed files with 2903 additions and 1590 deletions

2
.gitignore vendored
View File

@@ -1,5 +1,7 @@
tags
conf/ conf/
test-settings.php test-settings.php
config-schema.php
library/HTMLPurifier/DefinitionCache/Serializer/*/ library/HTMLPurifier/DefinitionCache/Serializer/*/
library/standalone/ library/standalone/
library/HTMLPurifier.standalone.php library/HTMLPurifier.standalone.php

View File

@@ -31,7 +31,7 @@ PROJECT_NAME = HTMLPurifier
# This could be handy for archiving the generated documentation or # This could be handy for archiving the generated documentation or
# if some version control system is used. # if some version control system is used.
PROJECT_NUMBER = 3.3.0 PROJECT_NUMBER = 4.1.1
# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) # The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
# base path where the generated documentation will be put. # base path where the generated documentation will be put.

2
FOCUS
View File

@@ -1,4 +1,4 @@
7 - Major bugfixes 9 - Major security fixes
[ Appendix A: Release focus IDs ] [ Appendix A: Release focus IDs ]
0 - N/A 0 - N/A

24
INSTALL
View File

@@ -231,12 +231,12 @@ HTML Purifier uses iconv to support other character encodings, as such,
any encoding that iconv supports <http://www.gnu.org/software/libiconv/> any encoding that iconv supports <http://www.gnu.org/software/libiconv/>
HTML Purifier supports with this code: HTML Purifier supports with this code:
$config->set('Core', 'Encoding', /* put your encoding here */); $config->set('Core.Encoding', /* put your encoding here */);
An example usage for Latin-1 websites (the most common encoding for English An example usage for Latin-1 websites (the most common encoding for English
websites): websites):
$config->set('Core', 'Encoding', 'ISO-8859-1'); $config->set('Core.Encoding', 'ISO-8859-1');
Note that HTML Purifier's support for non-Unicode encodings is crippled by the Note that HTML Purifier's support for non-Unicode encodings is crippled by the
fact that any character not supported by that encoding will be silently fact that any character not supported by that encoding will be silently
@@ -251,7 +251,7 @@ reason, I do not include the solution in this document).
For those of you using HTML 4.01 Transitional, you can disable For those of you using HTML 4.01 Transitional, you can disable
XHTML output like this: XHTML output like this:
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); $config->set('HTML.Doctype', 'HTML 4.01 Transitional');
Other supported doctypes include: Other supported doctypes include:
@@ -277,14 +277,14 @@ are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention %AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
translates to: translates to:
$config->set('Namespace', 'Directive', $value); $config->set('Namespace.Directive', $value);
E.g. E.g.
$config->set('HTML', 'Allowed', 'p,b,a[href],i'); $config->set('HTML.Allowed', 'p,b,a[href],i');
$config->set('URI', 'Base', 'http://www.example.com'); $config->set('URI.Base', 'http://www.example.com');
$config->set('URI', 'MakeAbsolute', true); $config->set('URI.MakeAbsolute', true);
$config->set('AutoFormat', 'AutoParagraph', true); $config->set('AutoFormat.AutoParagraph', true);
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
@@ -318,11 +318,11 @@ If you are unable or unwilling to give write permissions to the cache
directory, you can either disable the cache (and suffer a performance directory, you can either disable the cache (and suffer a performance
hit): hit):
$config->set('Core', 'DefinitionCache', null); $config->set('Core.DefinitionCache', null);
Or move the cache directory somewhere else (no trailing slash): Or move the cache directory somewhere else (no trailing slash):
$config->set('Cache', 'SerializerPath', '/home/user/absolute/path'); $config->set('Cache.SerializerPath', '/home/user/absolute/path');
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
@@ -363,8 +363,8 @@ If your website is in a different encoding or doctype, use this code:
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php'; require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault(); $config = HTMLPurifier_Config::createDefault();
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding $config->set('Core.Encoding', 'ISO-8859-1'); // replace with your encoding
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype $config->set('HTML.Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
$purifier = new HTMLPurifier($config); $purifier = new HTMLPurifier($config);
$clean_html = $purifier->purify($dirty_html); $clean_html = $purifier->purify($dirty_html);

76
NEWS
View File

@@ -9,6 +9,82 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier
. Internal change . Internal change
========================== ==========================
4.1.1, released 2010-05-31
- Fix undefined index warnings in maintenance scripts.
- Fix bug in DirectLex for parsing elements with a single attribute
with entities.
- Rewrite CSS output logic for font-family and url(). Thanks Mario
Heiderich <mario.heiderich@googlemail.com> for reporting and Takeshi
Terada <t-terada@violet.plala.or.jp> for suggesting the fix.
- Emit an error for CollectErrors if a body is extracted
- Fix bug where in background-position for center keyword handling.
- Fix infinite loop when a wrapper element is inserted in a context
where it's not allowed. Thanks Lars <lars@renoz.dk> for reporting.
- Remove +x bit and shebang from index.php; only supported mode is to
explicitly call it with php.
- Make test script less chatty when log_errors is on.
4.1.0, released 2010-04-26
! Support proprietary height attribute on table element
! Support YouTube slideshows that contain /cp/ in their URL.
! Support for data: URI scheme; not enabled by default, add it using
%URI.AllowedSchemes
! Support flashvars when using %HTML.SafeObject and %HTML.SafeEmbed.
! Support for Internet Explorer compatibility with %HTML.SafeObject
using %Output.FlashCompat.
! Handle <ol><ol> properly, by inserting the necessary <li> tag.
- Always quote the insides of url(...) in CSS.
4.0.0, released 2009-07-07
# APIs for ConfigSchema subsystem have substantially changed. See
docs/dev-config-bcbreaks.txt for details; in essence, anything that
had both namespace and directive now have a single unified key.
# Some configuration directives were renamed, specifically:
%AutoFormatParam.PurifierLinkifyDocURL -> %AutoFormat.PurifierLinkify.DocURL
%FilterParam.ExtractStyleBlocksEscaping -> %Filter.ExtractStyleBlocks.Escaping
%FilterParam.ExtractStyleBlocksScope -> %Filter.ExtractStyleBlocks.Scope
%FilterParam.ExtractStyleBlocksTidyImpl -> %Filter.ExtractStyleBlocks.TidyImpl
As usual, the old directive names will still work, but will throw E_NOTICE
errors.
# The allowed values for class have been relaxed to allow all of CDATA for
doctypes that are not XHTML 1.1 or XHTML 2.0. For old behavior, set
%Attr.ClassUseCDATA to false.
# Instead of appending the content model to an old content model, a blank
element will replace the old content model. You can use #SUPER to get
the old content model.
! More robust support for name="" and id=""
! HTMLPurifier_Config::inherit($config) allows you to inherit one
configuration, and have changes to that configuration be propagated
to all of its children.
! Implement %HTML.Attr.Name.UseCDATA, which relaxes validation rules on
the name attribute when set. Use with care. Thanks Ian Cook for
sponsoring.
! Implement %AutoFormat.RemoveEmpty.RemoveNbsp, which removes empty
tags that contain non-breaking spaces as well other whitespace. You
can also modify which tags should have &nbsp; maintained with
%AutoFormat.RemoveEmpty.RemoveNbsp.Exceptions.
! Implement %Attr.AllowedClasses, which allows administrators to restrict
classes users can use to a specified finite set of classes, and
%Attr.ForbiddenClasses, which is the logical inverse.
! You can now maintain your own configuration schema directories by
creating a config-schema.php file or passing an extra argument. Check
docs/dev-config-schema.html for more details.
! Added HTMLPurifier_Config->serialize() method, which lets you save away
your configuration in a compact serial file, which you can unserialize
and use directly without having to go through the overhead of setup.
- Fix bug where URIDefinition would not get cleared if it's directives got
changed.
- Fix fatal error in HTMLPurifier_Encoder on certain platforms (probably NetBSD 5.0)
- Fix bug in Linkify autoformatter involving <a><span>http://foo</span></a>
- Make %URI.Munge not apply to links that have the same host as your host.
- Prevent stray </body> tag from truncating output, if a second </body>
is present.
. Created script maintenance/rename-config.php for renaming a configuration
directive while maintaining its alias. This script does not change source code.
. Implement namespace locking for definition construction, to prevent
bugs where a directive is used for definition construction but is not
used to construct the cache hash.
3.3.0, released 2009-02-16 3.3.0, released 2009-02-16
! Implement CSS property 'overflow' when %CSS.AllowTricky is true. ! Implement CSS property 'overflow' when %CSS.AllowTricky is true.
! Implement generic property list classess ! Implement generic property list classess

95
TODO
View File

@@ -11,54 +11,60 @@ If no interest is expressed for a feature that may require a considerable
amount of effort to implement, it may get endlessly delayed. Do not be amount of effort to implement, it may get endlessly delayed. Do not be
afraid to cast your vote for the next feature to be implemented! afraid to cast your vote for the next feature to be implemented!
- Investigate how early internal structures can be accessed; this would Things to do as soon as possible:
prevent structures from being parsed and serialized multiple times.
- Built-in support for target="_blank" on all external links - Think about allowing explicit order of operations hooks for transforms
- Allow <a id="asdf" name="asdf"> - Inputs don't do the right thing with submit
- Convert configuration to allow an arbitrary number of namespaces; - Fix "<.<" bug (trailing < is removed if not EOD)
then rename as appropriate. - Build in better internal state dumps and debugging tools for remote
debugging
- Allowed/Allowed* have strange interactions when both set
- Transform lone embeds into object tags
FUTURE VERSIONS FUTURE VERSIONS
--------------- ---------------
4.1 release [It's All About Trust] (floating) 4.2 release [OMG CONFIG PONIES]
! Fix Printer. It's from the old days when we didn't have decent XML classes
! Factor demo.php into a set of Printer classes, and then create a stub
file for users here (inside the actual HTML Purifier library)
- Fix error handling with form construction
- Do encoding validation in Printers, or at least, where user data comes in
- Config: Add examples to everything (make built-in which also automatically
gives output)
- Add "register" field to config schemas to eliminate dependence on
naming conventions (try to remember why we ultimately decided on tihs)
5.0 release [HTML 5]
# Swap out code to use html5lib tokenizer and tree-builder
! Allow turning off of FixNesting and required attribute insertion
5.1 release [It's All About Trust] (floating)
# Implement untrusted, dangerous elements/attributes # Implement untrusted, dangerous elements/attributes
# Implement IDREF support (harder than it seems, since you cannot have # Implement IDREF support (harder than it seems, since you cannot have
IDREFs to non-existent IDs) IDREFs to non-existent IDs)
- Implement <area> (client and server side image maps are blocking
on IDREF support)
# Frameset XHTML 1.0 and HTML 4.01 doctypes # Frameset XHTML 1.0 and HTML 4.01 doctypes
- Implement <area>
- Figure out how to simultaneously set %CSS.Trusted and %HTML.Trusted (?) - Figure out how to simultaneously set %CSS.Trusted and %HTML.Trusted (?)
4.2 release [Error'ed] 5.2 release [Error'ed]
# Error logging for filtering/cleanup procedures # Error logging for filtering/cleanup procedures
- XSS-attempt detection--certain errors are flagged XSS-like
4.3 release [Do What I Mean, Not What I Say]
# Additional support for poorly written HTML # Additional support for poorly written HTML
- Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!) - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
- Friendly strict handling of <address> (block -> <br>) - Friendly strict handling of <address> (block -> <br>)
? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes: - XSS-attempt detection--certain errors are flagged XSS-like
1. Analyzing which tags to remove duplicants
2. Ensure attributes are merged into the parent tag
3. Extend the tag exclusion system to specify whether or not the
contents should be dropped or not (currently, there's code that could do
something like this if it didn't drop the inner text too.)
- Remove <span> tags that don't do anything (no attributes)
- Append something to duplicate IDs so they're still usable (impl. note: the - Append something to duplicate IDs so they're still usable (impl. note: the
dupe detector would also need to detect the suffix as well) dupe detector would also need to detect the suffix as well)
- Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg
5.0 release [Beyond HTML] 6.0 release [Beyond HTML]
# Legit token based CSS parsing (will require revamping almost every # Legit token based CSS parsing (will require revamping almost every
AttrDef class). Probably will use CSSTidy class? AttrDef class). Probably will use CSSTidy
# More control over allowed CSS properties using a modularization # More control over allowed CSS properties using a modularization
# HTML 5 support
# IRI support (this includes IDN) # IRI support (this includes IDN)
- Standardize token armor for all areas of processing - Standardize token armor for all areas of processing
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
Also, enable disabling of directionality
6.0 release [To XML and Beyond] 7.0 release [To XML and Beyond]
- Extended HTML capabilities based on namespacing and tag transforms (COMPLEX) - Extended HTML capabilities based on namespacing and tag transforms (COMPLEX)
- Hooks for adding custom processors to custom namespaced tags and - Hooks for adding custom processors to custom namespaced tags and
attributes, offer default implementation attributes, offer default implementation
@@ -69,27 +75,14 @@ Ongoing
- Refactor unit tests into lots of test methods - Refactor unit tests into lots of test methods
- Plugins for major CMSes (COMPLEX) - Plugins for major CMSes (COMPLEX)
- phpBB - phpBB
- Drupal needs loving! - Also, a FAQ for extension writers with HTML Purifier
- Phorum need loving!
- more! (look for ones that use WYSIWYGs)
- Also, maybe a FAQ for extension writers with HTML Purifier
AutoFormat AutoFormat
- Smileys - Smileys
- Syntax highlighting (with GeSHi) with <pre> and possibly <?php - Syntax highlighting (with GeSHi) with <pre> and possibly <?php
- Look at http://drupal.org/project/Modules/category/63 for ideas - Look at http://drupal.org/project/Modules/category/63 for ideas
Optimizations
- Reduce size of internal data-structures (esp. HTMLDefinition)
- Research memory usage of objects versus arrays
- Combine multiple strategies into a single, single-pass strategy
- Get PH5P working with the latest versions of DOM, which have much more
stringent error checking procedures. Maybe convert straight to tokens.
- Get rid of set_include_path(). Save this for another major release.
Neat feature related Neat feature related
! Factor demo.php into a set of Printer classes, and then create a stub
file for users here (inside the actual HTML Purifier library)
! Support exporting configuration, so users can easily tweak settings ! Support exporting configuration, so users can easily tweak settings
in the demo, and then copy-paste into their own setup in the demo, and then copy-paste into their own setup
- Advanced URI filtering schemes (see docs/proposal-new-directives.txt) - Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
@@ -106,14 +99,28 @@ Neat feature related
- Full set of color keywords. Also, a way to add onto them without - Full set of color keywords. Also, a way to add onto them without
finalizing the configuration object. finalizing the configuration object.
- Write a var_export and memcached DefinitionCache - Denis - Write a var_export and memcached DefinitionCache - Denis
- Allow restriction of allowed class values - Built-in support for target="_blank" on all external links
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
Also, enable disabling of directionality
? Externalize inline CSS to promote clean HTML, proposed by Sander Tekelenburg
? Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
1. Analyzing which tags to remove duplicants
2. Ensure attributes are merged into the parent tag
3. Extend the tag exclusion system to specify whether or not the
contents should be dropped or not (currently, there's code that could do
something like this if it didn't drop the inner text too.)
Maintenance related (slightly boring) Maintenance related (slightly boring)
# CHMOD install script for PEAR installs # CHMOD install script for PEAR installs
! Factor out command line parser into its own class, and unit test it ! Factor out command line parser into its own class, and unit test it
! Nested configuration namespaces - Reduce size of internal data-structures (esp. HTMLDefinition)
- Distinguish between default settings and explicitly set settings, so - Allow merging configurations. Thus,
configurations can be merged a -> b -> default
c -> d -> default
becomes
a -> b -> c -> d -> default
Maybe allow more fine-grained tuning of this behavior. Alternatively,
encourage people to use short plist depths before building them up.
- Time PHPT tests - Time PHPT tests
ChildDef related (very boring) ChildDef related (very boring)

View File

@@ -1 +1 @@
3.3.0 4.1.1

View File

@@ -1,6 +1,5 @@
HTML Purifier 3.3.0 is fixes a number of obscure bugs reported and fixed HTML Purifier 4.1.1 is a major security and bugfix release that
over a four month period. It is probably the last release in the 3.x improves on 4.1's fix for an XSS vulnerability exploitable on Internet
series. Notable new features include support for the overflow CSS Explorer. It also contains a number of important bugfixes, including
property; notable bugfixes include fixed YouTube rendering in certain the removal of improper logic that could result in infinite loops and
versions of Firefox, CSSDefinition Printer, improved early PHP support fixed parsing for single-attributes with entities with DirectLex.
and bugs in iconv.

View File

@@ -52,4 +52,5 @@
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -16,4 +16,5 @@ function qs(el) {if (window.RegExp && window.encodeURIComponent) {var ue=el.href
// --> // -->
</script><table border=0 cellspacing=0 cellpadding=4><tr><td nowrap><font size=-1><b>Web</b>&nbsp;&nbsp;&nbsp;&nbsp;<a id=1a class=q href="/imghp?hl=en&tab=wi" onClick="return qs(this);">Images</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=2a class=q href="http://groups.google.com/grphp?hl=en&tab=wg" onClick="return qs(this);">Groups</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=4a class=q href="http://news.google.com/nwshp?hl=en&tab=wn" onClick="return qs(this);">News</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=5a class=q href="http://froogle.google.com/frghp?hl=en&tab=wf" onClick="return qs(this);">Froogle</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=8a class=q href="/lochp?hl=en&tab=wl" onClick="return qs(this);">Local</a>&nbsp;&nbsp;&nbsp;&nbsp;<b><a href="/intl/en/options/" class=q>more&nbsp;&raquo;</a></b></font></td></tr></table><table cellspacing=0 cellpadding=0><tr><td width=25%>&nbsp;</td><td align=center><input type=hidden name=hl value=en><input maxlength=2048 size=55 name=q value="" title="Google Search"><br><input type=submit value="Google Search" name=btnG><input type=submit value="I'm Feeling Lucky" name=btnI></td><td valign=top nowrap width=25%><font size=-2>&nbsp;&nbsp;<a href=/advanced_search?hl=en>Advanced Search</a><br>&nbsp;&nbsp;<a href=/preferences?hl=en>Preferences</a><br>&nbsp;&nbsp;<a href=/language_tools?hl=en>Language Tools</a></font></td></tr></table></form><br><br><font size=-1><a href="/ads/">Advertising&nbsp;Programs</a> - <a href=/services/>Business Solutions</a> - <a href=/about.html>About Google</a></font><p><font size=-2>&copy;2006 Google</font></p></center></body></html> </script><table border=0 cellspacing=0 cellpadding=4><tr><td nowrap><font size=-1><b>Web</b>&nbsp;&nbsp;&nbsp;&nbsp;<a id=1a class=q href="/imghp?hl=en&tab=wi" onClick="return qs(this);">Images</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=2a class=q href="http://groups.google.com/grphp?hl=en&tab=wg" onClick="return qs(this);">Groups</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=4a class=q href="http://news.google.com/nwshp?hl=en&tab=wn" onClick="return qs(this);">News</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=5a class=q href="http://froogle.google.com/frghp?hl=en&tab=wf" onClick="return qs(this);">Froogle</a>&nbsp;&nbsp;&nbsp;&nbsp;<a id=8a class=q href="/lochp?hl=en&tab=wl" onClick="return qs(this);">Local</a>&nbsp;&nbsp;&nbsp;&nbsp;<b><a href="/intl/en/options/" class=q>more&nbsp;&raquo;</a></b></font></td></tr></table><table cellspacing=0 cellpadding=0><tr><td width=25%>&nbsp;</td><td align=center><input type=hidden name=hl value=en><input maxlength=2048 size=55 name=q value="" title="Google Search"><br><input type=submit value="Google Search" name=btnG><input type=submit value="I'm Feeling Lucky" name=btnI></td><td valign=top nowrap width=25%><font size=-2>&nbsp;&nbsp;<a href=/advanced_search?hl=en>Advanced Search</a><br>&nbsp;&nbsp;<a href=/preferences?hl=en>Preferences</a><br>&nbsp;&nbsp;<a href=/language_tools?hl=en>Language Tools</a></font></td></tr></table></form><br><br><font size=-1><a href="/ads/">Advertising&nbsp;Programs</a> - <a href=/services/>Business Solutions</a> - <a href=/about.html>About Google</a></font><p><font size=-2>&copy;2006 Google</font></p></center></body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -127,4 +127,5 @@ if (objAdMgr.isSlotAvailable("leaderboard")) {
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -539,4 +539,5 @@ Retrieved from "<a href="http://en.wikipedia.org/wiki/Tai_Chi_Chuan">http://en.w
<!-- Served by srv25 in 0.089 secs. --> <!-- Served by srv25 in 0.089 secs. -->
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -18,22 +18,24 @@ TODO:
if (version_compare(PHP_VERSION, '5.2', '<')) exit('PHP 5.2+ required.'); if (version_compare(PHP_VERSION, '5.2', '<')) exit('PHP 5.2+ required.');
error_reporting(E_ALL | E_STRICT); error_reporting(E_ALL | E_STRICT);
chdir(dirname(__FILE__));
// load dual-libraries // load dual-libraries
require_once '../extras/HTMLPurifierExtras.auto.php'; require_once dirname(__FILE__) . '/../extras/HTMLPurifierExtras.auto.php';
require_once '../library/HTMLPurifier.auto.php'; require_once dirname(__FILE__) . '/../library/HTMLPurifier.auto.php';
// setup HTML Purifier singleton // setup HTML Purifier singleton
HTMLPurifier::getInstance(array( HTMLPurifier::getInstance(array(
'AutoFormat.PurifierLinkify' => true 'AutoFormat.PurifierLinkify' => true
)); ));
$interchange = HTMLPurifier_ConfigSchema_InterchangeBuilder::buildFromDirectory(); $builder = new HTMLPurifier_ConfigSchema_InterchangeBuilder();
$interchange = new HTMLPurifier_ConfigSchema_Interchange();
$builder->buildDir($interchange);
$loader = dirname(__FILE__) . '/../config-schema.php';
if (file_exists($loader)) include $loader;
$interchange->validate(); $interchange->validate();
$style = 'plain'; // use $_GET in the future, careful to validate! $style = 'plain'; // use $_GET in the future, careful to validate!
$configdoc_xml = 'configdoc.xml'; $configdoc_xml = dirname(__FILE__) . '/configdoc.xml';
$xml_builder = new HTMLPurifier_ConfigSchema_Builder_Xml(); $xml_builder = new HTMLPurifier_ConfigSchema_Builder_Xml();
$xml_builder->openURI($configdoc_xml); $xml_builder->openURI($configdoc_xml);
@@ -50,13 +52,13 @@ if (!$output) {
} }
// write out // write out
file_put_contents("$style.html", $output); file_put_contents(dirname(__FILE__) . "/$style.html", $output);
if (php_sapi_name() != 'cli') { if (php_sapi_name() != 'cli') {
// output (instant feedback if it's a browser) // output (instant feedback if it's a browser)
echo $output; echo $output;
} else { } else {
echo 'Files generated successfully.'; echo "Files generated successfully.\n";
} }
// vim: et sw=4 sts=4 // vim: et sw=4 sts=4

View File

@@ -232,4 +232,5 @@
</xsl:stylesheet> </xsl:stylesheet>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -13,4 +13,5 @@
<type id="mixed">Mixed</type> <type id="mixed">Mixed</type>
</types> </types>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -6,6 +6,7 @@
</file> </file>
<file name="HTMLPurifier/Lexer.php"> <file name="HTMLPurifier/Lexer.php">
<line>81</line> <line>81</line>
<line>269</line>
</file> </file>
<file name="HTMLPurifier/Lexer/DirectLex.php"> <file name="HTMLPurifier/Lexer/DirectLex.php">
<line>53</line> <line>53</line>
@@ -85,22 +86,27 @@
</directive> </directive>
<directive id="Output.CommentScriptContents"> <directive id="Output.CommentScriptContents">
<file name="HTMLPurifier/Generator.php"> <file name="HTMLPurifier/Generator.php">
<line>45</line> <line>56</line>
</file> </file>
</directive> </directive>
<directive id="Output.SortAttr"> <directive id="Output.SortAttr">
<file name="HTMLPurifier/Generator.php"> <file name="HTMLPurifier/Generator.php">
<line>46</line> <line>57</line>
</file>
</directive>
<directive id="Output.FlashCompat">
<file name="HTMLPurifier/Generator.php">
<line>58</line>
</file> </file>
</directive> </directive>
<directive id="Output.TidyFormat"> <directive id="Output.TidyFormat">
<file name="HTMLPurifier/Generator.php"> <file name="HTMLPurifier/Generator.php">
<line>75</line> <line>87</line>
</file> </file>
</directive> </directive>
<directive id="Output.Newline"> <directive id="Output.Newline">
<file name="HTMLPurifier/Generator.php"> <file name="HTMLPurifier/Generator.php">
<line>89</line> <line>101</line>
</file> </file>
</directive> </directive>
<directive id="HTML.BlockWrapper"> <directive id="HTML.BlockWrapper">
@@ -208,6 +214,14 @@
<line>267</line> <line>267</line>
</file> </file>
</directive> </directive>
<directive id="URI.">
<file name="HTMLPurifier/URIDefinition.php">
<line>55</line>
</file>
<file name="HTMLPurifier/URIFilter/Munge.php">
<line>12</line>
</file>
</directive>
<directive id="URI.Host"> <directive id="URI.Host">
<file name="HTMLPurifier/URIDefinition.php"> <file name="HTMLPurifier/URIDefinition.php">
<line>64</line> <line>64</line>
@@ -225,12 +239,12 @@
</directive> </directive>
<directive id="URI.AllowedSchemes"> <directive id="URI.AllowedSchemes">
<file name="HTMLPurifier/URISchemeRegistry.php"> <file name="HTMLPurifier/URISchemeRegistry.php">
<line>42</line> <line>41</line>
</file> </file>
</directive> </directive>
<directive id="URI.OverrideAllowedSchemes"> <directive id="URI.OverrideAllowedSchemes">
<file name="HTMLPurifier/URISchemeRegistry.php"> <file name="HTMLPurifier/URISchemeRegistry.php">
<line>43</line> <line>42</line>
</file> </file>
</directive> </directive>
<directive id="URI.Disable"> <directive id="URI.Disable">
@@ -246,6 +260,16 @@
<line>12</line> <line>12</line>
</file> </file>
</directive> </directive>
<directive id="Attr.AllowedClasses">
<file name="HTMLPurifier/AttrDef/HTML/Class.php">
<line>18</line>
</file>
</directive>
<directive id="Attr.ForbiddenClasses">
<file name="HTMLPurifier/AttrDef/HTML/Class.php">
<line>19</line>
</file>
</directive>
<directive id="Attr.AllowedFrameTargets"> <directive id="Attr.AllowedFrameTargets">
<file name="HTMLPurifier/AttrDef/HTML/FrameTarget.php"> <file name="HTMLPurifier/AttrDef/HTML/FrameTarget.php">
<line>15</line> <line>15</line>
@@ -272,6 +296,11 @@
<line>54</line> <line>54</line>
</file> </file>
</directive> </directive>
<directive id="Attr.">
<file name="HTMLPurifier/AttrDef/HTML/LinkTypes.php">
<line>30</line>
</file>
</directive>
<directive id="Attr.DefaultTextDir"> <directive id="Attr.DefaultTextDir">
<file name="HTMLPurifier/AttrTransform/BdoDir.php"> <file name="HTMLPurifier/AttrTransform/BdoDir.php">
<line>13</line> <line>13</line>
@@ -297,7 +326,15 @@
</directive> </directive>
<directive id="Attr.DefaultInvalidImageAlt"> <directive id="Attr.DefaultInvalidImageAlt">
<file name="HTMLPurifier/AttrTransform/ImgRequired.php"> <file name="HTMLPurifier/AttrTransform/ImgRequired.php">
<line>32</line> <line>33</line>
</file>
</directive>
<directive id="HTML.Attr.Name.UseCDATA">
<file name="HTMLPurifier/AttrTransform/Name.php">
<line>11</line>
</file>
<file name="HTMLPurifier/HTMLModule/Name.php">
<line>13</line>
</file> </file>
</directive> </directive>
<directive id="Core.EscapeInvalidChildren"> <directive id="Core.EscapeInvalidChildren">
@@ -310,17 +347,17 @@
<line>91</line> <line>91</line>
</file> </file>
</directive> </directive>
<directive id="FilterParam.ExtractStyleBlocksTidyImpl"> <directive id="Filter.ExtractStyleBlocks.TidyImpl">
<file name="HTMLPurifier/Filter/ExtractStyleBlocks.php"> <file name="HTMLPurifier/Filter/ExtractStyleBlocks.php">
<line>41</line> <line>41</line>
</file> </file>
</directive> </directive>
<directive id="FilterParam.ExtractStyleBlocksScope"> <directive id="Filter.ExtractStyleBlocks.Scope">
<file name="HTMLPurifier/Filter/ExtractStyleBlocks.php"> <file name="HTMLPurifier/Filter/ExtractStyleBlocks.php">
<line>65</line> <line>65</line>
</file> </file>
</directive> </directive>
<directive id="FilterParam.ExtractStyleBlocksEscaping"> <directive id="Filter.ExtractStyleBlocks.Escaping">
<file name="HTMLPurifier/Filter/ExtractStyleBlocks.php"> <file name="HTMLPurifier/Filter/ExtractStyleBlocks.php">
<line>123</line> <line>123</line>
</file> </file>
@@ -351,11 +388,21 @@
<line>50</line> <line>50</line>
</file> </file>
</directive> </directive>
<directive id="AutoFormatParam.PurifierLinkifyDocURL"> <directive id="AutoFormat.PurifierLinkify.DocURL">
<file name="HTMLPurifier/Injector/PurifierLinkify.php"> <file name="HTMLPurifier/Injector/PurifierLinkify.php">
<line>15</line> <line>15</line>
</file> </file>
</directive> </directive>
<directive id="AutoFormat.RemoveEmpty.RemoveNbsp">
<file name="HTMLPurifier/Injector/RemoveEmpty.php">
<line>12</line>
</file>
</directive>
<directive id="AutoFormat.RemoveEmpty.RemoveNbsp.Exceptions">
<file name="HTMLPurifier/Injector/RemoveEmpty.php">
<line>13</line>
</file>
</directive>
<directive id="Core.AggressivelyFixLt"> <directive id="Core.AggressivelyFixLt">
<file name="HTMLPurifier/Lexer/DOMLex.php"> <file name="HTMLPurifier/Lexer/DOMLex.php">
<line>44</line> <line>44</line>

View File

@@ -17,202 +17,10 @@
<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div> <div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
<p> <p>
<strong>Warning:</strong> This document may be out-of-date. When in doubt, Please see <a href="enduser-customize.html">Customize!</a>
consult the source code documentation.
</p> </p>
<p>HTML Purifier currently natively supports only a subset of HTML's
allowed elements, attributes, and behavior; specifically, this subset
is the set of elements that are safe for untrusted users to use.
However, HTML Purifier is often utilized to ensure standards-compliance
from input that is trusted (making it a sort of Tidy substitute),
and often users need to define new elements or attributes. The
advanced API is oriented specifically for these use-cases.</p>
<p>Our goals are to let the user:</p>
<dl>
<dt>Select</dt>
<dd><ul>
<li>Doctype</li>
<!-- <li>Filterset</li> -->
<li>Elements / Attributes / Modules</li>
<li>Tidy</li>
</ul></dd>
<dt>Customize</dt>
<dd><ul>
<li>Attributes</li>
<li>Elements</li>
<!--<li>Doctypes</li>-->
</ul></dd>
</dl>
<h2>Select</h2>
<p>For basic use, the user will have to specify some basic parameters. This
is not strictly necessary, as HTML Purifier's default setting will always
output safe code, but is required for standards-compliant output.</p>
<h3>Selecting a Doctype</h3>
<p>The first thing to select is the <strong>doctype</strong>. This
is essential for standards-compliant output.</p>
<p class="technical">This identifier is based
on the name the W3C has given to the document type and <em>not</em>
the DTD identifier.</p>
<p>This parameter is set via the configuration object:</p>
<pre>$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');</pre>
<p>Due to historical reasons, the default doctype is XHTML 1.0
Transitional, however, we really shouldn't be guessing what the user's
doctype is. Fortunantely, people who can't be bothered to set this won't
be bothered when their pages stop validating.</p>
<h3>Selecting Elements / Attributes / Modules</h3>
<p>HTML Purifier will, by default, allow as many elements and attributes
as possible. However, a user may decide to roll their own filterset by
selecting modules, elements and attributes to allow for their own
specific use-case. This can be done using %HTML.Allowed:</p>
<pre>$config->set('HTML', 'Allowed', 'a[href|title],em,p,blockquote');</pre>
<p class="technical">The directive %HTML.Allowed is a convenience feature
that may be fully expressed with the legacy interface.</p>
<p>We currently support another interface from older versions:</p>
<pre>$config->set('HTML', 'AllowedElements', 'a,em,p,blockquote');
$config->set('HTML', 'AllowedAttributes', 'a.href,a.title');</pre>
<p>A user may also choose to allow modules using a specialized
directive:</p>
<pre>$config->set('HTML', 'AllowedModules', 'Hypertext,Text,Lists');</pre>
<p>But it is not expected that this feature will be widely used.</p>
<p class="technical">Module selection will work slightly differently
from the other AllowedElements and AllowedAttributes directives by
directly modifying the doctype you are operating in, in the spirit of
XHTML 1.1's modularization. We stop users from shooting themselves in the
foot by mandating the modules in %HTML.CoreModules be used.</p>
<p class="technical">Modules are distinguished from regular elements by the
case of their first letter. While XML distinguishes between and allows
lower and uppercase letters in element names, XHTML uses only lower-case
element names for sake of consistency.</p>
<h3>Selecting Tidy</h3>
<p>The name of this segment of functionality is inspired off of Dave
Ragget's program HTML Tidy, which purported to help clean up HTML. In
HTML Purifier, Tidy functionality involves turning unsupported and
deprecated elements into standards-compliant ones, maintaining
backwards compatibility, and enforcing best practices.</p>
<p>This is a complicated feature, and is explained more in depth at
<a href="enduser-tidy.html">the Tidy documentation page</a>.</p>
<!--
<h3>Unified selector</h3>
<p>Because selecting each and every one of these configuration options
is a chore, we may wish to offer a specialized configuration method
for selecting a filterset. Possibility:</p>
<pre>function selectFilter($doctype, $filterset, $tidy)</pre>
<p>...which is simply a light wrapper over the individual configuration
calls. A custom config file format or text format could also be adopted.</p>
-->
<h2>Customize</h2>
<p>By reviewing topic posts in the support forum, we determined that
there were two primarily demanded customization features people wanted:
to add an attribute to an existing element, and to add an element.
Thus, we'll want to create convenience functions for these common
use-cases.</p>
<p>Note that the functions described here are only available if
a raw copy of <code>HTMLPurifier_HTMLDefinition</code> was retrieved.
Furthermore, caching may prevent your changes from immediately
being seen: consult <a href="enduser-customize.html">enduser-customize.html</a> on how
to work around this.</p>
<h3>Attributes</h3>
<p>An attribute is bound to an element by a name and has a specific
<code>AttrDef</code> that validates it. The interface is therefore:</p>
<pre>function addAttribute($element, $attribute, $attribute_def);</pre>
<p>Example of the functionality in action:</p>
<pre>$def->addAttribute('a', 'rel', 'Enum#nofollow');</pre>
<p>The <code>$attribute_def</code> value is flexible,
to make things simpler. It can be a literal object or:</p>
<ul>
<!--<li>Class name: We'll instantiate it for you</li>
<li>Function name: We'll create an <code>HTMLPurifier_AttrDef_Anonymous</code>
class with that function registered as a callback.</li>-->
<li>String attribute type: We'll use <code>HTMLPurifier_AttrTypes</code>
to resolve it for you. Any data that follows a hash mark (#) will
be used to customize the attribute type: in the example above,
we specify which values for Enum to allow.</li>
</ul>
<h3>Elements</h3>
<p>An element requires certain information as specified by
<code>HTMLPurifier_ElementDef</code>. However, not all of it is necessary,
the usual things required are:</p>
<ul>
<li>Attributes</li>
<li>Content model/type</li>
<li>Registration in a content set</li>
</ul>
<p>This suggests an API like this:</p>
<pre>function addElement($element, $type, $contents,
$attr_collections = array(); $attributes = array());</pre>
<p>Each parameter explained in depth:</p>
<dl>
<dt><code>$element</code></dt>
<dd>Element name, ex. 'label'</dd>
<dt><code>$type</code></dt>
<dd>Content set to register in, ex. 'Inline' or 'Flow'</dd>
<dt><code>$contents</code></dt>
<dd>Description of allowed children. This is a merged form of
<code>HTMLPurifier_ElementDef</code>'s member variables
<code>$content_model</code> and <code>$content_model_type</code>,
where the form is <q>Type: Model</q>, ex. 'Optional: Inline'.
There are also a number of predefined templates one may use.</dd>
<dt><code>$attr_collections</code></dt>
<dd>Array (or string if only one) of attribute collection(s) to
merge into the attributes array.</dd>
<dt><code>$attributes</code></dt>
<dd>Array of attribute names to attribute definitions, much like
the above-described attribute customization.</dd>
</dl>
<p>A possible usage:</p>
<pre>$def->addElement('font', 'Inline', 'Optional: Inline', 'Common',
array('color' => 'Color'));</pre>
<p>See <code>HTMLPurifier/HTMLModule.php</code> for details.</p>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -0,0 +1,79 @@
Configuration Backwards-Compatibility Breaks
In version 4.0.0, the configuration subsystem (composed of the outwards
facing Config class, as well as the ConfigSchema and ConfigSchema_Interchange
subsystems), was significantly revamped to make use of property lists.
While most of the changes are internal, some internal APIs were changed for the
sake of clarity. HTMLPurifier_Config was kept completely backwards compatible,
although some of the functions were retrofitted with an unambiguous alternate
syntax. Both of these changes are discussed in this document.
1. Outwards Facing Changes
--------------------------------------------------------------------------------
The HTMLPurifier_Config class now takes an alternate syntax. The general rule
is:
If you passed $namespace, $directive, pass "$namespace.$directive"
instead.
An example:
$config->set('HTML', 'Allowed', 'p');
becomes:
$config->set('HTML.Allowed', 'p');
New configuration options may have more than one namespace, they might
look something like %Filter.YouTube.Blacklist. While you could technically
set it with ('HTML', 'YouTube.Blacklist'), the logical extension
('HTML', 'YouTube', 'Blacklist') does not work.
The old API will still work, but will emit E_USER_NOTICEs.
2. Internal API Changes
--------------------------------------------------------------------------------
Some overarching notes: we've completely eliminated the notion of namespace;
it's now an informal construct for organizing related configuration directives.
Also, the validation routines for keys (formerly "$namespace.$directive")
have been completely relaxed. I don't think it really should be necessary.
2.1 HTMLPurifier_ConfigSchema
First off, if you're interfacing with this class, you really shouldn't.
HTMLPurifier_ConfigSchema_Builder_ConfigSchema is really the only class that
should ever be creating HTMLPurifier_ConfigSchema, and HTMLPurifier_Config the
only class that should be reading it.
All namespace related methods were removed; they are completely unnecessary
now. Any $namespace, $name arguments must be replaced with $key (where
$key == "$namespace.$name"), including for addAlias().
The $info and $defaults member variables are no longer indexed as
[$namespace][$name]; they are now indexed as ["$namespace.$name"].
All deprecated methods were finally removed, after having yelled at you as
an E_USER_NOTICE for a while now.
2.2 HTMLPurifier_ConfigSchema_Interchange
Member variable $namespaces was removed.
2.3 HTMLPurifier_ConfigSchema_Interchange_Id
Member variable $namespace and $directive removed; member variable $key added.
Any method that took $namespace, $directive now takes $key.
2.4 HTMLPurifier_ConfigSchema_Interchange_Namespace
Removed.
vim: et sw=4 sts=4

164
docs/dev-config-naming.txt Normal file
View File

@@ -0,0 +1,164 @@
Configuration naming
HTML Purifier 4.0.0 features a new configuration naming system that
allows arbitrary nesting of namespaces. While there are certain cases
in which using two namespaces is obviously better (the canonical example
is where we were using AutoFormatParam to contain directives for AutoFormat
parameters), it is unclear whether or not a general migration to highly
namespaced directives is a good idea or not.
== Case studies ==
=== Attr.* ===
We have a dead duck HTML.Attr.Name.UseCDATA which migrated before we decided
to think this out thoroughly.
We currently have a large number of directives in the Attr.* namespace.
These directives tweak the behavior of some HTML attributes. They have
the properties:
* While they apply to only one attribute at a time, the attribute can
span over multiple elements (not necessarily all attributes, either).
The information of which elements it impacts is either omitted or
informally stated (EnableID applies to all elements, DefaultImageAlt
applies to <img> tags, AllowedRev doesn't say but only applies to a tags).
* There is a certain degree of clustering that could be applied, especially
to the ID directives. The clustering could be done with respect to
what element/attribute was used, i.e.
*.id -> EnableID, IDBlacklistRegexp, IDBlacklist, IDPrefixLocal, IDPrefix
img.src -> DefaultInvalidImage
img.alt -> DefaultImageAlt, DefaultInvalidImageAlt
bdo.dir -> DefaultTextDir
a.rel -> AllowedRel
a.rev -> AllowedRev
a.target -> AllowedFrameTargets
a.name -> Name.UseCDATA
* The directives often reference generic attribute types that were specified
in the DTD/specification. However, some of the behavior specifically relies
on the fact that other use cases of the attribute are not, at current,
supported by HTML Purifier.
AllowedRel, AllowedRev -> heavily <a> specific; if <link> ends up being
allowed, we will also have to give users specificity there (we also
want to preserve generality) DTD %Linktypes, HTML5 distinguishes
between <link> and <a>/<area>
AllowedFrameTargets -> heavily <a> specific, but also used by <area>
and <form>. Transitional DTD %FrameTarget, not present in strict,
HTML5 calls them "browsing contexts"
Default*Image* -> as a default parameter, is almost entirely exlcusive
to <img>
EnableID -> global attribute
Name.UseCDATA -> heavily <a> specific, but has heavy other usage by
many things
== AutoFormat.* ==
These have the fairly normal pluggable architecture that lends itself to
large amounts of namespaces (pluggability may be the key to figuring
out when gratuitous namespacing is good.) Properties:
* Boolean directives are fair game for being namespaced: for example,
RemoveEmpty.RemoveNbsp triggers RemoveEmpty.RemoveNbsp.Exceptions,
the latter of which only makes sense when RemoveEmpty.RemoveNbsp
is set to true. (The same applies to RemoveNbsp too)
The AutoFormat string is a bit long, but is the only bit of repeated
context.
== Core.* ==
Core is the potpourri of directives, mostly regarding some minor behavioral
tweaks for HTML handling abilities.
AggressivelyFixLt
ConvertDocumentToFragment
DirectLexLineNumberSyncInterval
LexerImpl
MaintainLineNumbers
Lexer
CollectErrors
Language
Error handling (Language is ostensibly a little more general, but
it's only used for error handling right now)
ColorKeywords
CSS and HTML
Encoding
EscapeNonASCIICharacters
Character encoding
EscapeInvalidChildren
EscapeInvalidTags
HiddenElements
RemoveInvalidImg
Lexing/Output
RemoveScriptContents
Deprecated
== HTML.* ==
AllowedAttributes
AllowedElements
AllowedModules
Allowed
ForbiddenAttributes
ForbiddenElements
Element set tuning
BlockWrapper
Child def advanced twiddle
CoreModules
CustomDoctype
Advanced HTMLModuleManager twiddles
DefinitionID
DefinitionRev
Caching
Doctype
Parent
Strict
XHTML
Global environment
MaxImgLength
Attribute twiddle? (applies to two attributes)
Proprietary
SafeEmbed
SafeObject
Trusted
Extra functionality/tagsets
TidyAdd
TidyLevel
TidyRemove
Tidy
== Output.* ==
These directly affect the output of Generator. These are all advanced
twiddles.
== URI.* ==
AllowedSchemes
OverrideAllowedSchemes
Scheme tuning
Base
DefaultScheme
Host
Global environment
DefinitionID
DefinitionRev
Caching
DisableExternalResources
DisableExternal
DisableResources
Disable
Contextual/authority tuning
HostBlacklist
Authority tuning
MakeAbsolute
MungeResources
MungeSecretKey
Munge
Transformation behavior (munge can be grouped)

View File

@@ -114,7 +114,7 @@ Test.Example</pre>
</tr> </tr>
<tr> <tr>
<td>VALUE-ALIASES</td> <td>VALUE-ALIASES</td>
<td>'baz' => 'bar'</td> <td>'baz' =&gt; 'bar'</td>
<td><em>Optional</em>. Mapping of one value to another, and <td><em>Optional</em>. Mapping of one value to another, and
should be a comma separated list of keypair duples. This should be a comma separated list of keypair duples. This
is only allowed string, istring, text and itext TYPEs.</td> is only allowed string, istring, text and itext TYPEs.</td>
@@ -213,7 +213,7 @@ Test.Example</pre>
</tr> </tr>
<tr> <tr>
<td>lookup</td> <td>lookup</td>
<td>array('key' => true)</td> <td>array('key' =&gt; true)</td>
<td>Lookup array, used with <code>isset($var[$key])</code></td> <td>Lookup array, used with <code>isset($var[$key])</code></td>
</tr> </tr>
<tr> <tr>
@@ -223,7 +223,7 @@ Test.Example</pre>
</tr> </tr>
<tr> <tr>
<td>hash</td> <td>hash</td>
<td>array('key' => 'val')</td> <td>array('key' =&gt; 'val')</td>
<td>Associative array of keys to values</td> <td>Associative array of keys to values</td>
</tr> </tr>
<tr> <tr>
@@ -267,6 +267,41 @@ Test.Example</pre>
If you ever make changes to your configuration directives, you If you ever make changes to your configuration directives, you
will need to run this script again. will need to run this script again.
</p> </p>
<h2>Adding in-house schema definitions</h2>
<p>
Placing stuff directly in HTML Purifier's source tree is generally not a
good idea, so HTML Purifier 4.0.0+ has some facilities in place to make your
life easier.
</p>
<p>
The first is to pass an extra parameter to <code>maintenance/generate-schema-cache.php</code>
with the location of your directory (relative or absolute path will do). For example,
if I'm storing my custom definitions in <em>/var/htmlpurifier/myschema</em>, run:
<code>php maintenance/generate-schema-cache.php /var/htmlpurifier/myschema</code>.
</p>
<p>
Alternatively, you can create a small loader PHP file in the HTML Purifier base
directory named <code>config-schema.php</code> (this is the same directory
you would place a <code>test-settings.php</code> file). In this file, add
the following line for each directory you want to load:
</p>
<pre>$builder-&gt;buildDir($interchange, '/var/htmlpurifier/myschema');</pre>
<p>You can even load a single file using:</p>
<pre>$builder-&gt;buildFile($interchange, '/var/htmlpurifier/myschema/MyApp.Directive.txt');</pre>
<p>Storing custom definitions that you don't plan on sending back upstream in
a separate directory is <em>definitely</em> a good idea! Additionally, picking
a good namespace can go a long way to saving you grief if you want to use
someone else's change, but they picked the same name, or if HTML Purifier
decides to add support for a configuration directive that has the same name.</p>
<!-- TODO: how to name directives that rely on naming conventions -->
<h2>Errors</h2> <h2>Errors</h2>
@@ -373,4 +408,5 @@ Test.Example</pre>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -64,4 +64,5 @@
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -79,4 +79,5 @@ help you find the correct functionality more quickly. Here they are:</p>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -29,4 +29,5 @@ that itch, put it here!</p>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -305,4 +305,5 @@ Mozilla on inside and needs -moz-outline, no IE support.</td></tr>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -18,12 +18,11 @@
<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div> <div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
<p> <p>
You may have heard of the <a href="dev-advanced-api.html">Advanced API</a>. HTML Purifier has this quirk where if you try to allow certain elements or
If you're interested in reading dry prose and boring functional attributes, HTML Purifier will tell you that it's not supported, and that
specifications, feel free to click that link to get a no-nonsense overview you should go to the forums to find out how to implement it. Well, this
on the Advanced API. For the rest of us, there's this tutorial. By the time document is how to implement elements and attributes which HTML Purifier
you're finished reading this, you should have a pretty good idea on doesn't support out of the box.
how to implement custom tags and attributes that HTML Purifier may not have.
</p> </p>
<h2>Is it necessary?</h2> <h2>Is it necessary?</h2>
@@ -84,17 +83,6 @@
limited to translations) above or below other corresponding text. limited to translations) above or below other corresponding text.
</p> </p>
<h3>XHTML 2.0</h3>
<p>
<a href="http://www.w3.org/TR/xhtml2/">XHTML 2.0</a> is still a
working draft, so any elements introduced in the
specification have not been implemented and will not be implemented
until we get a recommendation or proposal. Because XHTML 2.0 is
an entirely new markup language, implementing rules for it will be
no easy task.
</p>
<h3>HTML 5</h3> <h3>HTML 5</h3>
<p> <p>
@@ -156,9 +144,9 @@
</p> </p>
<pre>$config = HTMLPurifier_Config::createDefault(); <pre>$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', 'enduser-customize.html tutorial'); $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML', 'DefinitionRev', 1); $config-&gt;set('HTML.DefinitionRev', 1);
$def = $config->getHTMLDefinition(true);</pre> $def = $config-&gt;getHTMLDefinition(true);</pre>
<p> <p>
Assuming that HTML Purifier has already been properly loaded (hint: Assuming that HTML Purifier has already been properly loaded (hint:
@@ -211,10 +199,10 @@ $def = $config->getHTMLDefinition(true);</pre>
</p> </p>
<pre>$config = HTMLPurifier_Config::createDefault(); <pre>$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', 'enduser-customize.html tutorial'); $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML', 'DefinitionRev', 1); $config-&gt;set('HTML.DefinitionRev', 1);
<strong>$config->set('Cache', 'DefinitionImpl', null); // remove this later!</strong> <strong>$config-&gt;set('Cache.DefinitionImpl', null); // TODO: remove this later!</strong>
$def = $config->getHTMLDefinition(true);</pre> $def = $config-&gt;getHTMLDefinition(true);</pre>
<p> <p>
A few things should be mentioned about the caching mechanism before A few things should be mentioned about the caching mechanism before
@@ -267,10 +255,10 @@ $def = $config->getHTMLDefinition(true);</pre>
</p> </p>
<pre>$config = HTMLPurifier_Config::createDefault(); <pre>$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', 'enduser-customize.html tutorial'); $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML', 'DefinitionRev', 1); $config-&gt;set('HTML.DefinitionRev', 1);
$config->set('Cache', 'DefinitionImpl', null); // remove this later! $config-&gt;set('Cache.DefinitionImpl', null); // remove this later!
$def = $config->getHTMLDefinition(true); $def = $config-&gt;getHTMLDefinition(true);
<strong>$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');</strong></pre> <strong>$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');</strong></pre>
<p> <p>
@@ -385,11 +373,11 @@ $def = $config->getHTMLDefinition(true);
</p> </p>
<pre>$config = HTMLPurifier_Config::createDefault(); <pre>$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', 'enduser-customize.html tutorial'); $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML', 'DefinitionRev', 1); $config-&gt;set('HTML.DefinitionRev', 1);
$config->set('Cache', 'DefinitionImpl', null); // remove this later! $config-&gt;set('Cache.DefinitionImpl', null); // remove this later!
$def = $config->getHTMLDefinition(true); $def = $config-&gt;getHTMLDefinition(true);
<strong>$def->addAttribute('a', 'target', new HTMLPurifier_AttrDef_Enum( <strong>$def-&gt;addAttribute('a', 'target', new HTMLPurifier_AttrDef_Enum(
array('_blank','_self','_target','_top') array('_blank','_self','_target','_top')
));</strong></pre> ));</strong></pre>
@@ -724,7 +712,7 @@ $def = $config->getHTMLDefinition(true);
or more flow elements, but no nested <code>form</code>s</strong></li> or more flow elements, but no nested <code>form</code>s</strong></li>
<li>What attributes does the element allow that are general? <strong>Common</strong></li> <li>What attributes does the element allow that are general? <strong>Common</strong></li>
<li>What attributes does the element allow that are specific to this element? <strong>A whole bunch, see ATTLIST; <li>What attributes does the element allow that are specific to this element? <strong>A whole bunch, see ATTLIST;
we're going to the vital ones: <code>action</code>, <code>method</code> and <code>name</code></strong></li> we're going to do the vital ones: <code>action</code>, <code>method</code> and <code>name</code></strong></li>
</ol> </ol>
<p> <p>
@@ -732,14 +720,14 @@ $def = $config->getHTMLDefinition(true);
</p> </p>
<pre>$config = HTMLPurifier_Config::createDefault(); <pre>$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', 'enduser-customize.html tutorial'); $config-&gt;set('HTML.DefinitionID', 'enduser-customize.html tutorial');
$config->set('HTML', 'DefinitionRev', 1); $config-&gt;set('HTML.DefinitionRev', 1);
$config->set('Cache', 'DefinitionImpl', null); // remove this later! $config-&gt;set('Cache.DefinitionImpl', null); // remove this later!
$def = $config->getHTMLDefinition(true); $def = $config-&gt;getHTMLDefinition(true);
$def->addAttribute('a', 'target', new HTMLPurifier_AttrDef_Enum( $def-&gt;addAttribute('a', 'target', new HTMLPurifier_AttrDef_Enum(
array('_blank','_self','_target','_top') array('_blank','_self','_target','_top')
)); ));
<strong>$form = $def->addElement( <strong>$form = $def-&gt;addElement(
'form', // name 'form', // name
'Block', // content set 'Block', // content set
'Flow', // allowed children 'Flow', // allowed children
@@ -750,7 +738,7 @@ $def->addAttribute('a', 'target', new HTMLPurifier_AttrDef_Enum(
'name' => 'ID' 'name' => 'ID'
) )
); );
$form->excludes = array('form' => true);</strong></pre> $form-&gt;excludes = array('form' => true);</strong></pre>
<p> <p>
Each of the parameters corresponds to one of the questions we asked. Each of the parameters corresponds to one of the questions we asked.
@@ -795,4 +783,5 @@ $form->excludes = array('form' => true);</strong></pre>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -31,7 +31,7 @@ by default.</p>
<p>IDs, however, are quite useful functionality to have, so if users start <p>IDs, however, are quite useful functionality to have, so if users start
complaining about broken anchors you'll probably want to turn them back on complaining about broken anchors you'll probably want to turn them back on
with %HTML.EnableAttrID. But before you go mucking around with the config with %Attr.EnableID. But before you go mucking around with the config
object, it's probably worth to take some precautions to keep your page object, it's probably worth to take some precautions to keep your page
validating. Why?</p> validating. Why?</p>
@@ -56,8 +56,8 @@ validating. Why?</p>
deal with the most obvious solution: preventing users from using any IDs that deal with the most obvious solution: preventing users from using any IDs that
appear elsewhere on the document. The method is simple:</p> appear elsewhere on the document. The method is simple:</p>
<pre>$config->set('HTML', 'EnableAttrID', true); <pre>$config-&gt;set('Attr.EnableID', true);
$config->set('Attr', 'IDBlacklist' array( $config-&gt;set('Attr.IDBlacklist' array(
'list', 'of', 'attribute', 'values', 'that', 'are', 'forbidden' 'list', 'of', 'attribute', 'values', 'that', 'are', 'forbidden'
));</pre> ));</pre>
@@ -88,8 +88,8 @@ all, they might have simply specified a duplicate ID by accident.</p>
<p>This method, too, is quite simple: add a prefix to all user IDs. With this <p>This method, too, is quite simple: add a prefix to all user IDs. With this
code:</p> code:</p>
<pre>$config->set('HTML', 'EnableAttrID', true); <pre>$config-&gt;set('Attr.EnableID', true);
$config->set('Attr', 'IDPrefix', 'user_');</pre> $config-&gt;set('Attr.IDPrefix', 'user_');</pre>
<p>...this:</p> <p>...this:</p>
@@ -109,7 +109,7 @@ user_ to the beginning.&quot;</p>
nothing about multiple HTML Purifier outputs on one page. Thus, we have nothing about multiple HTML Purifier outputs on one page. Thus, we have
a second configuration value to piggy-back off of: %Attr.IDPrefixLocal:</p> a second configuration value to piggy-back off of: %Attr.IDPrefixLocal:</p>
<pre>$config->set('Attr', 'IDPrefixLocal', 'comment' . $id . '_');</pre> <pre>$config-&gt;set('Attr.IDPrefixLocal', 'comment' . $id . '_');</pre>
<p>This new attributes does nothing but append on to regular IDPrefix, but is <p>This new attributes does nothing but append on to regular IDPrefix, but is
special in that it is volatile: it's value is determined at run-time and special in that it is volatile: it's value is determined at run-time and
@@ -137,11 +137,12 @@ anchors is beyond me.</p>
<p>To revert back to pre-1.2.0 behavior, simply:</p> <p>To revert back to pre-1.2.0 behavior, simply:</p>
<pre>$config->set('HTML', 'EnableAttrID', true);</pre> <pre>$config-&gt;set('Attr.EnableID', true);</pre>
<p>Don't come crying to me when your page mysteriously stops validating, though.</p> <p>Don't come crying to me when your page mysteriously stops validating, though.</p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -116,4 +116,5 @@ if you decide to do that! Especially if you port HTML Purifier to C++.
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -76,7 +76,7 @@ associated with it, although it may change depending on your doctype.</p>
change the level of cleaning by setting the %HTML.TidyLevel configuration change the level of cleaning by setting the %HTML.TidyLevel configuration
directive:</p> directive:</p>
<pre>$config->set('HTML', 'TidyLevel', 'heavy'); // burn baby burn!</pre> <pre>$config-&gt;set('HTML.TidyLevel', 'heavy'); // burn baby burn!</pre>
<h2>Is the light level really light?</h2> <h2>Is the light level really light?</h2>
@@ -165,17 +165,17 @@ smoketest</a>.</p>
so happy about the br@clear implementation. That's perfectly fine! so happy about the br@clear implementation. That's perfectly fine!
HTML Purifier will make accomodations:</p> HTML Purifier will make accomodations:</p>
<pre>$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional'); <pre>$config-&gt;set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('HTML', 'TidyLevel', 'heavy'); // all changes, minus... $config-&gt;set('HTML.TidyLevel', 'heavy'); // all changes, minus...
<strong>$config->set('HTML', 'TidyRemove', 'br@clear');</strong></pre> <strong>$config-&gt;set('HTML.TidyRemove', 'br@clear');</strong></pre>
<p>That third line does the magic, removing the br@clear fix <p>That third line does the magic, removing the br@clear fix
from the module, ensuring that <code>&lt;br clear="both" /&gt;</code> from the module, ensuring that <code>&lt;br clear="both" /&gt;</code>
will pass through unharmed. The reverse is possible too:</p> will pass through unharmed. The reverse is possible too:</p>
<pre>$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional'); <pre>$config-&gt;set('HTML.Doctype', 'XHTML 1.0 Transitional');
$config->set('HTML', 'TidyLevel', 'none'); // no changes, plus... $config-&gt;set('HTML.TidyLevel', 'none'); // no changes, plus...
<strong>$config->set('HTML', 'TidyAdd', 'p@align');</strong></pre> <strong>$config-&gt;set('HTML.TidyAdd', 'p@align');</strong></pre>
<p>In this case, all transformations are shut off, except for the p@align <p>In this case, all transformations are shut off, except for the p@align
one, which you found handy.</p> one, which you found handy.</p>
@@ -227,4 +227,5 @@ effectively in the background.</p>
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -160,27 +160,14 @@
</p> </p>
<pre>$uri = $config->getDefinition('URI'); <pre>$uri = $config->getDefinition('URI');
$uri->addFilter(new HTMLPurifier_URIFilter_<strong>NameOfFilter</strong>());</pre> $uri->addFilter(new HTMLPurifier_URIFilter_<strong>NameOfFilter</strong>(), $config);</pre>
<p> <p>
If you want to be really fancy, you can define a configuration directive After adding a filter, you won't be able to set configuration directives.
for your filter and have HTML Purifier automatically manage whether or Structure your code accordingly.
not your filter gets loaded or not (this is how internal filters manage
things):
</p> </p>
<pre>HTMLPurifier_ConfigSchema::define( <!-- XXX: link to new documentation system -->
'URI', '<strong>NameOfFilter</strong>', false, 'bool',
'<strong>What your filter does.</strong>'
);
$uri = $config->getDefinition('URI', true);
$uri->registerFilter(new HTMLPurifier_URIFilter_<strong>NameOfFilter</strong>());
</pre>
<p>
Now, your filter will only be called when %URI.<strong>NameOfFilter</strong>
is set to true.
</p>
<h2>Post-filter</h2> <h2>Post-filter</h2>
@@ -213,4 +200,5 @@ $uri->registerFilter(new HTMLPurifier_URIFilter_<strong>NameOfFilter</strong>())
</body></html> </body></html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -1056,4 +1056,5 @@ a more in-depth look into character sets and encodings.</p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -67,7 +67,7 @@ into your documents. YouTube's code goes like this:</p>
</ol> </ol>
<p>What point 2 means is that if we have code like <code>&lt;span <p>What point 2 means is that if we have code like <code>&lt;span
class=&quot;embed-youtube&quot;&gt;AyPzM5WK8ys&lt;/span&gt;</code> your class=&quot;youtube-embed&quot;&gt;AyPzM5WK8ys&lt;/span&gt;</code> your
application can reconstruct the full object from this small snippet that application can reconstruct the full object from this small snippet that
passes through HTML Purifier <em>unharmed</em>. passes through HTML Purifier <em>unharmed</em>.
<a href="http://repo.or.cz/w/htmlpurifier.git?a=blob;hb=HEAD;f=library/HTMLPurifier/Filter/YouTube.php">Show me the code!</a></p> <a href="http://repo.or.cz/w/htmlpurifier.git?a=blob;hb=HEAD;f=library/HTMLPurifier/Filter/YouTube.php">Show me the code!</a></p>
@@ -75,7 +75,7 @@ passes through HTML Purifier <em>unharmed</em>.
<p>And the corresponding usage:</p> <p>And the corresponding usage:</p>
<pre>&lt;?php <pre>&lt;?php
$config->set('Filter', 'YouTube', true); $config-&gt;set('Filter.YouTube', true);
?&gt;</pre> ?&gt;</pre>
<p>There is a bit going in the two code snippets, so let's explain.</p> <p>There is a bit going in the two code snippets, so let's explain.</p>
@@ -149,4 +149,5 @@ with the core!</p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -8,8 +8,8 @@ require_once '../../library/HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault(); $config = HTMLPurifier_Config::createDefault();
// configuration goes here: // configuration goes here:
$config->set('Core', 'Encoding', 'UTF-8'); // replace with your encoding $config->set('Core.Encoding', 'UTF-8'); // replace with your encoding
$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional'); // replace with your doctype $config->set('HTML.Doctype', 'XHTML 1.0 Transitional'); // replace with your doctype
$purifier = new HTMLPurifier($config); $purifier = new HTMLPurifier($config);

View File

@@ -5,4 +5,5 @@ function init() {
} }
</script> </script>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -117,6 +117,12 @@ the code. They may be upgraded to HTML files or stay as TXT scratchpads.</p>
<td>Common security issues that may still arise (half-baked).</td> <td>Common security issues that may still arise (half-baked).</td>
</tr> </tr>
<tr>
<td>Development</td>
<td><a href="dev-config-bcbreaks.txt">Config BC Breaks</a></td>
<td>Backwards-incompatible changes in HTML Purifier 4.0.0</td>
</tr>
<tr> <tr>
<td>Development</td> <td>Development</td>
<td><a href="dev-code-quality.txt">Code Quality Issues</a></td> <td><a href="dev-code-quality.txt">Code Quality Issues</a></td>
@@ -178,4 +184,5 @@ the code. They may be upgraded to HTML files or stay as TXT scratchpads.</p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -45,4 +45,5 @@ something like that?</li>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

218
docs/proposal-plists.txt Normal file
View File

@@ -0,0 +1,218 @@
THE UNIVERSAL DESIGN PATTERN: PROPERTIES
Steve Yegge
Implementation:
get(name)
put(name, value)
has(name)
remove(name)
iteration, with filtering [this will be our namespaces]
parent
Representations:
- Keys are strings
- It's nice to not need to quote keys (if we formulate our own language,
consider this)
- Property not present representation (key missing)
- Frequent removal/re-add may have null help. If null is valid, use
another value. (PHP semantics are weird here)
Data structures:
- LinkedHashMap is wonderful (O(1) access and maintains order)
- Using a special property that points to the parent is usual
- Multiple inheritance possible, need rules for which to lookup first
- Iterative inheritance is best
- Consider performance!
Deletion
- Tricky problem with inheritance
- Distinguish between "not found" and "look in my parent for the property"
[Maybe HTML Purifier won't allow deletion]
Read/write asymmetry (it's correct!)
Read-only plists
- Allow ability to freeze [this is what we have already]
- Don't overuse it
Performance:
- Intern strings (PHP does this already)
- Don't be case-insensitive
- If all properties in a plist are known a-priori, you can use a "perfect"
hash function. Often overkill.
- Copy-on-read caching "plundering" reduces lookup, but uses memory and can
grow stale. Use as last resort.
- Refactoring to fields. Watch for API compatibility, system complexity,
and lack of flexibility.
- Refrigerator: external data-structure to hold plists
Transient properties:
[Don't need to worry about this]
- Use a separate plist for transient properties
- Non-numeric override; numeric should ADD
- Deletion: removeTransientProperty() and transientlyRemoveProperty()
Persistence:
- XML/JSON are good
- Text-based is good for readability, maintainability and bootstrapping
- Compressed binary format for network transport [not necessary]
- RDBMS or XML database
Querying: [not relevant]
- XML database is nice for XPath/XQuery
- jQuery for JSON
- Just load it all into a program
Backfills/Data integrity:
- Use usual methods
- Lazy backfill is a nice hack
Type systems:
- Flags: ReadOnly, Permanent, DontEnum
- Typed properties isn't that useful [It's also Not-PHP]
- Seperate meta-list of directive properties IS useful
- Duck typing is useful for systems designed fully around properties pattern
Trade-off:
+ Flexibility
+ Extensibility
+ Unit-testing/prototype-speed
- Performance
- Data integrity
- Navagability/Query-ability
- Reversability (hard to go back)
HTML Purifier
We are not happy with our current system of defining configuration directives,
because it has become clear that things will get a lot nicer if we allow
multiple namespaces, and there are some features that naturally lend themselves
to inheritance, which we do not really support well.
One of the considered implementation changes would be to go from a structure
like:
array(
'Namespace' => array(
'Directive' => 'val1',
'Directive2' => 'val2',
)
)
to:
array(
'Namespace.Directive' => 'val1',
'Namespace.Directive2' => 'val2',
)
The below implementation takes more memory, however, and it makes it a bit
complicated to grab all values from a namespace.
The alternate implementation choice is to allow nested plists. This keeps
iteration easy, but is problematic for inheritance (it would be difficult
to distinguish a plist from an array) and retrieval (when specifying multiple
namespaces we would need some multiple de-referencing).
----
We can bite the performance hit, and just do iteration with filter
(the strncmp call should be relatively cheap). Then, users should be able
to optimize doing something like:
$config = HTMLPurifier_Config::createDefault();
if (!file_exists('config.php')) {
// set up $config
$config->save('config.php');
} else {
$config->load('config.php');
}
Or maybe memcache, or something. This means that "// set up $config" must
not have any dynamic parts, or the user has to invalidate the cache when
they do update it. We have to think about this a little more carefully; the
file call might be more expensive.
----
This might get expensive, however, when we actually care about iterating
over the configuration and want the actual values. So what about nesting the
lists?
"ns.sub.directive" => values['ns']['sub']['directive']
We can distinguish between plists and arrays by using ArrayObjects for the
plists, and regular arrays for the arrays? Alternatively, use ArrayObjects
for the arrays, and regular arrays for the plists.
----
Implementation demands, and what has caused them:
1. DefinitionCache, the HTML, CSS and URI namespaces have caches attached to them
Results:
- getBatchSerial()
- getBatch() : in general, the ability to traverse just a namespace
2. AutoFormat/Filter, this is a plugin architecture, directives not hard-coded
- getBatch()
3. Configuration form
- Namespaces used to organize directives
Other than that, we have a pure plist. PERHAPS we should maintain separate things
for these different demands.
Issue 2: Directives for configuring the plugins are regular plists, but
when enabling them, while it's "plist-ish", what you're really doing is adding
them to an array of "autoformatters"/"filters" to enable. We can setup
magic BC as well as in the new interface, but there should also be an
add('AutoFormat', 'AutoParagraph'); which does the right thing.
One thing to consider is whether or not inheritance rules will apply to these.
I'd say yes. That means that they're still plisty, in fact, the underlying
implementation will probably be a plist. However, they will get their OWN
plists, and will NOT support nesting.
Issue 1: Our current implementation is generally not efficient; md5(serialize($foo))
is pretty expensive. So, I don't think there will be any problems if it
gets "less" efficient, as long as we give users a properly fast alternative;
DefinitionRev gives us a way to do this, by simply telling the user they must
update it whenever they update Configuration directives as well. (There are
obvious BC concerns here).
In such a case, we simply iterate over our plist (performing full retrievals
for each value), grab the entries we care about, and then serialize and hash.
It's going to be slow either way, due to the ability of plists to inherit.
If we ksort(), we don't have to traverse the entire array, however, the
cost of a ksort() call may not be worth it.
At this point, last time, I started worrying about the performance implications
of allowing inheritance, and wondering whether or not I wanted to squash
the plist. At first blush, our code might be under the assumption that
accessing properties is cheap; but actually we prefer to copy out the value
into a member variable if it's going to be used many times. With this is mind
I don't think CPU consumption from a few nested function calls is going to
be a problem. We *are* going to enforce a function only interface.
The next issue at hand is how we're going to manage the "special" plists,
which should still be able to be inherited. Basically, it means that multiple
plists would be attached to the configuration object, which is not the
best for memory performance. The alternative is to keep them all in one
big plist, and then eat the one-time cost of traversing the entire plist
to grab the appropriate values.
I think at this point we can write the generic interface, and then set up separate
plists if that ends up being necessary for performance (it probably won't.) Now
lets code our generic plist implementation.
----
Iterating over the plist presents some problems. The way we've chosen to solve
this is to squash all of the parents.
----
But I don't need iteration.
vim: et sw=4 sts=4

View File

@@ -43,4 +43,5 @@ the development of this library in these forum threads:</p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 --> <!-- vim: et sw=4 sts=4
-->

View File

@@ -163,5 +163,3 @@ div.segment {width:250px; float:left; margin-top:1em;}
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 -->

View File

@@ -127,5 +127,3 @@ style='color:black'>www.example.com/disclaimer</span></a><o:p></o:p></span></p>
</body> </body>
</html> </html>
<!-- vim: et sw=4 sts=4 -->

View File

@@ -72,5 +72,3 @@ title="Join Windows Live to share photos using Windows Live Photo E-mail.">Onlin
pictures are available for 30 days. <A style="COLOR: #0088e4" pictures are available for 30 days. <A style="COLOR: #0088e4"
href="http://g.msn.com/5meen_us/175">Get Windows Live Mail desktop to create href="http://g.msn.com/5meen_us/175">Get Windows Live Mail desktop to create
your own photo e-mails. </A></SPAN></NOBR></DIV></BODY></HTML> your own photo e-mails. </A></SPAN></NOBR></DIV></BODY></HTML>
<!-- vim: et sw=4 sts=4 -->

View File

@@ -7,7 +7,7 @@
* primary concern and you are using an opcode cache. PLEASE DO NOT EDIT THIS * primary concern and you are using an opcode cache. PLEASE DO NOT EDIT THIS
* FILE, changes will be overwritten the next time the script is run. * FILE, changes will be overwritten the next time the script is run.
* *
* @version 3.3.0 * @version 4.1.1
* *
* @warning * @warning
* You must *not* include any other HTML Purifier files before this file, * You must *not* include any other HTML Purifier files before this file,
@@ -98,6 +98,8 @@ require 'HTMLPurifier/AttrDef/CSS/Percentage.php';
require 'HTMLPurifier/AttrDef/CSS/TextDecoration.php'; require 'HTMLPurifier/AttrDef/CSS/TextDecoration.php';
require 'HTMLPurifier/AttrDef/CSS/URI.php'; require 'HTMLPurifier/AttrDef/CSS/URI.php';
require 'HTMLPurifier/AttrDef/HTML/Bool.php'; require 'HTMLPurifier/AttrDef/HTML/Bool.php';
require 'HTMLPurifier/AttrDef/HTML/Nmtokens.php';
require 'HTMLPurifier/AttrDef/HTML/Class.php';
require 'HTMLPurifier/AttrDef/HTML/Color.php'; require 'HTMLPurifier/AttrDef/HTML/Color.php';
require 'HTMLPurifier/AttrDef/HTML/FrameTarget.php'; require 'HTMLPurifier/AttrDef/HTML/FrameTarget.php';
require 'HTMLPurifier/AttrDef/HTML/ID.php'; require 'HTMLPurifier/AttrDef/HTML/ID.php';
@@ -105,7 +107,6 @@ require 'HTMLPurifier/AttrDef/HTML/Pixels.php';
require 'HTMLPurifier/AttrDef/HTML/Length.php'; require 'HTMLPurifier/AttrDef/HTML/Length.php';
require 'HTMLPurifier/AttrDef/HTML/LinkTypes.php'; require 'HTMLPurifier/AttrDef/HTML/LinkTypes.php';
require 'HTMLPurifier/AttrDef/HTML/MultiLength.php'; require 'HTMLPurifier/AttrDef/HTML/MultiLength.php';
require 'HTMLPurifier/AttrDef/HTML/Nmtokens.php';
require 'HTMLPurifier/AttrDef/URI/Email.php'; require 'HTMLPurifier/AttrDef/URI/Email.php';
require 'HTMLPurifier/AttrDef/URI/Host.php'; require 'HTMLPurifier/AttrDef/URI/Host.php';
require 'HTMLPurifier/AttrDef/URI/IPv4.php'; require 'HTMLPurifier/AttrDef/URI/IPv4.php';
@@ -123,6 +124,7 @@ require 'HTMLPurifier/AttrTransform/Input.php';
require 'HTMLPurifier/AttrTransform/Lang.php'; require 'HTMLPurifier/AttrTransform/Lang.php';
require 'HTMLPurifier/AttrTransform/Length.php'; require 'HTMLPurifier/AttrTransform/Length.php';
require 'HTMLPurifier/AttrTransform/Name.php'; require 'HTMLPurifier/AttrTransform/Name.php';
require 'HTMLPurifier/AttrTransform/NameSync.php';
require 'HTMLPurifier/AttrTransform/SafeEmbed.php'; require 'HTMLPurifier/AttrTransform/SafeEmbed.php';
require 'HTMLPurifier/AttrTransform/SafeObject.php'; require 'HTMLPurifier/AttrTransform/SafeObject.php';
require 'HTMLPurifier/AttrTransform/SafeParam.php'; require 'HTMLPurifier/AttrTransform/SafeParam.php';
@@ -174,6 +176,7 @@ require 'HTMLPurifier/Injector/DisplayLinkURI.php';
require 'HTMLPurifier/Injector/Linkify.php'; require 'HTMLPurifier/Injector/Linkify.php';
require 'HTMLPurifier/Injector/PurifierLinkify.php'; require 'HTMLPurifier/Injector/PurifierLinkify.php';
require 'HTMLPurifier/Injector/RemoveEmpty.php'; require 'HTMLPurifier/Injector/RemoveEmpty.php';
require 'HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php';
require 'HTMLPurifier/Injector/SafeObject.php'; require 'HTMLPurifier/Injector/SafeObject.php';
require 'HTMLPurifier/Lexer/DOMLex.php'; require 'HTMLPurifier/Lexer/DOMLex.php';
require 'HTMLPurifier/Lexer/DirectLex.php'; require 'HTMLPurifier/Lexer/DirectLex.php';
@@ -196,6 +199,7 @@ require 'HTMLPurifier/URIFilter/DisableExternalResources.php';
require 'HTMLPurifier/URIFilter/HostBlacklist.php'; require 'HTMLPurifier/URIFilter/HostBlacklist.php';
require 'HTMLPurifier/URIFilter/MakeAbsolute.php'; require 'HTMLPurifier/URIFilter/MakeAbsolute.php';
require 'HTMLPurifier/URIFilter/Munge.php'; require 'HTMLPurifier/URIFilter/Munge.php';
require 'HTMLPurifier/URIScheme/data.php';
require 'HTMLPurifier/URIScheme/ftp.php'; require 'HTMLPurifier/URIScheme/ftp.php';
require 'HTMLPurifier/URIScheme/http.php'; require 'HTMLPurifier/URIScheme/http.php';
require 'HTMLPurifier/URIScheme/https.php'; require 'HTMLPurifier/URIScheme/https.php';

View File

@@ -17,11 +17,11 @@ function kses($string, $allowed_html, $allowed_protocols = null) {
$allowed_attributes["$element.$attribute"] = true; $allowed_attributes["$element.$attribute"] = true;
} }
} }
$config->set('HTML', 'AllowedElements', $allowed_elements); $config->set('HTML.AllowedElements', $allowed_elements);
$config->set('HTML', 'AllowedAttributes', $allowed_attributes); $config->set('HTML.AllowedAttributes', $allowed_attributes);
$allowed_schemes = array(); $allowed_schemes = array();
if ($allowed_protocols !== null) { if ($allowed_protocols !== null) {
$config->set('URI', 'AllowedSchemes', $allowed_protocols); $config->set('URI.AllowedSchemes', $allowed_protocols);
} }
$purifier = new HTMLPurifier($config); $purifier = new HTMLPurifier($config);
return $purifier->purify($string); return $purifier->purify($string);

View File

@@ -19,7 +19,7 @@
*/ */
/* /*
HTML Purifier 3.3.0 - Standards Compliant HTML Filtering HTML Purifier 4.1.1 - Standards Compliant HTML Filtering
Copyright (C) 2006-2008 Edward Z. Yang Copyright (C) 2006-2008 Edward Z. Yang
This library is free software; you can redistribute it and/or This library is free software; you can redistribute it and/or
@@ -55,10 +55,10 @@ class HTMLPurifier
{ {
/** Version of HTML Purifier */ /** Version of HTML Purifier */
public $version = '3.3.0'; public $version = '4.1.1';
/** Constant with version of HTML Purifier */ /** Constant with version of HTML Purifier */
const VERSION = '3.3.0'; const VERSION = '4.1.1';
/** Global configuration object */ /** Global configuration object */
public $config; public $config;
@@ -128,7 +128,7 @@ class HTMLPurifier
$context->register('Generator', $this->generator); $context->register('Generator', $this->generator);
// set up global context variables // set up global context variables
if ($config->get('Core', 'CollectErrors')) { if ($config->get('Core.CollectErrors')) {
// may get moved out if other facilities use it // may get moved out if other facilities use it
$language_factory = HTMLPurifier_LanguageFactory::instance(); $language_factory = HTMLPurifier_LanguageFactory::instance();
$language = $language_factory->create($config, $context); $language = $language_factory->create($config, $context);
@@ -152,6 +152,7 @@ class HTMLPurifier
$filters = array(); $filters = array();
foreach ($filter_flags as $filter => $flag) { foreach ($filter_flags as $filter => $flag) {
if (!$flag) continue; if (!$flag) continue;
if (strpos($filter, '.') !== false) continue;
$class = "HTMLPurifier_Filter_$filter"; $class = "HTMLPurifier_Filter_$filter";
$filters[] = new $class; $filters[] = new $class;
} }

View File

@@ -92,6 +92,8 @@ require_once $__dir . '/HTMLPurifier/AttrDef/CSS/Percentage.php';
require_once $__dir . '/HTMLPurifier/AttrDef/CSS/TextDecoration.php'; require_once $__dir . '/HTMLPurifier/AttrDef/CSS/TextDecoration.php';
require_once $__dir . '/HTMLPurifier/AttrDef/CSS/URI.php'; require_once $__dir . '/HTMLPurifier/AttrDef/CSS/URI.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Bool.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Bool.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Nmtokens.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Class.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Color.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Color.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/FrameTarget.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/FrameTarget.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/ID.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/ID.php';
@@ -99,7 +101,6 @@ require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Pixels.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Length.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Length.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/LinkTypes.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/LinkTypes.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/MultiLength.php'; require_once $__dir . '/HTMLPurifier/AttrDef/HTML/MultiLength.php';
require_once $__dir . '/HTMLPurifier/AttrDef/HTML/Nmtokens.php';
require_once $__dir . '/HTMLPurifier/AttrDef/URI/Email.php'; require_once $__dir . '/HTMLPurifier/AttrDef/URI/Email.php';
require_once $__dir . '/HTMLPurifier/AttrDef/URI/Host.php'; require_once $__dir . '/HTMLPurifier/AttrDef/URI/Host.php';
require_once $__dir . '/HTMLPurifier/AttrDef/URI/IPv4.php'; require_once $__dir . '/HTMLPurifier/AttrDef/URI/IPv4.php';
@@ -117,6 +118,7 @@ require_once $__dir . '/HTMLPurifier/AttrTransform/Input.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Lang.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/Lang.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Length.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/Length.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/Name.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/Name.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/NameSync.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeEmbed.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/SafeEmbed.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeObject.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/SafeObject.php';
require_once $__dir . '/HTMLPurifier/AttrTransform/SafeParam.php'; require_once $__dir . '/HTMLPurifier/AttrTransform/SafeParam.php';
@@ -168,6 +170,7 @@ require_once $__dir . '/HTMLPurifier/Injector/DisplayLinkURI.php';
require_once $__dir . '/HTMLPurifier/Injector/Linkify.php'; require_once $__dir . '/HTMLPurifier/Injector/Linkify.php';
require_once $__dir . '/HTMLPurifier/Injector/PurifierLinkify.php'; require_once $__dir . '/HTMLPurifier/Injector/PurifierLinkify.php';
require_once $__dir . '/HTMLPurifier/Injector/RemoveEmpty.php'; require_once $__dir . '/HTMLPurifier/Injector/RemoveEmpty.php';
require_once $__dir . '/HTMLPurifier/Injector/RemoveSpansWithoutAttributes.php';
require_once $__dir . '/HTMLPurifier/Injector/SafeObject.php'; require_once $__dir . '/HTMLPurifier/Injector/SafeObject.php';
require_once $__dir . '/HTMLPurifier/Lexer/DOMLex.php'; require_once $__dir . '/HTMLPurifier/Lexer/DOMLex.php';
require_once $__dir . '/HTMLPurifier/Lexer/DirectLex.php'; require_once $__dir . '/HTMLPurifier/Lexer/DirectLex.php';
@@ -190,6 +193,7 @@ require_once $__dir . '/HTMLPurifier/URIFilter/DisableExternalResources.php';
require_once $__dir . '/HTMLPurifier/URIFilter/HostBlacklist.php'; require_once $__dir . '/HTMLPurifier/URIFilter/HostBlacklist.php';
require_once $__dir . '/HTMLPurifier/URIFilter/MakeAbsolute.php'; require_once $__dir . '/HTMLPurifier/URIFilter/MakeAbsolute.php';
require_once $__dir . '/HTMLPurifier/URIFilter/Munge.php'; require_once $__dir . '/HTMLPurifier/URIFilter/Munge.php';
require_once $__dir . '/HTMLPurifier/URIScheme/data.php';
require_once $__dir . '/HTMLPurifier/URIScheme/ftp.php'; require_once $__dir . '/HTMLPurifier/URIScheme/ftp.php';
require_once $__dir . '/HTMLPurifier/URIScheme/http.php'; require_once $__dir . '/HTMLPurifier/URIScheme/http.php';
require_once $__dir . '/HTMLPurifier/URIScheme/https.php'; require_once $__dir . '/HTMLPurifier/URIScheme/https.php';

View File

@@ -82,6 +82,42 @@ abstract class HTMLPurifier_AttrDef
return preg_replace('/rgb\((\d+)\s*,\s*(\d+)\s*,\s*(\d+)\)/', 'rgb(\1,\2,\3)', $string); return preg_replace('/rgb\((\d+)\s*,\s*(\d+)\s*,\s*(\d+)\)/', 'rgb(\1,\2,\3)', $string);
} }
/**
* Parses a possibly escaped CSS string and returns the "pure"
* version of it.
*/
protected function expandCSSEscape($string) {
// flexibly parse it
$ret = '';
for ($i = 0, $c = strlen($string); $i < $c; $i++) {
if ($string[$i] === '\\') {
$i++;
if ($i >= $c) {
$ret .= '\\';
break;
}
if (ctype_xdigit($string[$i])) {
$code = $string[$i];
for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
if (!ctype_xdigit($string[$i])) break;
$code .= $string[$i];
}
// We have to be extremely careful when adding
// new characters, to make sure we're not breaking
// the encoding.
$char = HTMLPurifier_Encoder::unichr(hexdec($code));
if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
$ret .= $char;
if ($i < $c && trim($string[$i]) !== '') $i--;
continue;
}
if ($string[$i] === "\n") continue;
}
$ret .= $string[$i];
}
return $ret;
}
} }
// vim: et sw=4 sts=4 // vim: et sw=4 sts=4

View File

@@ -59,7 +59,8 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
$keywords = array(); $keywords = array();
$keywords['h'] = false; // left, right $keywords['h'] = false; // left, right
$keywords['v'] = false; // top, bottom $keywords['v'] = false; // top, bottom
$keywords['c'] = false; // center $keywords['ch'] = false; // center (first word)
$keywords['cv'] = false; // center (second word)
$measures = array(); $measures = array();
$i = 0; $i = 0;
@@ -79,6 +80,13 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
$lbit = ctype_lower($bit) ? $bit : strtolower($bit); $lbit = ctype_lower($bit) ? $bit : strtolower($bit);
if (isset($lookup[$lbit])) { if (isset($lookup[$lbit])) {
$status = $lookup[$lbit]; $status = $lookup[$lbit];
if ($status == 'c') {
if ($i == 0) {
$status = 'ch';
} else {
$status = 'cv';
}
}
$keywords[$status] = $lbit; $keywords[$status] = $lbit;
$i++; $i++;
} }
@@ -101,20 +109,19 @@ class HTMLPurifier_AttrDef_CSS_BackgroundPosition extends HTMLPurifier_AttrDef
if (!$i) return false; // no valid values were caught if (!$i) return false; // no valid values were caught
$ret = array(); $ret = array();
// first keyword // first keyword
if ($keywords['h']) $ret[] = $keywords['h']; if ($keywords['h']) $ret[] = $keywords['h'];
elseif (count($measures)) $ret[] = array_shift($measures); elseif ($keywords['ch']) {
elseif ($keywords['c']) { $ret[] = $keywords['ch'];
$ret[] = $keywords['c']; $keywords['cv'] = false; // prevent re-use: center = center center
$keywords['c'] = false; // prevent re-use: center = center center
} }
elseif (count($measures)) $ret[] = array_shift($measures);
if ($keywords['v']) $ret[] = $keywords['v']; if ($keywords['v']) $ret[] = $keywords['v'];
elseif ($keywords['cv']) $ret[] = $keywords['cv'];
elseif (count($measures)) $ret[] = array_shift($measures); elseif (count($measures)) $ret[] = array_shift($measures);
elseif ($keywords['c']) $ret[] = $keywords['c'];
if (empty($ret)) return false; if (empty($ret)) return false;
return implode(' ', $ret); return implode(' ', $ret);

View File

@@ -9,7 +9,7 @@ class HTMLPurifier_AttrDef_CSS_Color extends HTMLPurifier_AttrDef
public function validate($color, $config, $context) { public function validate($color, $config, $context) {
static $colors = null; static $colors = null;
if ($colors === null) $colors = $config->get('Core', 'ColorKeywords'); if ($colors === null) $colors = $config->get('Core.ColorKeywords');
$color = trim($color); $color = trim($color);
if ($color === '') return false; if ($color === '') return false;

View File

@@ -34,37 +34,10 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
$quote = $font[0]; $quote = $font[0];
if ($font[$length - 1] !== $quote) continue; if ($font[$length - 1] !== $quote) continue;
$font = substr($font, 1, $length - 2); $font = substr($font, 1, $length - 2);
$new_font = '';
for ($i = 0, $c = strlen($font); $i < $c; $i++) {
if ($font[$i] === '\\') {
$i++;
if ($i >= $c) {
$new_font .= '\\';
break;
}
if (ctype_xdigit($font[$i])) {
$code = $font[$i];
for ($a = 1, $i++; $i < $c && $a < 6; $i++, $a++) {
if (!ctype_xdigit($font[$i])) break;
$code .= $font[$i];
}
// We have to be extremely careful when adding
// new characters, to make sure we're not breaking
// the encoding.
$char = HTMLPurifier_Encoder::unichr(hexdec($code));
if (HTMLPurifier_Encoder::cleanUTF8($char) === '') continue;
$new_font .= $char;
if ($i < $c && trim($font[$i]) !== '') $i--;
continue;
}
if ($font[$i] === "\n") continue;
}
$new_font .= $font[$i];
}
$font = $new_font;
} }
$font = $this->expandCSSEscape($font);
// $font is a pure representation of the font name // $font is a pure representation of the font name
if (ctype_alnum($font) && $font !== '') { if (ctype_alnum($font) && $font !== '') {
@@ -73,12 +46,21 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
continue; continue;
} }
// complicated font, requires quoting // bugger out on whitespace. form feed (0C) really
// shouldn't show up regardless
$font = str_replace(array("\n", "\t", "\r", "\x0C"), ' ', $font);
// armor single quotes and new lines // These ugly transforms don't pose a security
$font = str_replace("\\", "\\\\", $font); // risk (as \\ and \" might). We could try to be clever and
$font = str_replace("'", "\\'", $font); // use single-quote wrapping when there is a double quote
$final .= "'$font', "; // present, but I have choosen not to implement that.
// (warning: this code relies on the selection of quotation
// mark below)
$font = str_replace('\\', '\\5C ', $font);
$font = str_replace('"', '\\22 ', $font);
// complicated font, requires quoting
$final .= "\"$font\", "; // note that this will later get turned into &quot;
} }
$final = rtrim($final, ', '); $final = rtrim($final, ', ');
if ($final === '') return false; if ($final === '') return false;

View File

@@ -34,20 +34,16 @@ class HTMLPurifier_AttrDef_CSS_URI extends HTMLPurifier_AttrDef_URI
$uri = substr($uri, 1, $new_length - 1); $uri = substr($uri, 1, $new_length - 1);
} }
$keys = array( '(', ')', ',', ' ', '"', "'"); $uri = $this->expandCSSEscape($uri);
$values = array('\\(', '\\)', '\\,', '\\ ', '\\"', "\\'");
$uri = str_replace($values, $keys, $uri);
$result = parent::validate($uri, $config, $context); $result = parent::validate($uri, $config, $context);
if ($result === false) return false; if ($result === false) return false;
// escape necessary characters according to CSS spec // extra sanity check; should have been done by URI
// except for the comma, none of these should appear in the $result = str_replace(array('"', "\\", "\n", "\x0c", "\r"), "", $result);
// URI at all
$result = str_replace($keys, $values, $result);
return "url($result)"; return "url(\"$result\")";
} }

View File

@@ -0,0 +1,34 @@
<?php
/**
* Implements special behavior for class attribute (normally NMTOKENS)
*/
class HTMLPurifier_AttrDef_HTML_Class extends HTMLPurifier_AttrDef_HTML_Nmtokens
{
protected function split($string, $config, $context) {
// really, this twiddle should be lazy loaded
$name = $config->getDefinition('HTML')->doctype->name;
if ($name == "XHTML 1.1" || $name == "XHTML 2.0") {
return parent::split($string, $config, $context);
} else {
return preg_split('/\s+/', $string);
}
}
protected function filter($tokens, $config, $context) {
$allowed = $config->get('Attr.AllowedClasses');
$forbidden = $config->get('Attr.ForbiddenClasses');
$ret = array();
foreach ($tokens as $token) {
if (
($allowed === null || isset($allowed[$token])) &&
!isset($forbidden[$token]) &&
// We need this O(n) check because of PHP's array
// implementation that casts -0 to 0.
!in_array($token, $ret, true)
) {
$ret[] = $token;
}
}
return $ret;
}
}

View File

@@ -9,7 +9,7 @@ class HTMLPurifier_AttrDef_HTML_Color extends HTMLPurifier_AttrDef
public function validate($string, $config, $context) { public function validate($string, $config, $context) {
static $colors = null; static $colors = null;
if ($colors === null) $colors = $config->get('Core', 'ColorKeywords'); if ($colors === null) $colors = $config->get('Core.ColorKeywords');
$string = trim($string); $string = trim($string);

View File

@@ -12,7 +12,7 @@ class HTMLPurifier_AttrDef_HTML_FrameTarget extends HTMLPurifier_AttrDef_Enum
public function __construct() {} public function __construct() {}
public function validate($string, $config, $context) { public function validate($string, $config, $context) {
if ($this->valid_values === false) $this->valid_values = $config->get('Attr', 'AllowedFrameTargets'); if ($this->valid_values === false) $this->valid_values = $config->get('Attr.AllowedFrameTargets');
return parent::validate($string, $config, $context); return parent::validate($string, $config, $context);
} }

View File

@@ -17,18 +17,18 @@ class HTMLPurifier_AttrDef_HTML_ID extends HTMLPurifier_AttrDef
public function validate($id, $config, $context) { public function validate($id, $config, $context) {
if (!$config->get('Attr', 'EnableID')) return false; if (!$config->get('Attr.EnableID')) return false;
$id = trim($id); // trim it first $id = trim($id); // trim it first
if ($id === '') return false; if ($id === '') return false;
$prefix = $config->get('Attr', 'IDPrefix'); $prefix = $config->get('Attr.IDPrefix');
if ($prefix !== '') { if ($prefix !== '') {
$prefix .= $config->get('Attr', 'IDPrefixLocal'); $prefix .= $config->get('Attr.IDPrefixLocal');
// prevent re-appending the prefix // prevent re-appending the prefix
if (strpos($id, $prefix) !== 0) $id = $prefix . $id; if (strpos($id, $prefix) !== 0) $id = $prefix . $id;
} elseif ($config->get('Attr', 'IDPrefixLocal') !== '') { } elseif ($config->get('Attr.IDPrefixLocal') !== '') {
trigger_error('%Attr.IDPrefixLocal cannot be used unless '. trigger_error('%Attr.IDPrefixLocal cannot be used unless '.
'%Attr.IDPrefix is set', E_USER_WARNING); '%Attr.IDPrefix is set', E_USER_WARNING);
} }
@@ -51,7 +51,7 @@ class HTMLPurifier_AttrDef_HTML_ID extends HTMLPurifier_AttrDef
$result = ($trim === ''); $result = ($trim === '');
} }
$regexp = $config->get('Attr', 'IDBlacklistRegexp'); $regexp = $config->get('Attr.IDBlacklistRegexp');
if ($regexp && preg_match($regexp, $id)) { if ($regexp && preg_match($regexp, $id)) {
return false; return false;
} }

View File

@@ -27,7 +27,7 @@ class HTMLPurifier_AttrDef_HTML_LinkTypes extends HTMLPurifier_AttrDef
public function validate($string, $config, $context) { public function validate($string, $config, $context) {
$allowed = $config->get('Attr', $this->name); $allowed = $config->get('Attr.' . $this->name);
if (empty($allowed)) return false; if (empty($allowed)) return false;
$string = $this->parseCDATA($string); $string = $this->parseCDATA($string);

View File

@@ -2,10 +2,6 @@
/** /**
* Validates contents based on NMTOKENS attribute type. * Validates contents based on NMTOKENS attribute type.
* @note The only current use for this is the class attribute in HTML
* @note Could have some functionality factored out into Nmtoken class
* @warning We cannot assume this class will be used only for 'class'
* attributes. Not sure how to hook in magic behavior, then.
*/ */
class HTMLPurifier_AttrDef_HTML_Nmtokens extends HTMLPurifier_AttrDef class HTMLPurifier_AttrDef_HTML_Nmtokens extends HTMLPurifier_AttrDef
{ {
@@ -17,6 +13,17 @@ class HTMLPurifier_AttrDef_HTML_Nmtokens extends HTMLPurifier_AttrDef
// early abort: '' and '0' (strings that convert to false) are invalid // early abort: '' and '0' (strings that convert to false) are invalid
if (!$string) return false; if (!$string) return false;
$tokens = $this->split($string, $config, $context);
$tokens = $this->filter($tokens, $config, $context);
if (empty($tokens)) return false;
return implode(' ', $tokens);
}
/**
* Splits a space separated list of tokens into its constituent parts.
*/
protected function split($string, $config, $context) {
// OPTIMIZABLE! // OPTIMIZABLE!
// do the preg_match, capture all subpatterns for reformulation // do the preg_match, capture all subpatterns for reformulation
@@ -24,23 +31,20 @@ class HTMLPurifier_AttrDef_HTML_Nmtokens extends HTMLPurifier_AttrDef
// escaping because I don't know how to do that with regexps // escaping because I don't know how to do that with regexps
// and plus it would complicate optimization efforts (you never // and plus it would complicate optimization efforts (you never
// see that anyway). // see that anyway).
$matches = array();
$pattern = '/(?:(?<=\s)|\A)'. // look behind for space or string start $pattern = '/(?:(?<=\s)|\A)'. // look behind for space or string start
'((?:--|-?[A-Za-z_])[A-Za-z_\-0-9]*)'. '((?:--|-?[A-Za-z_])[A-Za-z_\-0-9]*)'.
'(?:(?=\s)|\z)/'; // look ahead for space or string end '(?:(?=\s)|\z)/'; // look ahead for space or string end
preg_match_all($pattern, $string, $matches); preg_match_all($pattern, $string, $matches);
return $matches[1];
}
if (empty($matches[1])) return false; /**
* Template method for removing certain tokens based on arbitrary criteria.
// reconstruct string * @note If we wanted to be really functional, we'd do an array_filter
$new_string = ''; * with a callback. But... we're not.
foreach ($matches[1] as $token) { */
$new_string .= $token . ' '; protected function filter($tokens, $config, $context) {
} return $tokens;
$new_string = rtrim($new_string);
return $new_string;
} }
} }

View File

@@ -25,7 +25,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
public function validate($uri, $config, $context) { public function validate($uri, $config, $context) {
if ($config->get('URI', 'Disable')) return false; if ($config->get('URI.Disable')) return false;
$uri = $this->parseCDATA($uri); $uri = $this->parseCDATA($uri);

View File

@@ -10,7 +10,7 @@ class HTMLPurifier_AttrTransform_BdoDir extends HTMLPurifier_AttrTransform
public function transform($attr, $config, $context) { public function transform($attr, $config, $context) {
if (isset($attr['dir'])) return $attr; if (isset($attr['dir'])) return $attr;
$attr['dir'] = $config->get('Attr', 'DefaultTextDir'); $attr['dir'] = $config->get('Attr.DefaultTextDir');
return $attr; return $attr;
} }

View File

@@ -15,21 +15,22 @@ class HTMLPurifier_AttrTransform_ImgRequired extends HTMLPurifier_AttrTransform
$src = true; $src = true;
if (!isset($attr['src'])) { if (!isset($attr['src'])) {
if ($config->get('Core', 'RemoveInvalidImg')) return $attr; if ($config->get('Core.RemoveInvalidImg')) return $attr;
$attr['src'] = $config->get('Attr', 'DefaultInvalidImage'); $attr['src'] = $config->get('Attr.DefaultInvalidImage');
$src = false; $src = false;
} }
if (!isset($attr['alt'])) { if (!isset($attr['alt'])) {
if ($src) { if ($src) {
$alt = $config->get('Attr', 'DefaultImageAlt'); $alt = $config->get('Attr.DefaultImageAlt');
if ($alt === null) { if ($alt === null) {
$attr['alt'] = basename($attr['src']); // truncate if the alt is too long
$attr['alt'] = substr(basename($attr['src']),0,40);
} else { } else {
$attr['alt'] = $alt; $attr['alt'] = $alt;
} }
} else { } else {
$attr['alt'] = $config->get('Attr', 'DefaultInvalidImageAlt'); $attr['alt'] = $config->get('Attr.DefaultInvalidImageAlt');
} }
} }

View File

@@ -7,6 +7,8 @@ class HTMLPurifier_AttrTransform_Name extends HTMLPurifier_AttrTransform
{ {
public function transform($attr, $config, $context) { public function transform($attr, $config, $context) {
// Abort early if we're using relaxed definition of name
if ($config->get('HTML.Attr.Name.UseCDATA')) return $attr;
if (!isset($attr['name'])) return $attr; if (!isset($attr['name'])) return $attr;
$id = $this->confiscateAttr($attr, 'name'); $id = $this->confiscateAttr($attr, 'name');
if ( isset($attr['id'])) return $attr; if ( isset($attr['id'])) return $attr;

View File

@@ -0,0 +1,27 @@
<?php
/**
* Post-transform that performs validation to the name attribute; if
* it is present with an equivalent id attribute, it is passed through;
* otherwise validation is performed.
*/
class HTMLPurifier_AttrTransform_NameSync extends HTMLPurifier_AttrTransform
{
public function __construct() {
$this->idDef = new HTMLPurifier_AttrDef_HTML_ID();
}
public function transform($attr, $config, $context) {
if (!isset($attr['name'])) return $attr;
$name = $attr['name'];
if (isset($attr['id']) && $attr['id'] === $name) return $attr;
$result = $this->idDef->validate($name, $config, $context);
if ($result === false) unset($attr['name']);
else $attr['name'] = $result;
return $attr;
}
}
// vim: et sw=4 sts=4

View File

@@ -37,8 +37,14 @@ class HTMLPurifier_AttrTransform_SafeParam extends HTMLPurifier_AttrTransform
$attr['value'] = 'window'; $attr['value'] = 'window';
break; break;
case 'movie': case 'movie':
case 'src':
$attr['name'] = "movie";
$attr['value'] = $this->uri->validate($attr['value'], $config, $context); $attr['value'] = $this->uri->validate($attr['value'], $config, $context);
break; break;
case 'flashvars':
// we're going to allow arbitrary inputs to the SWF, on
// the reasoning that it could only hack the SWF, not us.
break;
// add other cases to support other param name/value pairs // add other cases to support other param name/value pairs
default: default:
$attr['name'] = $attr['value'] = null; $attr['name'] = $attr['value'] = null;

View File

@@ -36,6 +36,9 @@ class HTMLPurifier_AttrTypes
$this->info['Charsets'] = new HTMLPurifier_AttrDef_Text(); $this->info['Charsets'] = new HTMLPurifier_AttrDef_Text();
$this->info['Character'] = new HTMLPurifier_AttrDef_Text(); $this->info['Character'] = new HTMLPurifier_AttrDef_Text();
// "proprietary" types
$this->info['Class'] = new HTMLPurifier_AttrDef_HTML_Class();
// number is really a positive integer (one or more digits) // number is really a positive integer (one or more digits)
// FIXME: ^^ not always, see start and value of list items // FIXME: ^^ not always, see start and value of list items
$this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true); $this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true);

View File

@@ -154,7 +154,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
new HTMLPurifier_AttrDef_CSS_Percentage(true), new HTMLPurifier_AttrDef_CSS_Percentage(true),
new HTMLPurifier_AttrDef_Enum(array('auto')) new HTMLPurifier_AttrDef_Enum(array('auto'))
)); ));
$max = $config->get('CSS', 'MaxImgLength'); $max = $config->get('CSS.MaxImgLength');
$this->info['width'] = $this->info['width'] =
$this->info['height'] = $this->info['height'] =
@@ -211,15 +211,15 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
// partial support // partial support
$this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap')); $this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap'));
if ($config->get('CSS', 'Proprietary')) { if ($config->get('CSS.Proprietary')) {
$this->doSetupProprietary($config); $this->doSetupProprietary($config);
} }
if ($config->get('CSS', 'AllowTricky')) { if ($config->get('CSS.AllowTricky')) {
$this->doSetupTricky($config); $this->doSetupTricky($config);
} }
$allow_important = $config->get('CSS', 'AllowImportant'); $allow_important = $config->get('CSS.AllowImportant');
// wrap all attr-defs with decorator that handles !important // wrap all attr-defs with decorator that handles !important
foreach ($this->info as $k => $v) { foreach ($this->info as $k => $v) {
$this->info[$k] = new HTMLPurifier_AttrDef_CSS_ImportantDecorator($v, $allow_important); $this->info[$k] = new HTMLPurifier_AttrDef_CSS_ImportantDecorator($v, $allow_important);
@@ -272,7 +272,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
// setup allowed elements // setup allowed elements
$support = "(for information on implementing this, see the ". $support = "(for information on implementing this, see the ".
"support forums) "; "support forums) ";
$allowed_attributes = $config->get('CSS', 'AllowedProperties'); $allowed_attributes = $config->get('CSS.AllowedProperties');
if ($allowed_attributes !== null) { if ($allowed_attributes !== null) {
foreach ($this->info as $name => $d) { foreach ($this->info as $name => $d) {
if(!isset($allowed_attributes[$name])) unset($this->info[$name]); if(!isset($allowed_attributes[$name])) unset($this->info[$name]);

View File

@@ -59,7 +59,7 @@ class HTMLPurifier_ChildDef_Required extends HTMLPurifier_ChildDef
$all_whitespace = true; $all_whitespace = true;
// some configuration // some configuration
$escape_invalid_children = $config->get('Core', 'EscapeInvalidChildren'); $escape_invalid_children = $config->get('Core.EscapeInvalidChildren');
// generator // generator
$gen = new HTMLPurifier_Generator($config, $context); $gen = new HTMLPurifier_Generator($config, $context);

View File

@@ -20,7 +20,7 @@ class HTMLPurifier_Config
/** /**
* HTML Purifier's version * HTML Purifier's version
*/ */
public $version = '3.3.0'; public $version = '4.1.1';
/** /**
* Bool indicator whether or not to automatically finalize * Bool indicator whether or not to automatically finalize
@@ -68,12 +68,30 @@ class HTMLPurifier_Config
*/ */
protected $plist; protected $plist;
/**
* Whether or not a set is taking place due to an
* alias lookup.
*/
private $aliasMode;
/**
* Set to false if you do not want line and file numbers in errors
* (useful when unit testing)
*/
public $chatty = true;
/**
* Current lock; only gets to this namespace are allowed.
*/
private $lock;
/** /**
* @param $definition HTMLPurifier_ConfigSchema that defines what directives * @param $definition HTMLPurifier_ConfigSchema that defines what directives
* are allowed. * are allowed.
*/ */
public function __construct($definition) { public function __construct($definition, $parent = null) {
$this->plist = new HTMLPurifier_PropertyList($definition->defaultPlist); $parent = $parent ? $parent : $definition->defaultPlist;
$this->plist = new HTMLPurifier_PropertyList($parent);
$this->def = $definition; // keep a copy around for checking $this->def = $definition; // keep a copy around for checking
$this->parser = new HTMLPurifier_VarParser_Flexible(); $this->parser = new HTMLPurifier_VarParser_Flexible();
} }
@@ -102,6 +120,16 @@ class HTMLPurifier_Config
return $ret; return $ret;
} }
/**
* Creates a new config object that inherits from a previous one.
* @param HTMLPurifier_Config $config Configuration object to inherit
* from.
* @return HTMLPurifier_Config object with $config as its parent.
*/
public static function inherit(HTMLPurifier_Config $config) {
return new HTMLPurifier_Config($config->def, $config->plist);
}
/** /**
* Convenience constructor that creates a default configuration object. * Convenience constructor that creates a default configuration object.
* @return Default HTMLPurifier_Config object. * @return Default HTMLPurifier_Config object.
@@ -114,24 +142,34 @@ class HTMLPurifier_Config
/** /**
* Retreives a value from the configuration. * Retreives a value from the configuration.
* @param $namespace String namespace
* @param $key String key * @param $key String key
*/ */
public function get($namespace, $key) { public function get($key, $a = null) {
if (!$this->finalized) $this->autoFinalize ? $this->finalize() : $this->plist->squash(true); if ($a !== null) {
if (!isset($this->def->info[$namespace][$key])) { $this->triggerError("Using deprecated API: use \$config->get('$key.$a') instead", E_USER_WARNING);
$key = "$key.$a";
}
if (!$this->finalized) $this->autoFinalize();
if (!isset($this->def->info[$key])) {
// can't add % due to SimpleTest bug // can't add % due to SimpleTest bug
trigger_error('Cannot retrieve value of undefined directive ' . htmlspecialchars("$namespace.$key"), $this->triggerError('Cannot retrieve value of undefined directive ' . htmlspecialchars($key),
E_USER_WARNING); E_USER_WARNING);
return; return;
} }
if (isset($this->def->info[$namespace][$key]->isAlias)) { if (isset($this->def->info[$key]->isAlias)) {
$d = $this->def->info[$namespace][$key]; $d = $this->def->info[$key];
trigger_error('Cannot get value from aliased directive, use real name ' . $d->namespace . '.' . $d->name, $this->triggerError('Cannot get value from aliased directive, use real name ' . $d->key,
E_USER_ERROR); E_USER_ERROR);
return; return;
} }
return $this->plist->get("$namespace.$key"); if ($this->lock) {
list($ns) = explode('.', $key);
if ($ns !== $this->lock) {
$this->triggerError('Cannot get value of namespace ' . $ns . ' when lock for ' . $this->lock . ' is active, this probably indicates a Definition setup method is accessing directives that are not within its namespace', E_USER_ERROR);
return;
}
}
return $this->plist->get($key);
} }
/** /**
@@ -139,13 +177,13 @@ class HTMLPurifier_Config
* @param $namespace String namespace * @param $namespace String namespace
*/ */
public function getBatch($namespace) { public function getBatch($namespace) {
if (!$this->finalized) $this->autoFinalize ? $this->finalize() : $this->plist->squash(true); if (!$this->finalized) $this->autoFinalize();
if (!isset($this->def->info[$namespace])) { $full = $this->getAll();
trigger_error('Cannot retrieve undefined namespace ' . htmlspecialchars($namespace), if (!isset($full[$namespace])) {
$this->triggerError('Cannot retrieve undefined namespace ' . htmlspecialchars($namespace),
E_USER_WARNING); E_USER_WARNING);
return; return;
} }
$full = $this->getAll();
return $full[$namespace]; return $full[$namespace];
} }
@@ -178,9 +216,10 @@ class HTMLPurifier_Config
/** /**
* Retrieves all directives, organized by namespace * Retrieves all directives, organized by namespace
* @warning This is a pretty inefficient function, avoid if you can
*/ */
public function getAll() { public function getAll() {
if (!$this->finalized) $this->autoFinalize ? $this->finalize() : $this->plist->squash(true); if (!$this->finalized) $this->autoFinalize();
$ret = array(); $ret = array();
foreach ($this->plist->squash() as $name => $value) { foreach ($this->plist->squash() as $name => $value) {
list($ns, $key) = explode('.', $name, 2); list($ns, $key) = explode('.', $name, 2);
@@ -191,29 +230,37 @@ class HTMLPurifier_Config
/** /**
* Sets a value to configuration. * Sets a value to configuration.
* @param $namespace String namespace
* @param $key String key * @param $key String key
* @param $value Mixed value * @param $value Mixed value
*/ */
public function set($namespace, $key, $value, $from_alias = false) { public function set($key, $value, $a = null) {
if (strpos($key, '.') === false) {
$namespace = $key;
$directive = $value;
$value = $a;
$key = "$key.$directive";
$this->triggerError("Using deprecated API: use \$config->set('$key', ...) instead", E_USER_NOTICE);
} else {
list($namespace) = explode('.', $key);
}
if ($this->isFinalized('Cannot set directive after finalization')) return; if ($this->isFinalized('Cannot set directive after finalization')) return;
if (!isset($this->def->info[$namespace][$key])) { if (!isset($this->def->info[$key])) {
trigger_error('Cannot set undefined directive ' . htmlspecialchars("$namespace.$key") . ' to value', $this->triggerError('Cannot set undefined directive ' . htmlspecialchars($key) . ' to value',
E_USER_WARNING); E_USER_WARNING);
return; return;
} }
$def = $this->def->info[$namespace][$key]; $def = $this->def->info[$key];
if (isset($def->isAlias)) { if (isset($def->isAlias)) {
if ($from_alias) { if ($this->aliasMode) {
trigger_error('Double-aliases not allowed, please fix '. $this->triggerError('Double-aliases not allowed, please fix '.
'ConfigSchema bug with' . "$namespace.$key", E_USER_ERROR); 'ConfigSchema bug with' . $key, E_USER_ERROR);
return; return;
} }
$this->set($new_ns = $def->namespace, $this->aliasMode = true;
$new_dir = $def->name, $this->set($def->key, $value);
$value, true); $this->aliasMode = false;
trigger_error("$namespace.$key is an alias, preferred directive name is $new_ns.$new_dir", E_USER_NOTICE); $this->triggerError("$key is an alias, preferred directive name is {$def->key}", E_USER_NOTICE);
return; return;
} }
@@ -231,7 +278,7 @@ class HTMLPurifier_Config
try { try {
$value = $this->parser->parse($value, $type, $allow_null); $value = $this->parser->parse($value, $type, $allow_null);
} catch (HTMLPurifier_VarParserException $e) { } catch (HTMLPurifier_VarParserException $e) {
trigger_error('Value for ' . "$namespace.$key" . ' is of invalid type, should be ' . HTMLPurifier_VarParser::getTypeName($type), E_USER_WARNING); $this->triggerError('Value for ' . $key . ' is of invalid type, should be ' . HTMLPurifier_VarParser::getTypeName($type), E_USER_WARNING);
return; return;
} }
if (is_string($value) && is_object($def)) { if (is_string($value) && is_object($def)) {
@@ -241,17 +288,17 @@ class HTMLPurifier_Config
} }
// check to see if the value is allowed // check to see if the value is allowed
if (isset($def->allowed) && !isset($def->allowed[$value])) { if (isset($def->allowed) && !isset($def->allowed[$value])) {
trigger_error('Value not supported, valid values are: ' . $this->triggerError('Value not supported, valid values are: ' .
$this->_listify($def->allowed), E_USER_WARNING); $this->_listify($def->allowed), E_USER_WARNING);
return; return;
} }
} }
$this->plist->set("$namespace.$key", $value); $this->plist->set($key, $value);
// reset definitions if the directives they depend on changed // reset definitions if the directives they depend on changed
// this is a very costly process, so it's discouraged // this is a very costly process, so it's discouraged
// with finalization // with finalization
if ($namespace == 'HTML' || $namespace == 'CSS') { if ($namespace == 'HTML' || $namespace == 'CSS' || $namespace == 'URI') {
$this->definitions[$namespace] = null; $this->definitions[$namespace] = null;
} }
@@ -291,9 +338,13 @@ class HTMLPurifier_Config
* @param $raw Whether or not definition should be returned raw * @param $raw Whether or not definition should be returned raw
*/ */
public function getDefinition($type, $raw = false) { public function getDefinition($type, $raw = false) {
if (!$this->finalized) $this->autoFinalize ? $this->finalize() : $this->plist->squash(true); if (!$this->finalized) $this->autoFinalize();
// temporarily suspend locks, so we can handle recursive definition calls
$lock = $this->lock;
$this->lock = null;
$factory = HTMLPurifier_DefinitionCacheFactory::instance(); $factory = HTMLPurifier_DefinitionCacheFactory::instance();
$cache = $factory->create($type, $this); $cache = $factory->create($type, $this);
$this->lock = $lock;
if (!$raw) { if (!$raw) {
// see if we can quickly supply a definition // see if we can quickly supply a definition
if (!empty($this->definitions[$type])) { if (!empty($this->definitions[$type])) {
@@ -328,14 +379,16 @@ class HTMLPurifier_Config
} }
// quick abort if raw // quick abort if raw
if ($raw) { if ($raw) {
if (is_null($this->get($type, 'DefinitionID'))) { if (is_null($this->get($type . '.DefinitionID'))) {
// fatally error out if definition ID not set // fatally error out if definition ID not set
throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID"); throw new HTMLPurifier_Exception("Cannot retrieve raw version without specifying %$type.DefinitionID");
} }
return $this->definitions[$type]; return $this->definitions[$type];
} }
// set it up // set it up
$this->lock = $type;
$this->definitions[$type]->setup($this); $this->definitions[$type]->setup($this);
$this->lock = null;
// save in cache // save in cache
$cache->set($this->definitions[$type], $this); $cache->set($this->definitions[$type], $this);
return $this->definitions[$type]; return $this->definitions[$type];
@@ -351,14 +404,12 @@ class HTMLPurifier_Config
foreach ($config_array as $key => $value) { foreach ($config_array as $key => $value) {
$key = str_replace('_', '.', $key); $key = str_replace('_', '.', $key);
if (strpos($key, '.') !== false) { if (strpos($key, '.') !== false) {
// condensed form $this->set($key, $value);
list($namespace, $directive) = explode('.', $key);
$this->set($namespace, $directive, $value);
} else { } else {
$namespace = $key; $namespace = $key;
$namespace_values = $value; $namespace_values = $value;
foreach ($namespace_values as $directive => $value) { foreach ($namespace_values as $directive => $value) {
$this->set($namespace, $directive, $value); $this->set($namespace .'.'. $directive, $value);
} }
} }
} }
@@ -394,16 +445,15 @@ class HTMLPurifier_Config
} }
} }
$ret = array(); $ret = array();
foreach ($schema->info as $ns => $keypairs) { foreach ($schema->info as $key => $def) {
foreach ($keypairs as $directive => $def) { list($ns, $directive) = explode('.', $key, 2);
if ($allowed !== true) { if ($allowed !== true) {
if (isset($blacklisted_directives["$ns.$directive"])) continue; if (isset($blacklisted_directives["$ns.$directive"])) continue;
if (!isset($allowed_directives["$ns.$directive"]) && !isset($allowed_ns[$ns])) continue; if (!isset($allowed_directives["$ns.$directive"]) && !isset($allowed_ns[$ns])) continue;
}
if (isset($def->isAlias)) continue;
if ($directive == 'DefinitionID' || $directive == 'DefinitionRev') continue;
$ret[] = array($ns, $directive);
} }
if (isset($def->isAlias)) continue;
if ($directive == 'DefinitionID' || $directive == 'DefinitionRev') continue;
$ret[] = array($ns, $directive);
} }
return $ret; return $ret;
} }
@@ -472,7 +522,7 @@ class HTMLPurifier_Config
*/ */
public function isFinalized($error = false) { public function isFinalized($error = false) {
if ($this->finalized && $error) { if ($this->finalized && $error) {
trigger_error($error, E_USER_ERROR); $this->triggerError($error, E_USER_ERROR);
} }
return $this->finalized; return $this->finalized;
} }
@@ -482,7 +532,11 @@ class HTMLPurifier_Config
* already finalized * already finalized
*/ */
public function autoFinalize() { public function autoFinalize() {
if (!$this->finalized && $this->autoFinalize) $this->finalize(); if ($this->autoFinalize) {
$this->finalize();
} else {
$this->plist->squash(true);
}
} }
/** /**
@@ -490,6 +544,35 @@ class HTMLPurifier_Config
*/ */
public function finalize() { public function finalize() {
$this->finalized = true; $this->finalized = true;
unset($this->parser);
}
/**
* Produces a nicely formatted error message by supplying the
* stack frame information from two levels up and OUTSIDE of
* HTMLPurifier_Config.
*/
protected function triggerError($msg, $no) {
// determine previous stack frame
$backtrace = debug_backtrace();
if ($this->chatty && isset($backtrace[1])) {
$frame = $backtrace[1];
$extra = " on line {$frame['line']} in file {$frame['file']}";
} else {
$extra = '';
}
trigger_error($msg . $extra, $no);
}
/**
* Returns a serialized form of the configuration object that can
* be reconstituted.
*/
public function serialize() {
$this->getDefinition('HTML');
$this->getDefinition('CSS');
$this->getDefinition('URI');
return serialize($this);
} }
} }

View File

@@ -87,24 +87,13 @@ class HTMLPurifier_ConfigSchema {
* HTMLPurifier_DirectiveDef::$type for allowed values * HTMLPurifier_DirectiveDef::$type for allowed values
* @param $allow_null Whether or not to allow null values * @param $allow_null Whether or not to allow null values
*/ */
public function add($namespace, $name, $default, $type, $allow_null) { public function add($key, $default, $type, $allow_null) {
$obj = new stdclass(); $obj = new stdclass();
$obj->type = is_int($type) ? $type : HTMLPurifier_VarParser::$types[$type]; $obj->type = is_int($type) ? $type : HTMLPurifier_VarParser::$types[$type];
if ($allow_null) $obj->allow_null = true; if ($allow_null) $obj->allow_null = true;
$this->info[$namespace][$name] = $obj; $this->info[$key] = $obj;
$this->defaults[$namespace][$name] = $default; $this->defaults[$key] = $default;
$this->defaultPlist->set("$namespace.$name", $default); $this->defaultPlist->set($key, $default);
}
/**
* Defines a namespace for directives to be put into.
* @warning This is slightly different from the corresponding static
* method.
* @param $namespace Namespace's name
*/
public function addNamespace($namespace) {
$this->info[$namespace] = array();
$this->defaults[$namespace] = array();
} }
/** /**
@@ -116,12 +105,12 @@ class HTMLPurifier_ConfigSchema {
* @param $name Name of Directive * @param $name Name of Directive
* @param $aliases Hash of aliased values to the real alias * @param $aliases Hash of aliased values to the real alias
*/ */
public function addValueAliases($namespace, $name, $aliases) { public function addValueAliases($key, $aliases) {
if (!isset($this->info[$namespace][$name]->aliases)) { if (!isset($this->info[$key]->aliases)) {
$this->info[$namespace][$name]->aliases = array(); $this->info[$key]->aliases = array();
} }
foreach ($aliases as $alias => $real) { foreach ($aliases as $alias => $real) {
$this->info[$namespace][$name]->aliases[$alias] = $real; $this->info[$key]->aliases[$alias] = $real;
} }
} }
@@ -133,8 +122,8 @@ class HTMLPurifier_ConfigSchema {
* @param $name Name of directive * @param $name Name of directive
* @param $allowed Lookup array of allowed values * @param $allowed Lookup array of allowed values
*/ */
public function addAllowedValues($namespace, $name, $allowed) { public function addAllowedValues($key, $allowed) {
$this->info[$namespace][$name]->allowed = $allowed; $this->info[$key]->allowed = $allowed;
} }
/** /**
@@ -144,88 +133,26 @@ class HTMLPurifier_ConfigSchema {
* @param $new_namespace * @param $new_namespace
* @param $new_name Directive that the alias will be to * @param $new_name Directive that the alias will be to
*/ */
public function addAlias($namespace, $name, $new_namespace, $new_name) { public function addAlias($key, $new_key) {
$obj = new stdclass; $obj = new stdclass;
$obj->namespace = $new_namespace; $obj->key = $new_key;
$obj->name = $new_name;
$obj->isAlias = true; $obj->isAlias = true;
$this->info[$namespace][$name] = $obj; $this->info[$key] = $obj;
} }
/** /**
* Replaces any stdclass that only has the type property with type integer. * Replaces any stdclass that only has the type property with type integer.
*/ */
public function postProcess() { public function postProcess() {
foreach ($this->info as $namespace => $info) { foreach ($this->info as $key => $v) {
foreach ($info as $directive => $v) { if (count((array) $v) == 1) {
if (count((array) $v) == 1) { $this->info[$key] = $v->type;
$this->info[$namespace][$directive] = $v->type; } elseif (count((array) $v) == 2 && isset($v->allow_null)) {
} elseif (count((array) $v) == 2 && isset($v->allow_null)) { $this->info[$key] = -$v->type;
$this->info[$namespace][$directive] = -$v->type;
}
} }
} }
} }
// DEPRECATED METHODS
/** @see HTMLPurifier_ConfigSchema->set() */
public static function define($namespace, $name, $default, $type, $description) {
HTMLPurifier_ConfigSchema::deprecated(__METHOD__);
$type_values = explode('/', $type, 2);
$type = $type_values[0];
$modifier = isset($type_values[1]) ? $type_values[1] : false;
$allow_null = ($modifier === 'null');
$def = HTMLPurifier_ConfigSchema::instance();
$def->add($namespace, $name, $default, $type, $allow_null);
}
/** @see HTMLPurifier_ConfigSchema->addNamespace() */
public static function defineNamespace($namespace, $description) {
HTMLPurifier_ConfigSchema::deprecated(__METHOD__);
$def = HTMLPurifier_ConfigSchema::instance();
$def->addNamespace($namespace);
}
/** @see HTMLPurifier_ConfigSchema->addValueAliases() */
public static function defineValueAliases($namespace, $name, $aliases) {
HTMLPurifier_ConfigSchema::deprecated(__METHOD__);
$def = HTMLPurifier_ConfigSchema::instance();
$def->addValueAliases($namespace, $name, $aliases);
}
/** @see HTMLPurifier_ConfigSchema->addAllowedValues() */
public static function defineAllowedValues($namespace, $name, $allowed_values) {
HTMLPurifier_ConfigSchema::deprecated(__METHOD__);
$allowed = array();
foreach ($allowed_values as $value) {
$allowed[$value] = true;
}
$def = HTMLPurifier_ConfigSchema::instance();
$def->addAllowedValues($namespace, $name, $allowed);
}
/** @see HTMLPurifier_ConfigSchema->addAlias() */
public static function defineAlias($namespace, $name, $new_namespace, $new_name) {
HTMLPurifier_ConfigSchema::deprecated(__METHOD__);
$def = HTMLPurifier_ConfigSchema::instance();
$def->addAlias($namespace, $name, $new_namespace, $new_name);
}
/** @deprecated, use HTMLPurifier_VarParser->parse() */
public function validate($a, $b, $c = false) {
trigger_error("HTMLPurifier_ConfigSchema->validate deprecated, use HTMLPurifier_VarParser->parse instead", E_USER_NOTICE);
$parser = new HTMLPurifier_VarParser();
return $parser->parse($a, $b, $c);
}
/**
* Throws an E_USER_NOTICE stating that a method is deprecated.
*/
private static function deprecated($method) {
trigger_error("Static HTMLPurifier_ConfigSchema::$method deprecated, use add*() method instead", E_USER_NOTICE);
}
} }
// vim: et sw=4 sts=4 // vim: et sw=4 sts=4

View File

@@ -9,36 +9,28 @@ class HTMLPurifier_ConfigSchema_Builder_ConfigSchema
public function build($interchange) { public function build($interchange) {
$schema = new HTMLPurifier_ConfigSchema(); $schema = new HTMLPurifier_ConfigSchema();
foreach ($interchange->namespaces as $n) {
$schema->addNamespace($n->namespace);
}
foreach ($interchange->directives as $d) { foreach ($interchange->directives as $d) {
$schema->add( $schema->add(
$d->id->namespace, $d->id->key,
$d->id->directive,
$d->default, $d->default,
$d->type, $d->type,
$d->typeAllowsNull $d->typeAllowsNull
); );
if ($d->allowed !== null) { if ($d->allowed !== null) {
$schema->addAllowedValues( $schema->addAllowedValues(
$d->id->namespace, $d->id->key,
$d->id->directive,
$d->allowed $d->allowed
); );
} }
foreach ($d->aliases as $alias) { foreach ($d->aliases as $alias) {
$schema->addAlias( $schema->addAlias(
$alias->namespace, $alias->key,
$alias->directive, $d->id->key
$d->id->namespace,
$d->id->directive
); );
} }
if ($d->valueAliases !== null) { if ($d->valueAliases !== null) {
$schema->addValueAliases( $schema->addValueAliases(
$d->id->namespace, $d->id->key,
$d->id->directive,
$d->valueAliases $d->valueAliases
); );
} }

View File

@@ -8,6 +8,7 @@ class HTMLPurifier_ConfigSchema_Builder_Xml extends XMLWriter
{ {
protected $interchange; protected $interchange;
private $namespace;
protected function writeHTMLDiv($html) { protected function writeHTMLDiv($html) {
$this->startElement('div'); $this->startElement('div');
@@ -34,36 +35,33 @@ class HTMLPurifier_ConfigSchema_Builder_Xml extends XMLWriter
$this->startElement('configdoc'); $this->startElement('configdoc');
$this->writeElement('title', $interchange->name); $this->writeElement('title', $interchange->name);
foreach ($interchange->namespaces as $namespace) { foreach ($interchange->directives as $directive) {
$this->buildNamespace($namespace); $this->buildDirective($directive);
} }
if ($this->namespace) $this->endElement(); // namespace
$this->endElement(); // configdoc $this->endElement(); // configdoc
$this->flush(); $this->flush();
} }
public function buildNamespace($namespace) { public function buildDirective($directive) {
$this->startElement('namespace');
$this->writeAttribute('id', $namespace->namespace);
$this->writeElement('name', $namespace->namespace); // Kludge, although I suppose having a notion of a "root namespace"
$this->startElement('description'); // certainly makes things look nicer when documentation is built.
$this->writeHTMLDiv($namespace->description); // Depends on things being sorted.
$this->endElement(); // description if (!$this->namespace || $this->namespace !== $directive->id->getRootNamespace()) {
if ($this->namespace) $this->endElement(); // namespace
foreach ($this->interchange->directives as $directive) { $this->namespace = $directive->id->getRootNamespace();
if ($directive->id->namespace !== $namespace->namespace) continue; $this->startElement('namespace');
$this->buildDirective($directive); $this->writeAttribute('id', $this->namespace);
$this->writeElement('name', $this->namespace);
} }
$this->endElement(); // namespace
}
public function buildDirective($directive) {
$this->startElement('directive'); $this->startElement('directive');
$this->writeAttribute('id', $directive->id->toString()); $this->writeAttribute('id', $directive->id->toString());
$this->writeElement('name', $directive->id->directive); $this->writeElement('name', $directive->id->getDirective());
$this->startElement('aliases'); $this->startElement('aliases');
foreach ($directive->aliases as $alias) $this->writeElement('alias', $alias->toString()); foreach ($directive->aliases as $alias) $this->writeElement('alias', $alias->toString());

View File

@@ -13,26 +13,11 @@ class HTMLPurifier_ConfigSchema_Interchange
*/ */
public $name; public $name;
/**
* Array of Namespace ID => array(namespace info)
*/
public $namespaces = array();
/** /**
* Array of Directive ID => array(directive info) * Array of Directive ID => array(directive info)
*/ */
public $directives = array(); public $directives = array();
/**
* Adds a namespace array to $namespaces
*/
public function addNamespace($namespace) {
if (isset($this->namespaces[$i = $namespace->namespace])) {
throw new HTMLPurifier_ConfigSchema_Exception("Cannot redefine namespace '$i'");
}
$this->namespaces[$i] = $namespace;
}
/** /**
* Adds a directive array to $directives * Adds a directive array to $directives
*/ */

View File

@@ -6,11 +6,10 @@
class HTMLPurifier_ConfigSchema_Interchange_Id class HTMLPurifier_ConfigSchema_Interchange_Id
{ {
public $namespace, $directive; public $key;
public function __construct($namespace, $directive) { public function __construct($key) {
$this->namespace = $namespace; $this->key = $key;
$this->directive = $directive;
} }
/** /**
@@ -18,12 +17,19 @@ class HTMLPurifier_ConfigSchema_Interchange_Id
* cause problems for PHP 5.0 support. * cause problems for PHP 5.0 support.
*/ */
public function toString() { public function toString() {
return $this->namespace . '.' . $this->directive; return $this->key;
}
public function getRootNamespace() {
return substr($this->key, 0, strpos($this->key, "."));
}
public function getDirective() {
return substr($this->key, strpos($this->key, ".") + 1);
} }
public static function make($id) { public static function make($id) {
list($namespace, $directive) = explode('.', $id); return new HTMLPurifier_ConfigSchema_Interchange_Id($id);
return new HTMLPurifier_ConfigSchema_Interchange_Id($namespace, $directive);
} }
} }

View File

@@ -1,21 +0,0 @@
<?php
/**
* Interchange component class describing namespaces.
*/
class HTMLPurifier_ConfigSchema_Interchange_Namespace
{
/**
* Name of namespace defined.
*/
public $namespace;
/**
* HTML description.
*/
public $description;
}
// vim: et sw=4 sts=4

View File

@@ -13,13 +13,17 @@ class HTMLPurifier_ConfigSchema_InterchangeBuilder
} }
public static function buildFromDirectory($dir = null) { public static function buildFromDirectory($dir = null) {
$parser = new HTMLPurifier_StringHashParser();
$builder = new HTMLPurifier_ConfigSchema_InterchangeBuilder(); $builder = new HTMLPurifier_ConfigSchema_InterchangeBuilder();
$interchange = new HTMLPurifier_ConfigSchema_Interchange(); $interchange = new HTMLPurifier_ConfigSchema_Interchange();
return $builder->buildDir($interchange, $dir);
}
if (!$dir) $dir = HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema/'; public function buildDir($interchange, $dir = null) {
$info = parse_ini_file($dir . 'info.ini'); if (!$dir) $dir = HTMLPURIFIER_PREFIX . '/HTMLPurifier/ConfigSchema/schema';
$interchange->name = $info['name']; if (file_exists($dir . '/info.ini')) {
$info = parse_ini_file($dir . '/info.ini');
$interchange->name = $info['name'];
}
$files = array(); $files = array();
$dh = opendir($dir); $dh = opendir($dir);
@@ -33,15 +37,20 @@ class HTMLPurifier_ConfigSchema_InterchangeBuilder
sort($files); sort($files);
foreach ($files as $file) { foreach ($files as $file) {
$builder->build( $this->buildFile($interchange, $dir . '/' . $file);
$interchange,
new HTMLPurifier_StringHash( $parser->parseFile($dir . $file) )
);
} }
return $interchange; return $interchange;
} }
public function buildFile($interchange, $file) {
$parser = new HTMLPurifier_StringHashParser();
$this->build(
$interchange,
new HTMLPurifier_StringHash( $parser->parseFile($file) )
);
}
/** /**
* Builds an interchange object based on a hash. * Builds an interchange object based on a hash.
* @param $interchange HTMLPurifier_ConfigSchema_Interchange object to build * @param $interchange HTMLPurifier_ConfigSchema_Interchange object to build
@@ -55,22 +64,17 @@ class HTMLPurifier_ConfigSchema_InterchangeBuilder
throw new HTMLPurifier_ConfigSchema_Exception('Hash does not have any ID'); throw new HTMLPurifier_ConfigSchema_Exception('Hash does not have any ID');
} }
if (strpos($hash['ID'], '.') === false) { if (strpos($hash['ID'], '.') === false) {
$this->buildNamespace($interchange, $hash); if (count($hash) == 2 && isset($hash['DESCRIPTION'])) {
$hash->offsetGet('DESCRIPTION'); // prevent complaining
} else {
throw new HTMLPurifier_ConfigSchema_Exception('All directives must have a namespace');
}
} else { } else {
$this->buildDirective($interchange, $hash); $this->buildDirective($interchange, $hash);
} }
$this->_findUnused($hash); $this->_findUnused($hash);
} }
public function buildNamespace($interchange, $hash) {
$namespace = new HTMLPurifier_ConfigSchema_Interchange_Namespace();
$namespace->namespace = $hash->offsetGet('ID');
if (isset($hash['DESCRIPTION'])) {
$namespace->description = $hash->offsetGet('DESCRIPTION');
}
$interchange->addNamespace($namespace);
}
public function buildDirective($interchange, $hash) { public function buildDirective($interchange, $hash) {
$directive = new HTMLPurifier_ConfigSchema_Interchange_Directive(); $directive = new HTMLPurifier_ConfigSchema_Interchange_Directive();

View File

@@ -39,10 +39,6 @@ class HTMLPurifier_ConfigSchema_Validator
$this->aliases = array(); $this->aliases = array();
// PHP is a bit lax with integer <=> string conversions in // PHP is a bit lax with integer <=> string conversions in
// arrays, so we don't use the identical !== comparison // arrays, so we don't use the identical !== comparison
foreach ($interchange->namespaces as $i => $namespace) {
if ($i != $namespace->namespace) $this->error(false, "Integrity violation: key '$i' does not match internal id '{$namespace->namespace}'");
$this->validateNamespace($namespace);
}
foreach ($interchange->directives as $i => $directive) { foreach ($interchange->directives as $i => $directive) {
$id = $directive->id->toString(); $id = $directive->id->toString();
if ($i != $id) $this->error(false, "Integrity violation: key '$i' does not match internal id '$id'"); if ($i != $id) $this->error(false, "Integrity violation: key '$i' does not match internal id '$id'");
@@ -51,20 +47,6 @@ class HTMLPurifier_ConfigSchema_Validator
return true; return true;
} }
/**
* Validates a HTMLPurifier_ConfigSchema_Interchange_Namespace object.
*/
public function validateNamespace($n) {
$this->context[] = "namespace '{$n->namespace}'";
$this->with($n, 'namespace')
->assertNotEmpty()
->assertAlnum(); // implicit assertIsString handled by InterchangeBuilder
$this->with($n, 'description')
->assertNotEmpty()
->assertIsString(); // handled by InterchangeBuilder
array_pop($this->context);
}
/** /**
* Validates a HTMLPurifier_ConfigSchema_Interchange_Id object. * Validates a HTMLPurifier_ConfigSchema_Interchange_Id object.
*/ */
@@ -75,12 +57,11 @@ class HTMLPurifier_ConfigSchema_Validator
// handled by InterchangeBuilder // handled by InterchangeBuilder
$this->error(false, 'is not an instance of HTMLPurifier_ConfigSchema_Interchange_Id'); $this->error(false, 'is not an instance of HTMLPurifier_ConfigSchema_Interchange_Id');
} }
if (!isset($this->interchange->namespaces[$id->namespace])) { // keys are now unconstrained (we might want to narrow down to A-Za-z0-9.)
$this->error('namespace', 'does not exist'); // assumes that the namespace was validated already // we probably should check that it has at least one namespace
} $this->with($id, 'key')
$this->with($id, 'directive')
->assertNotEmpty() ->assertNotEmpty()
->assertAlnum(); // implicit assertIsString handled by InterchangeBuilder ->assertIsString(); // implicit assertIsString handled by InterchangeBuilder
array_pop($this->context); array_pop($this->context);
} }

View File

@@ -0,0 +1,8 @@
Attr.AllowedClasses
TYPE: lookup/null
VERSION: 4.0.0
DEFAULT: null
--DESCRIPTION--
List of allowed class values in the class attribute. By default, this is null,
which means all classes are allowed.
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,19 @@
Attr.ClassUseCDATA
TYPE: bool/null
DEFAULT: null
VERSION: 4.0.0
--DESCRIPTION--
If null, class will auto-detect the doctype and, if matching XHTML 1.1 or
XHTML 2.0, will use the restrictive NMTOKENS specification of class. Otherwise,
it will use a relaxed CDATA definition. If true, the relaxed CDATA definition
is forced; if false, the NMTOKENS definition is forced. To get behavior
of HTML Purifier prior to 4.0.0, set this directive to false.
Some rational behind the auto-detection:
in previous versions of HTML Purifier, it was assumed that the form of
class was NMTOKENS, as specified by the XHTML Modularization (representing
XHTML 1.1 and XHTML 2.0). The DTDs for HTML 4.01 and XHTML 1.0, however
specify class as CDATA. HTML 5 effectively defines it as CDATA, but
with the additional constraint that each name should be unique (this is not
explicitly outlined in previous specifications).
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,8 @@
Attr.ForbiddenClasses
TYPE: lookup
VERSION: 4.0.0
DEFAULT: array()
--DESCRIPTION--
List of forbidden class values in the class attribute. By default, this is
empty, which means that no classes are forbidden. See also %Attr.AllowedClasses.
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
Attr
DESCRIPTION: Features regarding attribute validation.
--# vim: et sw=4 sts=4

View File

@@ -1,9 +1,9 @@
AutoFormatParam.PurifierLinkifyDocURL AutoFormat.PurifierLinkify.DocURL
TYPE: string TYPE: string
VERSION: 2.0.1 VERSION: 2.0.1
DEFAULT: '#%s' DEFAULT: '#%s'
ALIASES: AutoFormatParam.PurifierLinkifyDocURL
--DESCRIPTION-- --DESCRIPTION--
<p> <p>
Location of configuration documentation to link to, let %s substitute Location of configuration documentation to link to, let %s substitute
into the configuration's namespace and directive names sans the percent into the configuration's namespace and directive names sans the percent

View File

@@ -0,0 +1,11 @@
AutoFormat.RemoveEmpty.RemoveNbsp.Exceptions
TYPE: lookup
VERSION: 4.0.0
DEFAULT: array('td' => true, 'th' => true)
--DESCRIPTION--
<p>
When %AutoFormat.RemoveEmpty and %AutoFormat.RemoveEmpty.RemoveNbsp
are enabled, this directive defines what HTML elements should not be
removede if they have only a non-breaking space in them.
</p>
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,15 @@
AutoFormat.RemoveEmpty.RemoveNbsp
TYPE: bool
VERSION: 4.0.0
DEFAULT: false
--DESCRIPTION--
<p>
When enabled, HTML Purifier will treat any elements that contain only
non-breaking spaces as well as regular whitespace as empty, and remove
them when %AutoForamt.RemoveEmpty is enabled.
</p>
<p>
See %AutoFormat.RemoveEmpty.RemoveNbsp.Exceptions for a list of elements
that don't have this behavior applied to them.
</p>
--# vim: et sw=4 sts=4

View File

@@ -31,7 +31,8 @@ DEFAULT: false
</p> </p>
<p> <p>
Elements that contain only whitespace will be treated as empty. Non-breaking Elements that contain only whitespace will be treated as empty. Non-breaking
spaces, however, do not count as whitespace. spaces, however, do not count as whitespace. See
%AutoFormat.RemoveEmpty.RemoveNbsp for alternate behavior.
</p> </p>
<p> <p>
This algorithm is not perfect; you may still notice some empty tags, This algorithm is not perfect; you may still notice some empty tags,
@@ -39,7 +40,7 @@ DEFAULT: false
because they were not permitted in that context, or tags that, after because they were not permitted in that context, or tags that, after
being auto-closed by another tag, where empty. This is for safety reasons being auto-closed by another tag, where empty. This is for safety reasons
to prevent clever code from breaking validation. The general rule of thumb: to prevent clever code from breaking validation. The general rule of thumb:
if a tag looked empty on the way end, it will get removed; if HTML Purifier if a tag looked empty on the way in, it will get removed; if HTML Purifier
made it empty, it will stay. made it empty, it will stay.
</p> </p>
--# vim: et sw=4 sts=4 --# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
AutoFormat.RemoveSpansWithoutAttributes
TYPE: bool
VERSION: 4.0.1
DEFAULT: false
--DESCRIPTION--
<p>
This directive causes <code>span</code> tags without any attributes
to be removed. It will also remove spans that had all attributes
removed during processing.
</p>
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
AutoFormat
DESCRIPTION: Configuration for activating auto-formatting functionality (also known as <code>Injector</code>s)
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
AutoFormatParam
DESCRIPTION: Configuration for customizing auto-formatting functionality
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
CSS
DESCRIPTION: Configuration regarding allowed CSS.
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
Cache
DESCRIPTION: Configuration for DefinitionCache and related subclasses.
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
Core
DESCRIPTION: Core features that are always available.
--# vim: et sw=4 sts=4

View File

@@ -1,8 +1,8 @@
FilterParam.ExtractStyleBlocksEscaping Filter.ExtractStyleBlocks.Escaping
TYPE: bool TYPE: bool
VERSION: 3.0.0 VERSION: 3.0.0
DEFAULT: true DEFAULT: true
ALIASES: Filter.ExtractStyleBlocksEscaping ALIASES: Filter.ExtractStyleBlocksEscaping, FilterParam.ExtractStyleBlocksEscaping
--DESCRIPTION-- --DESCRIPTION--
<p> <p>

View File

@@ -1,8 +1,8 @@
FilterParam.ExtractStyleBlocksScope Filter.ExtractStyleBlocks.Scope
TYPE: string/null TYPE: string/null
VERSION: 3.0.0 VERSION: 3.0.0
DEFAULT: NULL DEFAULT: NULL
ALIASES: Filter.ExtractStyleBlocksScope ALIASES: Filter.ExtractStyleBlocksScope, FilterParam.ExtractStyleBlocksScope
--DESCRIPTION-- --DESCRIPTION--
<p> <p>

View File

@@ -1,7 +1,8 @@
FilterParam.ExtractStyleBlocksTidyImpl Filter.ExtractStyleBlocks.TidyImpl
TYPE: mixed/null TYPE: mixed/null
VERSION: 3.1.0 VERSION: 3.1.0
DEFAULT: NULL DEFAULT: NULL
ALIASES: FilterParam.ExtractStyleBlocksTidyImpl
--DESCRIPTION-- --DESCRIPTION--
<p> <p>
If left NULL, HTML Purifier will attempt to instantiate a <code>csstidy</code> If left NULL, HTML Purifier will attempt to instantiate a <code>csstidy</code>

View File

@@ -1,3 +0,0 @@
Filter
DESCRIPTION: Directives for turning filters on and off, or specifying custom filters.
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
FilterParam
DESCRIPTION: Configuration for filters.
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
HTML.Attr.Name.UseCDATA
TYPE: bool
DEFAULT: false
VERSION: 4.0.0
--DESCRIPTION--
The W3C specification DTD defines the name attribute to be CDATA, not ID, due
to limitations of DTD. In certain documents, this relaxed behavior is desired,
whether it is to specify duplicate names, or to specify names that would be
illegal IDs (for example, names that begin with a digit.) Set this configuration
directive to true to use the relaxed parsing rules.
--# vim: et sw=4 sts=4

View File

@@ -7,8 +7,7 @@ DEFAULT: false
Whether or not to permit embed tags in documents, with a number of extra Whether or not to permit embed tags in documents, with a number of extra
security features added to prevent script execution. This is similar to security features added to prevent script execution. This is similar to
what websites like MySpace do to embed tags. Embed is a proprietary what websites like MySpace do to embed tags. Embed is a proprietary
element and will cause your website to stop validating. You probably want element and will cause your website to stop validating; you should
to enable this with %HTML.SafeObject. see if you can use %Output.FlashCompat with %HTML.SafeObject instead
<strong>Highly experimental.</strong> first.</p>
</p>
--# vim: et sw=4 sts=4 --# vim: et sw=4 sts=4

View File

@@ -6,9 +6,8 @@ DEFAULT: false
<p> <p>
Whether or not to permit object tags in documents, with a number of extra Whether or not to permit object tags in documents, with a number of extra
security features added to prevent script execution. This is similar to security features added to prevent script execution. This is similar to
what websites like MySpace do to object tags. You may also want to what websites like MySpace do to object tags. You should also enable
enable %HTML.SafeEmbed for maximum interoperability with Internet Explorer, %Output.FlashCompat in order to generate Internet Explorer
although embed tags will cause your website to stop validating. compatibility code for your object tags.
<strong>Highly experimental.</strong>
</p> </p>
--# vim: et sw=4 sts=4 --# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
HTML
DESCRIPTION: Configuration regarding allowed HTML.
--# vim: et sw=4 sts=4

View File

@@ -0,0 +1,11 @@
Output.FlashCompat
TYPE: bool
VERSION: 4.1.0
DEFAULT: false
--DESCRIPTION--
<p>
If true, HTML Purifier will generate Internet Explorer compatibility
code for all object code. This is highly recommended if you enable
%HTML.SafeObject.
</p>
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
Output
DESCRIPTION: Configuration relating to the generation of (X)HTML.
--# vim: et sw=4 sts=4

View File

@@ -1,3 +0,0 @@
Test
DESCRIPTION: Developer testing configuration for our unit tests.
--# vim: et sw=4 sts=4

Some files were not shown because too many files have changed in this diff Show More