mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-08-04 05:07:55 +02:00
Compare commits
2 Commits
v2.1.1-str
...
v2.1.3-str
Author | SHA1 | Date | |
---|---|---|---|
|
9db861e356 | ||
|
b3f0e6c86c |
237
INSTALL
237
INSTALL
@@ -1,34 +1,81 @@
|
||||
|
||||
Install
|
||||
How to install HTML Purifier
|
||||
|
||||
HTML Purifier is designed to run out of the box, so actually using the library
|
||||
is extremely easy. (Although, if you were looking for a step-by-step
|
||||
installation GUI, you've come to the wrong place!) The impatient can scroll
|
||||
down to the bottom of this INSTALL document to see the code, but you really
|
||||
should make sure a few things are properly done.
|
||||
|
||||
|
||||
HTML Purifier is designed to run out of the box, so actually using the
|
||||
library is extremely easy. (Although... if you were looking for a
|
||||
step-by-step installation GUI, you've downloaded the wrong software!)
|
||||
|
||||
While the impatient can get going immediately with some of the sample
|
||||
code at the bottom of this library, it's well worth performing some
|
||||
basic sanity checks to get the most out of this library.
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
1. Compatibility
|
||||
|
||||
HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has no
|
||||
core dependencies with other libraries.
|
||||
HTML Purifier works in both PHP 4 and PHP 5, and is actively tested from
|
||||
PHP 4.3.7 and up (see tests/multitest.php for specific versions). It has
|
||||
no core dependencies with other libraries. PHP 4 support will be
|
||||
deprecated on December 31, 2007, at which time only essential security
|
||||
fixes will be issued for the PHP 4 version until August 8, 2008.
|
||||
|
||||
Optional extensions are iconv (usually installed) and tidy (also common).
|
||||
If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
|
||||
not having either of these extensions.
|
||||
These optional extensions can enhance the capabilities of HTML Purifier:
|
||||
|
||||
* iconv : Converts text to and from non-UTF-8 encodings
|
||||
* tidy : Used for pretty-printing HTML
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
2. Reconnaissance
|
||||
|
||||
2. Including the library
|
||||
A big plus of HTML Purifier is its inerrant support of standards, so
|
||||
your web-pages should be standards-compliant. (They should also use
|
||||
semantic markup, but that's another issue altogether, one HTML Purifier
|
||||
cannot fix without reading your mind.)
|
||||
|
||||
Simply use:
|
||||
HTML Purifier can process these doctypes:
|
||||
|
||||
* XHTML 1.0 Transitional (default)
|
||||
* XHTML 1.0 Strict
|
||||
* HTML 4.01 Transitional
|
||||
* HTML 4.01 Strict
|
||||
* XHTML 1.1
|
||||
|
||||
...and these character encodings:
|
||||
|
||||
* UTF-8 (default)
|
||||
* Any encoding iconv supports (with crippled internationalization support)
|
||||
|
||||
These defaults reflect what my choices where be if I were authoring an
|
||||
HTML document, however, what you choose depends on the nature of your
|
||||
codebase. If you don't know what doctype you are using, you can determine
|
||||
the doctype from this identifier at the top of your source code:
|
||||
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
|
||||
...and the character encoding from this code:
|
||||
|
||||
<meta http-equiv="Content-type" content="text/html;charset=ENCODING">
|
||||
|
||||
If the character encoding declaration is missing, STOP NOW, and
|
||||
read 'docs/enduser-utf8.html' (web accessible at
|
||||
http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is
|
||||
present, read this document anyway, as most websites specify character
|
||||
encoding incorrectly.
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
3. Including the library
|
||||
|
||||
The procedure is quite simple:
|
||||
|
||||
require_once '/path/to/library/HTMLPurifier.auto.php';
|
||||
|
||||
...and you're good to go. Since HTML Purifier's codebase is fairly
|
||||
large, I recommend only including HTML Purifier when you need it.
|
||||
I recommend only including HTML Purifier when you need it, because that
|
||||
call represents the inclusion of a lot of PHP files which constitute
|
||||
the bulk of HTML Purifier's memory usage.
|
||||
|
||||
If you don't like your include_path to be fiddled around with, simply set
|
||||
HTML Purifier's library/ directory to the include path yourself and then:
|
||||
@@ -39,46 +86,7 @@ Only the contents in the library/ folder are necessary, so you can remove
|
||||
everything else when using HTML Purifier in a production environment.
|
||||
|
||||
|
||||
|
||||
3. Preparing the proper output environment
|
||||
|
||||
HTML Purifier is all about web-standards, so accordingly your webpages should
|
||||
be standards compliant. HTML Purifier can deal with these doctypes:
|
||||
|
||||
* XHTML 1.0 Transitional (default)
|
||||
* XHTML 1.0 Strict
|
||||
* HTML 4.01 Transitional
|
||||
* HTML 4.01 Strict
|
||||
* XHTML 1.1 (sans Ruby)
|
||||
|
||||
...and these character encodings:
|
||||
|
||||
* UTF-8 (default)
|
||||
* Any encoding iconv supports (support is crippled for i18n though)
|
||||
|
||||
The defaults are there for a reason: they are best-practice choices that
|
||||
should not be changed lightly. For those of you in the dark, you can determine
|
||||
the doctype from this code in your HTML documents:
|
||||
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||
|
||||
...and the character encoding from this code:
|
||||
|
||||
<meta http-equiv="Content-type" content="text/html;charset=ENCODING">
|
||||
|
||||
For legacy codebases these declarations may be missing. If that is the case,
|
||||
STOP, and read docs/enduser-utf8.html
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
You may currently be vulnerable to XSS and other security threats, and HTML
|
||||
Purifier won't be able to fix that.
|
||||
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
4. Configuration
|
||||
|
||||
HTML Purifier is designed to run out-of-the-box, but occasionally HTML
|
||||
@@ -95,7 +103,6 @@ object and read on:
|
||||
$config = HTMLPurifier_Config::createDefault();
|
||||
|
||||
|
||||
|
||||
4.1. Setting a different character encoding
|
||||
|
||||
You really shouldn't use any other encoding except UTF-8, especially if you
|
||||
@@ -122,10 +129,6 @@ but please be cognizant of the issues the "solution" creates (for this
|
||||
reason, I do not include the solution in this document).
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
4.2. Setting a different doctype
|
||||
|
||||
For those of you using HTML 4.01 Transitional, you can disable
|
||||
@@ -135,7 +138,6 @@ XHTML output like this:
|
||||
|
||||
Other supported doctypes include:
|
||||
|
||||
|
||||
* HTML 4.01 Strict
|
||||
* HTML 4.01 Transitional
|
||||
* XHTML 1.0 Strict
|
||||
@@ -143,7 +145,6 @@ Other supported doctypes include:
|
||||
* XHTML 1.1
|
||||
|
||||
|
||||
|
||||
4.3. Other settings
|
||||
|
||||
There are more configuration directives which can be read about
|
||||
@@ -153,55 +154,24 @@ your code. Some of the more interesting ones are configurable at the
|
||||
demo <http://htmlpurifier.org/demo.php> and are well worth looking into
|
||||
for your own system.
|
||||
|
||||
For example, you can fine tune allowed elements and attributes, convert
|
||||
relative URLs to absolute ones, and even autoparagraph input text! These
|
||||
are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
|
||||
%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
|
||||
translates to:
|
||||
|
||||
$config->set('Namespace', 'Directive', $value);
|
||||
|
||||
E.g.
|
||||
|
||||
$config->set('HTML', 'Allowed', 'p,b,a[href],i');
|
||||
$config->set('URI', 'Base', 'http://www.example.com');
|
||||
$config->set('URI', 'MakeAbsolute', true);
|
||||
$config->set('AutoFormat', 'AutoParagraph', true);
|
||||
|
||||
|
||||
5. Using the code
|
||||
|
||||
The interface is mind-numbingly simple:
|
||||
|
||||
$purifier = new HTMLPurifier();
|
||||
$clean_html = $purifier->purify( $dirty_html );
|
||||
|
||||
...or, if you're using the configuration object:
|
||||
|
||||
$purifier = new HTMLPurifier($config);
|
||||
$clean_html = $purifier->purify( $dirty_html );
|
||||
|
||||
That's it! For more examples, check out docs/examples/ (they aren't very
|
||||
different though). Also, docs/enduser-slow.html gives advice on what to
|
||||
do if HTML Purifier is slowing down your application.
|
||||
|
||||
|
||||
|
||||
6. Quick install
|
||||
|
||||
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
|
||||
writable by the webserver (see Section 7: Caching below for details).
|
||||
If your website is in UTF-8 and XHTML Transitional, use this code:
|
||||
|
||||
<?php
|
||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||
|
||||
$purifier = new HTMLPurifier();
|
||||
$clean_html = $purifier->purify($dirty_html);
|
||||
?>
|
||||
|
||||
If your website is in a different encoding or doctype, use this code:
|
||||
|
||||
<?php
|
||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||
|
||||
$config = HTMLPurifier_Config::createDefault();
|
||||
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
|
||||
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
|
||||
$purifier = new HTMLPurifier($config);
|
||||
|
||||
$clean_html = $purifier->purify($dirty_html);
|
||||
?>
|
||||
|
||||
|
||||
|
||||
7. Caching
|
||||
---------------------------------------------------------------------------
|
||||
5. Caching
|
||||
|
||||
HTML Purifier generates some cache files (generally one or two) to speed up
|
||||
its execution. For maximum performance, make sure that
|
||||
@@ -236,3 +206,50 @@ hit):
|
||||
Or move the cache directory somewhere else (no trailing slash):
|
||||
|
||||
$config->set('Cache', 'SerializerPath', '/home/user/absolute/path');
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
6. Using the code
|
||||
|
||||
The interface is mind-numbingly simple:
|
||||
|
||||
$purifier = new HTMLPurifier();
|
||||
$clean_html = $purifier->purify( $dirty_html );
|
||||
|
||||
...or, if you're using the configuration object:
|
||||
|
||||
$purifier = new HTMLPurifier($config);
|
||||
$clean_html = $purifier->purify( $dirty_html );
|
||||
|
||||
That's it! For more examples, check out docs/examples/ (they aren't very
|
||||
different though). Also, docs/enduser-slow.html gives advice on what to
|
||||
do if HTML Purifier is slowing down your application.
|
||||
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
7. Quick install
|
||||
|
||||
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
|
||||
writable by the webserver (see Section 5: Caching above for details).
|
||||
If your website is in UTF-8 and XHTML Transitional, use this code:
|
||||
|
||||
<?php
|
||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||
|
||||
$purifier = new HTMLPurifier();
|
||||
$clean_html = $purifier->purify($dirty_html);
|
||||
?>
|
||||
|
||||
If your website is in a different encoding or doctype, use this code:
|
||||
|
||||
<?php
|
||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||
|
||||
$config = HTMLPurifier_Config::createDefault();
|
||||
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
|
||||
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
|
||||
$purifier = new HTMLPurifier($config);
|
||||
|
||||
$clean_html = $purifier->purify($dirty_html);
|
||||
?>
|
||||
|
||||
|
77
NEWS
77
NEWS
@@ -9,6 +9,83 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier
|
||||
. Internal change
|
||||
==========================
|
||||
|
||||
2.1.3, released 2007-11-05
|
||||
! tests/multitest.php allows you to test multiple versions by running
|
||||
tests/index.php through multiple interpreters using `phpv` shell
|
||||
script (you must provide this script!)
|
||||
- Fixed poor include ordering for Email URI AttrDefs, causes fatal errors
|
||||
on some systems.
|
||||
- Injector algorithm further refined: off-by-one error regarding skip
|
||||
counts for dormant injectors fixed
|
||||
- Corrective blockquote definition now enabled for HTML 4.01 Strict
|
||||
- Fatal error when <img> tag (or any other element with required attributes)
|
||||
has 'id' attribute fixed, thanks NykO18 for reporting
|
||||
- Fix warning emitted when a non-supported URI scheme is passed to the
|
||||
MakeAbsolute URIFilter, thanks NykO18 (again)
|
||||
- Further refine AutoParagraph injector. Behavior inside of elements
|
||||
allowing paragraph tags clarified: only inline content delimeted by
|
||||
double newlines (not block elements) are paragraphed.
|
||||
- Buggy treatment of end tags of elements that have required attributes
|
||||
fixed (does not manifest on default tag-set)
|
||||
- Spurious internal content reorganization error suppressed
|
||||
- HTMLDefinition->addElement now returns a reference to the created
|
||||
element object, as implied by the documentation
|
||||
- Phorum mod's HTML Purifier help message expanded (unreleased elsewhere)
|
||||
- Fix a theoretical class of infinite loops from DirectLex reported
|
||||
by Nate Abele
|
||||
- Work around unnecessary DOMElement type-cast in PH5P that caused errors
|
||||
in PHP 5.1
|
||||
- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only
|
||||
HTMLDefinition errors, this may indicate problems with error-collecting
|
||||
facilities in PHP 5
|
||||
- Make ErrorCollectorEMock work in both PHP 4 and PHP 5
|
||||
- Make PH5P work with PHP 5.0 by removing unnecessary array parameter typedef
|
||||
. %Core.AcceptFullDocuments renamed to %Core.ConvertDocumentToFragment
|
||||
to better communicate its purpose
|
||||
. Error unit tests can now specify the expectation of no errors. Future
|
||||
iterations of the harness will be extremely strict about what errors
|
||||
are allowed
|
||||
. Extend Injector hooks to allow for more powerful injector routines
|
||||
. HTMLDefinition->addBlankElement created, as according to the HTMLModule
|
||||
method
|
||||
. Doxygen configuration file updated, with minor improvements
|
||||
. Test runner now checks for similarly named files in conf/ directory too.
|
||||
. Minor cosmetic change to flush-definition-cache.php: trailing newline is
|
||||
outputted
|
||||
. Maintenance script for generating PH5P patch added, original PH5P source
|
||||
file also added under version control
|
||||
. Full unit test runner script title made more descriptive with PHP version
|
||||
. Updated INSTALL file to state that 4.3.7 is the earliest version we
|
||||
are actively testing
|
||||
|
||||
2.1.2, released 2007-09-03
|
||||
! Implemented Object module for trusted users
|
||||
! Implemented experimental HTML5 parsing mode using PH5P. To use, add
|
||||
this to your code:
|
||||
require_once 'HTMLPurifier/Lexer/PH5P.php';
|
||||
$config->set('Core', 'LexerImpl', 'PH5P');
|
||||
Note that this Lexer introduces some classes not in the HTMLPurifier
|
||||
namespace. Also, this is PHP5 only.
|
||||
! CSS property border-spacing implemented
|
||||
- Fix non-visible parsing error in DirectLex with empty tags that have
|
||||
slashes inside attribute values.
|
||||
- Fix typo in CSS definition: border-collapse:seperate; was incorrectly
|
||||
accepted as valid CSS. Usually non-visible, because this styling is the
|
||||
default for tables in most browsers. Thanks Brett Zamir for pointing
|
||||
this out.
|
||||
- Fix validation errors in configuration form
|
||||
- Hammer out a bunch of edge-case bugs in the standalone distribution
|
||||
- Inclusion reflection removed from URISchemeRegistry; you must manually
|
||||
include any new schema files you wish to use
|
||||
- Numerous typo fixes in documentation thanks to Brett Zamir
|
||||
. Unit test refactoring for one logical test per test function
|
||||
. Config and context parameters in ComplexHarness deprecated: instead, edit
|
||||
the $config and $context member variables
|
||||
. HTML wrapper in DOMLex now takes DTD identifiers into account; doesn't
|
||||
really make a difference, but is good for completeness sake
|
||||
. merge-library.php script refactored for greater code reusability and
|
||||
PHP4 compatibility
|
||||
|
||||
2.1.1, released 2007-08-04
|
||||
- Fix show-stopper bug in %URI.MakeAbsolute functionality
|
||||
- Fix PHP4 syntax error in standalone version
|
||||
|
15
TODO
15
TODO
@@ -28,23 +28,22 @@ afraid to cast your vote for the next feature to be implemented!
|
||||
- Remove empty inline tags<i></i>
|
||||
- Append something to duplicate IDs so they're still usable (impl. note: the
|
||||
dupe detector would also need to detect the suffix as well)
|
||||
- Externalize inline CSS to promote clean HTML
|
||||
|
||||
2.4 release [It's All About Trust] (floating)
|
||||
# Implement untrusted, dangerous elements/attributes
|
||||
# Implement IDREF support (harder than it seems, since you cannot have
|
||||
IDREFs to non-existent IDs)
|
||||
# Frameset XHTML 1.0 and HTML 4.01 doctypes
|
||||
|
||||
3.0 release [Beyond HTML]
|
||||
# Legit token based CSS parsing (will require revamping almost every
|
||||
AttrDef class)
|
||||
AttrDef class). Probably will use CSSTidy class
|
||||
# More control over allowed CSS properties (maybe modularize it in the
|
||||
same fashion!)
|
||||
# Formatters for plaintext
|
||||
- Smileys
|
||||
- Standardize token armor for all areas of processing
|
||||
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
|
||||
- Automatically add non-breaking spaces to empty table cells when
|
||||
empty-cells:show is applied to have compatibility with Internet Explorer
|
||||
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
|
||||
Also, enable disabling of directionality
|
||||
|
||||
@@ -63,25 +62,27 @@ Ongoing
|
||||
- Complete basic smoketests
|
||||
|
||||
Unknown release (on a scratch-an-itch basis)
|
||||
? Semi-lossy dumb alternate character encoding transfor
|
||||
# CHMOD install script for PEAR installs
|
||||
? Have 'lang' attribute be checked against official lists, achieved by
|
||||
encoding all characters that have string entity equivalents
|
||||
- Abstract ChildDef_BlockQuote to work with all elements that only
|
||||
allow blocks in them, required or optional
|
||||
- Reorganize Unit Tests
|
||||
- Refactor loop tests: Lexer
|
||||
- Reorganize configuration directives (Create more namespaces! Get messy!)
|
||||
- Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
|
||||
- Implement lenient <ruby> child validation
|
||||
- Explain how to use HTML Purifier in non-PHP languages / create
|
||||
a simple command line stub (or complicated?)
|
||||
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
|
||||
- Automatically add non-breaking spaces to empty table cells when
|
||||
empty-cells:show is applied to have compatibility with Internet Explorer
|
||||
|
||||
Requested
|
||||
|
||||
Wontfix
|
||||
- Non-lossy smart alternate character encoding transformations (unless
|
||||
patch provided)
|
||||
- Pretty-printing HTML, users can use Tidy on the output on entire page
|
||||
- Pretty-printing HTML: users can use Tidy on the output on entire page
|
||||
- Native content compression, whitespace stripping (don't rely on Tidy, make
|
||||
sure we don't remove from <pre> or related tags): use gzip if this is
|
||||
really important
|
||||
|
16
WHATSNEW
16
WHATSNEW
@@ -1,10 +1,6 @@
|
||||
In version 2.1, HTML Purifier's URI validation and filtering handling
|
||||
system has been revamped with a new, extensible URIFilter system. Also
|
||||
notable features include preservation of emoticons in PHP5 with
|
||||
%Core.AggressivelyFixLt, standalone and lite download versions,
|
||||
transforming relative URIs to absolute URIs, Ruby in XHTML 1.1, a Phorum
|
||||
mod, and UTF-8 font names. Notable bug-fixes include refinement of
|
||||
the auto-paragraphing algorithm (no longer experimental), better XHTML
|
||||
1.1 support and the removal of the contents of <style> elements. Version
|
||||
2.1.1 amends a few bugs in some of newly introduced features, namely
|
||||
running the standalone download version in PHP4 and %URI.MakeAbsolute.
|
||||
Stability release 2.1.3 fixes a slew of minor bugs found in HTML Purifier,
|
||||
and also includes some internal code enhancements and refactorings.
|
||||
Notably, tests/multitest.php automates testing in multiple versions,
|
||||
fatal AttrDef_URI_Email error fixed, blockquote contents are more lenient
|
||||
in HTML 4.01 Strict and fatal errors involving ID tags in img tags were
|
||||
fixed.
|
||||
|
@@ -39,7 +39,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
|
||||
<table cellspacing="0"><tbody>
|
||||
<tr><td class="impl-yes">Implemented</td></tr>
|
||||
<tr><td class="impl-partial">Partially implemented</td></tr>
|
||||
<tr><td class="impl-no">Will not implement</td></tr>
|
||||
<tr><td class="impl-no">Not priority to implement</td></tr>
|
||||
<tr><td class="danger">Dangerous attribute/property</td></tr>
|
||||
<tr><td class="css1">Present in CSS1</td></tr>
|
||||
<tr><td class="feature">Feature, requires extra work</td></tr>
|
||||
@@ -118,6 +118,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
|
||||
<tbody>
|
||||
<tr><th colspan="2">Table</th></tr>
|
||||
<tr class="impl-yes"><td>border-collapse</td><td>ENUM(collapse, seperate)</td></tr>
|
||||
<tr class="impl-yes"><td>border-space</td><td>MULTIPLE</td></tr>
|
||||
<tr class="impl-yes"><td>caption-side</td><td>ENUM(top, bottom)</td></tr>
|
||||
<tr class="feature"><td>empty-cells</td><td>ENUM(show, hide), No IE support makes this useless,
|
||||
possible fix with &nbsp;? Unknown release milestone.</td></tr>
|
||||
|
@@ -32,7 +32,7 @@
|
||||
Before we even write any code, it is paramount to consider whether or
|
||||
not the code we're writing is necessary or not. HTML Purifier, by default,
|
||||
contains a large set of elements and attributes: large enough so that
|
||||
<em>any</em> element or attribute in XHTML 1.0 (and its HTML variant)
|
||||
<em>any</em> element or attribute in XHTML 1.0 or 1.1 (and its HTML variants)
|
||||
that can be safely used by the general public is implemented.
|
||||
</p>
|
||||
|
||||
@@ -76,11 +76,12 @@
|
||||
<h3>XHTML 1.1</h3>
|
||||
|
||||
<p>
|
||||
We have not implemented the
|
||||
As of HTMLPurifier 2.1.0, we have implemented the
|
||||
<a href="http://www.w3.org/TR/2001/REC-ruby-20010531/">Ruby module</a>,
|
||||
which defines a set of tags
|
||||
for publishing short annotations for text, used mostly in Japanese
|
||||
and Chinese school texts.
|
||||
and Chinese school texts, but applicable for positioning any text (not
|
||||
limited to translations) above or below other corresponding text.
|
||||
</p>
|
||||
|
||||
<h3>XHTML 2.0</h3>
|
||||
@@ -492,10 +493,11 @@ $def =& $config->getHTMLDefinition(true);
|
||||
<p>
|
||||
The <code>(%flow;)*</code> indicates the allowed children of the
|
||||
<code>li</code> tag: <code>li</code> allows any number of flow
|
||||
elements as its children. In HTML Purifier, we'd write it like
|
||||
<code>Flow</code> (here's where the content sets we were
|
||||
discussing earlier come into play). There are three shorthand content models you
|
||||
can specify:
|
||||
elements as its children. (The <code>- O</code> allows the closing tag to be
|
||||
omitted, though in XML this is not allowed.) In HTML Purifier,
|
||||
we'd write it like <code>Flow</code> (here's where the content sets
|
||||
we were discussing earlier come into play). There are three shorthand
|
||||
content models you can specify:
|
||||
</p>
|
||||
|
||||
<table class="table">
|
||||
@@ -668,12 +670,22 @@ $def =& $config->getHTMLDefinition(true);
|
||||
Common is a combination of the above-mentioned collections.
|
||||
</p>
|
||||
|
||||
<p class="aside">
|
||||
Readers familiar with the modularization may have noticed that the Core
|
||||
attribute collection differs from that specified by the <a
|
||||
href="http://www.w3.org/TR/xhtml-modularization/abstract_modules.html#s_commonatts">abstract
|
||||
modules of the XHTML Modularization 1.1</a>. We believe this section
|
||||
to be in error, as <code>br</code> permits the use of the <code>style</code>
|
||||
attribute even though it uses the <code>Core</code> collection, and
|
||||
the DTD and XML Schemas supplied by W3C support our interpretation.
|
||||
</p>
|
||||
|
||||
<h3>Attributes</h3>
|
||||
|
||||
<p>
|
||||
If you didn't read the <a href="#addAttribute">previous section on
|
||||
If you didn't read the <a href="#addAttribute">earlier section on
|
||||
adding attributes</a>, read it now. The last parameter is simply
|
||||
array of attribute names to attribute implementations, in the exact
|
||||
an array of attribute names to attribute implementations, in the exact
|
||||
same format as <code>addAttribute()</code>.
|
||||
</p>
|
||||
|
||||
|
@@ -58,7 +58,7 @@ appear elsewhere on the document. The method is simple:</p>
|
||||
|
||||
<pre>$config->set('HTML', 'EnableAttrID', true);
|
||||
$config->set('Attr', 'IDBlacklist' array(
|
||||
'list', 'of', 'attributes', 'that', 'are', 'forbidden'
|
||||
'list', 'of', 'attribute', 'values', 'that', 'are', 'forbidden'
|
||||
));</pre>
|
||||
|
||||
<p>That being said, there are some notable drawbacks. First of all, you have to
|
||||
@@ -71,9 +71,9 @@ to possible standards-compliance issues.</p>
|
||||
<p>Furthermore, this position becomes untenable when a single web page must hold
|
||||
multiple portions of user-submitted content. Since there's obviously no way
|
||||
to find out before-hand what IDs users will use, the blacklist is helpless.
|
||||
And even since HTML Purifier validates each segment seperately, perhaps doing
|
||||
And since HTML Purifier validates each segment separately, perhaps doing
|
||||
so at different times, it would be extremely difficult to dynamically update
|
||||
the blacklist inbetween runs.</p>
|
||||
the blacklist in between runs.</p>
|
||||
|
||||
<p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after
|
||||
all, they might have simply specified a duplicate ID by accident.</p>
|
||||
|
@@ -22,7 +22,7 @@ out:</p>
|
||||
|
||||
<p class="emphasis">This ain't HTML Tidy!</p>
|
||||
|
||||
<p>Rather, Tidy stands for a cool set of Tidy-inspired in HTML Purifier
|
||||
<p>Rather, Tidy stands for a cool set of Tidy-inspired features in HTML Purifier
|
||||
that allows users to submit deprecated elements and attributes and get
|
||||
valid strict markup back. For example:</p>
|
||||
|
||||
@@ -33,8 +33,8 @@ valid strict markup back. For example:</p>
|
||||
<pre><div style="text-align:center;">Centered</div></pre>
|
||||
|
||||
<p>...when this particular fix is run on the HTML. This tutorial will give
|
||||
you down the lowdown of what exactly HTML Purifier will do when Tidy
|
||||
is on, and how to fine tune this behavior. Once again, <strong>you do
|
||||
you the lowdown of what exactly HTML Purifier will do when Tidy
|
||||
is on, and how to fine-tune this behavior. Once again, <strong>you do
|
||||
not need Tidy installed on your PHP to use these features!</strong></p>
|
||||
|
||||
<h2>What does it do?</h2>
|
||||
@@ -221,7 +221,7 @@ general syntax:</p>
|
||||
|
||||
<p>The lowdown is, quite frankly, HTML Purifier's default settings are
|
||||
probably good enough. The next step is to bump the level up to heavy,
|
||||
and if that still doesn't satisfy your appetite, do some fine tuning.
|
||||
and if that still doesn't satisfy your appetite, do some fine-tuning.
|
||||
Other than that, don't worry about it: this all works silently and
|
||||
effectively in the background.</p>
|
||||
|
||||
|
@@ -96,7 +96,7 @@ which can be a rewarding (but difficult) task.</p>
|
||||
<h2 id="findcharset">Finding the real encoding</h2>
|
||||
|
||||
<p>In the beginning, there was ASCII, and things were simple. But they
|
||||
weren't good, for no one could write in Cryllic or Thai. So there
|
||||
weren't good, for no one could write in Cyrillic or Thai. So there
|
||||
exploded a proliferation of character encodings to remedy the problem
|
||||
by extending the characters ASCII could express. This ridiculously
|
||||
simplified version of the history of character encodings shows us that
|
||||
@@ -138,7 +138,7 @@ browser:</p>
|
||||
<dd>View > Encoding: bulleted item is unofficial name</dd>
|
||||
</dl>
|
||||
|
||||
<p>Internet Explorer won't give you the mime (i.e. useful/real) name of the
|
||||
<p>Internet Explorer won't give you the MIME (i.e. useful/real) name of the
|
||||
character encoding, so you'll have to look it up using their description.
|
||||
Some common ones:</p>
|
||||
|
||||
@@ -216,6 +216,12 @@ if your <code>META</code> tag claims that either:</p>
|
||||
|
||||
<h2 id="fixcharset">Fixing the encoding</h2>
|
||||
|
||||
<p class="aside">The advice given here is for pages being served as
|
||||
vanilla <code>text/html</code>. Different practices must be used
|
||||
for <code>application/xml</code> or <code>application/xml+xhtml</code>, see
|
||||
<a href="http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020430/">W3C's
|
||||
document on XHTML media types</a> for more information.</p>
|
||||
|
||||
<p>If your <code>META</code> encoding and your real encoding match,
|
||||
savvy! You can skip this section. If they don't...</p>
|
||||
|
||||
@@ -302,7 +308,8 @@ languages</a>. The appropriate code is:</p>
|
||||
|
||||
<p>...replacing UTF-8 with whatever your embedded encoding is.
|
||||
This code must come before any output, so be careful about
|
||||
stray whitespace in your application.</p>
|
||||
stray whitespace in your application (i.e., any whitespace before
|
||||
output excluding whitespace within <?php ?> tags).</p>
|
||||
|
||||
<h4 id="fixcharset-server-phpini">PHP ini directive</h4>
|
||||
|
||||
@@ -313,8 +320,8 @@ header call: <code><a href="http://php.net/ini.core#ini.default-charset">default
|
||||
|
||||
<p>...will also do the trick. If PHP is running as an Apache module (and
|
||||
not as FastCGI, consult
|
||||
<a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess do apply this property
|
||||
globally:</p>
|
||||
<a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess to apply this property
|
||||
across many PHP files:</p>
|
||||
|
||||
<pre><a href="http://php.net/configuration.changes#configuration.changes.apache">php_value</a> default_charset "UTF-8"</pre>
|
||||
|
||||
@@ -360,10 +367,11 @@ to send anything at all:</p>
|
||||
|
||||
<pre><a href="http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset">AddDefaultCharset</a> Off</pre>
|
||||
|
||||
<p>...making your <code>META</code> tags the sole source of
|
||||
character encoding information. In these cases, it is
|
||||
<em>especially</em> important to make sure you have valid <code>META</code>
|
||||
tags on your pages and all the text before them is ASCII.</p>
|
||||
<p>...making your internal charset declaration (usually the <code>META</code> tags)
|
||||
the sole source of character encoding
|
||||
information. In these cases, it is <em>especially</em> important to make
|
||||
sure you have valid <code>META</code> tags on your pages and all the
|
||||
text before them is ASCII.</p>
|
||||
|
||||
<blockquote class="aside"><p>These directives can also be
|
||||
placed in httpd.conf file for Apache, but
|
||||
@@ -428,28 +436,30 @@ IIS to change character encodings, I'd be grateful.</p>
|
||||
|
||||
<p><code>META</code> tags are the most common source of embedded
|
||||
encodings, but they can also come from somewhere else: XML
|
||||
processing instructions. They look like:</p>
|
||||
Declarations. They look like:</p>
|
||||
|
||||
<pre><?xml version="1.0" encoding="UTF-8"?></pre>
|
||||
|
||||
<p>...and are most often found in XML documents (including XHTML).</p>
|
||||
|
||||
<p>For XHTML, this processing instruction theoretically
|
||||
<p>For XHTML, this XML Declaration theoretically
|
||||
overrides the <code>META</code> tag. In reality, this happens only when the
|
||||
XHTML is actually served as legit XML and not HTML, which is almost always
|
||||
never due to Internet Explorer's lack of support for
|
||||
<code>application/xhtml+xml</code> (even though doing so is often
|
||||
argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good practice</a>).</p>
|
||||
argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good
|
||||
practice</a> and is required by the XHTML 1.1 specification).</p>
|
||||
|
||||
<p>For XML, however, this processing instruction is extremely important.
|
||||
<p>For XML, however, this XML Declaration is extremely important.
|
||||
Since most webservers are not configured to send charsets for .xml files,
|
||||
this is the only thing a parser has to go on. Furthermore, the default
|
||||
for XML files is UTF-8, which often butts heads with more common
|
||||
ISO-8859-1 encoding (you see this in garbled RSS feeds).</p>
|
||||
|
||||
<p>In short, if you use XHTML and have gone through the
|
||||
trouble of adding the XML header, make sure it jives
|
||||
with your <code>META</code> tags and HTTP headers.</p>
|
||||
trouble of adding the XML Declaration, make sure it jives
|
||||
with your <code>META</code> tags (which should only be present
|
||||
if served in text/html) and HTTP headers.</p>
|
||||
|
||||
<h3 id="fixcharset-internals">Inside the process</h3>
|
||||
|
||||
@@ -506,7 +516,7 @@ usage in one language sometimes requires the occasional special character
|
||||
that, without surprise, is not available in your character set. Sometimes
|
||||
developers get around this by adding support for multiple encodings: when
|
||||
using Chinese, use Big5, when using Japanese, use Shift-JIS, when
|
||||
using Greek, etc. Other times, they use character entities with great
|
||||
using Greek, etc. Other times, they use character references with great
|
||||
zeal.</p>
|
||||
|
||||
<p>UTF-8, however, obviates the need for any of these complicated
|
||||
@@ -520,14 +530,14 @@ you don't have to use those user-unfriendly entities.</p>
|
||||
|
||||
<p>Websites encoded in Latin-1 (ISO-8859-1) which ocassionally need
|
||||
a special character outside of their scope often will use a character
|
||||
entity to achieve the desired effect. For instance, θ can be
|
||||
entity reference to achieve the desired effect. For instance, θ can be
|
||||
written <code>&theta;</code>, regardless of the character encoding's
|
||||
support of Greek letters.</p>
|
||||
|
||||
<p>This works nicely for limited use of special characters, but
|
||||
say you wanted this sentence of Chinese text: 激光,
|
||||
這兩個字是甚麼意思.
|
||||
The entity-ized version would look like this:</p>
|
||||
The ampersand encoded version would look like this:</p>
|
||||
|
||||
<pre>&#28608;&#20809;, &#36889;&#20841;&#20491;&#23383;&#26159;&#29978;&#40636;&#24847;&#24605;</pre>
|
||||
|
||||
@@ -545,7 +555,7 @@ an application that originally used ISO-8859-1 but switched to UTF-8
|
||||
when it became far to cumbersome to support foreign languages. Bots
|
||||
will now actually go through articles and convert character entities
|
||||
to their corresponding real characters for the sake of user-friendliness
|
||||
and searcheability. See
|
||||
and searchability. See
|
||||
<a href="http://meta.wikimedia.org/wiki/Help:Special_characters">Meta's
|
||||
page on special characters</a> for more details.
|
||||
</p></blockquote>
|
||||
@@ -593,7 +603,7 @@ browser you're using, they might:</p>
|
||||
<ul>
|
||||
<li>Replace the unsupported characters with useless question marks,</li>
|
||||
<li>Attempt to fix the characters (example: smart quotes to regular quotes),</li>
|
||||
<li>Replace the character with a character entity, or</li>
|
||||
<li>Replace the character with a character entity reference, or</li>
|
||||
<li>Send it anyway as a different character encoding mixed in
|
||||
with the original encoding (usually Windows-1252 rather than
|
||||
iso-8859-1 or UTF-8 interspersed in 8-bit)</li>
|
||||
@@ -609,7 +619,7 @@ since UTF-8 supports every character.</p>
|
||||
|
||||
<h4 id="whyutf8-forms-multipart"><code>multipart/form-data</code></h4>
|
||||
|
||||
<p>Multipart form submission takes a way a lot of the ambiguity
|
||||
<p>Multipart form submission takes away a lot of the ambiguity
|
||||
that percent-encoding had: the server now can explicitly ask for
|
||||
certain encodings, and the client can explicitly tell the server
|
||||
during the form submission what encoding the fields are in.</p>
|
||||
@@ -622,9 +632,9 @@ Each method has deficiencies, especially the former.</p>
|
||||
<p>If you tell the browser to send the form in the same encoding as
|
||||
the page, you still have the trouble of what to do with characters
|
||||
that are outside of the character encoding's range. The behavior, once
|
||||
again, varies: Firefox 2.0 entity-izes them while Internet Explorer
|
||||
7.0 mangles them beyond intelligibility. For serious internationalization purposes,
|
||||
this is not an option.</p>
|
||||
again, varies: Firefox 2.0 converts them to character entity references
|
||||
while Internet Explorer 7.0 mangles them beyond intelligibility. For
|
||||
serious internationalization purposes, this is not an option.</p>
|
||||
|
||||
<p>The other possibility is to set Accept-Encoding to UTF-8, which
|
||||
begs the question: Why aren't you using UTF-8 for everything then?
|
||||
@@ -664,12 +674,12 @@ it up to the module iconv to do the dirty work.</p>
|
||||
<p>This approach, however, is not perfect. iconv is blithely unaware
|
||||
of HTML character entities. HTML Purifier, in order to
|
||||
protect against sophisticated escaping schemes, normalizes all character
|
||||
and numeric entities before processing the text. This leads to
|
||||
and numeric entitie references before processing the text. This leads to
|
||||
one important ramification:</p>
|
||||
|
||||
<p><strong>Any character that is not supported by the target character
|
||||
set, regardless of whether or not it is in the form of a character
|
||||
entity or a raw character, will be silently ignored.</strong></p>
|
||||
entity reference or a raw character, will be silently ignored.</strong></p>
|
||||
|
||||
<p>Example of this principle at work: say you have <code>&theta;</code>
|
||||
in your HTML, but the output is in Latin-1 (which, understandably,
|
||||
@@ -678,7 +688,7 @@ set the encoding correctly using %Core.Encoding):</p>
|
||||
|
||||
<ul>
|
||||
<li>The <code>Encoder</code> will transform the text from ISO 8859-1 to UTF-8
|
||||
(note that theta is preserved since it doesn't actually use
|
||||
(note that theta is preserved here since it doesn't actually use
|
||||
any non-ASCII characters): <code>&theta;</code></li>
|
||||
<li>The <code>EntityParser</code> will transform all named and numeric
|
||||
character entities to their corresponding raw UTF-8 equivalents:
|
||||
@@ -701,7 +711,7 @@ Purifier has provided a slightly more palatable workaround using
|
||||
<li>The <code>EntityParser</code> transforms entities: <code>θ</code></li>
|
||||
<li>HTML Purifier processes the code: <code>θ</code></li>
|
||||
<li>The <code>Encoder</code> replaces all non-ASCII characters
|
||||
with numeric entities: <code>&#952;</code></li>
|
||||
with numeric entity reference: <code>&#952;</code></li>
|
||||
<li>For good measure, <code>Encoder</code> transforms encoding back to
|
||||
original (which is strictly unnecessary for 99% of encodings
|
||||
out there): <code>&#952;</code> (remember, it's all ASCII!)</li>
|
||||
@@ -711,19 +721,19 @@ Purifier has provided a slightly more palatable workaround using
|
||||
the land of Unicode characters, and is totally unacceptable for Chinese
|
||||
or Japanese texts. The even bigger kicker is that, supposing the
|
||||
input encoding was actually ISO-8859-7, which <em>does</em> support
|
||||
theta, the character would get entity-ized anyway! (The Encoder does
|
||||
not discriminate).</p>
|
||||
theta, the character would get converted into a character entity reference
|
||||
anyway! (The Encoder does not discriminate).</p>
|
||||
|
||||
<p>The current functionality is about where HTML Purifier will be for
|
||||
the rest of eternity. HTML Purifier could attempt to preserve the original
|
||||
form of the entities so that they could be substituted back in, only the
|
||||
form of the character references so that they could be substituted back in, only the
|
||||
DOM extension kills them off irreversibly. HTML Purifier could also attempt
|
||||
to be smart and only convert non-ASCII characters that weren't supported
|
||||
by the target encoding, but that would require reimplementing iconv
|
||||
with HTML awareness, something I will not do.</p>
|
||||
|
||||
<p>So there: either it's UTF-8 or crippled international support. Your pick! (and I'm
|
||||
not being sarcastic here: some people could care less about other languages)</p>
|
||||
not being sarcastic here: some people could care less about other languages).</p>
|
||||
|
||||
<h2 id="migrate">Migrate to UTF-8</h2>
|
||||
|
||||
@@ -985,7 +995,7 @@ and yes, it is variable width. Other traits:</p>
|
||||
in different ways. It is beyond the scope of this document to explain
|
||||
what precisely these implications are. PHPWact provides
|
||||
a very good <a href="http://www.phpwact.org/php/i18n/utf-8">reference document</a>
|
||||
on what to expect from each functions, although coverage is spotty in
|
||||
on what to expect from each function, although coverage is spotty in
|
||||
some areas. Their more general notes on
|
||||
<a href="http://www.phpwact.org/php/i18n/charsets">character sets</a>
|
||||
are also worth looking at for information on UTF-8. Some rules of thumb
|
||||
@@ -999,7 +1009,7 @@ when dealing with Unicode text:</p>
|
||||
<li>Think twice before using functions that:<ul>
|
||||
<li>...count characters (strlen will return bytes, not characters;
|
||||
str_split and word_wrap may corrupt)</li>
|
||||
<li>...entity-ize things (UTF-8 doesn't need entities)</li>
|
||||
<li>...convert characters to entity references (UTF-8 doesn't need entities)</li>
|
||||
<li>...do very complex string processing (*printf)</li>
|
||||
</ul></li>
|
||||
</ul>
|
||||
|
@@ -22,8 +22,8 @@
|
||||
*/
|
||||
|
||||
/*
|
||||
HTML Purifier 2.1.1 - Standards Compliant HTML Filtering
|
||||
Copyright (C) 2006 Edward Z. Yang
|
||||
HTML Purifier 2.1.3 - Standards Compliant HTML Filtering
|
||||
Copyright (C) 2006-2007 Edward Z. Yang
|
||||
|
||||
This library is free software; you can redistribute it and/or
|
||||
modify it under the terms of the GNU Lesser General Public
|
||||
@@ -43,9 +43,8 @@
|
||||
// constants are slow, but we'll make one exception
|
||||
define('HTMLPURIFIER_PREFIX', dirname(__FILE__));
|
||||
|
||||
// almost every class has an undocumented dependency to these, so make sure
|
||||
// they get included
|
||||
require_once 'HTMLPurifier/ConfigSchema.php'; // important
|
||||
// every class has an undocumented dependency to these, must be included!
|
||||
require_once 'HTMLPurifier/ConfigSchema.php'; // fatal errors if not included
|
||||
require_once 'HTMLPurifier/Config.php';
|
||||
require_once 'HTMLPurifier/Context.php';
|
||||
|
||||
@@ -60,16 +59,23 @@ require_once 'HTMLPurifier/LanguageFactory.php';
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'Core', 'CollectErrors', false, 'bool', '
|
||||
Whether or not to collect errors found while filtering the document. This
|
||||
is a useful way to give feedback to your users. CURRENTLY NOT IMPLEMENTED.
|
||||
This directive has been available since 2.0.0.
|
||||
is a useful way to give feedback to your users. <strong>Warning:</strong>
|
||||
Currently this feature is very patchy and experimental, with lots of
|
||||
possible error messages not yet implemented. It will not cause any problems,
|
||||
but it may not help your users either. This directive has been available
|
||||
since 2.0.0.
|
||||
');
|
||||
|
||||
/**
|
||||
* Main library execution class.
|
||||
* Facade that coordinates HTML Purifier's subsystems in order to purify HTML.
|
||||
*
|
||||
* Facade that performs calls to the HTMLPurifier_Lexer,
|
||||
* HTMLPurifier_Strategy and HTMLPurifier_Generator subsystems in order to
|
||||
* purify HTML.
|
||||
* @note There are several points in which configuration can be specified
|
||||
* for HTML Purifier. The precedence of these (from lowest to
|
||||
* highest) is as follows:
|
||||
* -# Instance: new HTMLPurifier($config)
|
||||
* -# Invocation: purify($html, $config)
|
||||
* These configurations are entirely independent of each other and
|
||||
* are *not* merged.
|
||||
*
|
||||
* @todo We need an easier way to inject strategies, it'll probably end
|
||||
* up getting done through config though.
|
||||
@@ -77,15 +83,16 @@ This directive has been available since 2.0.0.
|
||||
class HTMLPurifier
|
||||
{
|
||||
|
||||
var $version = '2.1.1';
|
||||
var $version = '2.1.3';
|
||||
|
||||
var $config;
|
||||
var $filters;
|
||||
var $filters = array();
|
||||
|
||||
var $strategy, $generator;
|
||||
|
||||
/**
|
||||
* Final HTMLPurifier_Context of last run purification. Might be an array.
|
||||
* Resultant HTMLPurifier_Context of last run purification. Is an array
|
||||
* of contexts if the last called method was purifyArray().
|
||||
* @public
|
||||
*/
|
||||
var $context;
|
||||
@@ -150,6 +157,11 @@ class HTMLPurifier
|
||||
$context->register('ErrorCollector', $error_collector);
|
||||
}
|
||||
|
||||
// setup id_accumulator context, necessary due to the fact that
|
||||
// AttrValidator can be called from many places
|
||||
$id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
|
||||
$context->register('IDAccumulator', $id_accumulator);
|
||||
|
||||
$html = HTMLPurifier_Encoder::convertToUTF8($html, $config, $context);
|
||||
|
||||
for ($i = 0, $size = count($this->filters); $i < $size; $i++) {
|
||||
@@ -198,6 +210,8 @@ class HTMLPurifier
|
||||
|
||||
/**
|
||||
* Singleton for enforcing just one HTML Purifier in your system
|
||||
* @param $prototype Optional prototype HTMLPurifier instance to
|
||||
* overload singleton with.
|
||||
*/
|
||||
static function &getInstance($prototype = null) {
|
||||
static $htmlpurifier;
|
||||
|
@@ -6,6 +6,7 @@ require_once 'HTMLPurifier/URIScheme.php';
|
||||
require_once 'HTMLPurifier/URISchemeRegistry.php';
|
||||
require_once 'HTMLPurifier/AttrDef/URI/Host.php';
|
||||
require_once 'HTMLPurifier/PercentEncoder.php';
|
||||
require_once 'HTMLPurifier/AttrDef/URI/Email.php';
|
||||
|
||||
// special case filtering directives
|
||||
|
||||
@@ -101,7 +102,7 @@ class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
|
||||
$result = $uri->validate($config, $context);
|
||||
if (!$result) break;
|
||||
|
||||
// chained validation
|
||||
// chained filtering
|
||||
$uri_def =& $config->getDefinition('URI');
|
||||
$result = $uri_def->filter($uri, $config, $context);
|
||||
if (!$result) break;
|
||||
|
@@ -14,3 +14,5 @@ class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
|
||||
|
||||
}
|
||||
|
||||
// sub-implementations
|
||||
require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';
|
||||
|
@@ -44,6 +44,9 @@ class HTMLPurifier_AttrTypes
|
||||
$this->info['LanguageCode'] = new HTMLPurifier_AttrDef_Lang();
|
||||
$this->info['Color'] = new HTMLPurifier_AttrDef_HTML_Color();
|
||||
|
||||
// unimplemented aliases
|
||||
$this->info['ContentType'] = new HTMLPurifier_AttrDef_Text();
|
||||
|
||||
// number is really a positive integer (one or more digits)
|
||||
// FIXME: ^^ not always, see start and value of list items
|
||||
$this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true);
|
||||
|
@@ -23,6 +23,13 @@ class HTMLPurifier_AttrValidator
|
||||
$definition = $config->getHTMLDefinition();
|
||||
$e =& $context->get('ErrorCollector', true);
|
||||
|
||||
// initialize IDAccumulator if necessary
|
||||
$ok =& $context->get('IDAccumulator', true);
|
||||
if (!$ok) {
|
||||
$id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
|
||||
$context->register('IDAccumulator', $id_accumulator);
|
||||
}
|
||||
|
||||
// initialize CurrentToken if necessary
|
||||
$current_token =& $context->get('CurrentToken', true);
|
||||
if (!$current_token) $context->register('CurrentToken', $token);
|
||||
|
@@ -204,7 +204,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
|
||||
$this->info['border-right'] = new HTMLPurifier_AttrDef_CSS_Border($config);
|
||||
|
||||
$this->info['border-collapse'] = new HTMLPurifier_AttrDef_Enum(array(
|
||||
'collapse', 'seperate'));
|
||||
'collapse', 'separate'));
|
||||
|
||||
$this->info['caption-side'] = new HTMLPurifier_AttrDef_Enum(array(
|
||||
'top', 'bottom'));
|
||||
@@ -219,6 +219,8 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
|
||||
new HTMLPurifier_AttrDef_CSS_Percentage()
|
||||
));
|
||||
|
||||
$this->info['border-spacing'] = new HTMLPurifier_AttrDef_CSS_Multiple(new HTMLPurifier_AttrDef_CSS_Length(), 2);
|
||||
|
||||
// partial support
|
||||
$this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap'));
|
||||
|
||||
|
@@ -15,7 +15,10 @@ class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
|
||||
var $type = 'optional';
|
||||
function validateChildren($tokens_of_children, $config, &$context) {
|
||||
$result = parent::validateChildren($tokens_of_children, $config, $context);
|
||||
if ($result === false) return array();
|
||||
if ($result === false) {
|
||||
if (empty($tokens_of_children)) return true;
|
||||
else return array();
|
||||
}
|
||||
return $result;
|
||||
}
|
||||
}
|
||||
|
@@ -42,7 +42,7 @@ class HTMLPurifier_Config
|
||||
/**
|
||||
* HTML Purifier's version
|
||||
*/
|
||||
var $version = '2.1.1';
|
||||
var $version = '2.1.3';
|
||||
|
||||
/**
|
||||
* Two-level associative array of configuration directives
|
||||
|
@@ -236,13 +236,26 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
||||
/**
|
||||
* Adds a custom element to your HTML definition
|
||||
* @note See HTMLPurifier_HTMLModule::addElement for detailed
|
||||
* parameter descriptions.
|
||||
* parameter and return value descriptions.
|
||||
*/
|
||||
function addElement($element_name, $type, $contents, $attr_collections, $attributes) {
|
||||
function &addElement($element_name, $type, $contents, $attr_collections, $attributes) {
|
||||
$module =& $this->getAnonymousModule();
|
||||
// assume that if the user is calling this, the element
|
||||
// is safe. This may not be a good idea
|
||||
$module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
|
||||
$element =& $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
|
||||
return $element;
|
||||
}
|
||||
|
||||
/**
|
||||
* Adds a blank element to your HTML definition, for overriding
|
||||
* existing behavior
|
||||
* @note See HTMLPurifier_HTMLModule::addBlankElement for detailed
|
||||
* parameter and return value descriptions.
|
||||
*/
|
||||
function &addBlankElement($element_name) {
|
||||
$module =& $this->getAnonymousModule();
|
||||
$element =& $module->addBlankElement($element_name);
|
||||
return $element;
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -330,7 +343,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
||||
if (isset($this->info_content_sets['Block'][$block_wrapper])) {
|
||||
$this->info_block_wrapper = $block_wrapper;
|
||||
} else {
|
||||
trigger_error('Cannot use non-block element as block wrapper.',
|
||||
trigger_error('Cannot use non-block element as block wrapper',
|
||||
E_USER_ERROR);
|
||||
}
|
||||
|
||||
@@ -340,7 +353,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
||||
$this->info_parent = $parent;
|
||||
$this->info_parent_def = $def;
|
||||
} else {
|
||||
trigger_error('Cannot use unrecognized element as parent.',
|
||||
trigger_error('Cannot use unrecognized element as parent',
|
||||
E_USER_ERROR);
|
||||
$this->info_parent_def = $this->manager->getElement($this->info_parent, true);
|
||||
}
|
||||
|
47
library/HTMLPurifier/HTMLModule/Object.php
Normal file
47
library/HTMLPurifier/HTMLModule/Object.php
Normal file
@@ -0,0 +1,47 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/HTMLModule.php';
|
||||
|
||||
/**
|
||||
* XHTML 1.1 Object Module, defines elements for generic object inclusion
|
||||
* @warning Users will commonly use <embed> to cater to legacy browsers: this
|
||||
* module does not allow this sort of behavior
|
||||
*/
|
||||
class HTMLPurifier_HTMLModule_Object extends HTMLPurifier_HTMLModule
|
||||
{
|
||||
|
||||
var $name = 'Object';
|
||||
|
||||
function HTMLPurifier_HTMLModule_Object() {
|
||||
|
||||
$this->addElement('object', false, 'Inline', 'Optional: #PCDATA | Flow | param', 'Common',
|
||||
array(
|
||||
'archive' => 'URI',
|
||||
'classid' => 'URI',
|
||||
'codebase' => 'URI',
|
||||
'codetype' => 'Text',
|
||||
'data' => 'URI',
|
||||
'declare' => 'Bool#declare',
|
||||
'height' => 'Length',
|
||||
'name' => 'CDATA',
|
||||
'standby' => 'Text',
|
||||
'tabindex' => 'Number',
|
||||
'type' => 'ContentType',
|
||||
'width' => 'Length'
|
||||
)
|
||||
);
|
||||
|
||||
$this->addElement('param', false, false, 'Empty', false,
|
||||
array(
|
||||
'id' => 'ID',
|
||||
'name*' => 'Text',
|
||||
'type' => 'Text',
|
||||
'value' => 'Text',
|
||||
'valuetype' => 'Enum#data,ref,object'
|
||||
)
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -13,6 +13,8 @@ require_once 'HTMLPurifier/AttrTransform/Length.php';
|
||||
require_once 'HTMLPurifier/AttrTransform/ImgSpace.php';
|
||||
require_once 'HTMLPurifier/AttrTransform/EnumToCSS.php';
|
||||
|
||||
require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
|
||||
|
||||
class HTMLPurifier_HTMLModule_Tidy_XHTMLAndHTML4 extends
|
||||
HTMLPurifier_HTMLModule_Tidy
|
||||
{
|
||||
@@ -188,5 +190,17 @@ class HTMLPurifier_HTMLModule_Tidy_Strict extends
|
||||
{
|
||||
var $name = 'Tidy_Strict';
|
||||
var $defaultLevel = 'light';
|
||||
|
||||
function makeFixes() {
|
||||
$r = parent::makeFixes();
|
||||
$r['blockquote#content_model_type'] = 'strictblockquote';
|
||||
return $r;
|
||||
}
|
||||
|
||||
var $defines_child_def = true;
|
||||
function getChildDef($def) {
|
||||
if ($def->content_model_type != 'strictblockquote') return parent::getChildDef($def);
|
||||
return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
|
||||
}
|
||||
}
|
||||
|
||||
|
@@ -1,26 +0,0 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy.php';
|
||||
require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
|
||||
|
||||
class HTMLPurifier_HTMLModule_Tidy_XHTMLStrict extends
|
||||
HTMLPurifier_HTMLModule_Tidy
|
||||
{
|
||||
|
||||
var $name = 'Tidy_XHTMLStrict';
|
||||
var $defaultLevel = 'light';
|
||||
|
||||
function makeFixes() {
|
||||
$r = array();
|
||||
$r['blockquote#content_model_type'] = 'strictblockquote';
|
||||
return $r;
|
||||
}
|
||||
|
||||
var $defines_child_def = true;
|
||||
function getChildDef($def) {
|
||||
if ($def->content_model_type != 'strictblockquote') return false;
|
||||
return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -29,12 +29,12 @@ require_once 'HTMLPurifier/HTMLModule/Scripting.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/XMLCommonAttributes.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Ruby.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Object.php';
|
||||
|
||||
// tidy modules
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTML.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php';
|
||||
require_once 'HTMLPurifier/HTMLModule/Tidy/Proprietary.php';
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
@@ -172,7 +172,7 @@ class HTMLPurifier_HTMLModuleManager
|
||||
$common = array(
|
||||
'CommonAttributes', 'Text', 'Hypertext', 'List',
|
||||
'Presentation', 'Edit', 'Bdo', 'Tables', 'Image',
|
||||
'StyleAttribute', 'Scripting'
|
||||
'StyleAttribute', 'Scripting', 'Object'
|
||||
);
|
||||
$transitional = array('Legacy', 'Target');
|
||||
$xml = array('XMLCommonAttributes');
|
||||
@@ -208,7 +208,7 @@ class HTMLPurifier_HTMLModuleManager
|
||||
$this->doctypes->register(
|
||||
'XHTML 1.0 Strict', true,
|
||||
array_merge($common, $xml, $non_xml),
|
||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_XHTMLStrict', 'Tidy_Proprietary'),
|
||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Strict', 'Tidy_Proprietary'),
|
||||
array(),
|
||||
'-//W3C//DTD XHTML 1.0 Strict//EN',
|
||||
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
|
||||
@@ -217,7 +217,7 @@ class HTMLPurifier_HTMLModuleManager
|
||||
$this->doctypes->register(
|
||||
'XHTML 1.1', true,
|
||||
array_merge($common, $xml, array('Ruby')),
|
||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_XHTMLStrict'), // Tidy_XHTML1_1
|
||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_Strict'), // Tidy_XHTML1_1
|
||||
array(),
|
||||
'-//W3C//DTD XHTML 1.1//EN',
|
||||
'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'
|
||||
|
@@ -1,11 +1,15 @@
|
||||
<?php
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'Attr', 'IDBlacklist', array(), 'list',
|
||||
'Array of IDs not allowed in the document.'
|
||||
);
|
||||
|
||||
/**
|
||||
* Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
|
||||
* @note In Slashdot-speak, dupe means duplicate.
|
||||
* @note This class does not accept $config or $context, thus, it is the
|
||||
* burden of the callee to register the appropriate errors or
|
||||
* configuration.
|
||||
* @note The default constructor does not accept $config or $context objects:
|
||||
* use must use the static build() factory method to perform initialization.
|
||||
*/
|
||||
class HTMLPurifier_IDAccumulator
|
||||
{
|
||||
@@ -16,6 +20,19 @@ class HTMLPurifier_IDAccumulator
|
||||
*/
|
||||
var $ids = array();
|
||||
|
||||
/**
|
||||
* Builds an IDAccumulator, also initializing the default blacklist
|
||||
* @param $config Instance of HTMLPurifier_Config
|
||||
* @param $context Instance of HTMLPurifier_Context
|
||||
* @return Fully initialized HTMLPurifier_IDAccumulator
|
||||
* @static
|
||||
*/
|
||||
static function build($config, &$context) {
|
||||
$id_accumulator = new HTMLPurifier_IDAccumulator();
|
||||
$id_accumulator->load($config->get('Attr', 'IDBlacklist'));
|
||||
return $id_accumulator;
|
||||
}
|
||||
|
||||
/**
|
||||
* Add an ID to the lookup table.
|
||||
* @param $id ID to be added.
|
||||
|
@@ -4,6 +4,9 @@
|
||||
* Injects tokens into the document while parsing for well-formedness.
|
||||
* This enables "formatter-like" functionality such as auto-paragraphing,
|
||||
* smiley-ification and linkification to take place.
|
||||
*
|
||||
* @todo Allow injectors to request a re-run on their output. This
|
||||
* would help if an operation is recursive.
|
||||
*/
|
||||
class HTMLPurifier_Injector
|
||||
{
|
||||
@@ -107,5 +110,12 @@ class HTMLPurifier_Injector
|
||||
*/
|
||||
function handleElement(&$token) {}
|
||||
|
||||
/**
|
||||
* Notifier that is called when an end token is processed
|
||||
* @note This differs from handlers in that the token is read-only
|
||||
*/
|
||||
function notifyEnd($token) {}
|
||||
|
||||
|
||||
}
|
||||
|
||||
|
@@ -6,20 +6,28 @@ HTMLPurifier_ConfigSchema::define(
|
||||
'AutoFormat', 'AutoParagraph', false, 'bool', '
|
||||
<p>
|
||||
This directive turns on auto-paragraphing, where double newlines are
|
||||
converted in to paragraphs whenever possible. Auto-paragraphing
|
||||
applies when:
|
||||
converted in to paragraphs whenever possible. Auto-paragraphing:
|
||||
</p>
|
||||
<ul>
|
||||
<li>There are inline elements or text in the root node</li>
|
||||
<li>There are inline elements or text with double newlines or
|
||||
block elements in nodes that allow paragraph tags</li>
|
||||
<li>There are double newlines in paragraph tags</li>
|
||||
<li>Always applies to inline elements or text in the root node,</li>
|
||||
<li>Applies to inline elements or text with double newlines in nodes
|
||||
that allow paragraph tags,</li>
|
||||
<li>Applies to double newlines in paragraph tags</li>
|
||||
</ul>
|
||||
<p>
|
||||
<code>p</code> tags must be allowed for this directive to take effect.
|
||||
We do not use <code>br</code> tags for paragraphing, as that is
|
||||
semantically incorrect.
|
||||
</p>
|
||||
<p>
|
||||
To prevent auto-paragraphing as a content-producer, refrain from using
|
||||
double-newlines except to specify a new paragraph or in contexts where
|
||||
it has special meaning (whitespace usually has no meaning except in
|
||||
tags like <code>pre</code>, so this should not be difficult.) To prevent
|
||||
the paragraphing of inline text adjacent to block elements, wrap them
|
||||
in <code>div</code> tags (the behavior is slightly different outside of
|
||||
the root node.)
|
||||
</p>
|
||||
<p>
|
||||
This directive has been available since 2.0.1.
|
||||
</p>
|
||||
@@ -62,19 +70,27 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
||||
$ok = false;
|
||||
// test if up-coming tokens are either block or have
|
||||
// a double newline in them
|
||||
$nesting = 0;
|
||||
for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
|
||||
if ($this->inputTokens[$i]->type == 'start'){
|
||||
if (!$this->_isInline($this->inputTokens[$i])) {
|
||||
$ok = true;
|
||||
// we haven't found a double-newline, and
|
||||
// we've hit a block element, so don't paragraph
|
||||
$ok = false;
|
||||
break;
|
||||
}
|
||||
break;
|
||||
$nesting++;
|
||||
}
|
||||
if ($this->inputTokens[$i]->type == 'end') {
|
||||
if ($nesting <= 0) break;
|
||||
$nesting--;
|
||||
}
|
||||
if ($this->inputTokens[$i]->type == 'end') break;
|
||||
if ($this->inputTokens[$i]->type == 'text') {
|
||||
// found it!
|
||||
if (strpos($this->inputTokens[$i]->data, "\n\n") !== false) {
|
||||
$ok = true;
|
||||
break;
|
||||
}
|
||||
if (!$this->inputTokens[$i]->is_whitespace) break;
|
||||
}
|
||||
}
|
||||
if ($ok) {
|
||||
|
@@ -13,11 +13,14 @@ if (version_compare(PHP_VERSION, "5", ">=")) {
|
||||
}
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'Core', 'AcceptFullDocuments', true, 'bool',
|
||||
'This parameter determines whether or not the filter should accept full '.
|
||||
'HTML documents, not just HTML fragments. When on, it will '.
|
||||
'drop all sections except the content between body.'
|
||||
);
|
||||
'Core', 'ConvertDocumentToFragment', true, 'bool', '
|
||||
This parameter determines whether or not the filter should convert
|
||||
input that is a full document with html and body tags to a fragment
|
||||
of just the contents of a body tag. This parameter is simply something
|
||||
HTML Purifier can do during an edge-case: for most inputs, this
|
||||
processing is not necessary.
|
||||
');
|
||||
HTMLPurifier_ConfigSchema::defineAlias('Core', 'AcceptFullDocuments', 'Core', 'ConvertDocumentToFragment');
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'Core', 'LexerImpl', null, 'mixed/null', '
|
||||
@@ -189,6 +192,9 @@ class HTMLPurifier_Lexer
|
||||
return new HTMLPurifier_Lexer_DOMLex();
|
||||
case 'DirectLex':
|
||||
return new HTMLPurifier_Lexer_DirectLex();
|
||||
case 'PH5P':
|
||||
// experimental Lexer that must be manually included
|
||||
return new HTMLPurifier_Lexer_PH5P();
|
||||
default:
|
||||
trigger_error("Cannot instantiate unrecognized Lexer type " . htmlspecialchars($lexer), E_USER_ERROR);
|
||||
}
|
||||
@@ -313,7 +319,7 @@ class HTMLPurifier_Lexer
|
||||
function normalize($html, $config, &$context) {
|
||||
|
||||
// extract body from document if applicable
|
||||
if ($config->get('Core', 'AcceptFullDocuments')) {
|
||||
if ($config->get('Core', 'ConvertDocumentToFragment')) {
|
||||
$html = $this->extractBody($html);
|
||||
}
|
||||
|
||||
|
@@ -53,14 +53,7 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
|
||||
}
|
||||
|
||||
// preprocess html, essential for UTF-8
|
||||
$html =
|
||||
'<!DOCTYPE html '.
|
||||
'PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"'.
|
||||
'"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'.
|
||||
'<html><head>'.
|
||||
'<meta http-equiv="Content-Type" content="text/html;'.
|
||||
' charset=utf-8" />'.
|
||||
'</head><body><div>'.$html.'</div></body></html>';
|
||||
$html = $this->wrapHTML($html, $config, $context);
|
||||
|
||||
$doc = new DOMDocument();
|
||||
$doc->encoding = 'UTF-8'; // theoretically, the above has this covered
|
||||
@@ -177,5 +170,25 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
|
||||
return '<!--' . str_replace('&', '&', $matches[1]) . $matches[2];
|
||||
}
|
||||
|
||||
/**
|
||||
* Wraps an HTML fragment in the necessary HTML
|
||||
*/
|
||||
function wrapHTML($html, $config, &$context) {
|
||||
$def = $config->getDefinition('HTML');
|
||||
$ret = '';
|
||||
|
||||
if (!empty($def->doctype->dtdPublic) || !empty($def->doctype->dtdSystem)) {
|
||||
$ret .= '<!DOCTYPE html ';
|
||||
if (!empty($def->doctype->dtdPublic)) $ret .= 'PUBLIC "' . $def->doctype->dtdPublic . '" ';
|
||||
if (!empty($def->doctype->dtdSystem)) $ret .= '"' . $def->doctype->dtdSystem . '" ';
|
||||
$ret .= '>';
|
||||
}
|
||||
|
||||
$ret .= '<html><head>';
|
||||
$ret .= '<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />';
|
||||
$ret .= '</head><body><div>'.$html.'</div></body></html>';
|
||||
return $ret;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
@@ -160,9 +160,15 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
||||
|
||||
$segment = substr($html, $cursor, $strlen_segment);
|
||||
|
||||
if ($segment === false) {
|
||||
// somehow, we attempted to access beyond the end of
|
||||
// the string, defense-in-depth, reported by Nate Abele
|
||||
break;
|
||||
}
|
||||
|
||||
// Check if it's a comment
|
||||
if (
|
||||
substr($segment, 0, 3) == '!--'
|
||||
substr($segment, 0, 3) === '!--'
|
||||
) {
|
||||
// re-determine segment length, looking for -->
|
||||
$position_comment_end = strpos($html, '-->', $cursor);
|
||||
@@ -237,7 +243,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
||||
// trailing slash. Remember, we could have a tag like <br>, so
|
||||
// any later token processing scripts must convert improperly
|
||||
// classified EmptyTags from StartTags.
|
||||
$is_self_closing= (strpos($segment,'/') === $strlen_segment-1);
|
||||
$is_self_closing = (strrpos($segment,'/') === $strlen_segment-1);
|
||||
if ($is_self_closing) {
|
||||
$strlen_segment--;
|
||||
$segment = substr($segment, 0, $strlen_segment);
|
||||
|
3886
library/HTMLPurifier/Lexer/PH5P.php
Normal file
3886
library/HTMLPurifier/Lexer/PH5P.php
Normal file
File diff suppressed because it is too large
Load Diff
@@ -25,7 +25,9 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
||||
|
||||
/**
|
||||
* Whether or not to compress directive names, clipping them off
|
||||
* after a certain amount of letters
|
||||
* after a certain amount of letters. False to disable or integer letters
|
||||
* before clipping.
|
||||
* @protected
|
||||
*/
|
||||
var $compress = false;
|
||||
|
||||
@@ -41,11 +43,13 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
||||
$this->docURL = $doc_url;
|
||||
$this->name = $name;
|
||||
$this->compress = $compress;
|
||||
// initialize sub-printers
|
||||
$this->fields['default'] = new HTMLPurifier_Printer_ConfigForm_default();
|
||||
$this->fields['bool'] = new HTMLPurifier_Printer_ConfigForm_bool();
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets default column and row size for textareas in sub-printers
|
||||
* @param $cols Integer columns of textarea, null to use default
|
||||
* @param $rows Integer rows of textarea, null to use default
|
||||
*/
|
||||
@@ -55,15 +59,14 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
||||
}
|
||||
|
||||
/**
|
||||
* Retrieves styling, in case the directory it's in is not publically
|
||||
* available
|
||||
* Retrieves styling, in case it is not accessible by webserver
|
||||
*/
|
||||
function getCSS() {
|
||||
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.css');
|
||||
}
|
||||
|
||||
/**
|
||||
* Retrieves JavaScript, in case directory is not public
|
||||
* Retrieves JavaScript, in case it is not accessible by webserver
|
||||
*/
|
||||
function getJavaScript() {
|
||||
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.js');
|
||||
@@ -97,14 +100,14 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
||||
$ret .= $this->renderNamespace($ns, $directives);
|
||||
}
|
||||
if ($render_controls) {
|
||||
$ret .= $this->start('tfoot');
|
||||
$ret .= $this->start('tbody');
|
||||
$ret .= $this->start('tr');
|
||||
$ret .= $this->start('td', array('colspan' => 2, 'class' => 'controls'));
|
||||
$ret .= $this->elementEmpty('input', array('type' => 'Submit', 'value' => 'Submit'));
|
||||
$ret .= $this->elementEmpty('input', array('type' => 'submit', 'value' => 'Submit'));
|
||||
$ret .= '[<a href="?">Reset</a>]';
|
||||
$ret .= $this->end('td');
|
||||
$ret .= $this->end('tr');
|
||||
$ret .= $this->end('tfoot');
|
||||
$ret .= $this->end('tbody');
|
||||
}
|
||||
$ret .= $this->end('table');
|
||||
return $ret;
|
||||
|
@@ -102,6 +102,7 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
|
||||
$ret .= $this->element('td', $this->listifyTagLookup($lookup));
|
||||
$ret .= $this->end('tr');
|
||||
}
|
||||
$ret .= $this->end('table');
|
||||
return $ret;
|
||||
}
|
||||
|
||||
@@ -179,7 +180,8 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
|
||||
$def->validateChildren(array(), $this->config, $context);
|
||||
}
|
||||
$elements = $def->elements;
|
||||
} elseif ($def->type == 'chameleon') {
|
||||
}
|
||||
if ($def->type == 'chameleon') {
|
||||
$attr['rowspan'] = 2;
|
||||
} elseif ($def->type == 'empty') {
|
||||
$elements = array();
|
||||
|
@@ -195,7 +195,7 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
|
||||
//################################################################//
|
||||
// Process result by interpreting $result
|
||||
|
||||
if ($result === true) {
|
||||
if ($result === true || $child_tokens === $result) {
|
||||
// leave the node as is
|
||||
|
||||
// register start token as a parental node start
|
||||
|
@@ -36,28 +36,23 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
|
||||
$definition = $config->getHTMLDefinition();
|
||||
|
||||
// CurrentNesting
|
||||
$this->currentNesting = array();
|
||||
$context->register('CurrentNesting', $this->currentNesting);
|
||||
|
||||
// InputIndex
|
||||
$this->inputIndex = false;
|
||||
$context->register('InputIndex', $this->inputIndex);
|
||||
|
||||
// InputTokens
|
||||
$context->register('InputTokens', $tokens);
|
||||
$this->inputTokens =& $tokens;
|
||||
|
||||
// OutputTokens
|
||||
// local variables
|
||||
$result = array();
|
||||
$this->outputTokens =& $result;
|
||||
|
||||
// %Core.EscapeInvalidTags
|
||||
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
||||
$generator = new HTMLPurifier_Generator();
|
||||
|
||||
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
||||
$e =& $context->get('ErrorCollector', true);
|
||||
|
||||
// member variables
|
||||
$this->currentNesting = array();
|
||||
$this->inputIndex = false;
|
||||
$this->inputTokens =& $tokens;
|
||||
$this->outputTokens =& $result;
|
||||
|
||||
// context variables
|
||||
$context->register('CurrentNesting', $this->currentNesting);
|
||||
$context->register('InputIndex', $this->inputIndex);
|
||||
$context->register('InputTokens', $tokens);
|
||||
|
||||
// -- begin INJECTOR --
|
||||
|
||||
$this->injectors = array();
|
||||
@@ -95,6 +90,10 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
trigger_error("Cannot enable $name injector because $error is not allowed", E_USER_WARNING);
|
||||
}
|
||||
|
||||
// warning: most foreach loops follow the convention $i => $x.
|
||||
// be sure, for PHP4 compatibility, to only perform write operations
|
||||
// directly referencing the object using $i: $x is only safe for reads
|
||||
|
||||
// -- end INJECTOR --
|
||||
|
||||
$token = false;
|
||||
@@ -105,6 +104,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
// if all goes well, this token will be passed through unharmed
|
||||
$token = $tokens[$this->inputIndex];
|
||||
|
||||
//printTokens($tokens, $this->inputIndex);
|
||||
|
||||
foreach ($this->injectors as $i => $x) {
|
||||
if ($x->skip > 0) $this->injectors[$i]->skip--;
|
||||
}
|
||||
@@ -114,7 +115,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
if ($token->type === 'text') {
|
||||
// injector handler code; duplicated for performance reasons
|
||||
foreach ($this->injectors as $i => $x) {
|
||||
if (!$x->skip) $x->handleText($token);
|
||||
if (!$x->skip) $this->injectors[$i]->handleText($token);
|
||||
if (is_array($token)) {
|
||||
$this->currentInjector = $i;
|
||||
break;
|
||||
@@ -172,7 +173,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
// injector handler code; duplicated for performance reasons
|
||||
if ($ok) {
|
||||
foreach ($this->injectors as $i => $x) {
|
||||
if (!$x->skip) $x->handleElement($token);
|
||||
if (!$x->skip) $this->injectors[$i]->handleElement($token);
|
||||
if (is_array($token)) {
|
||||
$this->currentInjector = $i;
|
||||
break;
|
||||
@@ -202,6 +203,9 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
$current_parent = array_pop($this->currentNesting);
|
||||
if ($current_parent->name == $token->name) {
|
||||
$result[] = $token;
|
||||
foreach ($this->injectors as $i => $x) {
|
||||
$this->injectors[$i]->notifyEnd($token);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
@@ -238,16 +242,16 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
|
||||
// okay, we found it, close all the skipped tags
|
||||
// note that skipped tags contains the element we need closed
|
||||
$size = count($skipped_tags);
|
||||
for ($i = $size - 1; $i > 0; $i--) {
|
||||
if ($e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
||||
for ($i = count($skipped_tags) - 1; $i >= 0; $i--) {
|
||||
if ($i && $e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
||||
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$i]);
|
||||
}
|
||||
$result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
||||
$result[] = $new_token = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
||||
foreach ($this->injectors as $j => $x) { // $j, not $i!!!
|
||||
$this->injectors[$j]->notifyEnd($new_token);
|
||||
}
|
||||
}
|
||||
|
||||
$result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
||||
|
||||
}
|
||||
|
||||
$context->destroy('CurrentNesting');
|
||||
@@ -255,17 +259,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
$context->destroy('InputIndex');
|
||||
$context->destroy('CurrentToken');
|
||||
|
||||
// we're at the end now, fix all still unclosed tags
|
||||
// not using processToken() because at this point we don't
|
||||
// care about current nesting
|
||||
// we're at the end now, fix all still unclosed tags (this is
|
||||
// duplicated from the end of the loop with some slight modifications)
|
||||
// not using $skipped_tags since it would invariably be all of them
|
||||
if (!empty($this->currentNesting)) {
|
||||
$size = count($this->currentNesting);
|
||||
for ($i = $size - 1; $i >= 0; $i--) {
|
||||
for ($i = count($this->currentNesting) - 1; $i >= 0; $i--) {
|
||||
if ($e && !isset($this->currentNesting[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
||||
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $this->currentNesting[$i]);
|
||||
}
|
||||
$result[] =
|
||||
new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
|
||||
$result[] = $new_token = new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
|
||||
foreach ($this->injectors as $j => $x) { // $j, not $i!!!
|
||||
$this->injectors[$j]->notifyEnd($new_token);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -286,8 +291,14 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
||||
|
||||
// adjust the injector skips based on the array substitution
|
||||
if ($this->injectors) {
|
||||
$offset = count($token) + 1;
|
||||
$offset = count($token);
|
||||
for ($i = 0; $i <= $this->currentInjector; $i++) {
|
||||
// because of the skip back, we need to add one more
|
||||
// for uninitialized injectors. I'm not exactly
|
||||
// sure why this is the case, but I think it has to
|
||||
// do with the fact that we're decrementing skips
|
||||
// before re-checking text
|
||||
if (!$this->injectors[$i]->skip) $this->injectors[$i]->skip++;
|
||||
$this->injectors[$i]->skip += $offset;
|
||||
}
|
||||
}
|
||||
|
@@ -116,6 +116,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
||||
// mostly everything's good, but
|
||||
// we need to make sure required attributes are in order
|
||||
if (
|
||||
($token->type === 'start' || $token->type === 'empty') &&
|
||||
$definition->info[$token->name]->required_attr &&
|
||||
($token->name != 'img' || $remove_invalid_img) // ensure config option still works
|
||||
) {
|
||||
@@ -134,7 +135,6 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
||||
$token->armor['ValidateAttributes'] = true;
|
||||
}
|
||||
|
||||
// CAN BE GENERICIZED
|
||||
if (isset($hidden_elements[$token->name]) && $token->type == 'start') {
|
||||
$textify_comments = $token->name;
|
||||
} elseif ($token->name === $textify_comments && $token->type == 'end') {
|
||||
|
@@ -6,10 +6,6 @@ require_once 'HTMLPurifier/IDAccumulator.php';
|
||||
|
||||
require_once 'HTMLPurifier/AttrValidator.php';
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'Attr', 'IDBlacklist', array(), 'list',
|
||||
'Array of IDs not allowed in the document.');
|
||||
|
||||
/**
|
||||
* Validate all attributes in the tokens.
|
||||
*/
|
||||
@@ -19,11 +15,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
|
||||
|
||||
function execute($tokens, $config, &$context) {
|
||||
|
||||
// setup id_accumulator context
|
||||
$id_accumulator = new HTMLPurifier_IDAccumulator();
|
||||
$id_accumulator->load($config->get('Attr', 'IDBlacklist'));
|
||||
$context->register('IDAccumulator', $id_accumulator);
|
||||
|
||||
// setup validator
|
||||
$validator = new HTMLPurifier_AttrValidator();
|
||||
|
||||
@@ -44,8 +35,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
|
||||
|
||||
$tokens[$key] = $token; // for PHP 4
|
||||
}
|
||||
|
||||
$context->destroy('IDAccumulator');
|
||||
$context->destroy('CurrentToken');
|
||||
|
||||
return $tokens;
|
||||
|
@@ -1,10 +1,22 @@
|
||||
<?php
|
||||
|
||||
/**
|
||||
* Chainable filters for custom URI processing
|
||||
* Chainable filters for custom URI processing.
|
||||
*
|
||||
* These filters can perform custom actions on a URI filter object,
|
||||
* including transformation or blacklisting.
|
||||
*
|
||||
* @warning This filter is called before scheme object validation occurs.
|
||||
* Make sure, if you require a specific scheme object, you
|
||||
* you check that it exists. This allows filters to convert
|
||||
* proprietary URI schemes into regular ones.
|
||||
*/
|
||||
class HTMLPurifier_URIFilter
|
||||
{
|
||||
|
||||
/**
|
||||
* Unique identifier of filter
|
||||
*/
|
||||
var $name;
|
||||
|
||||
/**
|
||||
@@ -17,8 +29,12 @@ class HTMLPurifier_URIFilter
|
||||
* @param &$uri Reference to URI object
|
||||
* @param $config Instance of HTMLPurifier_Config
|
||||
* @param &$context Instance of HTMLPurifier_Context
|
||||
* @return bool Whether or not to continue processing: false indicates
|
||||
* URL is no good, true indicates continue processing. Note that
|
||||
* all changes are committed directly on the URI object
|
||||
*/
|
||||
function filter(&$uri, $config, &$context) {
|
||||
trigger_error('Cannot call abstract function', E_USER_ERROR);
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -47,6 +47,10 @@ class HTMLPurifier_URIFilter_MakeAbsolute extends HTMLPurifier_URIFilter
|
||||
// absolute URI already: don't change
|
||||
if (!is_null($uri->host)) return true;
|
||||
$scheme_obj = $uri->getSchemeObj($config, $context);
|
||||
if (!$scheme_obj) {
|
||||
// scheme not recognized
|
||||
return false;
|
||||
}
|
||||
if (!$scheme_obj->hierarchical) {
|
||||
// non-hierarchal URI with explicit scheme, don't change
|
||||
return true;
|
||||
|
@@ -1,5 +1,12 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/URIScheme/http.php';
|
||||
require_once 'HTMLPurifier/URIScheme/https.php';
|
||||
require_once 'HTMLPurifier/URIScheme/mailto.php';
|
||||
require_once 'HTMLPurifier/URIScheme/ftp.php';
|
||||
require_once 'HTMLPurifier/URIScheme/nntp.php';
|
||||
require_once 'HTMLPurifier/URIScheme/news.php';
|
||||
|
||||
HTMLPurifier_ConfigSchema::define(
|
||||
'URI', 'AllowedSchemes', array(
|
||||
'http' => true, // "Hypertext Transfer Protocol", nuf' said
|
||||
@@ -7,7 +14,6 @@ HTMLPurifier_ConfigSchema::define(
|
||||
// quite useful, but not necessary
|
||||
'mailto' => true,// Email
|
||||
'ftp' => true, // "File Transfer Protocol"
|
||||
'irc' => true, // "Internet Relay Chat", usually needs another app
|
||||
// for Usenet, these two are similar, but distinct
|
||||
'nntp' => true, // individual Netnews articles
|
||||
'news' => true // newsgroup or individual Netnews articles
|
||||
@@ -54,12 +60,6 @@ class HTMLPurifier_URISchemeRegistry
|
||||
*/
|
||||
var $schemes = array();
|
||||
|
||||
/**
|
||||
* Directory where scheme objects can be found
|
||||
* @private
|
||||
*/
|
||||
var $_scheme_dir = null;
|
||||
|
||||
/**
|
||||
* Retrieves a scheme validator object
|
||||
* @param $scheme String scheme name like http or mailto
|
||||
@@ -79,21 +79,16 @@ class HTMLPurifier_URISchemeRegistry
|
||||
}
|
||||
|
||||
if (isset($this->schemes[$scheme])) return $this->schemes[$scheme];
|
||||
if (empty($this->_dir)) $this->_dir = HTMLPURIFIER_PREFIX . '/HTMLPurifier/URIScheme/';
|
||||
|
||||
if (!isset($allowed_schemes[$scheme])) return $null;
|
||||
|
||||
// this bit of reflection is not very efficient, and a bit
|
||||
// hacky too
|
||||
$class = 'HTMLPurifier_URIScheme_' . $scheme;
|
||||
if (!class_exists($class)) include_once $this->_dir . $scheme . '.php';
|
||||
if (!class_exists($class)) return $null;
|
||||
$this->schemes[$scheme] = new $class();
|
||||
return $this->schemes[$scheme];
|
||||
}
|
||||
|
||||
/**
|
||||
* Registers a custom scheme to the cache.
|
||||
* Registers a custom scheme to the cache, bypassing reflection.
|
||||
* @param $scheme Scheme name
|
||||
* @param $scheme_obj HTMLPurifier_URIScheme object
|
||||
*/
|
||||
|
64
maintenance/PH5P.patch
Normal file
64
maintenance/PH5P.patch
Normal file
@@ -0,0 +1,64 @@
|
||||
--- C:\Users\Edward\Webs\htmlpurifier\maintenance\PH5P.php 2007-11-04 23:41:49.074543700 -0500
|
||||
+++ C:\Users\Edward\Webs\htmlpurifier\maintenance/PH5P.new.php 2007-11-05 00:23:52.839543700 -0500
|
||||
@@ -211,7 +211,10 @@
|
||||
// If nothing is returned, emit a U+0026 AMPERSAND character token.
|
||||
// Otherwise, emit the character token that was returned.
|
||||
$char = (!$entity) ? '&' : $entity;
|
||||
- $this->emitToken($char);
|
||||
+ $this->emitToken(array(
|
||||
+ 'type' => self::CHARACTR,
|
||||
+ 'data' => $char
|
||||
+ ));
|
||||
|
||||
// Finally, switch to the data state.
|
||||
$this->state = 'data';
|
||||
@@ -708,7 +711,7 @@
|
||||
} elseif($char === '&') {
|
||||
/* U+0026 AMPERSAND (&)
|
||||
Switch to the entity in attribute value state. */
|
||||
- $this->entityInAttributeValueState('non');
|
||||
+ $this->entityInAttributeValueState();
|
||||
|
||||
} elseif($char === '>') {
|
||||
/* U+003E GREATER-THAN SIGN (>)
|
||||
@@ -738,7 +741,8 @@
|
||||
? '&'
|
||||
: $entity;
|
||||
|
||||
- $this->emitToken($char);
|
||||
+ $last = count($this->token['attr']) - 1;
|
||||
+ $this->token['attr'][$last]['value'] .= $char;
|
||||
}
|
||||
|
||||
private function bogusCommentState() {
|
||||
@@ -1066,6 +1070,11 @@
|
||||
$this->char++;
|
||||
|
||||
if(in_array($id, $this->entities)) {
|
||||
+ if ($e_name[$c-1] !== ';') {
|
||||
+ if ($c < $len && $e_name[$c] == ';') {
|
||||
+ $this->char++; // consume extra semicolon
|
||||
+ }
|
||||
+ }
|
||||
$entity = $id;
|
||||
break;
|
||||
}
|
||||
@@ -3659,7 +3668,7 @@
|
||||
}
|
||||
}
|
||||
|
||||
- private function generateImpliedEndTags(array $exclude = array()) {
|
||||
+ private function generateImpliedEndTags($exclude = array()) {
|
||||
/* When the steps below require the UA to generate implied end tags,
|
||||
then, if the current node is a dd element, a dt element, an li element,
|
||||
a p element, a td element, a th element, or a tr element, the UA must
|
||||
@@ -3673,7 +3682,8 @@
|
||||
}
|
||||
}
|
||||
|
||||
- private function getElementCategory($name) {
|
||||
+ private function getElementCategory($node) {
|
||||
+ $name = $node->tagName;
|
||||
if(in_array($name, $this->special))
|
||||
return self::SPECIAL;
|
||||
|
3824
maintenance/PH5P.php
Normal file
3824
maintenance/PH5P.php
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,5 +1,7 @@
|
||||
<?php
|
||||
|
||||
require_once 'compat-function-file-put-contents.php';
|
||||
|
||||
function assertCli() {
|
||||
if (php_sapi_name() != 'cli' && !getenv('PHP_IS_CLI')) {
|
||||
echo 'Script cannot be called from web-browser (if you are calling via cli,
|
||||
@@ -7,3 +9,135 @@ set environment variable PHP_IS_CLI to work around this).';
|
||||
exit;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Filesystem tools not provided by default; can recursively create, copy
|
||||
* and delete folders. Some template methods are provided for extensibility.
|
||||
* @note This class must be instantiated to be used, although it does
|
||||
* not maintain state.
|
||||
*/
|
||||
class FSTools
|
||||
{
|
||||
|
||||
/**
|
||||
* Recursively creates a directory
|
||||
* @param string $folder Name of folder to create
|
||||
* @note Adapted from the PHP manual comment 76612
|
||||
*/
|
||||
function mkdir($folder) {
|
||||
$folders = preg_split("#[\\\\/]#", $folder);
|
||||
$base = '';
|
||||
for($i = 0, $c = count($folders); $i < $c; $i++) {
|
||||
if(empty($folders[$i])) {
|
||||
if (!$i) {
|
||||
// special case for root level
|
||||
$base .= DIRECTORY_SEPARATOR;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
$base .= $folders[$i];
|
||||
if(!is_dir($base)){
|
||||
mkdir($base);
|
||||
}
|
||||
$base .= DIRECTORY_SEPARATOR;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Copy a file, or recursively copy a folder and its contents; modified
|
||||
* so that copied files, if PHP, have includes removed
|
||||
*
|
||||
* @author Aidan Lister <aidan@php.net>
|
||||
* @version 1.0.1-modified
|
||||
* @link http://aidanlister.com/repos/v/function.copyr.php
|
||||
* @param string $source Source path
|
||||
* @param string $dest Destination path
|
||||
* @return bool Returns TRUE on success, FALSE on failure
|
||||
*/
|
||||
function copyr($source, $dest) {
|
||||
// Simple copy for a file
|
||||
if (is_file($source)) {
|
||||
return $this->copy($source, $dest);
|
||||
}
|
||||
// Make destination directory
|
||||
if (!is_dir($dest)) {
|
||||
mkdir($dest);
|
||||
}
|
||||
// Loop through the folder
|
||||
$dir = dir($source);
|
||||
while (false !== $entry = $dir->read()) {
|
||||
// Skip pointers
|
||||
if ($entry == '.' || $entry == '..') {
|
||||
continue;
|
||||
}
|
||||
if (!$this->copyable($entry)) {
|
||||
continue;
|
||||
}
|
||||
// Deep copy directories
|
||||
if ($dest !== "$source/$entry") {
|
||||
$this->copyr("$source/$entry", "$dest/$entry");
|
||||
}
|
||||
}
|
||||
// Clean up
|
||||
$dir->close();
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Stub for PHP's built-in copy function, can be used to overload
|
||||
* functionality
|
||||
*/
|
||||
function copy($source, $dest) {
|
||||
return copy($source, $dest);
|
||||
}
|
||||
|
||||
/**
|
||||
* Overloadable function that tests a filename for copyability. By
|
||||
* default, everything should be copied; you can restrict things to
|
||||
* ignore hidden files, unreadable files, etc.
|
||||
*/
|
||||
function copyable($file) {
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Delete a file, or a folder and its contents
|
||||
*
|
||||
* @author Aidan Lister <aidan@php.net>
|
||||
* @version 1.0.3
|
||||
* @link http://aidanlister.com/repos/v/function.rmdirr.php
|
||||
* @param string $dirname Directory to delete
|
||||
* @return bool Returns TRUE on success, FALSE on failure
|
||||
*/
|
||||
function rmdirr($dirname)
|
||||
{
|
||||
// Sanity check
|
||||
if (!file_exists($dirname)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Simple delete for a file
|
||||
if (is_file($dirname) || is_link($dirname)) {
|
||||
return unlink($dirname);
|
||||
}
|
||||
|
||||
// Loop through the folder
|
||||
$dir = dir($dirname);
|
||||
while (false !== $entry = $dir->read()) {
|
||||
// Skip pointers
|
||||
if ($entry == '.' || $entry == '..') {
|
||||
continue;
|
||||
}
|
||||
// Recurse
|
||||
$this->rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
|
||||
}
|
||||
|
||||
// Clean up
|
||||
$dir->close();
|
||||
return rmdir($dirname);
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
107
maintenance/compat-function-file-put-contents.php
Normal file
107
maintenance/compat-function-file-put-contents.php
Normal file
@@ -0,0 +1,107 @@
|
||||
<?php
|
||||
// $Id: file_put_contents.php,v 1.27 2007/04/17 10:09:56 arpad Exp $
|
||||
|
||||
|
||||
if (!defined('FILE_USE_INCLUDE_PATH')) {
|
||||
define('FILE_USE_INCLUDE_PATH', 1);
|
||||
}
|
||||
|
||||
if (!defined('LOCK_EX')) {
|
||||
define('LOCK_EX', 2);
|
||||
}
|
||||
|
||||
if (!defined('FILE_APPEND')) {
|
||||
define('FILE_APPEND', 8);
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Replace file_put_contents()
|
||||
*
|
||||
* @category PHP
|
||||
* @package PHP_Compat
|
||||
* @license LGPL - http://www.gnu.org/licenses/lgpl.html
|
||||
* @copyright 2004-2007 Aidan Lister <aidan@php.net>, Arpad Ray <arpad@php.net>
|
||||
* @link http://php.net/function.file_put_contents
|
||||
* @author Aidan Lister <aidan@php.net>
|
||||
* @version $Revision: 1.27 $
|
||||
* @internal resource_context is not supported
|
||||
* @since PHP 5
|
||||
* @require PHP 4.0.0 (user_error)
|
||||
*/
|
||||
function php_compat_file_put_contents($filename, $content, $flags = null, $resource_context = null)
|
||||
{
|
||||
// If $content is an array, convert it to a string
|
||||
if (is_array($content)) {
|
||||
$content = implode('', $content);
|
||||
}
|
||||
|
||||
// If we don't have a string, throw an error
|
||||
if (!is_scalar($content)) {
|
||||
user_error('file_put_contents() The 2nd parameter should be either a string or an array',
|
||||
E_USER_WARNING);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Get the length of data to write
|
||||
$length = strlen($content);
|
||||
|
||||
// Check what mode we are using
|
||||
$mode = ($flags & FILE_APPEND) ?
|
||||
'a' :
|
||||
'wb';
|
||||
|
||||
// Check if we're using the include path
|
||||
$use_inc_path = ($flags & FILE_USE_INCLUDE_PATH) ?
|
||||
true :
|
||||
false;
|
||||
|
||||
// Open the file for writing
|
||||
if (($fh = @fopen($filename, $mode, $use_inc_path)) === false) {
|
||||
user_error('file_put_contents() failed to open stream: Permission denied',
|
||||
E_USER_WARNING);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Attempt to get an exclusive lock
|
||||
$use_lock = ($flags & LOCK_EX) ? true : false ;
|
||||
if ($use_lock === true) {
|
||||
if (!flock($fh, LOCK_EX)) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// Write to the file
|
||||
$bytes = 0;
|
||||
if (($bytes = @fwrite($fh, $content)) === false) {
|
||||
$errormsg = sprintf('file_put_contents() Failed to write %d bytes to %s',
|
||||
$length,
|
||||
$filename);
|
||||
user_error($errormsg, E_USER_WARNING);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Close the handle
|
||||
@fclose($fh);
|
||||
|
||||
// Check all the data was written
|
||||
if ($bytes != $length) {
|
||||
$errormsg = sprintf('file_put_contents() Only %d of %d bytes written, possibly out of free disk space.',
|
||||
$bytes,
|
||||
$length);
|
||||
user_error($errormsg, E_USER_WARNING);
|
||||
return false;
|
||||
}
|
||||
|
||||
// Return length
|
||||
return $bytes;
|
||||
}
|
||||
|
||||
|
||||
// Define
|
||||
if (!function_exists('file_put_contents')) {
|
||||
function file_put_contents($filename, $content, $flags = null, $resource_context = null)
|
||||
{
|
||||
return php_compat_file_put_contents($filename, $content, $flags, $resource_context);
|
||||
}
|
||||
}
|
@@ -32,5 +32,5 @@ foreach ($names as $name) {
|
||||
$cache->flush($config);
|
||||
}
|
||||
|
||||
echo 'Cache flushed successfully.';
|
||||
echo "Cache flushed successfully.\n";
|
||||
|
||||
|
13
maintenance/generate-ph5p-patch.php
Normal file
13
maintenance/generate-ph5p-patch.php
Normal file
@@ -0,0 +1,13 @@
|
||||
<?php
|
||||
|
||||
$orig = realpath(dirname(__FILE__) . '/PH5P.php');
|
||||
$new = realpath(dirname(__FILE__) . '/../library/HTMLPurifier/Lexer/PH5P.php');
|
||||
$newt = dirname(__FILE__) . '/PH5P.new.php'; // temporary file
|
||||
|
||||
// minor text-processing of new file to get into same format as original
|
||||
$new_src = file_get_contents($new);
|
||||
$new_src = '<?php' . PHP_EOL . substr($new_src, strpos($new_src, 'class HTML5 {'));
|
||||
|
||||
file_put_contents($newt, $new_src);
|
||||
shell_exec("diff -u \"$orig\" \"$newt\" > PH5P.patch");
|
||||
unlink($newt);
|
@@ -6,20 +6,38 @@ assertCli();
|
||||
|
||||
/**
|
||||
* Compiles all of HTML Purifier's library files into one big file
|
||||
* named HTMLPurifier.standalone.php. Operates recursively, and will
|
||||
* barf if there are conditional includes.
|
||||
*
|
||||
* Details: also creates blank "include" files in the test/blank directory
|
||||
* in order to simulate require_once's inside the test files.
|
||||
* named HTMLPurifier.standalone.php.
|
||||
*/
|
||||
|
||||
/**
|
||||
* Global array that tracks already loaded includes
|
||||
* Global hash that tracks already loaded includes
|
||||
*/
|
||||
$GLOBALS['loaded'] = array('HTMLPurifier.php' => true);
|
||||
|
||||
/**
|
||||
* @param $text Text to replace includes from
|
||||
* Custom FSTools for this script that overloads some behavior
|
||||
* @warning The overloading of copy() is not necessarily global for
|
||||
* this script. Watch out!
|
||||
*/
|
||||
class MergeLibraryFSTools extends FSTools
|
||||
{
|
||||
function copyable($entry) {
|
||||
// Skip hidden files
|
||||
if ($entry[0] == '.') {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
function copy($source, $dest) {
|
||||
copy_and_remove_includes($source, $dest);
|
||||
}
|
||||
}
|
||||
$FS = new MergeLibraryFSTools();
|
||||
|
||||
/**
|
||||
* Replaces the includes inside PHP source code with the corresponding
|
||||
* source.
|
||||
* @param string $text PHP source code to replace includes from
|
||||
*/
|
||||
function replace_includes($text) {
|
||||
return preg_replace_callback(
|
||||
@@ -32,6 +50,8 @@ function replace_includes($text) {
|
||||
/**
|
||||
* Removes leading PHP tags from included files. Assumes that there is
|
||||
* no trailing tag.
|
||||
* @note This is safe for files that have internal <?php
|
||||
* @param string $text Text to have leading PHP tag from
|
||||
*/
|
||||
function remove_php_tags($text) {
|
||||
return substr($text, 5);
|
||||
@@ -40,125 +60,48 @@ function remove_php_tags($text) {
|
||||
/**
|
||||
* Creates an appropriate blank file, recursively generating directories
|
||||
* if necessary
|
||||
* @param string $file Filename to create blank for
|
||||
*/
|
||||
function create_blank($file) {
|
||||
global $FS;
|
||||
$dir = dirname($file);
|
||||
$base = realpath('../tests/blanks/') . DIRECTORY_SEPARATOR ;
|
||||
if ($dir != '.') mkdir_deep($base . $dir);
|
||||
if ($dir != '.') {
|
||||
$FS->mkdir($base . $dir);
|
||||
}
|
||||
file_put_contents($base . $file, '');
|
||||
}
|
||||
|
||||
/**
|
||||
* Recursively creates a directory
|
||||
* @note Adapted from the PHP manual comment 76612
|
||||
* Copies the contents of a directory to the standalone directory
|
||||
* @param string $dir Directory to copy
|
||||
*/
|
||||
function mkdir_deep($folder) {
|
||||
$folders = preg_split("#[\\\\/]#", $folder);
|
||||
$base = '';
|
||||
for($i = 0, $c = count($folders); $i < $c; $i++) {
|
||||
if(empty($folders[$i])) {
|
||||
if (!$i) {
|
||||
// special case for root level
|
||||
$base .= DIRECTORY_SEPARATOR;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
$base .= $folders[$i];
|
||||
if(!is_dir($base)){
|
||||
mkdir($base);
|
||||
}
|
||||
$base .= DIRECTORY_SEPARATOR;
|
||||
}
|
||||
function make_dir_standalone($dir) {
|
||||
global $FS;
|
||||
return $FS->copyr($dir, 'standalone/' . $dir);
|
||||
}
|
||||
|
||||
/**
|
||||
* Copy a file, or recursively copy a folder and its contents
|
||||
*
|
||||
* @author Aidan Lister <aidan@php.net>
|
||||
* @version 1.0.1
|
||||
* @link http://aidanlister.com/repos/v/function.copyr.php
|
||||
* @param string $source Source path
|
||||
* @param string $dest Destination path
|
||||
* @return bool Returns TRUE on success, FALSE on failure
|
||||
* Copies the contents of a file to the standalone directory
|
||||
* @param string $file File to copy
|
||||
*/
|
||||
function copyr($source, $dest) {
|
||||
// Simple copy for a file
|
||||
if (is_file($source)) {
|
||||
return copy($source, $dest);
|
||||
}
|
||||
// Make destination directory
|
||||
if (!is_dir($dest)) {
|
||||
mkdir($dest);
|
||||
}
|
||||
// Loop through the folder
|
||||
$dir = dir($source);
|
||||
while (false !== $entry = $dir->read()) {
|
||||
// Skip pointers
|
||||
if ($entry == '.' || $entry == '..') {
|
||||
continue;
|
||||
}
|
||||
// Skip hidden files
|
||||
if ($entry[0] == '.') {
|
||||
continue;
|
||||
}
|
||||
// Deep copy directories
|
||||
if ($dest !== "$source/$entry") {
|
||||
copyr("$source/$entry", "$dest/$entry");
|
||||
}
|
||||
}
|
||||
// Clean up
|
||||
$dir->close();
|
||||
function make_file_standalone($file) {
|
||||
global $FS;
|
||||
$FS->mkdir('standalone/' . dirname($file));
|
||||
copy_and_remove_includes($file, 'standalone/' . $file);
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Delete a file, or a folder and its contents
|
||||
*
|
||||
* @author Aidan Lister <aidan@php.net>
|
||||
* @version 1.0.3
|
||||
* @link http://aidanlister.com/repos/v/function.rmdirr.php
|
||||
* @param string $dirname Directory to delete
|
||||
* @return bool Returns TRUE on success, FALSE on failure
|
||||
* Copies a file to another location recursively, if it is a PHP file
|
||||
* remove includes
|
||||
* @param string $file Original file
|
||||
* @param string $sfile New location of file
|
||||
*/
|
||||
function rmdirr($dirname)
|
||||
{
|
||||
// Sanity check
|
||||
if (!file_exists($dirname)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Simple delete for a file
|
||||
if (is_file($dirname) || is_link($dirname)) {
|
||||
return unlink($dirname);
|
||||
}
|
||||
|
||||
// Loop through the folder
|
||||
$dir = dir($dirname);
|
||||
while (false !== $entry = $dir->read()) {
|
||||
// Skip pointers
|
||||
if ($entry == '.' || $entry == '..') {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Recurse
|
||||
rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
|
||||
}
|
||||
|
||||
// Clean up
|
||||
$dir->close();
|
||||
return rmdir($dirname);
|
||||
}
|
||||
|
||||
/**
|
||||
* Copies the contents of a directory to the standalone directory
|
||||
*/
|
||||
function make_dir_standalone($dir) {
|
||||
return copyr($dir, 'standalone/' . $dir);
|
||||
}
|
||||
|
||||
function make_file_standalone($file) {
|
||||
mkdir_deep('standalone/' . dirname($file));
|
||||
return copy($file, 'standalone/' . $file);
|
||||
function copy_and_remove_includes($file, $sfile) {
|
||||
$contents = file_get_contents($file);
|
||||
if (strrchr($file, '.') === '.php') $contents = replace_includes($contents);
|
||||
return file_put_contents($sfile, $contents);
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -167,8 +110,14 @@ function make_file_standalone($file) {
|
||||
*/
|
||||
function replace_includes_callback($matches) {
|
||||
$file = $matches[1];
|
||||
// PHP 5 only file
|
||||
if ($file == 'HTMLPurifier/Lexer/DOMLex.php') {
|
||||
$preserve = array(
|
||||
// PHP 5 only
|
||||
'HTMLPurifier/Lexer/DOMLex.php' => 1,
|
||||
'HTMLPurifier/Printer.php' => 1,
|
||||
// PEAR (external)
|
||||
'XML/HTMLSax3.php' => 1
|
||||
);
|
||||
if (isset($preserve[$file])) {
|
||||
return $matches[0];
|
||||
}
|
||||
if (isset($GLOBALS['loaded'][$file])) return '';
|
||||
@@ -192,16 +141,22 @@ file_put_contents('HTMLPurifier.standalone.php', $contents);
|
||||
echo ' done!' . PHP_EOL;
|
||||
|
||||
echo 'Creating standalone directory...';
|
||||
rmdirr('standalone'); // ensure a clean copy
|
||||
mkdir_deep('standalone/HTMLPurifier/DefinitionCache/Serializer');
|
||||
make_dir_standalone('HTMLPurifier/EntityLookup');
|
||||
make_dir_standalone('HTMLPurifier/Language');
|
||||
make_file_standalone('HTMLPurifier/Printer/ConfigForm.js');
|
||||
make_file_standalone('HTMLPurifier/Printer/ConfigForm.css');
|
||||
make_dir_standalone('HTMLPurifier/URIScheme');
|
||||
// PHP 5 only file
|
||||
mkdir_deep('standalone/HTMLPurifier/Lexer');
|
||||
make_file_standalone('HTMLPurifier/Lexer/DOMLex.php');
|
||||
make_file_standalone('HTMLPurifier/TokenFactory.php');
|
||||
echo ' done!' . PHP_EOL;
|
||||
$FS->rmdirr('standalone'); // ensure a clean copy
|
||||
|
||||
// data files
|
||||
$FS->mkdir('standalone/HTMLPurifier/DefinitionCache/Serializer');
|
||||
make_dir_standalone('HTMLPurifier/EntityLookup');
|
||||
|
||||
// non-standard inclusion setup
|
||||
make_dir_standalone('HTMLPurifier/Language');
|
||||
|
||||
// optional components
|
||||
make_file_standalone('HTMLPurifier/Printer.php');
|
||||
make_dir_standalone('HTMLPurifier/Printer');
|
||||
make_dir_standalone('HTMLPurifier/Filter');
|
||||
make_file_standalone('HTMLPurifier/Lexer/PEARSax3.php');
|
||||
|
||||
// PHP 5 only files
|
||||
make_file_standalone('HTMLPurifier/Lexer/DOMLex.php');
|
||||
make_file_standalone('HTMLPurifier/Lexer/PH5P.php');
|
||||
echo ' done!' . PHP_EOL;
|
||||
|
@@ -261,12 +261,42 @@ function phorum_htmlpurifier_editor_after_subject() {
|
||||
// don't show this message if it's a WYSIWYG editor, since it will
|
||||
// then be handled automatically
|
||||
if (!empty($GLOBALS['PHORUM']['mod_htmlpurifier']['wysiwyg'])) return;
|
||||
?><tr><td colspan="2" style="padding:1em 0.3em;">
|
||||
HTML input is <strong>on</strong>. Make sure you escape all HTML and
|
||||
angled-brackets with &lt; and &gt; (you can also use CDATA
|
||||
tags, simply wrap the suspect text with
|
||||
<![CDATA[<em>text</em>]]>. Paragraphs will only be applied to
|
||||
double-spaces; single-spaces will not generate <tt><br></tt> tags.
|
||||
?><tr><td colspan="2" style="padding:1em 0.3em;" class="htmlpurifier-help">
|
||||
<p>
|
||||
<strong>HTML input</strong> is enabled. Make sure you escape all HTML and
|
||||
angled brackets with <code>&lt;</code> and <code>&gt;</code>.
|
||||
</p><?php
|
||||
$purifier =& HTMLPurifier::getInstance();
|
||||
$config = $purifier->config;
|
||||
if ($config->get('AutoFormat', 'AutoParagraph')) {
|
||||
?><p>
|
||||
<strong>Auto-paragraphing</strong> is enabled. Double
|
||||
newlines will be converted to paragraphs; for single
|
||||
newlines, use the <code>pre</code> tag.
|
||||
</p><?php
|
||||
}
|
||||
$html_definition = $config->getDefinition('HTML');
|
||||
$allowed = array();
|
||||
foreach ($html_definition->info as $name => $x) $allowed[] = "<code>$name</code>";
|
||||
sort($allowed);
|
||||
$allowed_text = implode(', ', $allowed);
|
||||
?><p><strong>Allowed tags:</strong> <?php
|
||||
echo $allowed_text;
|
||||
?>.</p><?php
|
||||
?>
|
||||
</p>
|
||||
<p>
|
||||
For inputting literal code such as HTML and PHP for display, use
|
||||
CDATA tags to auto-escape your angled brackets, and <code>pre</code>
|
||||
to preserve newlines:
|
||||
</p>
|
||||
<pre><pre><![CDATA[
|
||||
<em>Place code here</em>
|
||||
]]></pre></pre>
|
||||
<p>
|
||||
Power users, you can hide this notice with:
|
||||
<pre>.htmlpurifier-help {display:none;}</pre>
|
||||
</p>
|
||||
</td></tr><?php
|
||||
}
|
||||
|
||||
|
@@ -20,8 +20,10 @@ function phorum_htmlpurifier_migrate_sigs_check() {
|
||||
function phorum_htmlpurifier_migrate_sigs($offset) {
|
||||
global $PHORUM;
|
||||
|
||||
if(!$offset) return; // bail out quick of $offset == 0
|
||||
if(!$offset) return; // bail out quick if $offset == 0
|
||||
|
||||
// theoretically, we could get rid of this multi-request
|
||||
// doo-hickery if safe mode is off
|
||||
@set_time_limit(0); // attempt to let this run
|
||||
$increment = $PHORUM['mod_htmlpurifier']['migrate-sigs-increment'];
|
||||
|
||||
@@ -52,21 +54,19 @@ function phorum_htmlpurifier_migrate_sigs($offset) {
|
||||
|
||||
// query for highest ID in database
|
||||
$type = $PHORUM['DBCONFIG']['type'];
|
||||
$sql = "select MAX(user_id) from {$PHORUM['user_table']}";
|
||||
if ($type == 'mysql') {
|
||||
$conn = phorum_db_mysql_connect();
|
||||
$sql = "select MAX(user_id) from {$PHORUM['user_table']}";
|
||||
$res = mysql_query($sql, $conn);
|
||||
$row = mysql_fetch_row($res);
|
||||
$top_id = (int) $row[0];
|
||||
} elseif ($type == 'mysqli') {
|
||||
$conn = phorum_db_mysqli_connect();
|
||||
$sql = "select MAX(user_id) from {$PHORUM['user_table']}";
|
||||
$res = mysqli_query($conn, $sql);
|
||||
$row = mysqli_fetch_row($res);
|
||||
$top_id = (int) $row[0];
|
||||
} else {
|
||||
exit('Unrecognized database!');
|
||||
}
|
||||
$top_id = (int) $row[0];
|
||||
|
||||
$offset += $increment;
|
||||
if ($offset > $top_id) { // test for end condition
|
||||
|
@@ -31,7 +31,7 @@ while (false !== ($filename = readdir($dh))) {
|
||||
if ($filename == 'all.php') continue;
|
||||
if ($filename == 'testSchema.php') continue;
|
||||
?>
|
||||
<iframe src="<?php echo escapeHTML($filename); ?>"></iframe>
|
||||
<iframe src="<?php echo escapeHTML($filename); if (isset($_GET['standalone'])) {echo '?standalone';} ?>"></iframe>
|
||||
<?php
|
||||
}
|
||||
|
||||
|
@@ -2,7 +2,11 @@
|
||||
|
||||
header('Content-type: text/html; charset=UTF-8');
|
||||
|
||||
require_once '../library/HTMLPurifier.auto.php';
|
||||
if (!isset($_GET['standalone'])) {
|
||||
require_once '../library/HTMLPurifier.auto.php';
|
||||
} else {
|
||||
require_once '../library/HTMLPurifier.standalone.php';
|
||||
}
|
||||
error_reporting(E_ALL | E_STRICT);
|
||||
|
||||
function escapeHTML($string) {
|
||||
|
@@ -54,14 +54,14 @@ function isInScopes($array = array()) {
|
||||
}
|
||||
/**#@-*/
|
||||
|
||||
function printTokens($tokens, $index) {
|
||||
function printTokens($tokens, $index = null) {
|
||||
$string = '<pre>';
|
||||
$generator = new HTMLPurifier_Generator();
|
||||
foreach ($tokens as $i => $token) {
|
||||
if ($index == $i) $string .= '[<strong>';
|
||||
if ($index === $i) $string .= '[<strong>';
|
||||
$string .= "<sup>$i</sup>";
|
||||
$string .= $generator->escape($generator->generateFromToken($token));
|
||||
if ($index == $i) $string .= '</strong>]';
|
||||
if ($index === $i) $string .= '</strong>]';
|
||||
}
|
||||
$string .= '</pre>';
|
||||
echo $string;
|
||||
|
@@ -67,6 +67,7 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
|
||||
$this->assertDef('border:1px solid #000;');
|
||||
$this->assertDef('border-bottom:2em double #FF00FA;');
|
||||
$this->assertDef('border-collapse:collapse;');
|
||||
$this->assertDef('border-collapse:separate;');
|
||||
$this->assertDef('caption-side:top;');
|
||||
$this->assertDef('vertical-align:middle;');
|
||||
$this->assertDef('vertical-align:12px;');
|
||||
@@ -79,6 +80,8 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
|
||||
$this->assertDef('background-repeat:repeat-y;');
|
||||
$this->assertDef('background-attachment:fixed;');
|
||||
$this->assertDef('background-position:left 90%;');
|
||||
$this->assertDef('border-spacing:1em;');
|
||||
$this->assertDef('border-spacing:1em 2em;');
|
||||
|
||||
// duplicates
|
||||
$this->assertDef('text-align:right;text-align:left;',
|
||||
|
@@ -11,18 +11,19 @@ class HTMLPurifier_AttrTransform_BdoDirTest extends HTMLPurifier_AttrTransformHa
|
||||
$this->obj = new HTMLPurifier_AttrTransform_BdoDir();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testAddDefaultDir() {
|
||||
$this->assertResult( array(), array('dir' => 'ltr') );
|
||||
}
|
||||
|
||||
// leave existing dir alone
|
||||
function testPreserveExistingDir() {
|
||||
$this->assertResult( array('dir' => 'rtl') );
|
||||
}
|
||||
|
||||
// use a different default
|
||||
function testAlternateDefault() {
|
||||
$this->config->set('Attr', 'DefaultTextDir', 'rtl');
|
||||
$this->assertResult(
|
||||
array(),
|
||||
array('dir' => 'rtl'),
|
||||
array('Attr.DefaultTextDir' => 'rtl')
|
||||
array('dir' => 'rtl')
|
||||
);
|
||||
|
||||
}
|
||||
|
@@ -3,6 +3,10 @@
|
||||
require_once 'HTMLPurifier/AttrTransform/BgColor.php';
|
||||
require_once 'HTMLPurifier/AttrTransformHarness.php';
|
||||
|
||||
// we currently rely on the CSS validator to fix any problems.
|
||||
// This means that this transform, strictly speaking, supports
|
||||
// a superset of the functionality.
|
||||
|
||||
class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformHarness
|
||||
{
|
||||
|
||||
@@ -11,31 +15,31 @@ class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformH
|
||||
$this->obj = new HTMLPurifier_AttrTransform_BgColor();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
// we currently rely on the CSS validator to fix any problems.
|
||||
// This means that this transform, strictly speaking, supports
|
||||
// a superset of the functionality.
|
||||
|
||||
function testBasicTransform() {
|
||||
$this->assertResult(
|
||||
array('bgcolor' => '#000000'),
|
||||
array('style' => 'background-color:#000000;')
|
||||
);
|
||||
}
|
||||
|
||||
function testPrependNewCSS() {
|
||||
$this->assertResult(
|
||||
array('bgcolor' => '#000000', 'style' => 'font-weight:bold'),
|
||||
array('style' => 'background-color:#000000;font-weight:bold')
|
||||
);
|
||||
}
|
||||
|
||||
function testLenientTreatmentOfInvalidInput() {
|
||||
// this may change when we natively support the datatype and
|
||||
// validate its contents before forwarding it on
|
||||
$this->assertResult(
|
||||
array('bgcolor' => '#F00'),
|
||||
array('style' => 'background-color:#F00;')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -11,27 +11,29 @@ class HTMLPurifier_AttrTransform_BoolToCSSTest extends HTMLPurifier_AttrTransfor
|
||||
$this->obj = new HTMLPurifier_AttrTransform_BoolToCSS('foo', 'bar:3in;');
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
function testBasicTransform() {
|
||||
$this->assertResult(
|
||||
array('foo' => 'foo'),
|
||||
array('style' => 'bar:3in;')
|
||||
);
|
||||
}
|
||||
|
||||
// boolean attribute just has to be set: we don't care about
|
||||
// anything else
|
||||
function testIgnoreValueOfBooleanAttribute() {
|
||||
$this->assertResult(
|
||||
array('foo' => 'no'),
|
||||
array('style' => 'bar:3in;')
|
||||
);
|
||||
}
|
||||
|
||||
function testPrependCSS() {
|
||||
$this->assertResult(
|
||||
array('foo' => 'foo', 'style' => 'background-color:#F00;'),
|
||||
array('style' => 'bar:3in;background-color:#F00;')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -12,27 +12,29 @@ class HTMLPurifier_AttrTransform_BorderTest extends HTMLPurifier_AttrTransformHa
|
||||
$this->obj = new HTMLPurifier_AttrTransform_Border();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
function testBasicTransform() {
|
||||
$this->assertResult(
|
||||
array('border' => '1'),
|
||||
array('style' => 'border:1px solid;')
|
||||
);
|
||||
}
|
||||
|
||||
// once again, no validation done here, we expect CSS validator
|
||||
// to catch it
|
||||
function testLenientTreatmentOfInvalidInput() {
|
||||
$this->assertResult(
|
||||
array('border' => '10%'),
|
||||
array('style' => 'border:10%px solid;')
|
||||
);
|
||||
}
|
||||
|
||||
function testPrependNewCSS() {
|
||||
$this->assertResult(
|
||||
array('border' => '23', 'style' => 'font-weight:bold;'),
|
||||
array('style' => 'border:23px solid;font-weight:bold;')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -6,38 +6,44 @@ require_once 'HTMLPurifier/AttrTransformHarness.php';
|
||||
class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransformHarness
|
||||
{
|
||||
|
||||
function testRegular() {
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||
'left' => 'text-align:left;',
|
||||
'right' => 'text-align:right;'
|
||||
));
|
||||
}
|
||||
|
||||
// leave empty arrays alone
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
// leave arrays without interesting stuff alone
|
||||
function testPreserveArraysWithoutInterestingAttributes() {
|
||||
$this->assertResult( array('style' => 'font-weight:bold;') );
|
||||
}
|
||||
|
||||
// test each of the conversions
|
||||
|
||||
function testConvertAlignLeft() {
|
||||
$this->assertResult(
|
||||
array('align' => 'left'),
|
||||
array('style' => 'text-align:left;')
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertAlignRight() {
|
||||
$this->assertResult(
|
||||
array('align' => 'right'),
|
||||
array('style' => 'text-align:right;')
|
||||
);
|
||||
}
|
||||
|
||||
// drop garbage value
|
||||
function testRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
array('align' => 'invalid'),
|
||||
array()
|
||||
);
|
||||
}
|
||||
|
||||
// test CSS munging
|
||||
function testPrependNewCSS() {
|
||||
$this->assertResult(
|
||||
array('align' => 'left', 'style' => 'font-weight:bold;'),
|
||||
array('style' => 'text-align:left;font-weight:bold;')
|
||||
@@ -46,31 +52,23 @@ class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransfor
|
||||
}
|
||||
|
||||
function testCaseInsensitive() {
|
||||
|
||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||
'right' => 'text-align:right;'
|
||||
));
|
||||
|
||||
// test case insensitivity
|
||||
$this->assertResult(
|
||||
array('align' => 'RIGHT'),
|
||||
array('style' => 'text-align:right;')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testCaseSensitive() {
|
||||
|
||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||
'right' => 'text-align:right;'
|
||||
), true);
|
||||
|
||||
// test case insensitivity
|
||||
$this->assertResult(
|
||||
array('align' => 'RIGHT'),
|
||||
array()
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -11,39 +11,37 @@ class HTMLPurifier_AttrTransform_ImgRequiredTest extends HTMLPurifier_AttrTransf
|
||||
$this->obj = new HTMLPurifier_AttrTransform_ImgRequired();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testAddMissingAttr() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
array(),
|
||||
array('src' => '', 'alt' => 'Invalid image'),
|
||||
array(
|
||||
'Core.RemoveInvalidImg' => false
|
||||
)
|
||||
array('src' => '', 'alt' => 'Invalid image')
|
||||
);
|
||||
}
|
||||
|
||||
function testAlternateDefaults() {
|
||||
$this->config->set('Attr', 'DefaultInvalidImage', 'blank.png');
|
||||
$this->config->set('Attr', 'DefaultInvalidImageAlt', 'Pawned!');
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
array(),
|
||||
array('src' => 'blank.png', 'alt' => 'Pawned!'),
|
||||
array(
|
||||
'Attr.DefaultInvalidImage' => 'blank.png',
|
||||
'Attr.DefaultInvalidImageAlt' => 'Pawned!',
|
||||
'Core.RemoveInvalidImg' => false
|
||||
)
|
||||
array('src' => 'blank.png', 'alt' => 'Pawned!')
|
||||
);
|
||||
}
|
||||
|
||||
function testGenerateAlt() {
|
||||
$this->assertResult(
|
||||
array('src' => '/path/to/foobar.png'),
|
||||
array('src' => '/path/to/foobar.png', 'alt' => 'foobar.png')
|
||||
);
|
||||
}
|
||||
|
||||
function testAddDefaultSrc() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
array('alt' => 'intrigue'),
|
||||
array('alt' => 'intrigue', 'src' => ''),
|
||||
array(
|
||||
'Core.RemoveInvalidImg' => false
|
||||
)
|
||||
array('alt' => 'intrigue', 'src' => '')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -9,33 +9,35 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
|
||||
}
|
||||
|
||||
function testVertical() {
|
||||
|
||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
function testVerticalBasicUsage() {
|
||||
$this->assertResult(
|
||||
array('vspace' => '1'),
|
||||
array('style' => 'margin-top:1px;margin-bottom:1px;')
|
||||
);
|
||||
}
|
||||
|
||||
// no validation done here, we expect CSS validator to catch it
|
||||
function testLenientHandlingOfInvalidInput() {
|
||||
$this->assertResult(
|
||||
array('vspace' => '10%'),
|
||||
array('style' => 'margin-top:10%px;margin-bottom:10%px;')
|
||||
);
|
||||
}
|
||||
|
||||
function testPrependNewCSS() {
|
||||
$this->assertResult(
|
||||
array('vspace' => '23', 'style' => 'font-weight:bold;'),
|
||||
array('style' => 'margin-top:23px;margin-bottom:23px;font-weight:bold;')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testHorizontal() {
|
||||
function testHorizontalBasicUsage() {
|
||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('hspace');
|
||||
$this->assertResult(
|
||||
array('hspace' => '1'),
|
||||
@@ -43,7 +45,7 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
|
||||
);
|
||||
}
|
||||
|
||||
function testInvalid() {
|
||||
function testInvalidConstructionParameter() {
|
||||
$this->expectError('ispace is not valid space attribute');
|
||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('ispace');
|
||||
$this->assertResult(
|
||||
|
@@ -13,35 +13,36 @@ class HTMLPurifier_AttrTransform_LangTest
|
||||
$this->obj = new HTMLPurifier_AttrTransform_Lang();
|
||||
}
|
||||
|
||||
function test() {
|
||||
function testEmptyInput() {
|
||||
$this->assertResult(array());
|
||||
}
|
||||
|
||||
// leave non-lang'ed elements alone
|
||||
$this->assertResult(array(), true);
|
||||
|
||||
// copy lang to xml:lang
|
||||
function testCopyLangToXMLLang() {
|
||||
$this->assertResult(
|
||||
array('lang' => 'en'),
|
||||
array('lang' => 'en', 'xml:lang' => 'en')
|
||||
);
|
||||
}
|
||||
|
||||
// preserve attributes
|
||||
function testPreserveAttributes() {
|
||||
$this->assertResult(
|
||||
array('src' => 'vert.png', 'lang' => 'fr'),
|
||||
array('src' => 'vert.png', 'lang' => 'fr', 'xml:lang' => 'fr')
|
||||
);
|
||||
}
|
||||
|
||||
// copy xml:lang to lang
|
||||
function testCopyXMLLangToLang() {
|
||||
$this->assertResult(
|
||||
array('xml:lang' => 'en'),
|
||||
array('xml:lang' => 'en', 'lang' => 'en')
|
||||
);
|
||||
}
|
||||
|
||||
// both set, override lang with xml:lang
|
||||
function testXMLLangOverridesLang() {
|
||||
$this->assertResult(
|
||||
array('lang' => 'fr', 'xml:lang' => 'de'),
|
||||
array('lang' => 'de', 'xml:lang' => 'de')
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -11,21 +11,32 @@ class HTMLPurifier_AttrTransform_LengthTest extends HTMLPurifier_AttrTransformHa
|
||||
$this->obj = new HTMLPurifier_AttrTransform_Length('width');
|
||||
}
|
||||
|
||||
function test() {
|
||||
function testEmptyInput() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
function testTransformPixel() {
|
||||
$this->assertResult(
|
||||
array('width' => '10'),
|
||||
array('style' => 'width:10px;')
|
||||
);
|
||||
}
|
||||
|
||||
function testTransformPercentage() {
|
||||
$this->assertResult(
|
||||
array('width' => '10%'),
|
||||
array('style' => 'width:10%;')
|
||||
);
|
||||
}
|
||||
|
||||
function testPrependNewCSS() {
|
||||
$this->assertResult(
|
||||
array('width' => '10%', 'style' => 'font-weight:bold'),
|
||||
array('style' => 'width:10%;font-weight:bold')
|
||||
);
|
||||
// this behavior might change
|
||||
}
|
||||
|
||||
function testLenientTreatmentOfInvalidInput() {
|
||||
$this->assertResult(
|
||||
array('width' => 'asdf'),
|
||||
array('style' => 'width:asdf;')
|
||||
|
@@ -11,12 +11,18 @@ class HTMLPurifier_AttrTransform_NameTest extends HTMLPurifier_AttrTransformHarn
|
||||
$this->obj = new HTMLPurifier_AttrTransform_Name();
|
||||
}
|
||||
|
||||
function test() {
|
||||
function testEmpty() {
|
||||
$this->assertResult( array() );
|
||||
}
|
||||
|
||||
function testTransformNameToID() {
|
||||
$this->assertResult(
|
||||
array('name' => 'free'),
|
||||
array('id' => 'free')
|
||||
);
|
||||
}
|
||||
|
||||
function testExistingIDOverridesName() {
|
||||
$this->assertResult(
|
||||
array('name' => 'tryit', 'id' => 'tobad'),
|
||||
array('id' => 'tobad')
|
||||
|
@@ -6,6 +6,7 @@ class HTMLPurifier_AttrTransformHarness extends HTMLPurifier_ComplexHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->func = 'transform';
|
||||
}
|
||||
|
||||
|
@@ -35,7 +35,7 @@ class HTMLPurifier_AttrValidator_ErrorsTest extends HTMLPurifier_ErrorsHarness
|
||||
$this->invoke($token);
|
||||
}
|
||||
|
||||
// to lazy to check for global post and global pre
|
||||
// too lazy to check for global post and global pre
|
||||
|
||||
function testAttributeRemoved() {
|
||||
$this->expectErrorCollection(E_ERROR, 'AttrValidator: Attribute removed');
|
||||
|
@@ -6,28 +6,36 @@ require_once 'HTMLPurifier/ChildDef/Chameleon.php';
|
||||
class HTMLPurifier_ChildDef_ChameleonTest extends HTMLPurifier_ChildDefHarness
|
||||
{
|
||||
|
||||
function test() {
|
||||
var $isInline;
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_ChildDef_Chameleon(
|
||||
'b | i', // allowed only when in inline context
|
||||
'b | i | div' // allowed only when in block context
|
||||
);
|
||||
$this->context->register('IsInline', $this->isInline);
|
||||
}
|
||||
|
||||
function testInlineAlwaysAllowed() {
|
||||
$this->isInline = true;
|
||||
$this->assertResult(
|
||||
'<b>Allowed.</b>', true,
|
||||
array(), array('IsInline' => true)
|
||||
'<b>Allowed.</b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBlockNotAllowedInInline() {
|
||||
$this->isInline = true;
|
||||
$this->assertResult(
|
||||
'<div>Not allowed.</div>', '',
|
||||
array(), array('IsInline' => true)
|
||||
'<div>Not allowed.</div>', ''
|
||||
);
|
||||
}
|
||||
|
||||
function testBlockAllowedInNonInline() {
|
||||
$this->isInline = false;
|
||||
$this->assertResult(
|
||||
'<div>Allowed.</div>', true,
|
||||
array(), array('IsInline' => false)
|
||||
'<div>Allowed.</div>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -6,13 +6,21 @@ require_once 'HTMLPurifier/ChildDef/Optional.php';
|
||||
class HTMLPurifier_ChildDef_OptionalTest extends HTMLPurifier_ChildDefHarness
|
||||
{
|
||||
|
||||
function test() {
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_ChildDef_Optional('b | i');
|
||||
}
|
||||
|
||||
function testBasicUsage() {
|
||||
$this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
|
||||
$this->assertResult('Not allowed text', '');
|
||||
}
|
||||
|
||||
function testRemoveForbiddenText() {
|
||||
$this->assertResult('Not allowed text', '');
|
||||
}
|
||||
|
||||
function testEmpty() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -6,8 +6,7 @@ require_once 'HTMLPurifier/ChildDef/Required.php';
|
||||
class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness
|
||||
{
|
||||
|
||||
function testParsing() {
|
||||
|
||||
function testPrepareString() {
|
||||
$def = new HTMLPurifier_ChildDef_Required('foobar | bang |gizmo');
|
||||
$this->assertIdentical($def->elements,
|
||||
array(
|
||||
@@ -15,51 +14,61 @@ class HTMLPurifier_ChildDef_RequiredTest extends HTMLPurifier_ChildDefHarness
|
||||
,'bang' => true
|
||||
,'gizmo' => true
|
||||
));
|
||||
}
|
||||
|
||||
function testPrepareArray() {
|
||||
$def = new HTMLPurifier_ChildDef_Required(array('href', 'src'));
|
||||
$this->assertIdentical($def->elements,
|
||||
array(
|
||||
'href' => true
|
||||
,'src' => true
|
||||
));
|
||||
|
||||
}
|
||||
|
||||
function testPCDATAForbidden() {
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_ChildDef_Required('dt | dd');
|
||||
}
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult('', false);
|
||||
}
|
||||
|
||||
function testRemoveIllegalTagsAndElements() {
|
||||
$this->assertResult(
|
||||
'<dt>Term</dt>Text in an illegal location'.
|
||||
'<dd>Definition</dd><b>Illegal tag</b>',
|
||||
'<dt>Term</dt><dd>Definition</dd>');
|
||||
$this->assertResult('How do you do!', false);
|
||||
}
|
||||
|
||||
function testIgnoreWhitespace() {
|
||||
// whitespace shouldn't trigger it
|
||||
$this->assertResult("\n<dd>Definition</dd> ");
|
||||
}
|
||||
|
||||
function testPreserveWhitespaceAfterRemoval() {
|
||||
$this->assertResult(
|
||||
'<dd>Definition</dd> <b></b> ',
|
||||
'<dd>Definition</dd> '
|
||||
);
|
||||
$this->assertResult("\t ", false);
|
||||
}
|
||||
|
||||
function testDeleteNodeIfOnlyWhitespace() {
|
||||
$this->assertResult("\t ", false);
|
||||
}
|
||||
|
||||
function testPCDATAAllowed() {
|
||||
|
||||
$this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
|
||||
$this->assertResult('Out <b>Bold text</b><img />', 'Out <b>Bold text</b>');
|
||||
}
|
||||
|
||||
$this->assertResult('<b>Bold text</b><img />', '<b>Bold text</b>');
|
||||
|
||||
// with child escaping on
|
||||
function testPCDATAAllowedWithEscaping() {
|
||||
$this->obj = new HTMLPurifier_ChildDef_Required('#PCDATA | b');
|
||||
$this->config->set('Core', 'EscapeInvalidChildren', true);
|
||||
$this->assertResult(
|
||||
'<b>Bold text</b><img />',
|
||||
'<b>Bold text</b><img />',
|
||||
array(
|
||||
'Core.EscapeInvalidChildren' => true
|
||||
)
|
||||
'Out <b>Bold text</b><img />',
|
||||
'Out <b>Bold text</b><img />'
|
||||
);
|
||||
|
||||
}
|
||||
|
@@ -7,47 +7,77 @@ class HTMLPurifier_ChildDef_StrictBlockquoteTest
|
||||
extends HTMLPurifier_ChildDefHarness
|
||||
{
|
||||
|
||||
function test() {
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
|
||||
}
|
||||
|
||||
// assuming default wrap is p
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
function testPreserveValidP() {
|
||||
$this->assertResult('<p>Valid</p>');
|
||||
}
|
||||
|
||||
function testPreserveValidDiv() {
|
||||
$this->assertResult('<div>Still valid</div>');
|
||||
}
|
||||
|
||||
function testWrapTextWithP() {
|
||||
$this->assertResult('Needs wrap', '<p>Needs wrap</p>');
|
||||
}
|
||||
|
||||
function testNoWrapForWhitespaceOrValidElements() {
|
||||
$this->assertResult('<p>Do not wrap</p> <p>Whitespace</p>');
|
||||
}
|
||||
|
||||
function testWrapTextNextToValidElements() {
|
||||
$this->assertResult(
|
||||
'Wrap'. '<p>Do not wrap</p>',
|
||||
'<p>Wrap</p><p>Do not wrap</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testWrapInlineElements() {
|
||||
$this->assertResult(
|
||||
'<p>Do not</p>'.'<b>Wrap</b>',
|
||||
'<p>Do not</p><p><b>Wrap</b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testWrapAndRemoveInvalidTags() {
|
||||
$this->assertResult(
|
||||
'<li>Not allowed</li>Paragraph.<p>Hmm.</p>',
|
||||
'<p>Not allowedParagraph.</p><p>Hmm.</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testWrapComplicatedSring() {
|
||||
$this->assertResult(
|
||||
$var = 'He said<br />perhaps<br />we should <b>nuke</b> them.',
|
||||
"<p>$var</p>"
|
||||
);
|
||||
}
|
||||
|
||||
function testWrapAndRemoveInvalidTagsComplex() {
|
||||
$this->assertResult(
|
||||
'<foo>Bar</foo><bas /><b>People</b>Conniving.'. '<p>Fools!</p>',
|
||||
'<p>Bar'. '<b>People</b>Conniving.</p><p>Fools!</p>'
|
||||
);
|
||||
}
|
||||
|
||||
$this->assertResult('Needs wrap', '<div>Needs wrap</div>',
|
||||
array('HTML.BlockWrapper' => 'div'));
|
||||
function testAlternateWrapper() {
|
||||
$this->config->set('HTML', 'BlockWrapper', 'div');
|
||||
$this->assertResult('Needs wrap', '<div>Needs wrap</div>');
|
||||
|
||||
}
|
||||
|
||||
function testError() {
|
||||
// $this->expectError('Cannot use non-block element as block wrapper');
|
||||
$this->obj = new HTMLPurifier_ChildDef_StrictBlockquote('div | p');
|
||||
$this->assertResult('Needs wrap', '<p>Needs wrap</p>',
|
||||
array('HTML.BlockWrapper' => 'dav'));
|
||||
$this->config->set('HTML', 'BlockWrapper', 'dav');
|
||||
$this->assertResult('Needs wrap', '<p>Needs wrap</p>');
|
||||
$this->swallowErrors();
|
||||
}
|
||||
|
||||
|
@@ -3,46 +3,58 @@
|
||||
require_once 'HTMLPurifier/ChildDefHarness.php';
|
||||
require_once 'HTMLPurifier/ChildDef/Table.php';
|
||||
|
||||
// we're using empty tags to compact the tests: under real circumstances
|
||||
// there would be contents in them
|
||||
|
||||
class HTMLPurifier_ChildDef_TableTest extends HTMLPurifier_ChildDefHarness
|
||||
{
|
||||
|
||||
function test() {
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_ChildDef_Table();
|
||||
}
|
||||
|
||||
function testEmptyInput() {
|
||||
$this->assertResult('', false);
|
||||
}
|
||||
|
||||
// we're using empty tags to compact the tests: under real circumstances
|
||||
// there would be contents in them
|
||||
|
||||
function testSingleRow() {
|
||||
$this->assertResult('<tr />');
|
||||
}
|
||||
|
||||
function testComplexContents() {
|
||||
$this->assertResult('<caption /><col /><thead /><tfoot /><tbody>'.
|
||||
'<tr><td>asdf</td></tr></tbody>');
|
||||
$this->assertResult('<col /><col /><col /><tr />');
|
||||
}
|
||||
|
||||
// mixed up order
|
||||
function testReorderContents() {
|
||||
$this->assertResult(
|
||||
'<col /><colgroup /><tbody /><tfoot /><thead /><tr>1</tr><caption /><tr />',
|
||||
'<caption /><col /><colgroup /><thead /><tfoot /><tbody /><tr>1</tr><tr />');
|
||||
}
|
||||
|
||||
// duplicates of singles
|
||||
// - first caption serves
|
||||
// - trailing tfoots/theads get turned into tbodys
|
||||
function testDuplicateProcessing() {
|
||||
$this->assertResult(
|
||||
'<caption>1</caption><caption /><tbody /><tbody /><tfoot>1</tfoot><tfoot />',
|
||||
'<caption>1</caption><tfoot>1</tfoot><tbody /><tbody /><tbody />'
|
||||
);
|
||||
}
|
||||
|
||||
// errant text dropped (until bubbling is implemented)
|
||||
function testRemoveText() {
|
||||
$this->assertResult('foo', false);
|
||||
}
|
||||
|
||||
// whitespace sticks to the previous element, last whitespace is
|
||||
// stationary
|
||||
$this->assertResult("\n <tr />\n <tr />\n ", true, array('Output.Newline' => "\n"));
|
||||
function testStickyWhitespaceOnTr() {
|
||||
$this->config->set('Output', 'Newline', "\n");
|
||||
$this->assertResult("\n <tr />\n <tr />\n ");
|
||||
}
|
||||
|
||||
function testStickyWhitespaceOnTSection() {
|
||||
$this->config->set('Output', 'Newline', "\n");
|
||||
$this->assertResult(
|
||||
"\n\t<tbody />\n\t\t<tfoot />\n\t\t\t",
|
||||
"\n\t\t<tfoot />\n\t<tbody />\n\t\t\t",
|
||||
array('Output.Newline' => "\n")
|
||||
"\n\t\t<tfoot />\n\t<tbody />\n\t\t\t"
|
||||
);
|
||||
|
||||
}
|
||||
|
@@ -7,6 +7,7 @@ class HTMLPurifier_ChildDefHarness extends HTMLPurifier_ComplexHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = null;
|
||||
$this->func = 'validateChildren';
|
||||
$this->to_tokens = true;
|
||||
|
@@ -67,41 +67,20 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
|
||||
* @param $context_array Context array in form of Key => Value or an actual
|
||||
* context object.
|
||||
*/
|
||||
function assertResult($input, $expect = true,
|
||||
$config_array = array(), $context_array = array()
|
||||
) {
|
||||
|
||||
// setup config
|
||||
if ($this->config) {
|
||||
$config = HTMLPurifier_Config::create($this->config);
|
||||
$config->autoFinalize = false;
|
||||
$config->loadArray($config_array);
|
||||
} else {
|
||||
$config = HTMLPurifier_Config::create($config_array);
|
||||
}
|
||||
|
||||
// setup context object. Note that we are operating on a copy of it!
|
||||
// When necessary, extend the test harness to allow post-tests
|
||||
// on the context object
|
||||
if (empty($this->context)) {
|
||||
$context = new HTMLPurifier_Context();
|
||||
$context->loadArray($context_array);
|
||||
} else {
|
||||
$context =& $this->context;
|
||||
}
|
||||
function assertResult($input, $expect = true) {
|
||||
|
||||
if ($this->to_tokens && is_string($input)) {
|
||||
// $func may cause $input to change, so "clone" another copy
|
||||
// to sacrifice
|
||||
$input = $this->lexer->tokenizeHTML($s = $input, $config, $context);
|
||||
$input_c = $this->lexer->tokenizeHTML($s, $config, $context);
|
||||
$input = $this->tokenize($temp = $input);
|
||||
$input_c = $this->tokenize($temp);
|
||||
} else {
|
||||
$input_c = $input;
|
||||
}
|
||||
|
||||
// call the function
|
||||
$func = $this->func;
|
||||
$result = $this->obj->$func($input_c, $config, $context);
|
||||
$result = $this->obj->$func($input_c, $this->config, $this->context);
|
||||
|
||||
// test a bool result
|
||||
if (is_bool($result)) {
|
||||
@@ -112,11 +91,9 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
|
||||
}
|
||||
|
||||
if ($this->to_html) {
|
||||
$result = $this->generator->
|
||||
generateFromTokens($result, $config, $context);
|
||||
$result = $this->generate($result);
|
||||
if (is_array($expect)) {
|
||||
$expect = $this->generator->
|
||||
generateFromTokens($expect, $config, $context);
|
||||
$expect = $this->generate($expect);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -124,6 +101,20 @@ class HTMLPurifier_ComplexHarness extends HTMLPurifier_Harness
|
||||
|
||||
}
|
||||
|
||||
/**
|
||||
* Tokenize HTML into tokens, uses member variables for common variables
|
||||
*/
|
||||
function tokenize($html) {
|
||||
return $this->lexer->tokenizeHTML($html, $this->config, $this->context);
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate textual HTML from tokens
|
||||
*/
|
||||
function generate($tokens) {
|
||||
return $this->generator->generateFromTokens($tokens, $this->config, $this->context);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
||||
|
@@ -17,7 +17,7 @@ class HTMLPurifier_EntityLookupTest extends HTMLPurifier_Harness
|
||||
// special char
|
||||
$this->assertIdentical('"', $lookup->table['quot']);
|
||||
$this->assertIdentical('“', $lookup->table['ldquo']);
|
||||
$this->assertIdentical('<', $lookup->table['lt']); //expressed strangely
|
||||
$this->assertIdentical('<', $lookup->table['lt']); // expressed strangely in source file
|
||||
|
||||
// symbol char
|
||||
$this->assertIdentical('θ', $lookup->table['theta']);
|
||||
|
@@ -27,11 +27,20 @@ class HTMLPurifier_ErrorCollectorEMock extends HTMLPurifier_ErrorCollectorMock
|
||||
|
||||
function send($severity, $msg) {
|
||||
// test for context
|
||||
$test = &$this->_getCurrentTestCase();
|
||||
$context =& SimpleTest::getContext();
|
||||
$test =& $context->getTest();
|
||||
|
||||
// compat
|
||||
if (empty($this->_mock)) {
|
||||
$mock =& $this;
|
||||
} else {
|
||||
$mock =& $this->_mock;
|
||||
}
|
||||
|
||||
foreach ($this->_expected_context as $key => $value) {
|
||||
$test->assertEqual($value, $this->_context->get($key));
|
||||
}
|
||||
$step = $this->getCallCount('send');
|
||||
$step = $mock->getCallCount('send');
|
||||
if (isset($this->_expected_context_at[$step])) {
|
||||
foreach ($this->_expected_context_at[$step] as $key => $value) {
|
||||
$test->assertEqual($value, $this->_context->get($key));
|
||||
@@ -39,7 +48,7 @@ class HTMLPurifier_ErrorCollectorEMock extends HTMLPurifier_ErrorCollectorMock
|
||||
}
|
||||
// boilerplate mock code, does not have return value or references
|
||||
$args = func_get_args();
|
||||
$this->_invoke('send', $args);
|
||||
$mock->_invoke('send', $args);
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -3,11 +3,15 @@
|
||||
require_once 'HTMLPurifier/ErrorCollectorEMock.php';
|
||||
require_once 'HTMLPurifier/Lexer/DirectLex.php';
|
||||
|
||||
/**
|
||||
* @todo Make the callCount variable actually work, so we can precisely
|
||||
* specify what errors we want: no more, no less
|
||||
*/
|
||||
class HTMLPurifier_ErrorsHarness extends HTMLPurifier_Harness
|
||||
{
|
||||
|
||||
var $config, $context;
|
||||
var $collector, $generator;
|
||||
var $collector, $generator, $callCount;
|
||||
|
||||
function setup() {
|
||||
$this->config = HTMLPurifier_Config::create(array('Core.CollectErrors' => true));
|
||||
@@ -16,6 +20,11 @@ class HTMLPurifier_ErrorsHarness extends HTMLPurifier_Harness
|
||||
$this->collector = new HTMLPurifier_ErrorCollectorEMock();
|
||||
$this->collector->prepare($this->context);
|
||||
$this->context->register('ErrorCollector', $this->collector);
|
||||
$this->callCount = 0;
|
||||
}
|
||||
|
||||
function expectNoErrorCollection() {
|
||||
$this->collector->expectNever('send');
|
||||
}
|
||||
|
||||
function expectErrorCollection() {
|
||||
|
39
tests/HTMLPurifier/HTMLModule/ObjectTest.php
Normal file
39
tests/HTMLPurifier/HTMLModule/ObjectTest.php
Normal file
@@ -0,0 +1,39 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/HTMLModuleHarness.php';
|
||||
|
||||
class HTMLPurifier_HTMLModule_ObjectTest extends HTMLPurifier_HTMLModuleHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->config->set('HTML', 'Trusted', true);
|
||||
}
|
||||
|
||||
function testDefaultRemoval() {
|
||||
$this->config->set('HTML', 'Trusted', false);
|
||||
$this->assertResult(
|
||||
'<object></object>', ''
|
||||
);
|
||||
}
|
||||
|
||||
function testMinimal() {
|
||||
$this->assertResult('<object></object>');
|
||||
}
|
||||
|
||||
function testStandardUseCase() {
|
||||
$this->assertResult(
|
||||
'<object type="video/x-ms-wmv" data="http://domain.com/video.wmv" width="320" height="256">
|
||||
<param name="src" value="http://domain.com/video.wmv" />
|
||||
<param name="autostart" value="false" />
|
||||
<param name="controller" value="true" />
|
||||
<param name="pluginurl" value="http://www.microsoft.com/Windows/MediaPlayer/" />
|
||||
<a href="http://www.microsoft.com/Windows/MediaPlayer/">Windows Media player required</a>
|
||||
</object>'
|
||||
);
|
||||
}
|
||||
|
||||
// more test-cases?
|
||||
|
||||
}
|
||||
|
@@ -5,47 +5,51 @@ require_once 'HTMLPurifier/HTMLModuleHarness.php';
|
||||
class HTMLPurifier_HTMLModule_ScriptingTest extends HTMLPurifier_HTMLModuleHarness
|
||||
{
|
||||
|
||||
function test() {
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->config->set('HTML', 'Trusted', true);
|
||||
$this->config->set('Core', 'CommentScriptContents', false);
|
||||
}
|
||||
|
||||
// default (remove everything)
|
||||
function testDefaultRemoval() {
|
||||
$this->config->set('HTML', 'Trusted', false);
|
||||
$this->assertResult(
|
||||
'<script type="text/javascript">foo();</script>', ''
|
||||
);
|
||||
}
|
||||
|
||||
// enabled
|
||||
function testPreserve() {
|
||||
$this->assertResult(
|
||||
'<script type="text/javascript">foo();</script>', true,
|
||||
array('HTML.Trusted' => true)
|
||||
'<script type="text/javascript">foo();</script>'
|
||||
);
|
||||
}
|
||||
|
||||
// CDATA
|
||||
function testCDATAEnclosure() {
|
||||
$this->assertResult(
|
||||
'//<![CDATA[
|
||||
'<script type="text/javascript">//<![CDATA[
|
||||
alert("<This is compatible with XHTML>");
|
||||
//]]> ', true,
|
||||
array('HTML.Trusted' => true)
|
||||
//]]></script>'
|
||||
);
|
||||
}
|
||||
|
||||
// max
|
||||
function testAllAttributes() {
|
||||
$this->assertResult(
|
||||
'<script
|
||||
defer="defer"
|
||||
src="test.js"
|
||||
type="text/javascript"
|
||||
>PCDATA</script>', true,
|
||||
array('HTML.Trusted' => true, 'Core.CommentScriptContents' => false)
|
||||
>PCDATA</script>'
|
||||
);
|
||||
}
|
||||
|
||||
// unsupported
|
||||
function testUnsupportedAttributes() {
|
||||
$this->assertResult(
|
||||
'<script
|
||||
type="text/javascript"
|
||||
charset="utf-8"
|
||||
>PCDATA</script>',
|
||||
'<script type="text/javascript">PCDATA</script>',
|
||||
array('HTML.Trusted' => true, 'Core.CommentScriptContents' => false)
|
||||
'<script type="text/javascript">PCDATA</script>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -30,5 +30,11 @@ class HTMLPurifier_IDAccumulatorTest extends HTMLPurifier_Harness
|
||||
|
||||
}
|
||||
|
||||
function testBuild() {
|
||||
$this->config->set('Attr', 'IDBlacklist', array('foo'));
|
||||
$accumulator = HTMLPurifier_IDAccumulator::build($this->config, $this->context);
|
||||
$this->assertTrue( isset($accumulator->ids['foo']) );
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
@@ -8,29 +8,35 @@ class HTMLPurifier_Injector_AutoParagraphTest extends HTMLPurifier_InjectorHarne
|
||||
|
||||
function setup() {
|
||||
parent::setup();
|
||||
$this->config = array('AutoFormat.AutoParagraph' => true);
|
||||
$this->config->set('AutoFormat', 'AutoParagraph', true);
|
||||
}
|
||||
|
||||
function test() {
|
||||
function testSingleParagraph() {
|
||||
$this->assertResult(
|
||||
'Foobar',
|
||||
'<p>Foobar</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testSingleMultiLineParagraph() {
|
||||
$this->assertResult(
|
||||
'Par 1
|
||||
Par 1 still',
|
||||
'<p>Par 1
|
||||
Par 1 still</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoParagraphs() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
Par2',
|
||||
'<p>Par1</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoParagraphsWithLotsOfSpace() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
@@ -39,15 +45,18 @@ Par2',
|
||||
Par2',
|
||||
'<p>Par1</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoParagraphsWithInlineElements() {
|
||||
$this->assertResult(
|
||||
'<b>Par1</b>
|
||||
|
||||
<i>Par2</i>',
|
||||
'<p><b>Par1</b></p><p><i>Par2</i></p>'
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
function testSingleParagraphThatLooksLikeTwo() {
|
||||
$this->assertResult(
|
||||
'<b>Par1
|
||||
|
||||
@@ -56,29 +65,40 @@ Par2</b>',
|
||||
|
||||
Par2</b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAddParagraphAdjacentToParagraph() {
|
||||
$this->assertResult(
|
||||
'Par1<p>Par2</p>',
|
||||
'<p>Par1</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphUnclosedInlineElement() {
|
||||
$this->assertResult(
|
||||
'<b>Par1',
|
||||
'<p><b>Par1</b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testPreservePreTags() {
|
||||
$this->assertResult(
|
||||
'<pre>Par1
|
||||
|
||||
Par1</pre>'
|
||||
);
|
||||
}
|
||||
|
||||
function testIgnoreTrailingWhitespace() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
',
|
||||
'<p>Par1</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testDoNotParagraphBlockElements() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
@@ -87,19 +107,25 @@ Par1</pre>'
|
||||
Par3',
|
||||
'<p>Par1</p><div>Par2</div><p>Par3</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphTextAndInlineNodes() {
|
||||
$this->assertResult(
|
||||
'Par<b>1</b>',
|
||||
'<p>Par<b>1</b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testIgnoreLeadingWhitespace() {
|
||||
$this->assertResult(
|
||||
'
|
||||
|
||||
Par',
|
||||
'<p>Par</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testIgnoreSurroundingWhitespace() {
|
||||
$this->assertResult(
|
||||
'
|
||||
|
||||
@@ -108,69 +134,87 @@ Par
|
||||
',
|
||||
'<p>Par</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphInsideBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div>Par1
|
||||
|
||||
Par2</div>',
|
||||
'<div><p>Par1</p><p>Par2</p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphInlineNodeInsideBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><b>Par1</b>
|
||||
|
||||
Par2</div>',
|
||||
'<div><p><b>Par1</b></p><p>Par2</p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testNoParagraphWhenOnlyOneInsideBlockNode() {
|
||||
$this->assertResult('<div>Par1</div>');
|
||||
}
|
||||
|
||||
function testParagraphTwoInlineNodesInsideBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><b>Par1</b>
|
||||
|
||||
<i>Par2</i></div>',
|
||||
'<div><p><b>Par1</b></p><p><i>Par2</i></p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testPreserveInlineNodesInPreTag() {
|
||||
$this->assertResult(
|
||||
'<pre><b>Par1</b>
|
||||
|
||||
<i>Par2</i></pre>',
|
||||
true
|
||||
<i>Par2</i></pre>'
|
||||
);
|
||||
}
|
||||
|
||||
function testSplitUpInternalsOfPTagInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><p>Foo
|
||||
|
||||
Bar</p></div>',
|
||||
'<div><p>Foo</p><p>Bar</p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testSplitUpInlineNodesInPTagInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><p><b>Foo</b>
|
||||
|
||||
<i>Bar</i></p></div>',
|
||||
'<div><p><b>Foo</b></p><p><i>Bar</i></p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
$this->assertResult(
|
||||
'<div><b>Foo</b></div>',
|
||||
'<div><b>Foo</b></div>'
|
||||
);
|
||||
function testNoParagraphSingleInlineNodeInBlockNode() {
|
||||
$this->assertResult( '<div><b>Foo</b></div>' );
|
||||
}
|
||||
|
||||
function testParagraphInBlockquote() {
|
||||
$this->assertResult(
|
||||
'<blockquote>Par1
|
||||
|
||||
Par2</blockquote>',
|
||||
'<blockquote><p>Par1</p><p>Par2</p></blockquote>'
|
||||
);
|
||||
}
|
||||
|
||||
function testNoParagraphBetweenListItem() {
|
||||
$this->assertResult(
|
||||
'<ul><li>Foo</li>
|
||||
|
||||
<li>Bar</li></ul>', true
|
||||
<li>Bar</li></ul>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphSingleElementWithSurroundingSpace() {
|
||||
$this->assertResult(
|
||||
'<div>
|
||||
|
||||
@@ -179,7 +223,9 @@ Bar
|
||||
</div>',
|
||||
'<div><p>Bar</p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testIgnoreExtraSpaceWithLeadingInlineNode() {
|
||||
$this->assertResult(
|
||||
'<b>Par1</b>a
|
||||
|
||||
@@ -188,99 +234,146 @@ Bar
|
||||
Par2',
|
||||
'<p><b>Par1</b>a</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAbsorbExtraEndingPTag() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
Par2</p>',
|
||||
'<p>Par1</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAbsorbExtraEndingDivTag() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
Par2</div>',
|
||||
'<p>Par1</p><p>Par2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testDoNotParagraphSingleSurroundingSpaceInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div>
|
||||
Par1
|
||||
</div>', true
|
||||
</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBlockNodeTextDelimeterInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div>Par1
|
||||
|
||||
<div>Par2</div></div>',
|
||||
'<div><p>Par1</p><div>Par2</div></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBlockNodeTextDelimeterWithoutDoublespaceInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div>Par1
|
||||
<div>Par2</div></div>',
|
||||
'<div><p>Par1
|
||||
</p><div>Par2</div></div>'
|
||||
<div>Par2</div></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBlockNodeTextDelimeterWithoutDoublespace() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
<div>Par2</div>',
|
||||
'<p>Par1
|
||||
</p><div>Par2</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoParagraphsOfTextAndInlineNode() {
|
||||
$this->assertResult(
|
||||
'Par1
|
||||
|
||||
<b>Par2</b>',
|
||||
'<p>Par1</p><p><b>Par2</b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testLeadingInlineNodeParagraph() {
|
||||
$this->assertResult(
|
||||
'<img /> Foo',
|
||||
'<p><img /> Foo</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTrailingInlineNodeParagraph() {
|
||||
$this->assertResult(
|
||||
'<li>Foo <a>bar</a></li>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoInlineNodeParagraph() {
|
||||
$this->assertResult(
|
||||
'<li><b>baz</b><a>bar</a></li>'
|
||||
);
|
||||
}
|
||||
|
||||
function testNoParagraphTrailingBlockNodeInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><div>asdf</div><b>asdf</b></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphTrailingBlockNodeWithDoublespaceInBlockNode() {
|
||||
$this->assertResult(
|
||||
'<div><div>asdf</div>
|
||||
|
||||
<b>asdf</b></div>',
|
||||
'<div><div>asdf</div><p><b>asdf</b></p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphTwoInlineNodesAndWhitespaceNode() {
|
||||
$this->assertResult(
|
||||
'<b>One</b> <i>Two</i>',
|
||||
'<p><b>One</b> <i>Two</i></p>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testInlineRootNode() {
|
||||
function testNoParagraphWithInlineRootNode() {
|
||||
$this->config->set('HTML', 'Parent', 'span');
|
||||
$this->assertResult(
|
||||
'Par
|
||||
|
||||
Par2',
|
||||
true,
|
||||
array('AutoFormat.AutoParagraph' => true, 'HTML.Parent' => 'span')
|
||||
Par2'
|
||||
);
|
||||
}
|
||||
|
||||
function testNeeded() {
|
||||
function testInlineAndBlockTagInDivNoParagraph() {
|
||||
$this->assertResult(
|
||||
'<div><code>bar</code> mmm <pre>asdf</pre></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testInlineAndBlockTagInDivNeedingParagraph() {
|
||||
$this->assertResult(
|
||||
'<div><code>bar</code> mmm
|
||||
|
||||
<pre>asdf</pre></div>',
|
||||
'<div><p><code>bar</code> mmm</p><pre>asdf</pre></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTextInlineNodeTextThenDoubleNewlineNeedsParagraph() {
|
||||
$this->assertResult(
|
||||
'<div>asdf <code>bar</code> mmm
|
||||
|
||||
<pre>asdf</pre></div>',
|
||||
'<div><p>asdf <code>bar</code> mmm</p><pre>asdf</pre></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testErrorNeeded() {
|
||||
$this->config->set('HTML', 'Allowed', 'b');
|
||||
$this->expectError('Cannot enable AutoParagraph injector because p is not allowed');
|
||||
$this->assertResult('<b>foobar</b>', true, array('AutoFormat.AutoParagraph' => true, 'HTML.Allowed' => 'b'));
|
||||
$this->assertResult('<b>foobar</b>');
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -8,35 +8,40 @@ class HTMLPurifier_Injector_LinkifyTest extends HTMLPurifier_InjectorHarness
|
||||
|
||||
function setup() {
|
||||
parent::setup();
|
||||
$this->config = array('AutoFormat.Linkify' => true);
|
||||
$this->config->set('AutoFormat', 'Linkify', true);
|
||||
}
|
||||
|
||||
function testLinkify() {
|
||||
|
||||
function testLinkifyURLInRootNode() {
|
||||
$this->assertResult(
|
||||
'http://example.com',
|
||||
'<a href="http://example.com">http://example.com</a>'
|
||||
);
|
||||
}
|
||||
|
||||
function testLinkifyURLInInlineNode() {
|
||||
$this->assertResult(
|
||||
'<b>http://example.com</b>',
|
||||
'<b><a href="http://example.com">http://example.com</a></b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBasicUsageCase() {
|
||||
$this->assertResult(
|
||||
'This URL http://example.com is what you need',
|
||||
'This URL <a href="http://example.com">http://example.com</a> is what you need'
|
||||
);
|
||||
}
|
||||
|
||||
function testIgnoreURLInATag() {
|
||||
$this->assertResult(
|
||||
'<a>http://example.com/</a>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testNeeded() {
|
||||
$this->config->set('HTML', 'Allowed', 'b');
|
||||
$this->expectError('Cannot enable Linkify injector because a is not allowed');
|
||||
$this->assertResult('http://example.com/', true, array('AutoFormat.Linkify' => true, 'HTML.Allowed' => 'b'));
|
||||
$this->assertResult('http://example.com/');
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -8,39 +8,53 @@ class HTMLPurifier_Injector_PurifierLinkifyTest extends HTMLPurifier_InjectorHar
|
||||
|
||||
function setup() {
|
||||
parent::setup();
|
||||
$this->config = array(
|
||||
'AutoFormat.PurifierLinkify' => true,
|
||||
'AutoFormatParam.PurifierLinkifyDocURL' => '#%s'
|
||||
);
|
||||
$this->config->set('AutoFormat', 'PurifierLinkify', true);
|
||||
$this->config->set('AutoFormatParam', 'PurifierLinkifyDocURL', '#%s');
|
||||
}
|
||||
|
||||
function testLinkify() {
|
||||
|
||||
function testNoTriggerCharacer() {
|
||||
$this->assertResult('Foobar');
|
||||
}
|
||||
|
||||
function testTriggerCharacterInIrrelevantContext() {
|
||||
$this->assertResult('20% off!');
|
||||
}
|
||||
|
||||
function testPreserveNamespace() {
|
||||
$this->assertResult('%Core namespace (not recognized)');
|
||||
}
|
||||
|
||||
function testLinkifyBasic() {
|
||||
$this->assertResult(
|
||||
'%Namespace.Directive',
|
||||
'<a href="#Namespace.Directive">%Namespace.Directive</a>'
|
||||
);
|
||||
}
|
||||
|
||||
function testLinkifyWithAdjacentTextNodes() {
|
||||
$this->assertResult(
|
||||
'This %Namespace.Directive thing',
|
||||
'This <a href="#Namespace.Directive">%Namespace.Directive</a> thing'
|
||||
);
|
||||
}
|
||||
|
||||
function testLinkifyInBlock() {
|
||||
$this->assertResult(
|
||||
'<div>This %Namespace.Directive thing</div>',
|
||||
'<div>This <a href="#Namespace.Directive">%Namespace.Directive</a> thing</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testPreserveInATag() {
|
||||
$this->assertResult(
|
||||
'<a>%Namespace.Directive</a>'
|
||||
);
|
||||
|
||||
|
||||
}
|
||||
|
||||
function testNeeded() {
|
||||
$this->config->set('HTML', 'Allowed', 'b');
|
||||
$this->expectError('Cannot enable PurifierLinkify injector because a is not allowed');
|
||||
$this->assertResult('%Namespace.Directive', true, array('AutoFormat.PurifierLinkify' => true, 'HTML.Allowed' => 'b'));
|
||||
$this->assertResult('%Namespace.Directive');
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -5,70 +5,98 @@ require_once 'HTMLPurifier/Lexer/DirectLex.php';
|
||||
class HTMLPurifier_LexerTest extends HTMLPurifier_Harness
|
||||
{
|
||||
|
||||
var $Lexer;
|
||||
var $DirectLex, $PEARSax3, $DOMLex;
|
||||
var $_entity_lookup;
|
||||
var $_has_pear = false;
|
||||
var $_has_dom = false;
|
||||
|
||||
function setUp() {
|
||||
$this->Lexer = new HTMLPurifier_Lexer();
|
||||
|
||||
$this->DirectLex = new HTMLPurifier_Lexer_DirectLex();
|
||||
|
||||
if ( $GLOBALS['HTMLPurifierTest']['PEAR'] &&
|
||||
((error_reporting() & E_STRICT) != E_STRICT)
|
||||
function HTMLPurifier_LexerTest() {
|
||||
parent::HTMLPurifier_Harness();
|
||||
// E_STRICT = 2048, int used for PHP4 compat: this check disables
|
||||
// PEAR if PHP 5 strict mode is on, since the class is not strict safe
|
||||
if (
|
||||
$GLOBALS['HTMLPurifierTest']['PEAR'] &&
|
||||
((error_reporting() & 2048) != 2048) // ought to be a better way
|
||||
) {
|
||||
$this->_has_pear = true;
|
||||
require_once 'HTMLPurifier/Lexer/PEARSax3.php';
|
||||
$this->PEARSax3 = new HTMLPurifier_Lexer_PEARSax3();
|
||||
$this->_has_pear = true;
|
||||
}
|
||||
|
||||
$this->_has_dom = version_compare(PHP_VERSION, '5', '>=');
|
||||
if ($this->_has_dom) {
|
||||
require_once 'HTMLPurifier/Lexer/DOMLex.php';
|
||||
$this->DOMLex = new HTMLPurifier_Lexer_DOMLex();
|
||||
if ($GLOBALS['HTMLPurifierTest']['PH5P']) {
|
||||
require_once 'HTMLPurifier/Lexer/PH5P.php';
|
||||
}
|
||||
|
||||
$this->_entity_lookup = HTMLPurifier_EntityLookup::instance();
|
||||
|
||||
}
|
||||
|
||||
// HTMLPurifier_Lexer::create() --------------------------------------------
|
||||
|
||||
function test_create() {
|
||||
$config = HTMLPurifier_Config::create(array('Core.MaintainLineNumbers' => true));
|
||||
$lexer = HTMLPurifier_Lexer::create($config);
|
||||
$this->config->set('Core', 'MaintainLineNumbers', true);
|
||||
$lexer = HTMLPurifier_Lexer::create($this->config);
|
||||
$this->assertIsA($lexer, 'HTMLPurifier_Lexer_DirectLex');
|
||||
}
|
||||
|
||||
// HTMLPurifier_Lexer->parseData() -----------------------------------------
|
||||
|
||||
function assertParseData($input, $expect = true) {
|
||||
if ($expect === true) $expect = $input;
|
||||
$lexer = new HTMLPurifier_Lexer();
|
||||
$this->assertIdentical($expect, $lexer->parseData($input));
|
||||
}
|
||||
|
||||
function test_parseData_plainText() {
|
||||
$this->assertParseData('asdf');
|
||||
}
|
||||
|
||||
function test_parseData_ampersandEntity() {
|
||||
$this->assertParseData('&', '&');
|
||||
}
|
||||
|
||||
function test_parseData_quotEntity() {
|
||||
$this->assertParseData('"', '"');
|
||||
}
|
||||
|
||||
function test_parseData_aposNumericEntity() {
|
||||
$this->assertParseData(''', "'");
|
||||
}
|
||||
|
||||
function test_parseData_aposCompactNumericEntity() {
|
||||
$this->assertParseData(''', "'");
|
||||
}
|
||||
|
||||
function test_parseData_adjacentAmpersandEntities() {
|
||||
$this->assertParseData('&&&', '&&&');
|
||||
}
|
||||
|
||||
function test_parseData_trailingUnescapedAmpersand() {
|
||||
$this->assertParseData('&&', '&&');
|
||||
}
|
||||
|
||||
function test_parseData_internalUnescapedAmpersand() {
|
||||
$this->assertParseData('Procter & Gamble');
|
||||
}
|
||||
|
||||
function test_parseData_improperEntityFaultToleranceTest() {
|
||||
$this->assertParseData('-');
|
||||
}
|
||||
|
||||
// HTMLPurifier_Lexer->extractBody() ---------------------------------------
|
||||
|
||||
function assertExtractBody($text, $extract = true) {
|
||||
$result = $this->Lexer->extractBody($text);
|
||||
$lexer = new HTMLPurifier_Lexer();
|
||||
$result = $lexer->extractBody($text);
|
||||
if ($extract === true) $extract = $text;
|
||||
$this->assertIdentical($extract, $result);
|
||||
}
|
||||
|
||||
function test_parseData() {
|
||||
$HP =& $this->Lexer;
|
||||
|
||||
$this->assertIdentical('asdf', $HP->parseData('asdf'));
|
||||
$this->assertIdentical('&', $HP->parseData('&'));
|
||||
$this->assertIdentical('"', $HP->parseData('"'));
|
||||
$this->assertIdentical("'", $HP->parseData('''));
|
||||
$this->assertIdentical("'", $HP->parseData('''));
|
||||
$this->assertIdentical('&&&', $HP->parseData('&&&'));
|
||||
$this->assertIdentical('&&', $HP->parseData('&&')); // [INVALID]
|
||||
$this->assertIdentical('Procter & Gamble',
|
||||
$HP->parseData('Procter & Gamble')); // [INVALID]
|
||||
|
||||
// This is not special, thus not converted. Test of fault tolerance,
|
||||
// realistically speaking, this should never happen
|
||||
$this->assertIdentical('-', $HP->parseData('-'));
|
||||
function test_extractBody_noBodyTags() {
|
||||
$this->assertExtractBody('<b>Bold</b>');
|
||||
}
|
||||
|
||||
|
||||
function test_extractBody() {
|
||||
$this->assertExtractBody('<b>Bold</b>');
|
||||
function test_extractBody_lowercaseBodyTags() {
|
||||
$this->assertExtractBody('<html><body><b>Bold</b></body></html>', '<b>Bold</b>');
|
||||
}
|
||||
|
||||
function test_extractBody_uppercaseBodyTags() {
|
||||
$this->assertExtractBody('<HTML><BODY><B>Bold</B></BODY></HTML>', '<B>Bold</B>');
|
||||
}
|
||||
|
||||
function test_extractBody_realisticUseCase() {
|
||||
$this->assertExtractBody(
|
||||
'<?xml version="1.0"
|
||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
||||
@@ -96,303 +124,404 @@ class HTMLPurifier_LexerTest extends HTMLPurifier_Harness
|
||||
</div>
|
||||
</form>
|
||||
');
|
||||
}
|
||||
|
||||
function test_extractBody_bodyWithAttributes() {
|
||||
$this->assertExtractBody('<html><body bgcolor="#F00"><b>Bold</b></body></html>', '<b>Bold</b>');
|
||||
}
|
||||
|
||||
function test_extractBody_preserveUnclosedBody() {
|
||||
$this->assertExtractBody('<body>asdf'); // not closed, don't accept
|
||||
|
||||
}
|
||||
|
||||
function test_tokenizeHTML() {
|
||||
|
||||
$input = array();
|
||||
$expect = array();
|
||||
$sax_expect = array();
|
||||
$config = array();
|
||||
|
||||
$input[0] = '';
|
||||
$expect[0] = array();
|
||||
|
||||
$input[1] = 'This is regular text.';
|
||||
$expect[1] = array(
|
||||
new HTMLPurifier_Token_Text('This is regular text.')
|
||||
);
|
||||
|
||||
$input[2] = 'This is <b>bold</b> text';
|
||||
$expect[2] = array(
|
||||
new HTMLPurifier_Token_Text('This is ')
|
||||
,new HTMLPurifier_Token_Start('b', array())
|
||||
,new HTMLPurifier_Token_Text('bold')
|
||||
,new HTMLPurifier_Token_End('b')
|
||||
,new HTMLPurifier_Token_Text(' text')
|
||||
);
|
||||
|
||||
$input[3] = '<DIV>Totally rad dude. <b>asdf</b></div>';
|
||||
$expect[3] = array(
|
||||
new HTMLPurifier_Token_Start('DIV', array())
|
||||
,new HTMLPurifier_Token_Text('Totally rad dude. ')
|
||||
,new HTMLPurifier_Token_Start('b', array())
|
||||
,new HTMLPurifier_Token_Text('asdf')
|
||||
,new HTMLPurifier_Token_End('b')
|
||||
,new HTMLPurifier_Token_End('div')
|
||||
);
|
||||
|
||||
// [XML-INVALID]
|
||||
$input[4] = '<asdf></asdf><d></d><poOloka><poolasdf><ds></asdf></ASDF>';
|
||||
$expect[4] = array(
|
||||
new HTMLPurifier_Token_Start('asdf')
|
||||
,new HTMLPurifier_Token_End('asdf')
|
||||
,new HTMLPurifier_Token_Start('d')
|
||||
,new HTMLPurifier_Token_End('d')
|
||||
,new HTMLPurifier_Token_Start('poOloka')
|
||||
,new HTMLPurifier_Token_Start('poolasdf')
|
||||
,new HTMLPurifier_Token_Start('ds')
|
||||
,new HTMLPurifier_Token_End('asdf')
|
||||
,new HTMLPurifier_Token_End('ASDF')
|
||||
);
|
||||
// DOM is different because it condenses empty tags into REAL empty ones
|
||||
// as well as makes it well-formed
|
||||
$dom_expect[4] = array(
|
||||
new HTMLPurifier_Token_Empty('asdf')
|
||||
,new HTMLPurifier_Token_Empty('d')
|
||||
,new HTMLPurifier_Token_Start('pooloka')
|
||||
,new HTMLPurifier_Token_Start('poolasdf')
|
||||
,new HTMLPurifier_Token_Empty('ds')
|
||||
,new HTMLPurifier_Token_End('poolasdf')
|
||||
,new HTMLPurifier_Token_End('pooloka')
|
||||
);
|
||||
|
||||
$input[5] = '<a'."\t".'href="foobar.php"'."\n".'title="foo!">Link to <b id="asdf">foobar</b></a>';
|
||||
$expect[5] = array(
|
||||
new HTMLPurifier_Token_Start('a',array('href'=>'foobar.php','title'=>'foo!'))
|
||||
,new HTMLPurifier_Token_Text('Link to ')
|
||||
,new HTMLPurifier_Token_Start('b',array('id'=>'asdf'))
|
||||
,new HTMLPurifier_Token_Text('foobar')
|
||||
,new HTMLPurifier_Token_End('b')
|
||||
,new HTMLPurifier_Token_End('a')
|
||||
);
|
||||
|
||||
$input[6] = '<br />';
|
||||
$expect[6] = array(
|
||||
new HTMLPurifier_Token_Empty('br')
|
||||
);
|
||||
|
||||
// [SGML-INVALID] [RECOVERABLE]
|
||||
$input[7] = '<!-- Comment --> <!-- not so well formed --->';
|
||||
$expect[7] = array(
|
||||
new HTMLPurifier_Token_Comment(' Comment ')
|
||||
,new HTMLPurifier_Token_Text(' ')
|
||||
,new HTMLPurifier_Token_Comment(' not so well formed -')
|
||||
);
|
||||
$sax_expect[7] = false; // we need to figure out proper comment output
|
||||
|
||||
// [SGML-INVALID]
|
||||
$input[8] = '<a href=""';
|
||||
$expect[8] = array(
|
||||
new HTMLPurifier_Token_Text('<a href=""')
|
||||
);
|
||||
// SAX parses it into a tag
|
||||
$sax_expect[8] = array(
|
||||
new HTMLPurifier_Token_Start('a', array('href'=>''))
|
||||
);
|
||||
// DOM parses it into an empty tag
|
||||
$dom_expect[8] = array(
|
||||
new HTMLPurifier_Token_Empty('a', array('href'=>''))
|
||||
);
|
||||
|
||||
$input[9] = '<b>';
|
||||
$expect[9] = array(
|
||||
new HTMLPurifier_Token_Text('<b>')
|
||||
);
|
||||
$sax_expect[9] = array(
|
||||
new HTMLPurifier_Token_Text('<')
|
||||
,new HTMLPurifier_Token_Text('b')
|
||||
,new HTMLPurifier_Token_Text('>')
|
||||
);
|
||||
// note that SAX can clump text nodes together. We won't be
|
||||
// too picky though
|
||||
|
||||
// [SGML-INVALID]
|
||||
$input[10] = '<a "=>';
|
||||
// We barf on this, aim for no attributes
|
||||
$expect[10] = array(
|
||||
new HTMLPurifier_Token_Start('a', array('"' => ''))
|
||||
);
|
||||
// DOM correctly has no attributes, but also closes the tag
|
||||
$dom_expect[10] = array(
|
||||
new HTMLPurifier_Token_Empty('a')
|
||||
);
|
||||
// SAX barfs on this
|
||||
$sax_expect[10] = array(
|
||||
new HTMLPurifier_Token_Start('a', array('"' => ''))
|
||||
);
|
||||
|
||||
// [INVALID] [RECOVERABLE]
|
||||
$input[11] = '"';
|
||||
$expect[11] = array( new HTMLPurifier_Token_Text('"') );
|
||||
|
||||
// compare with this valid one:
|
||||
$input[12] = '"';
|
||||
$expect[12] = array( new HTMLPurifier_Token_Text('"') );
|
||||
$sax_expect[12] = false; // choked!
|
||||
|
||||
// CDATA sections!
|
||||
$input[13] = '<![CDATA[You <b>can't</b> get me!]]>';
|
||||
$expect[13] = array( new HTMLPurifier_Token_Text(
|
||||
'You <b>can't</b> get me!' // raw
|
||||
) );
|
||||
$sax_expect[13] = array( // SAX has a seperate call for each entity
|
||||
new HTMLPurifier_Token_Text('You '),
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('b'),
|
||||
new HTMLPurifier_Token_Text('>'),
|
||||
new HTMLPurifier_Token_Text('can'),
|
||||
new HTMLPurifier_Token_Text('&'),
|
||||
new HTMLPurifier_Token_Text('#39;t'),
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('/b'),
|
||||
new HTMLPurifier_Token_Text('>'),
|
||||
new HTMLPurifier_Token_Text(' get me!')
|
||||
);
|
||||
|
||||
$char_theta = $this->_entity_lookup->table['theta'];
|
||||
$char_rarr = $this->_entity_lookup->table['rarr'];
|
||||
|
||||
// test entity replacement
|
||||
$input[14] = 'θ';
|
||||
$expect[14] = array( new HTMLPurifier_Token_Text($char_theta) );
|
||||
|
||||
// test that entities aren't replaced in CDATA sections
|
||||
$input[15] = 'θ <![CDATA[→]]>';
|
||||
$expect[15] = array( new HTMLPurifier_Token_Text($char_theta . ' →') );
|
||||
$sax_expect[15] = array(
|
||||
new HTMLPurifier_Token_Text($char_theta . ' '),
|
||||
new HTMLPurifier_Token_Text('&'),
|
||||
new HTMLPurifier_Token_Text('rarr;')
|
||||
);
|
||||
|
||||
// test entity resolution in attributes
|
||||
$input[16] = '<a href="index.php?title=foo&id=bar">Link</a>';
|
||||
$expect[16] = array(
|
||||
new HTMLPurifier_Token_Start('a',array('href' => 'index.php?title=foo&id=bar'))
|
||||
,new HTMLPurifier_Token_Text('Link')
|
||||
,new HTMLPurifier_Token_End('a')
|
||||
);
|
||||
|
||||
// test that UTF-8 is preserved
|
||||
$char_hearts = $this->_entity_lookup->table['hearts'];
|
||||
$input[17] = $char_hearts;
|
||||
$expect[17] = array( new HTMLPurifier_Token_Text($char_hearts) );
|
||||
|
||||
// test weird characters in attributes
|
||||
$input[18] = '<br test="x < 6" />';
|
||||
$expect[18] = array( new HTMLPurifier_Token_Empty('br', array('test' => 'x < 6')) );
|
||||
|
||||
// test emoticon protection
|
||||
$input[19] = '<b>Whoa! <3 That\'s not good >.></b>';
|
||||
$expect[19] = array(
|
||||
new HTMLPurifier_Token_Start('b'),
|
||||
new HTMLPurifier_Token_Text('Whoa! '),
|
||||
new HTMLPurifier_Token_Text('<3 That\'s not good >'),
|
||||
new HTMLPurifier_Token_Text('.>'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
);
|
||||
$dom_expect[19] = array(
|
||||
new HTMLPurifier_Token_Start('b'),
|
||||
new HTMLPurifier_Token_Text('Whoa! <3 That\'s not good >.>'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
);
|
||||
$sax_expect[19] = false; // SAX drops the < character
|
||||
$config[19] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
|
||||
|
||||
// test comment parsing with funky characters inside
|
||||
$input[20] = '<!-- This >< comment --><br />';
|
||||
$expect[20] = array(
|
||||
new HTMLPurifier_Token_Comment(' This >< comment '),
|
||||
new HTMLPurifier_Token_Empty('br')
|
||||
);
|
||||
$sax_expect[20] = false;
|
||||
$config[20] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
|
||||
|
||||
// test comment parsing of missing end
|
||||
$input[21] = '<!-- This >< comment';
|
||||
$expect[21] = array(
|
||||
new HTMLPurifier_Token_Comment(' This >< comment')
|
||||
);
|
||||
$sax_expect[21] = false;
|
||||
$dom_expect[21] = false;
|
||||
$config[21] = HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' => true));
|
||||
|
||||
// test CDATA tags
|
||||
$input[22] = '<script>alert("<foo>");</script>';
|
||||
$expect[22] = array(
|
||||
new HTMLPurifier_Token_Start('script')
|
||||
,new HTMLPurifier_Token_Text('alert("<foo>");')
|
||||
,new HTMLPurifier_Token_End('script')
|
||||
);
|
||||
$config[22] = HTMLPurifier_Config::create(array('HTML.Trusted' => true));
|
||||
$sax_expect[22] = false;
|
||||
|
||||
// test escaping
|
||||
$input[23] = '<!-- This comment < < & -->';
|
||||
$expect[23] = array(
|
||||
new HTMLPurifier_Token_Comment(' This comment < < & ') );
|
||||
$sax_expect[23] = false; $config[23] =
|
||||
HTMLPurifier_Config::create(array('Core.AggressivelyFixLt' =>
|
||||
true));
|
||||
|
||||
// more DirectLex edge-cases
|
||||
$input[24] = '<a href="><>">';
|
||||
$expect[24] = array(
|
||||
new HTMLPurifier_Token_Start('a', array('href' => '')),
|
||||
new HTMLPurifier_Token_Text('<">')
|
||||
);
|
||||
$sax_expect[24] = false;
|
||||
$dom_expect[24] = array(
|
||||
new HTMLPurifier_Token_Empty('a', array('href' => '><>'))
|
||||
);
|
||||
|
||||
$default_config = HTMLPurifier_Config::createDefault();
|
||||
$default_context = new HTMLPurifier_Context();
|
||||
foreach($input as $i => $discard) {
|
||||
if (!isset($config[$i])) $config[$i] = $default_config;
|
||||
|
||||
$result = $this->DirectLex->tokenizeHTML($input[$i], $config[$i], $default_context);
|
||||
$this->assertIdentical($expect[$i], $result, 'DirectLexTest '.$i.': %s');
|
||||
paintIf($result, $expect[$i] != $result);
|
||||
|
||||
if ($this->_has_pear) {
|
||||
// assert unless I say otherwise
|
||||
$sax_result = $this->PEARSax3->tokenizeHTML($input[$i], $config[$i], $default_context);
|
||||
if (!isset($sax_expect[$i])) {
|
||||
// by default, assert with normal result
|
||||
$this->assertIdentical($expect[$i], $sax_result, 'PEARSax3Test '.$i.': %s');
|
||||
paintIf($sax_result, $expect[$i] != $sax_result);
|
||||
} elseif ($sax_expect[$i] === false) {
|
||||
// assertions were turned off, optionally dump
|
||||
// paintIf($sax_expect, $i == NUMBER);
|
||||
} else {
|
||||
// match with a custom SAX result array
|
||||
$this->assertIdentical($sax_expect[$i], $sax_result, 'PEARSax3Test (custom) '.$i.': %s');
|
||||
paintIf($sax_result, $sax_expect[$i] != $sax_result);
|
||||
}
|
||||
}
|
||||
|
||||
if ($this->_has_dom) {
|
||||
$dom_result = $this->DOMLex->tokenizeHTML($input[$i], $config[$i], $default_context);
|
||||
// same structure as SAX
|
||||
if (!isset($dom_expect[$i])) {
|
||||
$this->assertIdentical($expect[$i], $dom_result, 'DOMLexTest '.$i.': %s');
|
||||
paintIf($dom_result, $expect[$i] != $dom_result);
|
||||
} elseif ($dom_expect[$i] === false) {
|
||||
// paintIf($dom_result, $i == NUMBER);
|
||||
} else {
|
||||
$this->assertIdentical($dom_expect[$i], $dom_result, 'DOMLexTest (custom) '.$i.': %s');
|
||||
paintIf($dom_result, $dom_expect[$i] != $dom_result);
|
||||
}
|
||||
}
|
||||
// HTMLPurifier_Lexer->tokenizeHTML() --------------------------------------
|
||||
|
||||
function assertTokenization($input, $expect, $alt_expect = array()) {
|
||||
$lexers = array();
|
||||
$lexers['DirectLex'] = new HTMLPurifier_Lexer_DirectLex();
|
||||
if ($this->_has_pear) $lexers['PEARSax3'] = new HTMLPurifier_Lexer_PEARSax3();
|
||||
if (version_compare(PHP_VERSION, "5", ">=") && class_exists('DOMDocument')) {
|
||||
$lexers['DOMLex'] = new HTMLPurifier_Lexer_DOMLex();
|
||||
$lexers['PH5P'] = new HTMLPurifier_Lexer_PH5P();
|
||||
}
|
||||
foreach ($lexers as $name => $lexer) {
|
||||
$result = $lexer->tokenizeHTML($input, $this->config, $this->context);
|
||||
if (isset($alt_expect[$name])) {
|
||||
if ($alt_expect[$name] === false) continue;
|
||||
$t_expect = $alt_expect[$name];
|
||||
$this->assertIdentical($result, $alt_expect[$name], "$name: %s");
|
||||
} else {
|
||||
$t_expect = $expect;
|
||||
$this->assertIdentical($result, $expect, "$name: %s");
|
||||
}
|
||||
if ($t_expect != $result) {
|
||||
printTokens($result);
|
||||
//var_dump($result);
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_emptyInput() {
|
||||
$this->assertTokenization('', array());
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_plainText() {
|
||||
$this->assertTokenization(
|
||||
'This is regular text.',
|
||||
array(
|
||||
new HTMLPurifier_Token_Text('This is regular text.')
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_textAndTags() {
|
||||
$this->assertTokenization(
|
||||
'This is <b>bold</b> text',
|
||||
array(
|
||||
new HTMLPurifier_Token_Text('This is '),
|
||||
new HTMLPurifier_Token_Start('b', array()),
|
||||
new HTMLPurifier_Token_Text('bold'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
new HTMLPurifier_Token_Text(' text'),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_normalizeCase() {
|
||||
$this->assertTokenization(
|
||||
'<DIV>Totally rad dude. <b>asdf</b></div>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Start('DIV', array()),
|
||||
new HTMLPurifier_Token_Text('Totally rad dude. '),
|
||||
new HTMLPurifier_Token_Start('b', array()),
|
||||
new HTMLPurifier_Token_Text('asdf'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
new HTMLPurifier_Token_End('div'),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_notWellFormed() {
|
||||
$this->assertTokenization(
|
||||
'<asdf></asdf><d></d><poOloka><poolasdf><ds></asdf></ASDF>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Start('asdf'),
|
||||
new HTMLPurifier_Token_End('asdf'),
|
||||
new HTMLPurifier_Token_Start('d'),
|
||||
new HTMLPurifier_Token_End('d'),
|
||||
new HTMLPurifier_Token_Start('poOloka'),
|
||||
new HTMLPurifier_Token_Start('poolasdf'),
|
||||
new HTMLPurifier_Token_Start('ds'),
|
||||
new HTMLPurifier_Token_End('asdf'),
|
||||
new HTMLPurifier_Token_End('ASDF'),
|
||||
),
|
||||
array(
|
||||
'DOMLex' => $alt = array(
|
||||
new HTMLPurifier_Token_Empty('asdf'),
|
||||
new HTMLPurifier_Token_Empty('d'),
|
||||
new HTMLPurifier_Token_Start('pooloka'),
|
||||
new HTMLPurifier_Token_Start('poolasdf'),
|
||||
new HTMLPurifier_Token_Empty('ds'),
|
||||
new HTMLPurifier_Token_End('poolasdf'),
|
||||
new HTMLPurifier_Token_End('pooloka'),
|
||||
),
|
||||
'PH5P' => $alt,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_whitespaceInTag() {
|
||||
$this->assertTokenization(
|
||||
'<a'."\t".'href="foobar.php"'."\n".'title="foo!">Link to <b id="asdf">foobar</b></a>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Start('a',array('href'=>'foobar.php','title'=>'foo!')),
|
||||
new HTMLPurifier_Token_Text('Link to '),
|
||||
new HTMLPurifier_Token_Start('b',array('id'=>'asdf')),
|
||||
new HTMLPurifier_Token_Text('foobar'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
new HTMLPurifier_Token_End('a'),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_emptyTag() {
|
||||
$this->assertTokenization(
|
||||
'<br />',
|
||||
array( new HTMLPurifier_Token_Empty('br') )
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_comment() {
|
||||
$this->assertTokenization(
|
||||
'<!-- Comment -->',
|
||||
array( new HTMLPurifier_Token_Comment(' Comment ') ),
|
||||
array(
|
||||
'PEARSax3' => array( new HTMLPurifier_Token_Comment('-- Comment --') ),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_malformedComment() {
|
||||
$this->assertTokenization(
|
||||
'<!-- not so well formed --->',
|
||||
array( new HTMLPurifier_Token_Comment(' not so well formed -') ),
|
||||
array(
|
||||
'PEARSax3' => array( new HTMLPurifier_Token_Comment('-- not so well formed ---') ),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_unterminatedTag() {
|
||||
$this->assertTokenization(
|
||||
'<a href=""',
|
||||
array( new HTMLPurifier_Token_Text('<a href=""') ),
|
||||
array(
|
||||
// I like our behavior better, but it's non-standard
|
||||
'DOMLex' => array( new HTMLPurifier_Token_Empty('a', array('href'=>'')) ),
|
||||
'PEARSax3' => array( new HTMLPurifier_Token_Start('a', array('href'=>'')) ),
|
||||
'PH5P' => false, // total barfing, grabs scaffolding too
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_specialEntities() {
|
||||
$this->assertTokenization(
|
||||
'<b>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Text('<b>')
|
||||
),
|
||||
array(
|
||||
// some parsers will separate entities out
|
||||
'PEARSax3' => $split = array(
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('b'),
|
||||
new HTMLPurifier_Token_Text('>'),
|
||||
),
|
||||
'PH5P' => $split,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_earlyQuote() {
|
||||
$this->assertTokenization(
|
||||
'<a "=>',
|
||||
array( new HTMLPurifier_Token_Empty('a') ),
|
||||
array(
|
||||
// we barf on this input
|
||||
'DirectLex' => $tokens = array(
|
||||
new HTMLPurifier_Token_Start('a', array('"' => ''))
|
||||
),
|
||||
'PEARSax3' => $tokens,
|
||||
'PH5P' => array(
|
||||
new HTMLPurifier_Token_Empty('a', array('"' => ''))
|
||||
),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_unescapedQuote() {
|
||||
$this->assertTokenization(
|
||||
'"',
|
||||
array( new HTMLPurifier_Token_Text('"') )
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_escapedQuote() {
|
||||
$this->assertTokenization(
|
||||
'"',
|
||||
array( new HTMLPurifier_Token_Text('"') ),
|
||||
array(
|
||||
'PEARSax3' => false, // PEAR barfs on this
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_cdata() {
|
||||
$this->assertTokenization(
|
||||
'<![CDATA[You <b>can't</b> get me!]]>',
|
||||
array( new HTMLPurifier_Token_Text('You <b>can't</b> get me!') ),
|
||||
array(
|
||||
// PEAR splits up all of the CDATA
|
||||
'PEARSax3' => $split = array(
|
||||
new HTMLPurifier_Token_Text('You '),
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('b'),
|
||||
new HTMLPurifier_Token_Text('>'),
|
||||
new HTMLPurifier_Token_Text('can'),
|
||||
new HTMLPurifier_Token_Text('&'),
|
||||
new HTMLPurifier_Token_Text('#39;t'),
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('/b'),
|
||||
new HTMLPurifier_Token_Text('>'),
|
||||
new HTMLPurifier_Token_Text(' get me!'),
|
||||
),
|
||||
'PH5P' => $split,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_characterEntity() {
|
||||
$this->assertTokenization(
|
||||
'θ',
|
||||
array( new HTMLPurifier_Token_Text("\xCE\xB8") )
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_characterEntityInCDATA() {
|
||||
$this->assertTokenization(
|
||||
'<![CDATA[→]]>',
|
||||
array( new HTMLPurifier_Token_Text("→") ),
|
||||
array(
|
||||
'PEARSax3' => $split = array(
|
||||
new HTMLPurifier_Token_Text('&'),
|
||||
new HTMLPurifier_Token_Text('rarr;'),
|
||||
),
|
||||
'PH5P' => $split,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_entityInAttribute() {
|
||||
$this->assertTokenization(
|
||||
'<a href="index.php?title=foo&id=bar">Link</a>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Start('a',array('href' => 'index.php?title=foo&id=bar')),
|
||||
new HTMLPurifier_Token_Text('Link'),
|
||||
new HTMLPurifier_Token_End('a'),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_preserveUTF8() {
|
||||
$this->assertTokenization(
|
||||
"\xCE\xB8",
|
||||
array( new HTMLPurifier_Token_Text("\xCE\xB8") )
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_specialEntityInAttribute() {
|
||||
$this->assertTokenization(
|
||||
'<br test="x < 6" />',
|
||||
array( new HTMLPurifier_Token_Empty('br', array('test' => 'x < 6')) )
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_emoticonProtection() {
|
||||
$this->config->set('Core', 'AggressivelyFixLt', true);
|
||||
$this->assertTokenization(
|
||||
'<b>Whoa! <3 That\'s not good >.></b>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Start('b'),
|
||||
new HTMLPurifier_Token_Text('Whoa! '),
|
||||
new HTMLPurifier_Token_Text('<3 That\'s not good >'),
|
||||
new HTMLPurifier_Token_Text('.>'),
|
||||
new HTMLPurifier_Token_End('b')
|
||||
),
|
||||
array(
|
||||
// text is absorbed together
|
||||
'DOMLex' => array(
|
||||
new HTMLPurifier_Token_Start('b'),
|
||||
new HTMLPurifier_Token_Text('Whoa! <3 That\'s not good >.>'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
),
|
||||
'PEARSax3' => false, // totally mangled
|
||||
'PH5P' => array( // interesting grouping
|
||||
new HTMLPurifier_Token_Start('b'),
|
||||
new HTMLPurifier_Token_Text('Whoa! '),
|
||||
new HTMLPurifier_Token_Text('<'),
|
||||
new HTMLPurifier_Token_Text('3 That\'s not good >.>'),
|
||||
new HTMLPurifier_Token_End('b'),
|
||||
),
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_commentWithFunkyChars() {
|
||||
$this->assertTokenization(
|
||||
'<!-- This >< comment --><br />',
|
||||
array(
|
||||
new HTMLPurifier_Token_Comment(' This >< comment '),
|
||||
new HTMLPurifier_Token_Empty('br'),
|
||||
),
|
||||
array(
|
||||
'PEARSax3' => false,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_unterminatedComment() {
|
||||
$this->assertTokenization(
|
||||
'<!-- This >< comment',
|
||||
array( new HTMLPurifier_Token_Comment(' This >< comment') ),
|
||||
array(
|
||||
'DOMLex' => false,
|
||||
'PEARSax3' => false,
|
||||
'PH5P' => false,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_scriptCDATAContents() {
|
||||
$this->config->set('HTML', 'Trusted', true);
|
||||
$this->assertTokenization(
|
||||
'Foo: <script>alert("<foo>");</script>',
|
||||
array(
|
||||
new HTMLPurifier_Token_Text('Foo: '),
|
||||
new HTMLPurifier_Token_Start('script'),
|
||||
new HTMLPurifier_Token_Text('alert("<foo>");'),
|
||||
new HTMLPurifier_Token_End('script'),
|
||||
),
|
||||
array(
|
||||
'PEARSax3' => false,
|
||||
// PH5P, for some reason, bubbles the script to <head>
|
||||
'PH5P' => false,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_entitiesInComment() {
|
||||
$this->config->set('Core', 'AggressivelyFixLt', true);
|
||||
$this->assertTokenization(
|
||||
'<!-- This comment < < & -->',
|
||||
array( new HTMLPurifier_Token_Comment(' This comment < < & ') ),
|
||||
array(
|
||||
'PEARSax3' => false
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_attributeWithSpecialCharacters() {
|
||||
$this->assertTokenization(
|
||||
'<a href="><>">',
|
||||
array( new HTMLPurifier_Token_Empty('a', array('href' => '><>')) ),
|
||||
array(
|
||||
'DirectLex' => array(
|
||||
new HTMLPurifier_Token_Start('a', array('href' => '')),
|
||||
new HTMLPurifier_Token_Text('<">'),
|
||||
),
|
||||
'PEARSax3' => false,
|
||||
)
|
||||
);
|
||||
}
|
||||
|
||||
function test_tokenizeHTML_emptyTagWithSlashInAttribute() {
|
||||
$this->assertTokenization(
|
||||
'<param name="src" value="http://example.com/video.wmv" />',
|
||||
array( new HTMLPurifier_Token_Empty('param', array('name' => 'src', 'value' => 'http://example.com/video.wmv')) )
|
||||
);
|
||||
}
|
||||
|
||||
/*
|
||||
|
||||
function test_tokenizeHTML_() {
|
||||
$this->assertTokenization(
|
||||
,
|
||||
array(
|
||||
|
||||
)
|
||||
);
|
||||
}
|
||||
*/
|
||||
|
||||
}
|
||||
|
||||
|
@@ -16,6 +16,7 @@ class HTMLPurifier_SimpleTest_Reporter extends HTMLReporter
|
||||
?>><?php echo $file ?></option>
|
||||
<?php } ?>
|
||||
</select>
|
||||
<input type="checkbox" name="standalone" title="Standalone version?" <?php if(isset($_GET['standalone'])) {echo 'checked="checked" ';} ?>/>
|
||||
<input type="submit" value="Go">
|
||||
</form>
|
||||
<?php
|
||||
|
@@ -11,26 +11,36 @@ class HTMLPurifier_Strategy_CoreTest extends HTMLPurifier_StrategyHarness
|
||||
$this->obj = new HTMLPurifier_Strategy_Core();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
function testBlankInput() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
function testMakeWellFormed() {
|
||||
$this->assertResult(
|
||||
'<b>Make well formed.',
|
||||
'<b>Make well formed.</b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testFixNesting() {
|
||||
$this->assertResult(
|
||||
'<b><div>Fix nesting.</div></b>',
|
||||
'<b></b><div>Fix nesting.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveForeignElements() {
|
||||
$this->assertResult(
|
||||
'<asdf>Foreign element removal.</asdf>',
|
||||
'Foreign element removal.'
|
||||
);
|
||||
}
|
||||
|
||||
function testFirstThree() {
|
||||
$this->assertResult(
|
||||
'<foo><b><div>All three.</div></b>',
|
||||
'<b></b><div>All three.</div>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -11,79 +11,81 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
|
||||
$this->obj = new HTMLPurifier_Strategy_FixNesting();
|
||||
}
|
||||
|
||||
function testBlockAndInlineIntegration() {
|
||||
|
||||
// legal inline
|
||||
function testPreserveInlineInRoot() {
|
||||
$this->assertResult('<b>Bold text</b>');
|
||||
}
|
||||
|
||||
// legal inline and block (default parent element is FLOW)
|
||||
function testPreserveInlineAndBlockInRoot() {
|
||||
$this->assertResult('<a href="about:blank">Blank</a><div>Block</div>');
|
||||
}
|
||||
|
||||
// illegal block in inline
|
||||
function testRemoveBlockInInline() {
|
||||
$this->assertResult(
|
||||
'<b><div>Illegal div.</div></b>',
|
||||
'<b>Illegal div.</b>'
|
||||
);
|
||||
|
||||
// same test with different configuration (fragile)
|
||||
$this->assertResult(
|
||||
'<b><div>Illegal div.</div></b>',
|
||||
'<b><div>Illegal div.</div></b>',
|
||||
array('Core.EscapeInvalidChildren' => true)
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testNodeRemovalIntegration() {
|
||||
function testEscapeBlockInInline() {
|
||||
$this->config->set('Core', 'EscapeInvalidChildren', true);
|
||||
$this->assertResult(
|
||||
'<b><div>Illegal div.</div></b>',
|
||||
'<b><div>Illegal div.</div></b>'
|
||||
);
|
||||
}
|
||||
|
||||
// test of empty set that's required, resulting in removal of node
|
||||
function testRemoveNodeWithMissingRequiredElements() {
|
||||
$this->assertResult('<ul></ul>', '');
|
||||
}
|
||||
|
||||
// test illegal text which gets removed
|
||||
function testRemoveIllegalPCDATA() {
|
||||
$this->assertResult(
|
||||
'<ul>Illegal text<li>Legal item</li></ul>',
|
||||
'<ul><li>Legal item</li></ul>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testTableIntegration() {
|
||||
// test custom table definition
|
||||
$this->assertResult(
|
||||
'<table><tr><td>Cell 1</td></tr></table>'
|
||||
);
|
||||
function testCustomTableDefinition() {
|
||||
$this->assertResult('<table><tr><td>Cell 1</td></tr></table>');
|
||||
}
|
||||
|
||||
function testRemoveEmptyTable() {
|
||||
$this->assertResult('<table></table>', '');
|
||||
}
|
||||
|
||||
function testChameleonIntegration() {
|
||||
|
||||
// block in inline ins not allowed
|
||||
function testChameleonRemoveBlockInNodeInInline() {
|
||||
$this->assertResult(
|
||||
'<span><ins><div>Not allowed!</div></ins></span>',
|
||||
'<span><ins>Not allowed!</ins></span>'
|
||||
);
|
||||
}
|
||||
|
||||
// test block element that has inline content
|
||||
function testChameleonRemoveBlockInBlockNodeWithInlineContent() {
|
||||
$this->assertResult(
|
||||
'<h1><ins><div>Not allowed!</div></ins></h1>',
|
||||
'<h1><ins>Not allowed!</ins></h1>'
|
||||
);
|
||||
}
|
||||
|
||||
// stacked ins/del
|
||||
function testNestedChameleonRemoveBlockInNodeWithInlineContent() {
|
||||
$this->assertResult(
|
||||
'<h1><ins><del><div>Not allowed!</div></del></ins></h1>',
|
||||
'<h1><ins><del>Not allowed!</del></ins></h1>'
|
||||
);
|
||||
}
|
||||
|
||||
function testNestedChameleonPreserveBlockInBlock() {
|
||||
$this->assertResult(
|
||||
'<div><ins><del><div>Allowed!</div></del></ins></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testChameleonEscapeInvalidBlockInInline() {
|
||||
$this->config->set('Core', 'EscapeInvalidChildren', true);
|
||||
$this->assertResult( // alt config
|
||||
'<span><ins><div>Not allowed!</div></ins></span>',
|
||||
'<span><ins><div>Not allowed!</div></ins></span>',
|
||||
array('Core.EscapeInvalidChildren' => true)
|
||||
'<span><ins><div>Not allowed!</div></ins></span>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testExclusionsIntegration() {
|
||||
@@ -94,42 +96,44 @@ class HTMLPurifier_Strategy_FixNestingTest extends HTMLPurifier_StrategyHarness
|
||||
);
|
||||
}
|
||||
|
||||
function testCustomParentIntegration() {
|
||||
// test inline parent
|
||||
$this->assertResult(
|
||||
'<b>Bold</b>', true, array('HTML.Parent' => 'span')
|
||||
);
|
||||
$this->assertResult(
|
||||
'<div>Reject</div>', 'Reject', array('HTML.Parent' => 'span')
|
||||
);
|
||||
}
|
||||
|
||||
function testError() {
|
||||
// test fallback to div
|
||||
$this->expectError('Cannot use unrecognized element as parent.');
|
||||
$this->assertResult(
|
||||
'<div>Accept</div>', true, array('HTML.Parent' => 'obviously-impossible')
|
||||
);
|
||||
$this->swallowErrors();
|
||||
|
||||
function testPreserveInlineNodeInInlineRootNode() {
|
||||
$this->config->set('HTML', 'Parent', 'span');
|
||||
$this->assertResult('<b>Bold</b>');
|
||||
}
|
||||
|
||||
function testDoubleCheckIntegration() {
|
||||
// breaks without the redundant checking code
|
||||
function testRemoveBlockNodeInInlineRootNode() {
|
||||
$this->config->set('HTML', 'Parent', 'span');
|
||||
$this->assertResult('<div>Reject</div>', 'Reject');
|
||||
}
|
||||
|
||||
function testInvalidParentError() {
|
||||
// test fallback to div
|
||||
$this->config->set('HTML', 'Parent', 'obviously-impossible');
|
||||
// $this->expectError('Cannot use unrecognized element as parent');
|
||||
$this->assertResult('<div>Accept</div>');
|
||||
$this->swallowErrors();
|
||||
}
|
||||
|
||||
function testCascadingRemovalOfNodesMissingRequiredChildren() {
|
||||
$this->assertResult('<table><tr></tr></table>', '');
|
||||
}
|
||||
|
||||
// special case, prevents scrolling one back to find parent
|
||||
function testCascadingRemovalSpecialCaseCannotScrollOneBack() {
|
||||
$this->assertResult('<table><tr></tr><tr></tr></table>', '');
|
||||
}
|
||||
|
||||
// cascading rollbacks
|
||||
$this->assertResult(
|
||||
'<table><tbody><tr></tr><tr></tr></tbody><tr></tr><tr></tr></table>',
|
||||
''
|
||||
);
|
||||
function testLotsOfCascadingRemovalOfNodes() {
|
||||
$this->assertResult('<table><tbody><tr></tr><tr></tr></tbody><tr></tr><tr></tr></table>', '');
|
||||
}
|
||||
|
||||
// rollbacks twice
|
||||
function testAdjacentRemovalOfNodeMissingRequiredChildren() {
|
||||
$this->assertResult('<table></table><table></table>', '');
|
||||
}
|
||||
|
||||
function testStrictBlockquoteInHTML401() {
|
||||
$this->config->set('HTML', 'Doctype', 'HTML 4.01 Strict');
|
||||
$this->assertResult('<blockquote>text</blockquote>', '<blockquote><p>text</p></blockquote>');
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
|
@@ -28,6 +28,11 @@ class HTMLPurifier_Strategy_FixNesting_ErrorsTest extends HTMLPurifier_Strategy_
|
||||
$this->invoke("<span>Valid<div>Invalid</div></span>");
|
||||
}
|
||||
|
||||
function testNoNodeReorganizedForEmptyNode() {
|
||||
$this->expectNoErrorCollection();
|
||||
$this->invoke("<span></span>");
|
||||
}
|
||||
|
||||
function testNodeContentsRemoved() {
|
||||
$this->expectErrorCollection(E_ERROR, 'Strategy_FixNesting: Node contents removed');
|
||||
$this->expectContext('CurrentToken', new HTMLPurifier_Token_Start('span', array(), 1));
|
||||
|
@@ -9,113 +9,77 @@ class HTMLPurifier_Strategy_MakeWellFormedTest extends HTMLPurifier_StrategyHarn
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_MakeWellFormed();
|
||||
$this->config = array();
|
||||
}
|
||||
|
||||
function testNormalIntegration() {
|
||||
function testEmptyInput() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
function testWellFormedInput() {
|
||||
$this->assertResult('This is <b>bold text</b>.');
|
||||
}
|
||||
|
||||
function testUnclosedTagIntegration() {
|
||||
function testUnclosedTagTerminatedByDocumentEnd() {
|
||||
$this->assertResult(
|
||||
'<b>Unclosed tag, gasp!',
|
||||
'<b>Unclosed tag, gasp!</b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testUnclosedTagTerminatedByParentNodeEnd() {
|
||||
$this->assertResult(
|
||||
'<b><i>Bold and italic?</b>',
|
||||
'<b><i>Bold and italic?</i></b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveStrayClosingTag() {
|
||||
$this->assertResult(
|
||||
'Unused end tags... recycle!</b>',
|
||||
'Unused end tags... recycle!'
|
||||
);
|
||||
}
|
||||
|
||||
function testEmptyTagDetectionIntegration() {
|
||||
function testConvertStartToEmpty() {
|
||||
$this->assertResult(
|
||||
'<br style="clear:both;">',
|
||||
'<br style="clear:both;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertEmptyToStart() {
|
||||
$this->assertResult(
|
||||
'<div style="clear:both;" />',
|
||||
'<div style="clear:both;"></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAutoClose() {
|
||||
// paragraph
|
||||
|
||||
function testAutoCloseParagraph() {
|
||||
$this->assertResult(
|
||||
'<p>Paragraph 1<p>Paragraph 2',
|
||||
'<p>Paragraph 1</p><p>Paragraph 2</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAutoCloseParagraphInsideDiv() {
|
||||
$this->assertResult(
|
||||
'<div><p>Paragraphs<p>In<p>A<p>Div</div>',
|
||||
'<div><p>Paragraphs</p><p>In</p><p>A</p><p>Div</p></div>'
|
||||
);
|
||||
}
|
||||
|
||||
// list
|
||||
|
||||
function testAutoCloseListItem() {
|
||||
$this->assertResult(
|
||||
'<ol><li>Item 1<li>Item 2</ol>',
|
||||
'<ol><li>Item 1</li><li>Item 2</li></ol>'
|
||||
);
|
||||
}
|
||||
|
||||
// colgroup
|
||||
|
||||
function testAutoCloseColgroup() {
|
||||
$this->assertResult(
|
||||
'<table><colgroup><col /><tr></tr></table>',
|
||||
'<table><colgroup><col /></colgroup><tr></tr></table>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testMultipleInjectors() {
|
||||
|
||||
$this->config = array('AutoFormat.AutoParagraph' => true, 'AutoFormat.Linkify' => true);
|
||||
|
||||
$this->assertResult(
|
||||
'Foobar',
|
||||
'<p>Foobar</p>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'http://example.com',
|
||||
'<p><a href="http://example.com">http://example.com</a></p>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'<b>http://example.com</b>',
|
||||
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'<b>http://example.com',
|
||||
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'http://example.com
|
||||
|
||||
http://dev.example.com',
|
||||
'<p><a href="http://example.com">http://example.com</a></p><p><a href="http://dev.example.com">http://dev.example.com</a></p>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'http://example.com <div>http://example.com</div>',
|
||||
'<p><a href="http://example.com">http://example.com</a> </p><div><a href="http://example.com">http://example.com</a></div>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'This URL http://example.com is what you need',
|
||||
'<p>This URL <a href="http://example.com">http://example.com</a> is what you need</p>'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
85
tests/HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php
Normal file
85
tests/HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php
Normal file
@@ -0,0 +1,85 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/StrategyHarness.php';
|
||||
require_once 'HTMLPurifier/Strategy/MakeWellFormed.php';
|
||||
|
||||
class HTMLPurifier_Strategy_MakeWellFormed_InjectorTest extends HTMLPurifier_StrategyHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_MakeWellFormed();
|
||||
$this->config->set('AutoFormat', 'AutoParagraph', true);
|
||||
$this->config->set('AutoFormat', 'Linkify', true);
|
||||
generate_mock_once('HTMLPurifier_Injector');
|
||||
}
|
||||
|
||||
function testEndNotification() {
|
||||
$mock = new HTMLPurifier_InjectorMock();
|
||||
$mock->skip = false;
|
||||
$mock->expectAt(0, 'notifyEnd', array(new HTMLPurifier_Token_End('b')));
|
||||
$mock->expectAt(1, 'notifyEnd', array(new HTMLPurifier_Token_End('i')));
|
||||
$mock->expectCallCount('notifyEnd', 2);
|
||||
$this->config->set('AutoFormat', 'AutoParagraph', false);
|
||||
$this->config->set('AutoFormat', 'Linkify', false);
|
||||
$this->config->set('AutoFormat', 'Custom', array($mock));
|
||||
$this->assertResult('<i><b>asdf</b>', '<i><b>asdf</b></i>');
|
||||
}
|
||||
|
||||
function testOnlyAutoParagraph() {
|
||||
$this->assertResult(
|
||||
'Foobar',
|
||||
'<p>Foobar</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphWrappingOnlyLink() {
|
||||
$this->assertResult(
|
||||
'http://example.com',
|
||||
'<p><a href="http://example.com">http://example.com</a></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphWrappingNodeContainingLink() {
|
||||
$this->assertResult(
|
||||
'<b>http://example.com</b>',
|
||||
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphWrappingPoorlyFormedNodeContainingLink() {
|
||||
$this->assertResult(
|
||||
'<b>http://example.com',
|
||||
'<p><b><a href="http://example.com">http://example.com</a></b></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTwoParagraphsContainingOnlyOneLink() {
|
||||
$this->assertResult(
|
||||
"http://example.com\n\nhttp://dev.example.com",
|
||||
'<p><a href="http://example.com">http://example.com</a></p><p><a href="http://dev.example.com">http://dev.example.com</a></p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphNextToDivWithLinks() {
|
||||
$this->assertResult(
|
||||
'http://example.com <div>http://example.com</div>',
|
||||
'<p><a href="http://example.com">http://example.com</a> </p><div><a href="http://example.com">http://example.com</a></div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRealisticLinkInSentence() {
|
||||
$this->assertResult(
|
||||
'This URL http://example.com is what you need',
|
||||
'<p>This URL <a href="http://example.com">http://example.com</a> is what you need</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testParagraphAfterLinkifiedURL() {
|
||||
$this->assertResult(
|
||||
"http://google.com\n\n<b>b</b>",
|
||||
"<p><a href=\"http://google.com\">http://google.com</a></p><p><b>b</b></p>"
|
||||
);
|
||||
}
|
||||
|
||||
}
|
@@ -3,8 +3,7 @@
|
||||
require_once 'HTMLPurifier/StrategyHarness.php';
|
||||
require_once 'HTMLPurifier/Strategy/RemoveForeignElements.php';
|
||||
|
||||
class HTMLPurifier_Strategy_RemoveForeignElementsTest
|
||||
extends HTMLPurifier_StrategyHarness
|
||||
class HTMLPurifier_Strategy_RemoveForeignElementsTest extends HTMLPurifier_StrategyHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
@@ -12,96 +11,84 @@ class HTMLPurifier_Strategy_RemoveForeignElementsTest
|
||||
$this->obj = new HTMLPurifier_Strategy_RemoveForeignElements();
|
||||
}
|
||||
|
||||
function test() {
|
||||
|
||||
$this->config = array('HTML.Doctype' => 'XHTML 1.0 Strict');
|
||||
|
||||
function testBlankInput() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
function testPreserveRecognizedElements() {
|
||||
$this->assertResult('This is <b>bold text</b>.');
|
||||
}
|
||||
|
||||
function testRemoveForeignElements() {
|
||||
$this->assertResult(
|
||||
'<asdf>Bling</asdf><d href="bang">Bong</d><foobar />',
|
||||
'BlingBong'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveScriptAndContents() {
|
||||
$this->assertResult(
|
||||
'<script>alert();</script>',
|
||||
''
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveStyleAndContents() {
|
||||
$this->assertResult(
|
||||
'<style>.foo {blink;}</style>',
|
||||
''
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveOnlyScriptTagsLegacy() {
|
||||
$this->config->set('Core', 'RemoveScriptContents', false);
|
||||
$this->assertResult(
|
||||
'<script>alert();</script>',
|
||||
'alert();',
|
||||
array('Core.RemoveScriptContents' => false)
|
||||
'alert();'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveOnlyScriptTags() {
|
||||
$this->config->set('Core', 'HiddenElements', array());
|
||||
$this->assertResult(
|
||||
'<script>alert();</script>',
|
||||
'alert();',
|
||||
array('Core.HiddenElements' => array())
|
||||
'alert();'
|
||||
);
|
||||
}
|
||||
|
||||
$this->assertResult(
|
||||
'<menu><li>Item 1</li></menu>',
|
||||
'<ul><li>Item 1</li></ul>'
|
||||
);
|
||||
function testRemoveInvalidImg() {
|
||||
$this->assertResult('<img />', '');
|
||||
}
|
||||
|
||||
// test center transform
|
||||
$this->assertResult(
|
||||
'<center>Look I am Centered!</center>',
|
||||
'<div style="text-align:center;">Look I am Centered!</div>'
|
||||
);
|
||||
|
||||
// test font transform
|
||||
$this->assertResult(
|
||||
'<font color="red" face="Arial" size="6">Big Warning!</font>',
|
||||
'<span style="color:red;font-family:Arial;font-size:xx-large;">Big'.
|
||||
' Warning!</span>'
|
||||
);
|
||||
|
||||
// test removal of invalid img tag
|
||||
$this->assertResult(
|
||||
'<img />',
|
||||
''
|
||||
);
|
||||
|
||||
// test preservation of valid img tag
|
||||
function testPreserveValidImg() {
|
||||
$this->assertResult('<img src="foobar.gif" alt="foobar.gif" />');
|
||||
}
|
||||
|
||||
// test preservation of invalid img tag when removal is disabled
|
||||
$this->assertResult(
|
||||
'<img />',
|
||||
true,
|
||||
array(
|
||||
'Core.RemoveInvalidImg' => false
|
||||
)
|
||||
);
|
||||
function testPreserveInvalidImgWhenRemovalIsDisabled() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult('<img />');
|
||||
}
|
||||
|
||||
// test transform to unallowed element
|
||||
$this->assertResult(
|
||||
'<font color="red" face="Arial" size="6">Big Warning!</font>',
|
||||
'Big Warning!',
|
||||
array('HTML.Allowed' => 'div')
|
||||
);
|
||||
|
||||
// text-ify commented script contents ( the trailing comment gets
|
||||
// removed during generation )
|
||||
function testTextifyCommentedScriptContents() {
|
||||
$this->config->set('HTML', 'Trusted', true);
|
||||
$this->config->set('Output', 'CommentScriptContents', false); // simplify output
|
||||
$this->assertResult(
|
||||
'<script type="text/javascript"><!--
|
||||
alert(<b>bold</b>);
|
||||
// --></script>',
|
||||
'<script type="text/javascript">
|
||||
alert(<b>bold</b>);
|
||||
// </script>',
|
||||
array('HTML.Trusted' => true, 'Output.CommentScriptContents' => false)
|
||||
// </script>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRequiredAttributesTestNotPerformedOnEndTag() {
|
||||
$this->config->set('HTML', 'DefinitionID',
|
||||
'HTMLPurifier_Strategy_RemoveForeignElementsTest'.
|
||||
'->testRequiredAttributesTestNotPerformedOnEndTag');
|
||||
$def =& $this->config->getHTMLDefinition(true);
|
||||
$def->addElement('f', 'Block', 'Optional: #PCDATA', false, array('req*' => 'Text'));
|
||||
$this->assertResult('<f req="text">Foo</f> Bar');
|
||||
}
|
||||
|
||||
}
|
||||
|
@@ -0,0 +1,46 @@
|
||||
<?php
|
||||
|
||||
require_once 'HTMLPurifier/StrategyHarness.php';
|
||||
require_once 'HTMLPurifier/Strategy/RemoveForeignElements.php';
|
||||
|
||||
class HTMLPurifier_Strategy_RemoveForeignElements_TidyTest
|
||||
extends HTMLPurifier_StrategyHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_RemoveForeignElements();
|
||||
$this->config->set('HTML', 'TidyLevel', 'heavy');
|
||||
}
|
||||
|
||||
function testCenterTransform() {
|
||||
$this->assertResult(
|
||||
'<center>Look I am Centered!</center>',
|
||||
'<div style="text-align:center;">Look I am Centered!</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testFontTransform() {
|
||||
$this->assertResult(
|
||||
'<font color="red" face="Arial" size="6">Big Warning!</font>',
|
||||
'<span style="color:red;font-family:Arial;font-size:xx-large;">Big'.
|
||||
' Warning!</span>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTransformToForbiddenElement() {
|
||||
$this->config->set('HTML', 'Allowed', 'div');
|
||||
$this->assertResult(
|
||||
'<font color="red" face="Arial" size="6">Big Warning!</font>',
|
||||
'Big Warning!'
|
||||
);
|
||||
}
|
||||
|
||||
function testMenuTransform() {
|
||||
$this->assertResult(
|
||||
'<menu><li>Item 1</li></menu>',
|
||||
'<ul><li>Item 1</li></ul>'
|
||||
);
|
||||
}
|
||||
|
||||
}
|
@@ -1,6 +1,5 @@
|
||||
<?php
|
||||
|
||||
require_once('HTMLPurifier/Config.php');
|
||||
require_once('HTMLPurifier/StrategyHarness.php');
|
||||
require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
|
||||
|
||||
@@ -11,126 +10,99 @@ class HTMLPurifier_Strategy_ValidateAttributesTest extends
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
|
||||
$this->config = array('HTML.Doctype' => 'XHTML 1.0 Strict');
|
||||
}
|
||||
|
||||
function testEmpty() {
|
||||
function testEmptyInput() {
|
||||
$this->assertResult('');
|
||||
}
|
||||
|
||||
function testIDs() {
|
||||
function testRemoveIDByDefault() {
|
||||
$this->assertResult(
|
||||
'<div id="valid">Kill the ID.</div>',
|
||||
'<div>Kill the ID.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
$this->assertResult('<div id="valid">Preserve the ID.</div>', true,
|
||||
array('HTML.EnableAttrID' => true));
|
||||
|
||||
$this->assertResult(
|
||||
'<div id="0invalid">Kill the ID.</div>',
|
||||
'<div>Kill the ID.</div>',
|
||||
array('HTML.EnableAttrID' => true)
|
||||
);
|
||||
|
||||
// test id accumulator
|
||||
$this->assertResult(
|
||||
'<div id="valid">Valid</div><div id="valid">Invalid</div>',
|
||||
'<div id="valid">Valid</div><div>Invalid</div>',
|
||||
array('HTML.EnableAttrID' => true)
|
||||
);
|
||||
|
||||
function testRemoveInvalidDir() {
|
||||
$this->assertResult(
|
||||
'<span dir="up-to-down">Bad dir.</span>',
|
||||
'<span>Bad dir.</span>'
|
||||
);
|
||||
|
||||
// test attribute key case sensitivity
|
||||
$this->assertResult(
|
||||
'<div ID="valid">Convert ID to lowercase.</div>',
|
||||
'<div id="valid">Convert ID to lowercase.</div>',
|
||||
array('HTML.EnableAttrID' => true)
|
||||
);
|
||||
|
||||
// test simple attribute substitution
|
||||
$this->assertResult(
|
||||
'<div id=" valid ">Trim whitespace.</div>',
|
||||
'<div id="valid">Trim whitespace.</div>',
|
||||
array('HTML.EnableAttrID' => true)
|
||||
);
|
||||
|
||||
// test configuration id blacklist
|
||||
$this->assertResult(
|
||||
'<div id="invalid">Invalid</div>',
|
||||
'<div>Invalid</div>',
|
||||
array(
|
||||
'Attr.IDBlacklist' => array('invalid'),
|
||||
'HTML.EnableAttrID' => true
|
||||
)
|
||||
);
|
||||
|
||||
// name rewritten as id
|
||||
$this->assertResult(
|
||||
'<a name="foobar" />',
|
||||
'<a id="foobar" />',
|
||||
array('HTML.EnableAttrID' => true)
|
||||
);
|
||||
}
|
||||
|
||||
function testClasses() {
|
||||
function testPreserveValidClass() {
|
||||
$this->assertResult('<div class="valid">Valid</div>');
|
||||
}
|
||||
|
||||
function testSelectivelyRemoveInvalidClasses() {
|
||||
$this->assertResult(
|
||||
'<div class="valid 0invalid">Keep valid.</div>',
|
||||
'<div class="valid">Keep valid.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTitle() {
|
||||
function testPreserveTitle() {
|
||||
$this->assertResult(
|
||||
'<acronym title="PHP: Hypertext Preprocessor">PHP</acronym>'
|
||||
);
|
||||
}
|
||||
|
||||
function testLang() {
|
||||
function testAddXMLLang() {
|
||||
$this->assertResult(
|
||||
'<span lang="fr">La soupe.</span>',
|
||||
'<span lang="fr" xml:lang="fr">La soupe.</span>'
|
||||
);
|
||||
}
|
||||
|
||||
// test only xml:lang for XHTML 1.1
|
||||
function testOnlyXMLLangInXHTML11() {
|
||||
$this->config->set('HTML', 'Doctype', 'XHTML 1.1');
|
||||
$this->assertResult(
|
||||
'<b lang="en">asdf</b>',
|
||||
'<b xml:lang="en">asdf</b>', array('HTML.Doctype' => 'XHTML 1.1')
|
||||
'<b xml:lang="en">asdf</b>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAlign() {
|
||||
|
||||
$this->assertResult(
|
||||
'<h1 align="center">Centered Headline</h1>',
|
||||
'<h1 style="text-align:center;">Centered Headline</h1>'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<h1 align="right">Right-aligned Headline</h1>',
|
||||
'<h1 style="text-align:right;">Right-aligned Headline</h1>'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<h1 align="left">Left-aligned Headline</h1>',
|
||||
'<h1 style="text-align:left;">Left-aligned Headline</h1>'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<p align="justify">Justified Paragraph</p>',
|
||||
'<p style="text-align:justify;">Justified Paragraph</p>'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<h1 align="invalid">Invalid Headline</h1>',
|
||||
'<h1>Invalid Headline</h1>'
|
||||
);
|
||||
|
||||
function testBasicURI() {
|
||||
$this->assertResult('<a href="http://www.google.com/">Google</a>');
|
||||
}
|
||||
|
||||
function testTable() {
|
||||
function testInvalidURI() {
|
||||
$this->assertResult(
|
||||
'<a href="javascript:badstuff();">Google</a>',
|
||||
'<a>Google</a>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBdoAddMissingDir() {
|
||||
$this->assertResult(
|
||||
'<bdo>Go left.</bdo>',
|
||||
'<bdo dir="ltr">Go left.</bdo>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBdoReplaceInvalidDirWithDefault() {
|
||||
$this->assertResult(
|
||||
'<bdo dir="blahblah">Invalid value!</bdo>',
|
||||
'<bdo dir="ltr">Invalid value!</bdo>'
|
||||
);
|
||||
}
|
||||
|
||||
function testBdoAlternateDefaultDir() {
|
||||
$this->config->set('Attr', 'DefaultTextDir', 'rtl');
|
||||
$this->assertResult(
|
||||
'<bdo>Go right.</bdo>',
|
||||
'<bdo dir="rtl">Go right.</bdo>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveDirWhenNotRequired() {
|
||||
$this->assertResult(
|
||||
'<span dir="blahblah">Invalid value!</span>',
|
||||
'<span>Invalid value!</span>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTableAttributes() {
|
||||
$this->assertResult(
|
||||
'<table frame="above" rules="rows" summary="A test table" border="2" cellpadding="5%" cellspacing="3" width="100%">
|
||||
<col align="right" width="4*" />
|
||||
@@ -148,293 +120,64 @@ class HTMLPurifier_Strategy_ValidateAttributesTest extends
|
||||
</tr>
|
||||
</table>'
|
||||
);
|
||||
}
|
||||
|
||||
// test col.span is non-zero
|
||||
function testColSpanIsNonZero() {
|
||||
$this->assertResult(
|
||||
'<col span="0" />',
|
||||
'<col />'
|
||||
);
|
||||
// lengths
|
||||
$this->assertResult(
|
||||
'<td width="5%" height="10" /><th width="10" height="5%" /><hr width="10" height="10" />',
|
||||
'<td style="width:5%;height:10px;" /><th style="width:10px;height:5%;" /><hr style="width:10px;" />'
|
||||
);
|
||||
// td boolean transformation
|
||||
$this->assertResult(
|
||||
'<td nowrap />',
|
||||
'<td style="white-space:nowrap;" />'
|
||||
);
|
||||
|
||||
// caption align transformation
|
||||
$this->assertResult(
|
||||
'<caption align="left" />',
|
||||
'<caption style="text-align:left;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<caption align="right" />',
|
||||
'<caption style="text-align:right;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<caption align="top" />',
|
||||
'<caption style="caption-side:top;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<caption align="bottom" />',
|
||||
'<caption style="caption-side:bottom;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<caption align="nonsense" />',
|
||||
'<caption />'
|
||||
);
|
||||
|
||||
// align transformation
|
||||
$this->assertResult(
|
||||
'<table align="left" />',
|
||||
'<table style="float:left;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<table align="center" />',
|
||||
'<table style="margin-left:auto;margin-right:auto;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<table align="right" />',
|
||||
'<table style="float:right;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<table align="top" />',
|
||||
'<table />'
|
||||
);
|
||||
}
|
||||
|
||||
function testURI() {
|
||||
$this->assertResult('<a href="http://www.google.com/">Google</a>');
|
||||
|
||||
// test invalid URI
|
||||
$this->assertResult(
|
||||
'<a href="javascript:badstuff();">Google</a>',
|
||||
'<a>Google</a>'
|
||||
);
|
||||
}
|
||||
|
||||
function testImg() {
|
||||
function testImgAddDefaults() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
'<img />',
|
||||
'<img src="" alt="Invalid image" />',
|
||||
array('Core.RemoveInvalidImg' => false)
|
||||
'<img src="" alt="Invalid image" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgGenerateAlt() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" />',
|
||||
'<img src="foobar.jpg" alt="foobar.jpg" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgAddDefaultSrc() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
'<img alt="pretty picture" />',
|
||||
'<img alt="pretty picture" src="" />',
|
||||
array('Core.RemoveInvalidImg' => false)
|
||||
'<img alt="pretty picture" src="" />'
|
||||
);
|
||||
// mailto in image is not allowed
|
||||
}
|
||||
|
||||
function testImgRemoveNonRetrievableProtocol() {
|
||||
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||
$this->assertResult(
|
||||
'<img src="mailto:foo@example.com" />',
|
||||
'<img alt="mailto:foo@example.com" src="" />',
|
||||
array('Core.RemoveInvalidImg' => false)
|
||||
);
|
||||
// align transformation
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="left" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="float:left;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="right" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="float:right;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="bottom" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:baseline;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="middle" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:middle;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="top" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:top;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="outerspace" />',
|
||||
'<img src="foobar.jpg" alt="foobar" />'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
function testBdo() {
|
||||
// test required attributes for bdo
|
||||
$this->assertResult(
|
||||
'<bdo>Go left.</bdo>',
|
||||
'<bdo dir="ltr">Go left.</bdo>'
|
||||
);
|
||||
|
||||
$this->assertResult(
|
||||
'<bdo dir="blahblah">Invalid value!</bdo>',
|
||||
'<bdo dir="ltr">Invalid value!</bdo>'
|
||||
'<img alt="mailto:foo@example.com" src="" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testDir() {
|
||||
// see testBdo, behavior is subtly different
|
||||
$this->assertResult(
|
||||
'<span dir="blahblah">Invalid value!</span>',
|
||||
'<span>Invalid value!</span>'
|
||||
);
|
||||
function testPreserveRel() {
|
||||
$this->config->set('Attr', 'AllowedRel', 'nofollow');
|
||||
$this->assertResult('<a href="foo" rel="nofollow" />');
|
||||
}
|
||||
|
||||
function testLinks() {
|
||||
// link types
|
||||
$this->assertResult(
|
||||
'<a href="foo" rel="nofollow" />',
|
||||
true,
|
||||
array('Attr.AllowedRel' => 'nofollow')
|
||||
);
|
||||
// link targets
|
||||
$this->assertResult(
|
||||
'<a href="foo" target="_top" />',
|
||||
true,
|
||||
array('Attr.AllowedFrameTargets' => '_top',
|
||||
'HTML.Doctype' => 'XHTML 1.0 Transitional')
|
||||
);
|
||||
function testPreserveTarget() {
|
||||
$this->config->set('Attr', 'AllowedFrameTargets', '_top');
|
||||
$this->config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
|
||||
$this->assertResult('<a href="foo" target="_top" />');
|
||||
}
|
||||
|
||||
function testRemoveTargetWhenNotSupported() {
|
||||
$this->config->set('HTML', 'Doctype', 'XHTML 1.0 Strict');
|
||||
$this->config->set('Attr', 'AllowedFrameTargets', '_top');
|
||||
$this->assertResult(
|
||||
'<a href="foo" target="_top" />',
|
||||
'<a href="foo" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<a href="foo" target="_top" />',
|
||||
'<a href="foo" />',
|
||||
array('Attr.AllowedFrameTargets' => '_top', 'HTML.Strict' => true)
|
||||
);
|
||||
}
|
||||
|
||||
function testBorder() {
|
||||
// border
|
||||
$this->assertResult(
|
||||
'<img src="foo" alt="foo" hspace="1" vspace="3" />',
|
||||
'<img src="foo" alt="foo" style="margin-top:3px;margin-bottom:3px;margin-left:1px;margin-right:1px;" />',
|
||||
array('Attr.AllowedRel' => 'nofollow')
|
||||
);
|
||||
}
|
||||
|
||||
function testHr() {
|
||||
$this->assertResult(
|
||||
'<hr size="3" />',
|
||||
'<hr style="height:3px;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<hr noshade />',
|
||||
'<hr style="color:#808080;background-color:#808080;border:0;" />'
|
||||
);
|
||||
// align transformation
|
||||
$this->assertResult(
|
||||
'<hr align="left" />',
|
||||
'<hr style="margin-left:0;margin-right:auto;text-align:left;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<hr align="center" />',
|
||||
'<hr style="margin-left:auto;margin-right:auto;text-align:center;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<hr align="right" />',
|
||||
'<hr style="margin-left:auto;margin-right:0;text-align:right;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<hr align="bottom" />',
|
||||
'<hr />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBr() {
|
||||
// br clear transformation
|
||||
$this->assertResult(
|
||||
'<br clear="left" />',
|
||||
'<br style="clear:left;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<br clear="right" />',
|
||||
'<br style="clear:right;" />'
|
||||
);
|
||||
$this->assertResult( // test both?
|
||||
'<br clear="all" />',
|
||||
'<br style="clear:both;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<br clear="none" />',
|
||||
'<br style="clear:none;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<br clear="foo" />',
|
||||
'<br />'
|
||||
);
|
||||
}
|
||||
|
||||
function testListTypeTransform() {
|
||||
// ul
|
||||
$this->assertResult(
|
||||
'<ul type="disc" />',
|
||||
'<ul style="list-style-type:disc;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ul type="square" />',
|
||||
'<ul style="list-style-type:square;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ul type="circle" />',
|
||||
'<ul style="list-style-type:circle;" />'
|
||||
);
|
||||
$this->assertResult( // case insensitive
|
||||
'<ul type="CIRCLE" />',
|
||||
'<ul style="list-style-type:circle;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ul type="a" />',
|
||||
'<ul />'
|
||||
);
|
||||
// ol
|
||||
$this->assertResult(
|
||||
'<ol type="1" />',
|
||||
'<ol style="list-style-type:decimal;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ol type="i" />',
|
||||
'<ol style="list-style-type:lower-roman;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ol type="I" />',
|
||||
'<ol style="list-style-type:upper-roman;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ol type="a" />',
|
||||
'<ol style="list-style-type:lower-alpha;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ol type="A" />',
|
||||
'<ol style="list-style-type:upper-alpha;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<ol type="disc" />',
|
||||
'<ol />'
|
||||
);
|
||||
// li
|
||||
$this->assertResult(
|
||||
'<li type="circle" />',
|
||||
'<li style="list-style-type:circle;" />'
|
||||
);
|
||||
$this->assertResult(
|
||||
'<li type="A" />',
|
||||
'<li style="list-style-type:upper-alpha;" />'
|
||||
);
|
||||
$this->assertResult( // case sensitive
|
||||
'<li type="CIRCLE" />',
|
||||
'<li />'
|
||||
);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
65
tests/HTMLPurifier/Strategy/ValidateAttributes_IDTest.php
Normal file
65
tests/HTMLPurifier/Strategy/ValidateAttributes_IDTest.php
Normal file
@@ -0,0 +1,65 @@
|
||||
<?php
|
||||
|
||||
require_once('HTMLPurifier/StrategyHarness.php');
|
||||
require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
|
||||
|
||||
class HTMLPurifier_Strategy_ValidateAttributes_IDTest extends HTMLPurifier_StrategyHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
|
||||
$this->config->set('HTML', 'EnableAttrID', true);
|
||||
}
|
||||
|
||||
|
||||
function testPreserveIDWhenEnabled() {
|
||||
$this->assertResult('<div id="valid">Preserve the ID.</div>');
|
||||
}
|
||||
|
||||
function testRemoveInvalidID() {
|
||||
$this->assertResult(
|
||||
'<div id="0invalid">Kill the ID.</div>',
|
||||
'<div>Kill the ID.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveDuplicateID() {
|
||||
$this->assertResult(
|
||||
'<div id="valid">Valid</div><div id="valid">Invalid</div>',
|
||||
'<div id="valid">Valid</div><div>Invalid</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testAttributeKeyCaseInsensitivity() {
|
||||
$this->assertResult(
|
||||
'<div ID="valid">Convert ID to lowercase.</div>',
|
||||
'<div id="valid">Convert ID to lowercase.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testTrimWhitespace() {
|
||||
$this->assertResult(
|
||||
'<div id=" valid ">Trim whitespace.</div>',
|
||||
'<div id="valid">Trim whitespace.</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testIDBlacklist() {
|
||||
$this->config->set('Attr', 'IDBlacklist', array('invalid'));
|
||||
$this->assertResult(
|
||||
'<div id="invalid">Invalid</div>',
|
||||
'<div>Invalid</div>'
|
||||
);
|
||||
}
|
||||
|
||||
function testNameConvertedToID() {
|
||||
$this->config->set('HTML', 'TidyLevel', 'heavy');
|
||||
$this->assertResult(
|
||||
'<a name="foobar" />',
|
||||
'<a id="foobar" />'
|
||||
);
|
||||
}
|
||||
|
||||
}
|
||||
|
353
tests/HTMLPurifier/Strategy/ValidateAttributes_TidyTest.php
Normal file
353
tests/HTMLPurifier/Strategy/ValidateAttributes_TidyTest.php
Normal file
@@ -0,0 +1,353 @@
|
||||
<?php
|
||||
|
||||
require_once('HTMLPurifier/StrategyHarness.php');
|
||||
require_once('HTMLPurifier/Strategy/ValidateAttributes.php');
|
||||
|
||||
class HTMLPurifier_Strategy_ValidateAttributes_TidyTest extends HTMLPurifier_StrategyHarness
|
||||
{
|
||||
|
||||
function setUp() {
|
||||
parent::setUp();
|
||||
$this->obj = new HTMLPurifier_Strategy_ValidateAttributes();
|
||||
$this->config->set('HTML', 'TidyLevel', 'heavy');
|
||||
}
|
||||
|
||||
function testConvertCenterAlign() {
|
||||
$this->assertResult(
|
||||
'<h1 align="center">Centered Headline</h1>',
|
||||
'<h1 style="text-align:center;">Centered Headline</h1>'
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertRightAlign() {
|
||||
$this->assertResult(
|
||||
'<h1 align="right">Right-aligned Headline</h1>',
|
||||
'<h1 style="text-align:right;">Right-aligned Headline</h1>'
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertLeftAlign() {
|
||||
$this->assertResult(
|
||||
'<h1 align="left">Left-aligned Headline</h1>',
|
||||
'<h1 style="text-align:left;">Left-aligned Headline</h1>'
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertJustifyAlign() {
|
||||
$this->assertResult(
|
||||
'<p align="justify">Justified Paragraph</p>',
|
||||
'<p style="text-align:justify;">Justified Paragraph</p>'
|
||||
);
|
||||
}
|
||||
|
||||
function testRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
'<h1 align="invalid">Invalid Headline</h1>',
|
||||
'<h1>Invalid Headline</h1>'
|
||||
);
|
||||
}
|
||||
|
||||
function testConvertTableLengths() {
|
||||
$this->assertResult(
|
||||
'<td width="5%" height="10" /><th width="10" height="5%" /><hr width="10" height="10" />',
|
||||
'<td style="width:5%;height:10px;" /><th style="width:10px;height:5%;" /><hr style="width:10px;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testTdConvertNowrap() {
|
||||
$this->assertResult(
|
||||
'<td nowrap />',
|
||||
'<td style="white-space:nowrap;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testCaptionConvertAlignLeft() {
|
||||
$this->assertResult(
|
||||
'<caption align="left" />',
|
||||
'<caption style="text-align:left;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testCaptionConvertAlignRight() {
|
||||
$this->assertResult(
|
||||
'<caption align="right" />',
|
||||
'<caption style="text-align:right;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testCaptionConvertAlignTop() {
|
||||
$this->assertResult(
|
||||
'<caption align="top" />',
|
||||
'<caption style="caption-side:top;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testCaptionConvertAlignBottom() {
|
||||
$this->assertResult(
|
||||
'<caption align="bottom" />',
|
||||
'<caption style="caption-side:bottom;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testCaptionRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
'<caption align="nonsense" />',
|
||||
'<caption />'
|
||||
);
|
||||
}
|
||||
|
||||
function testTableConvertAlignLeft() {
|
||||
$this->assertResult(
|
||||
'<table align="left" />',
|
||||
'<table style="float:left;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testTableConvertAlignCenter() {
|
||||
$this->assertResult(
|
||||
'<table align="center" />',
|
||||
'<table style="margin-left:auto;margin-right:auto;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testTableConvertAlignRight() {
|
||||
$this->assertResult(
|
||||
'<table align="right" />',
|
||||
'<table style="float:right;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testTableRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
'<table align="top" />',
|
||||
'<table />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgConvertAlignLeft() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="left" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="float:left;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgConvertAlignRight() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="right" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="float:right;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgConvertAlignBottom() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="bottom" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:baseline;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgConvertAlignMiddle() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="middle" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:middle;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgConvertAlignTop() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="top" />',
|
||||
'<img src="foobar.jpg" alt="foobar" style="vertical-align:top;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testImgRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
'<img src="foobar.jpg" alt="foobar" align="outerspace" />',
|
||||
'<img src="foobar.jpg" alt="foobar" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBorderConvertHVSpace() {
|
||||
$this->assertResult(
|
||||
'<img src="foo" alt="foo" hspace="1" vspace="3" />',
|
||||
'<img src="foo" alt="foo" style="margin-top:3px;margin-bottom:3px;margin-left:1px;margin-right:1px;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrConvertSize() {
|
||||
$this->assertResult(
|
||||
'<hr size="3" />',
|
||||
'<hr style="height:3px;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrConvertNoshade() {
|
||||
$this->assertResult(
|
||||
'<hr noshade />',
|
||||
'<hr style="color:#808080;background-color:#808080;border:0;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrConvertAlignLeft() {
|
||||
$this->assertResult(
|
||||
'<hr align="left" />',
|
||||
'<hr style="margin-left:0;margin-right:auto;text-align:left;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrConvertAlignCenter() {
|
||||
$this->assertResult(
|
||||
'<hr align="center" />',
|
||||
'<hr style="margin-left:auto;margin-right:auto;text-align:center;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrConvertAlignRight() {
|
||||
$this->assertResult(
|
||||
'<hr align="right" />',
|
||||
'<hr style="margin-left:auto;margin-right:0;text-align:right;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testHrRemoveInvalidAlign() {
|
||||
$this->assertResult(
|
||||
'<hr align="bottom" />',
|
||||
'<hr />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBrConvertClearLeft() {
|
||||
$this->assertResult(
|
||||
'<br clear="left" />',
|
||||
'<br style="clear:left;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBrConvertClearRight() {
|
||||
$this->assertResult(
|
||||
'<br clear="right" />',
|
||||
'<br style="clear:right;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBrConvertClearAll() {
|
||||
$this->assertResult(
|
||||
'<br clear="all" />',
|
||||
'<br style="clear:both;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBrConvertClearNone() {
|
||||
$this->assertResult(
|
||||
'<br clear="none" />',
|
||||
'<br style="clear:none;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testBrRemoveInvalidClear() {
|
||||
$this->assertResult(
|
||||
'<br clear="foo" />',
|
||||
'<br />'
|
||||
);
|
||||
}
|
||||
|
||||
function testUlConvertTypeDisc() {
|
||||
$this->assertResult(
|
||||
'<ul type="disc" />',
|
||||
'<ul style="list-style-type:disc;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testUlConvertTypeSquare() {
|
||||
$this->assertResult(
|
||||
'<ul type="square" />',
|
||||
'<ul style="list-style-type:square;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testUlConvertTypeCircle() {
|
||||
$this->assertResult(
|
||||
'<ul type="circle" />',
|
||||
'<ul style="list-style-type:circle;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testUlConvertTypeCaseInsensitive() {
|
||||
$this->assertResult(
|
||||
'<ul type="CIRCLE" />',
|
||||
'<ul style="list-style-type:circle;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testUlRemoveInvalidType() {
|
||||
$this->assertResult(
|
||||
'<ul type="a" />',
|
||||
'<ul />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlConvertType1() {
|
||||
$this->assertResult(
|
||||
'<ol type="1" />',
|
||||
'<ol style="list-style-type:decimal;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlConvertTypeLowerI() {
|
||||
$this->assertResult(
|
||||
'<ol type="i" />',
|
||||
'<ol style="list-style-type:lower-roman;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlConvertTypeUpperI() {
|
||||
$this->assertResult(
|
||||
'<ol type="I" />',
|
||||
'<ol style="list-style-type:upper-roman;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlConvertTypeLowerA() {
|
||||
$this->assertResult(
|
||||
'<ol type="a" />',
|
||||
'<ol style="list-style-type:lower-alpha;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlConvertTypeUpperA() {
|
||||
$this->assertResult(
|
||||
'<ol type="A" />',
|
||||
'<ol style="list-style-type:upper-alpha;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testOlRemoveInvalidType() {
|
||||
$this->assertResult(
|
||||
'<ol type="disc" />',
|
||||
'<ol />'
|
||||
);
|
||||
}
|
||||
|
||||
function testLiConvertTypeCircle() {
|
||||
$this->assertResult(
|
||||
'<li type="circle" />',
|
||||
'<li style="list-style-type:circle;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testLiConvertTypeA() {
|
||||
$this->assertResult(
|
||||
'<li type="A" />',
|
||||
'<li style="list-style-type:upper-alpha;" />'
|
||||
);
|
||||
}
|
||||
|
||||
function testLiConvertTypeCaseSensitive() {
|
||||
$this->assertResult(
|
||||
'<li type="CIRCLE" />',
|
||||
'<li />'
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
|
@@ -111,6 +111,12 @@ class HTMLPurifier_URIFilter_MakeAbsoluteTest extends HTMLPurifier_URIFilterHarn
|
||||
$this->assertFiltering('.', '../');
|
||||
}
|
||||
|
||||
function testRemoveJavaScriptWithEmbeddedLink() {
|
||||
// credits: NykO18
|
||||
$this->setBase('http://www.example.com/');
|
||||
$this->assertFiltering('javascript: window.location = \'http://www.example.com\';', false);
|
||||
}
|
||||
|
||||
// error case
|
||||
|
||||
function testErrorNoBase() {
|
||||
|
@@ -94,6 +94,7 @@ class HTMLPurifierTest extends HTMLPurifier_Harness
|
||||
|
||||
$this->purifier = new HTMLPurifier(array('HTML.EnableAttrID' => true));
|
||||
$this->assertPurification('<span id="moon">foobar</span>');
|
||||
$this->assertPurification('<img id="folly" src="folly.png" alt="Omigosh!" />');
|
||||
|
||||
}
|
||||
|
||||
|
@@ -3,9 +3,11 @@
|
||||
// call one file using /?f=FileTest.php , see $test_files array for
|
||||
// valid values
|
||||
|
||||
error_reporting(E_ALL | E_STRICT);
|
||||
if (version_compare(PHP_VERSION, '5.1', '>=')) error_reporting(E_ALL | E_STRICT);
|
||||
else error_reporting(E_ALL);
|
||||
|
||||
define('HTMLPurifierTest', 1);
|
||||
define('HTMLPURIFIER_SCHEMA_STRICT', true);
|
||||
define('HTMLPURIFIER_SCHEMA_STRICT', true); // validate schemas
|
||||
|
||||
// wishlist: automated calling of this file from multiple PHP versions so we
|
||||
// don't have to constantly switch around
|
||||
@@ -13,10 +15,12 @@ define('HTMLPURIFIER_SCHEMA_STRICT', true);
|
||||
// default settings (protect against register_globals)
|
||||
$GLOBALS['HTMLPurifierTest'] = array();
|
||||
$GLOBALS['HTMLPurifierTest']['PEAR'] = false; // do PEAR tests
|
||||
$GLOBALS['HTMLPurifierTest']['PH5P'] = version_compare(PHP_VERSION, "5", ">=") && class_exists('DOMDocument');
|
||||
$simpletest_location = 'simpletest/'; // reasonable guess
|
||||
|
||||
// load SimpleTest
|
||||
@include '../test-settings.php'; // don't mind if it isn't there
|
||||
if (file_exists('../conf/test-settings.php')) include '../conf/test-settings.php';
|
||||
if (file_exists('../test-settings.php')) include '../test-settings.php';
|
||||
require_once $simpletest_location . 'unit_tester.php';
|
||||
require_once $simpletest_location . 'reporter.php';
|
||||
require_once $simpletest_location . 'mock_objects.php';
|
||||
@@ -78,8 +82,7 @@ if ($test_file = $GLOBALS['HTMLPurifierTest']['File']) {
|
||||
|
||||
} else {
|
||||
|
||||
$test = new GroupTest('All Tests');
|
||||
|
||||
$test = new GroupTest('All tests on PHP ' . PHP_VERSION);
|
||||
foreach ($test_files as $test_file) {
|
||||
require_once $test_file;
|
||||
$test->addTestClass(path2class($test_file));
|
||||
@@ -91,5 +94,3 @@ if (SimpleReporter::inCli()) $reporter = new TextReporter();
|
||||
else $reporter = new HTMLPurifier_SimpleTest_Reporter('UTF-8');
|
||||
|
||||
$test->run($reporter);
|
||||
|
||||
|
||||
|
33
tests/multitest.php
Normal file
33
tests/multitest.php
Normal file
@@ -0,0 +1,33 @@
|
||||
<?php
|
||||
|
||||
$versions_to_test = array(
|
||||
'FLUSH',
|
||||
'5.0.4',
|
||||
'5.0.5',
|
||||
'5.1.4',
|
||||
'5.1.6',
|
||||
'5.2.0',
|
||||
'5.2.1',
|
||||
'5.2.2',
|
||||
'5.2.3',
|
||||
'5.2.4',
|
||||
'5.2.5RC2-dev',
|
||||
'5.3.0-dev',
|
||||
// '6.0.0-dev',
|
||||
);
|
||||
|
||||
echo str_repeat('-', 70) . "\n";
|
||||
echo "HTML Purifier\n";
|
||||
echo "Multiple PHP Versions Test\n\n";
|
||||
|
||||
passthru("php ../maintenance/merge-library.php");
|
||||
|
||||
foreach ($versions_to_test as $version) {
|
||||
if ($version === 'FLUSH') {
|
||||
shell_exec('php ../maintenance/flush-definition-cache.php');
|
||||
continue;
|
||||
}
|
||||
passthru("phpv $version index.php");
|
||||
passthru("phpv $version index.php standalone");
|
||||
echo "\n\n";
|
||||
}
|
@@ -79,6 +79,7 @@ $test_files[] = 'HTMLPurifier/GeneratorTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLDefinitionTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModuleManagerTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModuleTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModule/ObjectTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModule/RubyTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModule/ScriptingTest.php';
|
||||
$test_files[] = 'HTMLPurifier/HTMLModule/TidyTest.php';
|
||||
@@ -98,9 +99,13 @@ $test_files[] = 'HTMLPurifier/Strategy/FixNestingTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/FixNesting_ErrorsTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormedTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormed_ErrorsTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/MakeWellFormed_InjectorTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElementsTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElements_ErrorsTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/RemoveForeignElements_TidyTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributesTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributes_IDTest.php';
|
||||
$test_files[] = 'HTMLPurifier/Strategy/ValidateAttributes_TidyTest.php';
|
||||
$test_files[] = 'HTMLPurifier/TagTransformTest.php';
|
||||
$test_files[] = 'HTMLPurifier/TokenTest.php';
|
||||
$test_files[] = 'HTMLPurifier/URIDefinitionTest.php';
|
||||
|
Reference in New Issue
Block a user