mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-08-04 13:18:00 +02:00
Compare commits
4 Commits
v2.1.0-str
...
v2.1.3-str
Author | SHA1 | Date | |
---|---|---|---|
|
9db861e356 | ||
|
b3f0e6c86c | ||
|
80c60bb9b5 | ||
|
503e76081b |
237
INSTALL
237
INSTALL
@@ -1,34 +1,81 @@
|
|||||||
|
|
||||||
Install
|
Install
|
||||||
How to install HTML Purifier
|
How to install HTML Purifier
|
||||||
|
|
||||||
HTML Purifier is designed to run out of the box, so actually using the library
|
HTML Purifier is designed to run out of the box, so actually using the
|
||||||
is extremely easy. (Although, if you were looking for a step-by-step
|
library is extremely easy. (Although... if you were looking for a
|
||||||
installation GUI, you've come to the wrong place!) The impatient can scroll
|
step-by-step installation GUI, you've downloaded the wrong software!)
|
||||||
down to the bottom of this INSTALL document to see the code, but you really
|
|
||||||
should make sure a few things are properly done.
|
While the impatient can get going immediately with some of the sample
|
||||||
|
code at the bottom of this library, it's well worth performing some
|
||||||
|
basic sanity checks to get the most out of this library.
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
1. Compatibility
|
1. Compatibility
|
||||||
|
|
||||||
HTML Purifier works in both PHP 4 and PHP 5, from PHP 4.3.2 and up. It has no
|
HTML Purifier works in both PHP 4 and PHP 5, and is actively tested from
|
||||||
core dependencies with other libraries.
|
PHP 4.3.7 and up (see tests/multitest.php for specific versions). It has
|
||||||
|
no core dependencies with other libraries. PHP 4 support will be
|
||||||
|
deprecated on December 31, 2007, at which time only essential security
|
||||||
|
fixes will be issued for the PHP 4 version until August 8, 2008.
|
||||||
|
|
||||||
Optional extensions are iconv (usually installed) and tidy (also common).
|
These optional extensions can enhance the capabilities of HTML Purifier:
|
||||||
If you use UTF-8 and don't plan on pretty-printing HTML, you can get away with
|
|
||||||
not having either of these extensions.
|
* iconv : Converts text to and from non-UTF-8 encodings
|
||||||
|
* tidy : Used for pretty-printing HTML
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
2. Reconnaissance
|
||||||
|
|
||||||
2. Including the library
|
A big plus of HTML Purifier is its inerrant support of standards, so
|
||||||
|
your web-pages should be standards-compliant. (They should also use
|
||||||
|
semantic markup, but that's another issue altogether, one HTML Purifier
|
||||||
|
cannot fix without reading your mind.)
|
||||||
|
|
||||||
Simply use:
|
HTML Purifier can process these doctypes:
|
||||||
|
|
||||||
|
* XHTML 1.0 Transitional (default)
|
||||||
|
* XHTML 1.0 Strict
|
||||||
|
* HTML 4.01 Transitional
|
||||||
|
* HTML 4.01 Strict
|
||||||
|
* XHTML 1.1
|
||||||
|
|
||||||
|
...and these character encodings:
|
||||||
|
|
||||||
|
* UTF-8 (default)
|
||||||
|
* Any encoding iconv supports (with crippled internationalization support)
|
||||||
|
|
||||||
|
These defaults reflect what my choices where be if I were authoring an
|
||||||
|
HTML document, however, what you choose depends on the nature of your
|
||||||
|
codebase. If you don't know what doctype you are using, you can determine
|
||||||
|
the doctype from this identifier at the top of your source code:
|
||||||
|
|
||||||
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
||||||
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||||||
|
|
||||||
|
...and the character encoding from this code:
|
||||||
|
|
||||||
|
<meta http-equiv="Content-type" content="text/html;charset=ENCODING">
|
||||||
|
|
||||||
|
If the character encoding declaration is missing, STOP NOW, and
|
||||||
|
read 'docs/enduser-utf8.html' (web accessible at
|
||||||
|
http://htmlpurifier.org/docs/enduser-utf8.html). In fact, even if it is
|
||||||
|
present, read this document anyway, as most websites specify character
|
||||||
|
encoding incorrectly.
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
3. Including the library
|
||||||
|
|
||||||
|
The procedure is quite simple:
|
||||||
|
|
||||||
require_once '/path/to/library/HTMLPurifier.auto.php';
|
require_once '/path/to/library/HTMLPurifier.auto.php';
|
||||||
|
|
||||||
...and you're good to go. Since HTML Purifier's codebase is fairly
|
I recommend only including HTML Purifier when you need it, because that
|
||||||
large, I recommend only including HTML Purifier when you need it.
|
call represents the inclusion of a lot of PHP files which constitute
|
||||||
|
the bulk of HTML Purifier's memory usage.
|
||||||
|
|
||||||
If you don't like your include_path to be fiddled around with, simply set
|
If you don't like your include_path to be fiddled around with, simply set
|
||||||
HTML Purifier's library/ directory to the include path yourself and then:
|
HTML Purifier's library/ directory to the include path yourself and then:
|
||||||
@@ -39,46 +86,7 @@ Only the contents in the library/ folder are necessary, so you can remove
|
|||||||
everything else when using HTML Purifier in a production environment.
|
everything else when using HTML Purifier in a production environment.
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
3. Preparing the proper output environment
|
|
||||||
|
|
||||||
HTML Purifier is all about web-standards, so accordingly your webpages should
|
|
||||||
be standards compliant. HTML Purifier can deal with these doctypes:
|
|
||||||
|
|
||||||
* XHTML 1.0 Transitional (default)
|
|
||||||
* XHTML 1.0 Strict
|
|
||||||
* HTML 4.01 Transitional
|
|
||||||
* HTML 4.01 Strict
|
|
||||||
* XHTML 1.1 (sans Ruby)
|
|
||||||
|
|
||||||
...and these character encodings:
|
|
||||||
|
|
||||||
* UTF-8 (default)
|
|
||||||
* Any encoding iconv supports (support is crippled for i18n though)
|
|
||||||
|
|
||||||
The defaults are there for a reason: they are best-practice choices that
|
|
||||||
should not be changed lightly. For those of you in the dark, you can determine
|
|
||||||
the doctype from this code in your HTML documents:
|
|
||||||
|
|
||||||
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
||||||
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
||||||
|
|
||||||
...and the character encoding from this code:
|
|
||||||
|
|
||||||
<meta http-equiv="Content-type" content="text/html;charset=ENCODING">
|
|
||||||
|
|
||||||
For legacy codebases these declarations may be missing. If that is the case,
|
|
||||||
STOP, and read docs/enduser-utf8.html
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
You may currently be vulnerable to XSS and other security threats, and HTML
|
|
||||||
Purifier won't be able to fix that.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4. Configuration
|
4. Configuration
|
||||||
|
|
||||||
HTML Purifier is designed to run out-of-the-box, but occasionally HTML
|
HTML Purifier is designed to run out-of-the-box, but occasionally HTML
|
||||||
@@ -95,7 +103,6 @@ object and read on:
|
|||||||
$config = HTMLPurifier_Config::createDefault();
|
$config = HTMLPurifier_Config::createDefault();
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4.1. Setting a different character encoding
|
4.1. Setting a different character encoding
|
||||||
|
|
||||||
You really shouldn't use any other encoding except UTF-8, especially if you
|
You really shouldn't use any other encoding except UTF-8, especially if you
|
||||||
@@ -122,10 +129,6 @@ but please be cognizant of the issues the "solution" creates (for this
|
|||||||
reason, I do not include the solution in this document).
|
reason, I do not include the solution in this document).
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4.2. Setting a different doctype
|
4.2. Setting a different doctype
|
||||||
|
|
||||||
For those of you using HTML 4.01 Transitional, you can disable
|
For those of you using HTML 4.01 Transitional, you can disable
|
||||||
@@ -135,7 +138,6 @@ XHTML output like this:
|
|||||||
|
|
||||||
Other supported doctypes include:
|
Other supported doctypes include:
|
||||||
|
|
||||||
|
|
||||||
* HTML 4.01 Strict
|
* HTML 4.01 Strict
|
||||||
* HTML 4.01 Transitional
|
* HTML 4.01 Transitional
|
||||||
* XHTML 1.0 Strict
|
* XHTML 1.0 Strict
|
||||||
@@ -143,7 +145,6 @@ Other supported doctypes include:
|
|||||||
* XHTML 1.1
|
* XHTML 1.1
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4.3. Other settings
|
4.3. Other settings
|
||||||
|
|
||||||
There are more configuration directives which can be read about
|
There are more configuration directives which can be read about
|
||||||
@@ -153,55 +154,24 @@ your code. Some of the more interesting ones are configurable at the
|
|||||||
demo <http://htmlpurifier.org/demo.php> and are well worth looking into
|
demo <http://htmlpurifier.org/demo.php> and are well worth looking into
|
||||||
for your own system.
|
for your own system.
|
||||||
|
|
||||||
|
For example, you can fine tune allowed elements and attributes, convert
|
||||||
|
relative URLs to absolute ones, and even autoparagraph input text! These
|
||||||
|
are, respectively, %HTML.Allowed, %URI.MakeAbsolute and %URI.Base, and
|
||||||
|
%AutoFormat.AutoParagraph. The %Namespace.Directive naming convention
|
||||||
|
translates to:
|
||||||
|
|
||||||
|
$config->set('Namespace', 'Directive', $value);
|
||||||
|
|
||||||
|
E.g.
|
||||||
|
|
||||||
|
$config->set('HTML', 'Allowed', 'p,b,a[href],i');
|
||||||
|
$config->set('URI', 'Base', 'http://www.example.com');
|
||||||
|
$config->set('URI', 'MakeAbsolute', true);
|
||||||
|
$config->set('AutoFormat', 'AutoParagraph', true);
|
||||||
|
|
||||||
|
|
||||||
5. Using the code
|
---------------------------------------------------------------------------
|
||||||
|
5. Caching
|
||||||
The interface is mind-numbingly simple:
|
|
||||||
|
|
||||||
$purifier = new HTMLPurifier();
|
|
||||||
$clean_html = $purifier->purify( $dirty_html );
|
|
||||||
|
|
||||||
...or, if you're using the configuration object:
|
|
||||||
|
|
||||||
$purifier = new HTMLPurifier($config);
|
|
||||||
$clean_html = $purifier->purify( $dirty_html );
|
|
||||||
|
|
||||||
That's it! For more examples, check out docs/examples/ (they aren't very
|
|
||||||
different though). Also, docs/enduser-slow.html gives advice on what to
|
|
||||||
do if HTML Purifier is slowing down your application.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
6. Quick install
|
|
||||||
|
|
||||||
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
|
|
||||||
writable by the webserver (see Section 7: Caching below for details).
|
|
||||||
If your website is in UTF-8 and XHTML Transitional, use this code:
|
|
||||||
|
|
||||||
<?php
|
|
||||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
|
||||||
|
|
||||||
$purifier = new HTMLPurifier();
|
|
||||||
$clean_html = $purifier->purify($dirty_html);
|
|
||||||
?>
|
|
||||||
|
|
||||||
If your website is in a different encoding or doctype, use this code:
|
|
||||||
|
|
||||||
<?php
|
|
||||||
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
|
||||||
|
|
||||||
$config = HTMLPurifier_Config::createDefault();
|
|
||||||
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
|
|
||||||
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
|
|
||||||
$purifier = new HTMLPurifier($config);
|
|
||||||
|
|
||||||
$clean_html = $purifier->purify($dirty_html);
|
|
||||||
?>
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
7. Caching
|
|
||||||
|
|
||||||
HTML Purifier generates some cache files (generally one or two) to speed up
|
HTML Purifier generates some cache files (generally one or two) to speed up
|
||||||
its execution. For maximum performance, make sure that
|
its execution. For maximum performance, make sure that
|
||||||
@@ -236,3 +206,50 @@ hit):
|
|||||||
Or move the cache directory somewhere else (no trailing slash):
|
Or move the cache directory somewhere else (no trailing slash):
|
||||||
|
|
||||||
$config->set('Cache', 'SerializerPath', '/home/user/absolute/path');
|
$config->set('Cache', 'SerializerPath', '/home/user/absolute/path');
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
6. Using the code
|
||||||
|
|
||||||
|
The interface is mind-numbingly simple:
|
||||||
|
|
||||||
|
$purifier = new HTMLPurifier();
|
||||||
|
$clean_html = $purifier->purify( $dirty_html );
|
||||||
|
|
||||||
|
...or, if you're using the configuration object:
|
||||||
|
|
||||||
|
$purifier = new HTMLPurifier($config);
|
||||||
|
$clean_html = $purifier->purify( $dirty_html );
|
||||||
|
|
||||||
|
That's it! For more examples, check out docs/examples/ (they aren't very
|
||||||
|
different though). Also, docs/enduser-slow.html gives advice on what to
|
||||||
|
do if HTML Purifier is slowing down your application.
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
7. Quick install
|
||||||
|
|
||||||
|
First, make sure library/HTMLPurifier/DefinitionCache/Serializer is
|
||||||
|
writable by the webserver (see Section 5: Caching above for details).
|
||||||
|
If your website is in UTF-8 and XHTML Transitional, use this code:
|
||||||
|
|
||||||
|
<?php
|
||||||
|
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||||
|
|
||||||
|
$purifier = new HTMLPurifier();
|
||||||
|
$clean_html = $purifier->purify($dirty_html);
|
||||||
|
?>
|
||||||
|
|
||||||
|
If your website is in a different encoding or doctype, use this code:
|
||||||
|
|
||||||
|
<?php
|
||||||
|
require_once '/path/to/htmlpurifier/library/HTMLPurifier.auto.php';
|
||||||
|
|
||||||
|
$config = HTMLPurifier_Config::createDefault();
|
||||||
|
$config->set('Core', 'Encoding', 'ISO-8859-1'); // replace with your encoding
|
||||||
|
$config->set('HTML', 'Doctype', 'HTML 4.01 Transitional'); // replace with your doctype
|
||||||
|
$purifier = new HTMLPurifier($config);
|
||||||
|
|
||||||
|
$clean_html = $purifier->purify($dirty_html);
|
||||||
|
?>
|
||||||
|
|
||||||
|
139
NEWS
139
NEWS
@@ -9,6 +9,145 @@ NEWS ( CHANGELOG and HISTORY ) HTMLPurifier
|
|||||||
. Internal change
|
. Internal change
|
||||||
==========================
|
==========================
|
||||||
|
|
||||||
|
2.1.3, released 2007-11-05
|
||||||
|
! tests/multitest.php allows you to test multiple versions by running
|
||||||
|
tests/index.php through multiple interpreters using `phpv` shell
|
||||||
|
script (you must provide this script!)
|
||||||
|
- Fixed poor include ordering for Email URI AttrDefs, causes fatal errors
|
||||||
|
on some systems.
|
||||||
|
- Injector algorithm further refined: off-by-one error regarding skip
|
||||||
|
counts for dormant injectors fixed
|
||||||
|
- Corrective blockquote definition now enabled for HTML 4.01 Strict
|
||||||
|
- Fatal error when <img> tag (or any other element with required attributes)
|
||||||
|
has 'id' attribute fixed, thanks NykO18 for reporting
|
||||||
|
- Fix warning emitted when a non-supported URI scheme is passed to the
|
||||||
|
MakeAbsolute URIFilter, thanks NykO18 (again)
|
||||||
|
- Further refine AutoParagraph injector. Behavior inside of elements
|
||||||
|
allowing paragraph tags clarified: only inline content delimeted by
|
||||||
|
double newlines (not block elements) are paragraphed.
|
||||||
|
- Buggy treatment of end tags of elements that have required attributes
|
||||||
|
fixed (does not manifest on default tag-set)
|
||||||
|
- Spurious internal content reorganization error suppressed
|
||||||
|
- HTMLDefinition->addElement now returns a reference to the created
|
||||||
|
element object, as implied by the documentation
|
||||||
|
- Phorum mod's HTML Purifier help message expanded (unreleased elsewhere)
|
||||||
|
- Fix a theoretical class of infinite loops from DirectLex reported
|
||||||
|
by Nate Abele
|
||||||
|
- Work around unnecessary DOMElement type-cast in PH5P that caused errors
|
||||||
|
in PHP 5.1
|
||||||
|
- Work around PHP 4 SimpleTest lack-of-error complaining for one-time-only
|
||||||
|
HTMLDefinition errors, this may indicate problems with error-collecting
|
||||||
|
facilities in PHP 5
|
||||||
|
- Make ErrorCollectorEMock work in both PHP 4 and PHP 5
|
||||||
|
- Make PH5P work with PHP 5.0 by removing unnecessary array parameter typedef
|
||||||
|
. %Core.AcceptFullDocuments renamed to %Core.ConvertDocumentToFragment
|
||||||
|
to better communicate its purpose
|
||||||
|
. Error unit tests can now specify the expectation of no errors. Future
|
||||||
|
iterations of the harness will be extremely strict about what errors
|
||||||
|
are allowed
|
||||||
|
. Extend Injector hooks to allow for more powerful injector routines
|
||||||
|
. HTMLDefinition->addBlankElement created, as according to the HTMLModule
|
||||||
|
method
|
||||||
|
. Doxygen configuration file updated, with minor improvements
|
||||||
|
. Test runner now checks for similarly named files in conf/ directory too.
|
||||||
|
. Minor cosmetic change to flush-definition-cache.php: trailing newline is
|
||||||
|
outputted
|
||||||
|
. Maintenance script for generating PH5P patch added, original PH5P source
|
||||||
|
file also added under version control
|
||||||
|
. Full unit test runner script title made more descriptive with PHP version
|
||||||
|
. Updated INSTALL file to state that 4.3.7 is the earliest version we
|
||||||
|
are actively testing
|
||||||
|
|
||||||
|
2.1.2, released 2007-09-03
|
||||||
|
! Implemented Object module for trusted users
|
||||||
|
! Implemented experimental HTML5 parsing mode using PH5P. To use, add
|
||||||
|
this to your code:
|
||||||
|
require_once 'HTMLPurifier/Lexer/PH5P.php';
|
||||||
|
$config->set('Core', 'LexerImpl', 'PH5P');
|
||||||
|
Note that this Lexer introduces some classes not in the HTMLPurifier
|
||||||
|
namespace. Also, this is PHP5 only.
|
||||||
|
! CSS property border-spacing implemented
|
||||||
|
- Fix non-visible parsing error in DirectLex with empty tags that have
|
||||||
|
slashes inside attribute values.
|
||||||
|
- Fix typo in CSS definition: border-collapse:seperate; was incorrectly
|
||||||
|
accepted as valid CSS. Usually non-visible, because this styling is the
|
||||||
|
default for tables in most browsers. Thanks Brett Zamir for pointing
|
||||||
|
this out.
|
||||||
|
- Fix validation errors in configuration form
|
||||||
|
- Hammer out a bunch of edge-case bugs in the standalone distribution
|
||||||
|
- Inclusion reflection removed from URISchemeRegistry; you must manually
|
||||||
|
include any new schema files you wish to use
|
||||||
|
- Numerous typo fixes in documentation thanks to Brett Zamir
|
||||||
|
. Unit test refactoring for one logical test per test function
|
||||||
|
. Config and context parameters in ComplexHarness deprecated: instead, edit
|
||||||
|
the $config and $context member variables
|
||||||
|
. HTML wrapper in DOMLex now takes DTD identifiers into account; doesn't
|
||||||
|
really make a difference, but is good for completeness sake
|
||||||
|
. merge-library.php script refactored for greater code reusability and
|
||||||
|
PHP4 compatibility
|
||||||
|
|
||||||
|
2.1.1, released 2007-08-04
|
||||||
|
- Fix show-stopper bug in %URI.MakeAbsolute functionality
|
||||||
|
- Fix PHP4 syntax error in standalone version
|
||||||
|
. Add prefix directory to include path for standalone, this prevents
|
||||||
|
other installations from clobbering the standalone's URI schemes
|
||||||
|
. Single test methods can be invoked by prefixing with __only
|
||||||
|
|
||||||
|
2.1.0, released 2007-08-02
|
||||||
|
# flush-htmldefinition-cache.php superseded in favor of a generic
|
||||||
|
flush-definition-cache.php script, you can clear a specific cache
|
||||||
|
by passing its name as a parameter to the script
|
||||||
|
! Phorum mod implemented for HTML Purifier
|
||||||
|
! With %Core.AggressivelyFixLt, <3 and similar emoticons no longer
|
||||||
|
trigger HTML removal in PHP5 (DOMLex). This directive is not necessary
|
||||||
|
for PHP4 (DirectLex).
|
||||||
|
! Standalone file now available, which greatly reduces the amount of
|
||||||
|
includes (although there are still a few files that reside in the
|
||||||
|
standalone folder)
|
||||||
|
! Relative URIs can now be transformed into their absolute equivalents
|
||||||
|
using %URI.Base and %URI.MakeAbsolute
|
||||||
|
! Ruby implemented for XHTML 1.1
|
||||||
|
! You can now define custom URI filtering behavior, see enduser-uri-filter.html
|
||||||
|
for more details
|
||||||
|
! UTF-8 font names now supported in CSS
|
||||||
|
- AutoFormatters emit friendly error messages if tags or attributes they
|
||||||
|
need are not allowed
|
||||||
|
- ConfigForm's compactification of directive names is now configurable
|
||||||
|
- AutoParagraph autoformatter algorithm refined after field-testing
|
||||||
|
- XHTML 1.1 now applies XHTML 1.0 Strict cleanup routines, namely
|
||||||
|
blockquote wrapping
|
||||||
|
- Contents of <style> tags removed by default when tags are removed
|
||||||
|
. HTMLPurifier_Config->getSerial() implemented, this is extremely useful
|
||||||
|
for output cache invalidation
|
||||||
|
. ConfigForm printer now can retrieve CSS and JS files as strings, in
|
||||||
|
case HTML Purifier's directory is not publically accessible
|
||||||
|
. Introduce new text/itext configuration directive values: these represent
|
||||||
|
longer strings that would be more appropriately edited with a textarea
|
||||||
|
. Allow newlines to act as separators for lists, hashes, lookups and
|
||||||
|
%HTML.Allowed
|
||||||
|
. ConfigForm generates textareas instead of text inputs for lists, hashes,
|
||||||
|
lookups, text and itext fields
|
||||||
|
. Hidden element content removal genericized: %Core.HiddenElements can
|
||||||
|
be used to customize this behavior, by default <script> and <style> are
|
||||||
|
hidden
|
||||||
|
. Added HTMLPURIFIER_PREFIX constant, should be used instead of dirname(__FILE__)
|
||||||
|
. Custom ChildDef added to default include list
|
||||||
|
. URIScheme reflection improved: will not attempt to include file if class
|
||||||
|
already exists. May clobber autoload, so I need to keep an eye on it
|
||||||
|
. ConfigSchema heavily optimized, will only collect information and validate
|
||||||
|
definitions when HTMLPURIFIER_SCHEMA_STRICT is true.
|
||||||
|
. AttrDef_URI unit tests and implementation refactored
|
||||||
|
. benchmarks/ directory now protected from public view with .htaccess file;
|
||||||
|
run the tests via command line
|
||||||
|
. URI scheme is munged off if there is no authority and the scheme is the
|
||||||
|
default one
|
||||||
|
. All unit tests inherit from HTMLPurifier_Harness, not UnitTestCase
|
||||||
|
. Interface for URIScheme changed
|
||||||
|
. Generic URI object to hold components of URI added, most systems involved
|
||||||
|
in URI validation have been migrated to use it
|
||||||
|
. Custom filtering for URIs factored out to URIDefinition interface for
|
||||||
|
maximum extensibility
|
||||||
|
|
||||||
2.0.1, released 2007-06-27
|
2.0.1, released 2007-06-27
|
||||||
! Tag auto-closing now based on a ChildDef heuristic rather than a
|
! Tag auto-closing now based on a ChildDef heuristic rather than a
|
||||||
manually set auto_close array; some behavior may change
|
manually set auto_close array; some behavior may change
|
||||||
|
36
TODO
36
TODO
@@ -6,14 +6,9 @@ TODO List
|
|||||||
? Maybe I'll Do It
|
? Maybe I'll Do It
|
||||||
==========================
|
==========================
|
||||||
|
|
||||||
2.1 release [Refactor, refactor!]
|
If no interest is expressed for a feature that may required a considerable
|
||||||
# URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX)
|
amount of effort to implement, it may get endlessly delayed. Do not be
|
||||||
# Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
|
afraid to cast your vote for the next feature to be implemented!
|
||||||
# Ruby support
|
|
||||||
- Configuration profiles: predefined directives set with one func call
|
|
||||||
- Implement IDREF support (harder than it seems, since you cannot have
|
|
||||||
IDREFs to non-existent IDs)
|
|
||||||
- Allow non-ASCII characters in font names
|
|
||||||
|
|
||||||
2.2 release [Error'ed]
|
2.2 release [Error'ed]
|
||||||
# Error logging for filtering/cleanup procedures
|
# Error logging for filtering/cleanup procedures
|
||||||
@@ -33,21 +28,22 @@ TODO List
|
|||||||
- Remove empty inline tags<i></i>
|
- Remove empty inline tags<i></i>
|
||||||
- Append something to duplicate IDs so they're still usable (impl. note: the
|
- Append something to duplicate IDs so they're still usable (impl. note: the
|
||||||
dupe detector would also need to detect the suffix as well)
|
dupe detector would also need to detect the suffix as well)
|
||||||
|
- Externalize inline CSS to promote clean HTML
|
||||||
|
|
||||||
2.4 release [It's All About Trust] (floating)
|
2.4 release [It's All About Trust] (floating)
|
||||||
# Implement untrusted, dangerous elements/attributes
|
# Implement untrusted, dangerous elements/attributes
|
||||||
|
# Implement IDREF support (harder than it seems, since you cannot have
|
||||||
|
IDREFs to non-existent IDs)
|
||||||
|
# Frameset XHTML 1.0 and HTML 4.01 doctypes
|
||||||
|
|
||||||
3.0 release [Beyond HTML]
|
3.0 release [Beyond HTML]
|
||||||
# Legit token based CSS parsing (will require revamping almost every
|
# Legit token based CSS parsing (will require revamping almost every
|
||||||
AttrDef class)
|
AttrDef class). Probably will use CSSTidy class
|
||||||
# More control over allowed CSS properties (maybe modularize it in the
|
# More control over allowed CSS properties (maybe modularize it in the
|
||||||
same fashion!)
|
same fashion!)
|
||||||
# Formatters for plaintext
|
# Formatters for plaintext
|
||||||
- Smileys
|
- Smileys
|
||||||
- Standardize token armor for all areas of processing
|
- Standardize token armor for all areas of processing
|
||||||
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
|
|
||||||
- Automatically add non-breaking spaces to empty table cells when
|
|
||||||
empty-cells:show is applied to have compatibility with Internet Explorer
|
|
||||||
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
|
- Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
|
||||||
Also, enable disabling of directionality
|
Also, enable disabling of directionality
|
||||||
|
|
||||||
@@ -60,31 +56,33 @@ TODO List
|
|||||||
Ongoing
|
Ongoing
|
||||||
- Lots of profiling, make it faster!
|
- Lots of profiling, make it faster!
|
||||||
- Plugins for major CMSes (COMPLEX)
|
- Plugins for major CMSes (COMPLEX)
|
||||||
- WordPress (mostly written, needs beta-testing)
|
|
||||||
- phpBB
|
- phpBB
|
||||||
- Phorum
|
|
||||||
- eFiction
|
- eFiction
|
||||||
- more! (look for ones that use WYSIWYGs)
|
- more! (look for ones that use WYSIWYGs)
|
||||||
- Complete basic smoketests
|
- Complete basic smoketests
|
||||||
|
|
||||||
Unknown release (on a scratch-an-itch basis)
|
Unknown release (on a scratch-an-itch basis)
|
||||||
? Semi-lossy dumb alternate character encoding transfor
|
# CHMOD install script for PEAR installs
|
||||||
? Have 'lang' attribute be checked against official lists, achieved by
|
? Have 'lang' attribute be checked against official lists, achieved by
|
||||||
encoding all characters that have string entity equivalents
|
encoding all characters that have string entity equivalents
|
||||||
- Explain how to use HTML Purifier in non-PHP languages / create
|
|
||||||
a simple command line stub
|
|
||||||
- Abstract ChildDef_BlockQuote to work with all elements that only
|
- Abstract ChildDef_BlockQuote to work with all elements that only
|
||||||
allow blocks in them, required or optional
|
allow blocks in them, required or optional
|
||||||
- Reorganize Unit Tests
|
- Reorganize Unit Tests
|
||||||
- Refactor loop tests (esp. AttrDef_URI)
|
|
||||||
- Reorganize configuration directives (Create more namespaces! Get messy!)
|
- Reorganize configuration directives (Create more namespaces! Get messy!)
|
||||||
|
- Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
|
||||||
|
- Implement lenient <ruby> child validation
|
||||||
|
- Explain how to use HTML Purifier in non-PHP languages / create
|
||||||
|
a simple command line stub (or complicated?)
|
||||||
|
- Fixes for Firefox's inability to handle COL alignment props (Bug 915)
|
||||||
|
- Automatically add non-breaking spaces to empty table cells when
|
||||||
|
empty-cells:show is applied to have compatibility with Internet Explorer
|
||||||
|
|
||||||
Requested
|
Requested
|
||||||
|
|
||||||
Wontfix
|
Wontfix
|
||||||
- Non-lossy smart alternate character encoding transformations (unless
|
- Non-lossy smart alternate character encoding transformations (unless
|
||||||
patch provided)
|
patch provided)
|
||||||
- Pretty-printing HTML, users can use Tidy on the output on entire page
|
- Pretty-printing HTML: users can use Tidy on the output on entire page
|
||||||
- Native content compression, whitespace stripping (don't rely on Tidy, make
|
- Native content compression, whitespace stripping (don't rely on Tidy, make
|
||||||
sure we don't remove from <pre> or related tags): use gzip if this is
|
sure we don't remove from <pre> or related tags): use gzip if this is
|
||||||
really important
|
really important
|
||||||
|
14
WHATSNEW
14
WHATSNEW
@@ -1,8 +1,6 @@
|
|||||||
In version 2.1, HTML Purifier's URI validation and filtering handling
|
Stability release 2.1.3 fixes a slew of minor bugs found in HTML Purifier,
|
||||||
system has been revamped with a new, extensible URIFilter system. Also
|
and also includes some internal code enhancements and refactorings.
|
||||||
notable features include preservation of emoticons in PHP5 with
|
Notably, tests/multitest.php automates testing in multiple versions,
|
||||||
%Core.AggressivelyFixLt, standalone and lite download versions,
|
fatal AttrDef_URI_Email error fixed, blockquote contents are more lenient
|
||||||
transforming relative URIs to absolute URIs, Ruby in XHTML 1.1, a Phorum
|
in HTML 4.01 Strict and fatal errors involving ID tags in img tags were
|
||||||
mod, and UTF-8 font names. Notable bug-fixes include refinement of
|
fixed.
|
||||||
the auto-paragraphing algorithm (no longer experimental), better XHTML
|
|
||||||
1.1 support and the removal of the contents of <style> elements.
|
|
||||||
|
1
benchmarks/.htaccess
Normal file
1
benchmarks/.htaccess
Normal file
@@ -0,0 +1 @@
|
|||||||
|
Deny from all
|
12
benchmarks/Trace.php
Normal file
12
benchmarks/Trace.php
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
ini_set('xdebug.trace_format', 1);
|
||||||
|
ini_set('xdebug.show_mem_delta', true);
|
||||||
|
|
||||||
|
xdebug_start_trace(dirname(__FILE__) . '/Trace');
|
||||||
|
require_once '../library/HTMLPurifier.auto.php';
|
||||||
|
|
||||||
|
$purifier = new HTMLPurifier();
|
||||||
|
|
||||||
|
$data = $purifier->purify(file_get_contents('samples/Lexer/4.html'));
|
||||||
|
xdebug_stop_trace();
|
@@ -39,7 +39,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
|
|||||||
<table cellspacing="0"><tbody>
|
<table cellspacing="0"><tbody>
|
||||||
<tr><td class="impl-yes">Implemented</td></tr>
|
<tr><td class="impl-yes">Implemented</td></tr>
|
||||||
<tr><td class="impl-partial">Partially implemented</td></tr>
|
<tr><td class="impl-partial">Partially implemented</td></tr>
|
||||||
<tr><td class="impl-no">Will not implement</td></tr>
|
<tr><td class="impl-no">Not priority to implement</td></tr>
|
||||||
<tr><td class="danger">Dangerous attribute/property</td></tr>
|
<tr><td class="danger">Dangerous attribute/property</td></tr>
|
||||||
<tr><td class="css1">Present in CSS1</td></tr>
|
<tr><td class="css1">Present in CSS1</td></tr>
|
||||||
<tr><td class="feature">Feature, requires extra work</td></tr>
|
<tr><td class="feature">Feature, requires extra work</td></tr>
|
||||||
@@ -118,6 +118,7 @@ thead th {text-align:left;padding:0.1em;background-color:#EEE;}
|
|||||||
<tbody>
|
<tbody>
|
||||||
<tr><th colspan="2">Table</th></tr>
|
<tr><th colspan="2">Table</th></tr>
|
||||||
<tr class="impl-yes"><td>border-collapse</td><td>ENUM(collapse, seperate)</td></tr>
|
<tr class="impl-yes"><td>border-collapse</td><td>ENUM(collapse, seperate)</td></tr>
|
||||||
|
<tr class="impl-yes"><td>border-space</td><td>MULTIPLE</td></tr>
|
||||||
<tr class="impl-yes"><td>caption-side</td><td>ENUM(top, bottom)</td></tr>
|
<tr class="impl-yes"><td>caption-side</td><td>ENUM(top, bottom)</td></tr>
|
||||||
<tr class="feature"><td>empty-cells</td><td>ENUM(show, hide), No IE support makes this useless,
|
<tr class="feature"><td>empty-cells</td><td>ENUM(show, hide), No IE support makes this useless,
|
||||||
possible fix with &nbsp;? Unknown release milestone.</td></tr>
|
possible fix with &nbsp;? Unknown release milestone.</td></tr>
|
||||||
|
@@ -32,7 +32,7 @@
|
|||||||
Before we even write any code, it is paramount to consider whether or
|
Before we even write any code, it is paramount to consider whether or
|
||||||
not the code we're writing is necessary or not. HTML Purifier, by default,
|
not the code we're writing is necessary or not. HTML Purifier, by default,
|
||||||
contains a large set of elements and attributes: large enough so that
|
contains a large set of elements and attributes: large enough so that
|
||||||
<em>any</em> element or attribute in XHTML 1.0 (and its HTML variant)
|
<em>any</em> element or attribute in XHTML 1.0 or 1.1 (and its HTML variants)
|
||||||
that can be safely used by the general public is implemented.
|
that can be safely used by the general public is implemented.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
@@ -76,11 +76,12 @@
|
|||||||
<h3>XHTML 1.1</h3>
|
<h3>XHTML 1.1</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
We have not implemented the
|
As of HTMLPurifier 2.1.0, we have implemented the
|
||||||
<a href="http://www.w3.org/TR/2001/REC-ruby-20010531/">Ruby module</a>,
|
<a href="http://www.w3.org/TR/2001/REC-ruby-20010531/">Ruby module</a>,
|
||||||
which defines a set of tags
|
which defines a set of tags
|
||||||
for publishing short annotations for text, used mostly in Japanese
|
for publishing short annotations for text, used mostly in Japanese
|
||||||
and Chinese school texts.
|
and Chinese school texts, but applicable for positioning any text (not
|
||||||
|
limited to translations) above or below other corresponding text.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<h3>XHTML 2.0</h3>
|
<h3>XHTML 2.0</h3>
|
||||||
@@ -492,10 +493,11 @@ $def =& $config->getHTMLDefinition(true);
|
|||||||
<p>
|
<p>
|
||||||
The <code>(%flow;)*</code> indicates the allowed children of the
|
The <code>(%flow;)*</code> indicates the allowed children of the
|
||||||
<code>li</code> tag: <code>li</code> allows any number of flow
|
<code>li</code> tag: <code>li</code> allows any number of flow
|
||||||
elements as its children. In HTML Purifier, we'd write it like
|
elements as its children. (The <code>- O</code> allows the closing tag to be
|
||||||
<code>Flow</code> (here's where the content sets we were
|
omitted, though in XML this is not allowed.) In HTML Purifier,
|
||||||
discussing earlier come into play). There are three shorthand content models you
|
we'd write it like <code>Flow</code> (here's where the content sets
|
||||||
can specify:
|
we were discussing earlier come into play). There are three shorthand
|
||||||
|
content models you can specify:
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
<table class="table">
|
<table class="table">
|
||||||
@@ -668,12 +670,22 @@ $def =& $config->getHTMLDefinition(true);
|
|||||||
Common is a combination of the above-mentioned collections.
|
Common is a combination of the above-mentioned collections.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p class="aside">
|
||||||
|
Readers familiar with the modularization may have noticed that the Core
|
||||||
|
attribute collection differs from that specified by the <a
|
||||||
|
href="http://www.w3.org/TR/xhtml-modularization/abstract_modules.html#s_commonatts">abstract
|
||||||
|
modules of the XHTML Modularization 1.1</a>. We believe this section
|
||||||
|
to be in error, as <code>br</code> permits the use of the <code>style</code>
|
||||||
|
attribute even though it uses the <code>Core</code> collection, and
|
||||||
|
the DTD and XML Schemas supplied by W3C support our interpretation.
|
||||||
|
</p>
|
||||||
|
|
||||||
<h3>Attributes</h3>
|
<h3>Attributes</h3>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
If you didn't read the <a href="#addAttribute">previous section on
|
If you didn't read the <a href="#addAttribute">earlier section on
|
||||||
adding attributes</a>, read it now. The last parameter is simply
|
adding attributes</a>, read it now. The last parameter is simply
|
||||||
array of attribute names to attribute implementations, in the exact
|
an array of attribute names to attribute implementations, in the exact
|
||||||
same format as <code>addAttribute()</code>.
|
same format as <code>addAttribute()</code>.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
@@ -58,7 +58,7 @@ appear elsewhere on the document. The method is simple:</p>
|
|||||||
|
|
||||||
<pre>$config->set('HTML', 'EnableAttrID', true);
|
<pre>$config->set('HTML', 'EnableAttrID', true);
|
||||||
$config->set('Attr', 'IDBlacklist' array(
|
$config->set('Attr', 'IDBlacklist' array(
|
||||||
'list', 'of', 'attributes', 'that', 'are', 'forbidden'
|
'list', 'of', 'attribute', 'values', 'that', 'are', 'forbidden'
|
||||||
));</pre>
|
));</pre>
|
||||||
|
|
||||||
<p>That being said, there are some notable drawbacks. First of all, you have to
|
<p>That being said, there are some notable drawbacks. First of all, you have to
|
||||||
@@ -71,9 +71,9 @@ to possible standards-compliance issues.</p>
|
|||||||
<p>Furthermore, this position becomes untenable when a single web page must hold
|
<p>Furthermore, this position becomes untenable when a single web page must hold
|
||||||
multiple portions of user-submitted content. Since there's obviously no way
|
multiple portions of user-submitted content. Since there's obviously no way
|
||||||
to find out before-hand what IDs users will use, the blacklist is helpless.
|
to find out before-hand what IDs users will use, the blacklist is helpless.
|
||||||
And even since HTML Purifier validates each segment seperately, perhaps doing
|
And since HTML Purifier validates each segment separately, perhaps doing
|
||||||
so at different times, it would be extremely difficult to dynamically update
|
so at different times, it would be extremely difficult to dynamically update
|
||||||
the blacklist inbetween runs.</p>
|
the blacklist in between runs.</p>
|
||||||
|
|
||||||
<p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after
|
<p>Finally, simply destroying the ID is extremely un-userfriendly behavior: after
|
||||||
all, they might have simply specified a duplicate ID by accident.</p>
|
all, they might have simply specified a duplicate ID by accident.</p>
|
||||||
|
@@ -22,7 +22,7 @@ out:</p>
|
|||||||
|
|
||||||
<p class="emphasis">This ain't HTML Tidy!</p>
|
<p class="emphasis">This ain't HTML Tidy!</p>
|
||||||
|
|
||||||
<p>Rather, Tidy stands for a cool set of Tidy-inspired in HTML Purifier
|
<p>Rather, Tidy stands for a cool set of Tidy-inspired features in HTML Purifier
|
||||||
that allows users to submit deprecated elements and attributes and get
|
that allows users to submit deprecated elements and attributes and get
|
||||||
valid strict markup back. For example:</p>
|
valid strict markup back. For example:</p>
|
||||||
|
|
||||||
@@ -33,8 +33,8 @@ valid strict markup back. For example:</p>
|
|||||||
<pre><div style="text-align:center;">Centered</div></pre>
|
<pre><div style="text-align:center;">Centered</div></pre>
|
||||||
|
|
||||||
<p>...when this particular fix is run on the HTML. This tutorial will give
|
<p>...when this particular fix is run on the HTML. This tutorial will give
|
||||||
you down the lowdown of what exactly HTML Purifier will do when Tidy
|
you the lowdown of what exactly HTML Purifier will do when Tidy
|
||||||
is on, and how to fine tune this behavior. Once again, <strong>you do
|
is on, and how to fine-tune this behavior. Once again, <strong>you do
|
||||||
not need Tidy installed on your PHP to use these features!</strong></p>
|
not need Tidy installed on your PHP to use these features!</strong></p>
|
||||||
|
|
||||||
<h2>What does it do?</h2>
|
<h2>What does it do?</h2>
|
||||||
@@ -221,7 +221,7 @@ general syntax:</p>
|
|||||||
|
|
||||||
<p>The lowdown is, quite frankly, HTML Purifier's default settings are
|
<p>The lowdown is, quite frankly, HTML Purifier's default settings are
|
||||||
probably good enough. The next step is to bump the level up to heavy,
|
probably good enough. The next step is to bump the level up to heavy,
|
||||||
and if that still doesn't satisfy your appetite, do some fine tuning.
|
and if that still doesn't satisfy your appetite, do some fine-tuning.
|
||||||
Other than that, don't worry about it: this all works silently and
|
Other than that, don't worry about it: this all works silently and
|
||||||
effectively in the background.</p>
|
effectively in the background.</p>
|
||||||
|
|
||||||
|
@@ -96,7 +96,7 @@ which can be a rewarding (but difficult) task.</p>
|
|||||||
<h2 id="findcharset">Finding the real encoding</h2>
|
<h2 id="findcharset">Finding the real encoding</h2>
|
||||||
|
|
||||||
<p>In the beginning, there was ASCII, and things were simple. But they
|
<p>In the beginning, there was ASCII, and things were simple. But they
|
||||||
weren't good, for no one could write in Cryllic or Thai. So there
|
weren't good, for no one could write in Cyrillic or Thai. So there
|
||||||
exploded a proliferation of character encodings to remedy the problem
|
exploded a proliferation of character encodings to remedy the problem
|
||||||
by extending the characters ASCII could express. This ridiculously
|
by extending the characters ASCII could express. This ridiculously
|
||||||
simplified version of the history of character encodings shows us that
|
simplified version of the history of character encodings shows us that
|
||||||
@@ -138,7 +138,7 @@ browser:</p>
|
|||||||
<dd>View > Encoding: bulleted item is unofficial name</dd>
|
<dd>View > Encoding: bulleted item is unofficial name</dd>
|
||||||
</dl>
|
</dl>
|
||||||
|
|
||||||
<p>Internet Explorer won't give you the mime (i.e. useful/real) name of the
|
<p>Internet Explorer won't give you the MIME (i.e. useful/real) name of the
|
||||||
character encoding, so you'll have to look it up using their description.
|
character encoding, so you'll have to look it up using their description.
|
||||||
Some common ones:</p>
|
Some common ones:</p>
|
||||||
|
|
||||||
@@ -216,6 +216,12 @@ if your <code>META</code> tag claims that either:</p>
|
|||||||
|
|
||||||
<h2 id="fixcharset">Fixing the encoding</h2>
|
<h2 id="fixcharset">Fixing the encoding</h2>
|
||||||
|
|
||||||
|
<p class="aside">The advice given here is for pages being served as
|
||||||
|
vanilla <code>text/html</code>. Different practices must be used
|
||||||
|
for <code>application/xml</code> or <code>application/xml+xhtml</code>, see
|
||||||
|
<a href="http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020430/">W3C's
|
||||||
|
document on XHTML media types</a> for more information.</p>
|
||||||
|
|
||||||
<p>If your <code>META</code> encoding and your real encoding match,
|
<p>If your <code>META</code> encoding and your real encoding match,
|
||||||
savvy! You can skip this section. If they don't...</p>
|
savvy! You can skip this section. If they don't...</p>
|
||||||
|
|
||||||
@@ -231,7 +237,7 @@ of your real encoding.</p>
|
|||||||
why the character encoding should be explicitly stated. When the
|
why the character encoding should be explicitly stated. When the
|
||||||
browser isn't told what the character encoding of a text is, it
|
browser isn't told what the character encoding of a text is, it
|
||||||
has to guess: and sometimes the guess is wrong. Hackers can manipulate
|
has to guess: and sometimes the guess is wrong. Hackers can manipulate
|
||||||
this guess in order to slip XSS pass filters and then fool the
|
this guess in order to slip XSS past filters and then fool the
|
||||||
browser into executing it as active code. A great example of this
|
browser into executing it as active code. A great example of this
|
||||||
is the <a href="http://shiflett.org/archive/177">Google UTF-7
|
is the <a href="http://shiflett.org/archive/177">Google UTF-7
|
||||||
exploit</a>.</p>
|
exploit</a>.</p>
|
||||||
@@ -302,7 +308,8 @@ languages</a>. The appropriate code is:</p>
|
|||||||
|
|
||||||
<p>...replacing UTF-8 with whatever your embedded encoding is.
|
<p>...replacing UTF-8 with whatever your embedded encoding is.
|
||||||
This code must come before any output, so be careful about
|
This code must come before any output, so be careful about
|
||||||
stray whitespace in your application.</p>
|
stray whitespace in your application (i.e., any whitespace before
|
||||||
|
output excluding whitespace within <?php ?> tags).</p>
|
||||||
|
|
||||||
<h4 id="fixcharset-server-phpini">PHP ini directive</h4>
|
<h4 id="fixcharset-server-phpini">PHP ini directive</h4>
|
||||||
|
|
||||||
@@ -313,8 +320,8 @@ header call: <code><a href="http://php.net/ini.core#ini.default-charset">default
|
|||||||
|
|
||||||
<p>...will also do the trick. If PHP is running as an Apache module (and
|
<p>...will also do the trick. If PHP is running as an Apache module (and
|
||||||
not as FastCGI, consult
|
not as FastCGI, consult
|
||||||
<a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess do apply this property
|
<a href="http://php.net/phpinfo">phpinfo</a>() for details), you can even use htaccess to apply this property
|
||||||
globally:</p>
|
across many PHP files:</p>
|
||||||
|
|
||||||
<pre><a href="http://php.net/configuration.changes#configuration.changes.apache">php_value</a> default_charset "UTF-8"</pre>
|
<pre><a href="http://php.net/configuration.changes#configuration.changes.apache">php_value</a> default_charset "UTF-8"</pre>
|
||||||
|
|
||||||
@@ -360,10 +367,11 @@ to send anything at all:</p>
|
|||||||
|
|
||||||
<pre><a href="http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset">AddDefaultCharset</a> Off</pre>
|
<pre><a href="http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset">AddDefaultCharset</a> Off</pre>
|
||||||
|
|
||||||
<p>...making your <code>META</code> tags the sole source of
|
<p>...making your internal charset declaration (usually the <code>META</code> tags)
|
||||||
character encoding information. In these cases, it is
|
the sole source of character encoding
|
||||||
<em>especially</em> important to make sure you have valid <code>META</code>
|
information. In these cases, it is <em>especially</em> important to make
|
||||||
tags on your pages and all the text before them is ASCII.</p>
|
sure you have valid <code>META</code> tags on your pages and all the
|
||||||
|
text before them is ASCII.</p>
|
||||||
|
|
||||||
<blockquote class="aside"><p>These directives can also be
|
<blockquote class="aside"><p>These directives can also be
|
||||||
placed in httpd.conf file for Apache, but
|
placed in httpd.conf file for Apache, but
|
||||||
@@ -428,28 +436,30 @@ IIS to change character encodings, I'd be grateful.</p>
|
|||||||
|
|
||||||
<p><code>META</code> tags are the most common source of embedded
|
<p><code>META</code> tags are the most common source of embedded
|
||||||
encodings, but they can also come from somewhere else: XML
|
encodings, but they can also come from somewhere else: XML
|
||||||
processing instructions. They look like:</p>
|
Declarations. They look like:</p>
|
||||||
|
|
||||||
<pre><?xml version="1.0" encoding="UTF-8"?></pre>
|
<pre><?xml version="1.0" encoding="UTF-8"?></pre>
|
||||||
|
|
||||||
<p>...and are most often found in XML documents (including XHTML).</p>
|
<p>...and are most often found in XML documents (including XHTML).</p>
|
||||||
|
|
||||||
<p>For XHTML, this processing instruction theoretically
|
<p>For XHTML, this XML Declaration theoretically
|
||||||
overrides the <code>META</code> tag. In reality, this happens only when the
|
overrides the <code>META</code> tag. In reality, this happens only when the
|
||||||
XHTML is actually served as legit XML and not HTML, which is almost always
|
XHTML is actually served as legit XML and not HTML, which is almost always
|
||||||
never due to Internet Explorer's lack of support for
|
never due to Internet Explorer's lack of support for
|
||||||
<code>application/xhtml+xml</code> (even though doing so is often
|
<code>application/xhtml+xml</code> (even though doing so is often
|
||||||
argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good practice</a>).</p>
|
argued to be <a href="http://www.hixie.ch/advocacy/xhtml">good
|
||||||
|
practice</a> and is required by the XHTML 1.1 specification).</p>
|
||||||
|
|
||||||
<p>For XML, however, this processing instruction is extremely important.
|
<p>For XML, however, this XML Declaration is extremely important.
|
||||||
Since most webservers are not configured to send charsets for .xml files,
|
Since most webservers are not configured to send charsets for .xml files,
|
||||||
this is the only thing a parser has to go on. Furthermore, the default
|
this is the only thing a parser has to go on. Furthermore, the default
|
||||||
for XML files is UTF-8, which often butts heads with more common
|
for XML files is UTF-8, which often butts heads with more common
|
||||||
ISO-8859-1 encoding (you see this in garbled RSS feeds).</p>
|
ISO-8859-1 encoding (you see this in garbled RSS feeds).</p>
|
||||||
|
|
||||||
<p>In short, if you use XHTML and have gone through the
|
<p>In short, if you use XHTML and have gone through the
|
||||||
trouble of adding the XML header, make sure it jives
|
trouble of adding the XML Declaration, make sure it jives
|
||||||
with your <code>META</code> tags and HTTP headers.</p>
|
with your <code>META</code> tags (which should only be present
|
||||||
|
if served in text/html) and HTTP headers.</p>
|
||||||
|
|
||||||
<h3 id="fixcharset-internals">Inside the process</h3>
|
<h3 id="fixcharset-internals">Inside the process</h3>
|
||||||
|
|
||||||
@@ -506,7 +516,7 @@ usage in one language sometimes requires the occasional special character
|
|||||||
that, without surprise, is not available in your character set. Sometimes
|
that, without surprise, is not available in your character set. Sometimes
|
||||||
developers get around this by adding support for multiple encodings: when
|
developers get around this by adding support for multiple encodings: when
|
||||||
using Chinese, use Big5, when using Japanese, use Shift-JIS, when
|
using Chinese, use Big5, when using Japanese, use Shift-JIS, when
|
||||||
using Greek, etc. Other times, they use character entities with great
|
using Greek, etc. Other times, they use character references with great
|
||||||
zeal.</p>
|
zeal.</p>
|
||||||
|
|
||||||
<p>UTF-8, however, obviates the need for any of these complicated
|
<p>UTF-8, however, obviates the need for any of these complicated
|
||||||
@@ -520,14 +530,14 @@ you don't have to use those user-unfriendly entities.</p>
|
|||||||
|
|
||||||
<p>Websites encoded in Latin-1 (ISO-8859-1) which ocassionally need
|
<p>Websites encoded in Latin-1 (ISO-8859-1) which ocassionally need
|
||||||
a special character outside of their scope often will use a character
|
a special character outside of their scope often will use a character
|
||||||
entity to achieve the desired effect. For instance, θ can be
|
entity reference to achieve the desired effect. For instance, θ can be
|
||||||
written <code>&theta;</code>, regardless of the character encoding's
|
written <code>&theta;</code>, regardless of the character encoding's
|
||||||
support of Greek letters.</p>
|
support of Greek letters.</p>
|
||||||
|
|
||||||
<p>This works nicely for limited use of special characters, but
|
<p>This works nicely for limited use of special characters, but
|
||||||
say you wanted this sentence of Chinese text: 激光,
|
say you wanted this sentence of Chinese text: 激光,
|
||||||
這兩個字是甚麼意思.
|
這兩個字是甚麼意思.
|
||||||
The entity-ized version would look like this:</p>
|
The ampersand encoded version would look like this:</p>
|
||||||
|
|
||||||
<pre>&#28608;&#20809;, &#36889;&#20841;&#20491;&#23383;&#26159;&#29978;&#40636;&#24847;&#24605;</pre>
|
<pre>&#28608;&#20809;, &#36889;&#20841;&#20491;&#23383;&#26159;&#29978;&#40636;&#24847;&#24605;</pre>
|
||||||
|
|
||||||
@@ -545,7 +555,7 @@ an application that originally used ISO-8859-1 but switched to UTF-8
|
|||||||
when it became far to cumbersome to support foreign languages. Bots
|
when it became far to cumbersome to support foreign languages. Bots
|
||||||
will now actually go through articles and convert character entities
|
will now actually go through articles and convert character entities
|
||||||
to their corresponding real characters for the sake of user-friendliness
|
to their corresponding real characters for the sake of user-friendliness
|
||||||
and searcheability. See
|
and searchability. See
|
||||||
<a href="http://meta.wikimedia.org/wiki/Help:Special_characters">Meta's
|
<a href="http://meta.wikimedia.org/wiki/Help:Special_characters">Meta's
|
||||||
page on special characters</a> for more details.
|
page on special characters</a> for more details.
|
||||||
</p></blockquote>
|
</p></blockquote>
|
||||||
@@ -567,10 +577,11 @@ which may be used by POST, and is required when you want to upload
|
|||||||
files.</p>
|
files.</p>
|
||||||
|
|
||||||
<p>The following is a summarization of notes from
|
<p>The following is a summarization of notes from
|
||||||
<a href="http://ppewww.physics.gla.ac.uk/~flavell/charset/form-i18n.html">
|
<a href="http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html">
|
||||||
<code>FORM</code> submission and i18n</a>. That document contains lots
|
<code>FORM</code> submission and i18n</a>. That document contains lots
|
||||||
of useful information, but is written in a rambly manner, so
|
of useful information, but is written in a rambly manner, so
|
||||||
here I try to get right to the point.</p>
|
here I try to get right to the point. (Note: the original has
|
||||||
|
disappeared off the web, so I am linking to the Web Archive copy.)</p>
|
||||||
|
|
||||||
<h4 id="whyutf8-forms-urlencoded"><code>application/x-www-form-urlencoded</code></h4>
|
<h4 id="whyutf8-forms-urlencoded"><code>application/x-www-form-urlencoded</code></h4>
|
||||||
|
|
||||||
@@ -592,7 +603,7 @@ browser you're using, they might:</p>
|
|||||||
<ul>
|
<ul>
|
||||||
<li>Replace the unsupported characters with useless question marks,</li>
|
<li>Replace the unsupported characters with useless question marks,</li>
|
||||||
<li>Attempt to fix the characters (example: smart quotes to regular quotes),</li>
|
<li>Attempt to fix the characters (example: smart quotes to regular quotes),</li>
|
||||||
<li>Replace the character with a character entity, or</li>
|
<li>Replace the character with a character entity reference, or</li>
|
||||||
<li>Send it anyway as a different character encoding mixed in
|
<li>Send it anyway as a different character encoding mixed in
|
||||||
with the original encoding (usually Windows-1252 rather than
|
with the original encoding (usually Windows-1252 rather than
|
||||||
iso-8859-1 or UTF-8 interspersed in 8-bit)</li>
|
iso-8859-1 or UTF-8 interspersed in 8-bit)</li>
|
||||||
@@ -608,7 +619,7 @@ since UTF-8 supports every character.</p>
|
|||||||
|
|
||||||
<h4 id="whyutf8-forms-multipart"><code>multipart/form-data</code></h4>
|
<h4 id="whyutf8-forms-multipart"><code>multipart/form-data</code></h4>
|
||||||
|
|
||||||
<p>Multipart form submission takes a way a lot of the ambiguity
|
<p>Multipart form submission takes away a lot of the ambiguity
|
||||||
that percent-encoding had: the server now can explicitly ask for
|
that percent-encoding had: the server now can explicitly ask for
|
||||||
certain encodings, and the client can explicitly tell the server
|
certain encodings, and the client can explicitly tell the server
|
||||||
during the form submission what encoding the fields are in.</p>
|
during the form submission what encoding the fields are in.</p>
|
||||||
@@ -621,9 +632,9 @@ Each method has deficiencies, especially the former.</p>
|
|||||||
<p>If you tell the browser to send the form in the same encoding as
|
<p>If you tell the browser to send the form in the same encoding as
|
||||||
the page, you still have the trouble of what to do with characters
|
the page, you still have the trouble of what to do with characters
|
||||||
that are outside of the character encoding's range. The behavior, once
|
that are outside of the character encoding's range. The behavior, once
|
||||||
again, varies: Firefox 2.0 entity-izes them while Internet Explorer
|
again, varies: Firefox 2.0 converts them to character entity references
|
||||||
7.0 mangles them beyond intelligibility. For serious internationalization purposes,
|
while Internet Explorer 7.0 mangles them beyond intelligibility. For
|
||||||
this is not an option.</p>
|
serious internationalization purposes, this is not an option.</p>
|
||||||
|
|
||||||
<p>The other possibility is to set Accept-Encoding to UTF-8, which
|
<p>The other possibility is to set Accept-Encoding to UTF-8, which
|
||||||
begs the question: Why aren't you using UTF-8 for everything then?
|
begs the question: Why aren't you using UTF-8 for everything then?
|
||||||
@@ -663,12 +674,12 @@ it up to the module iconv to do the dirty work.</p>
|
|||||||
<p>This approach, however, is not perfect. iconv is blithely unaware
|
<p>This approach, however, is not perfect. iconv is blithely unaware
|
||||||
of HTML character entities. HTML Purifier, in order to
|
of HTML character entities. HTML Purifier, in order to
|
||||||
protect against sophisticated escaping schemes, normalizes all character
|
protect against sophisticated escaping schemes, normalizes all character
|
||||||
and numeric entities before processing the text. This leads to
|
and numeric entitie references before processing the text. This leads to
|
||||||
one important ramification:</p>
|
one important ramification:</p>
|
||||||
|
|
||||||
<p><strong>Any character that is not supported by the target character
|
<p><strong>Any character that is not supported by the target character
|
||||||
set, regardless of whether or not it is in the form of a character
|
set, regardless of whether or not it is in the form of a character
|
||||||
entity or a raw character, will be silently ignored.</strong></p>
|
entity reference or a raw character, will be silently ignored.</strong></p>
|
||||||
|
|
||||||
<p>Example of this principle at work: say you have <code>&theta;</code>
|
<p>Example of this principle at work: say you have <code>&theta;</code>
|
||||||
in your HTML, but the output is in Latin-1 (which, understandably,
|
in your HTML, but the output is in Latin-1 (which, understandably,
|
||||||
@@ -677,7 +688,7 @@ set the encoding correctly using %Core.Encoding):</p>
|
|||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li>The <code>Encoder</code> will transform the text from ISO 8859-1 to UTF-8
|
<li>The <code>Encoder</code> will transform the text from ISO 8859-1 to UTF-8
|
||||||
(note that theta is preserved since it doesn't actually use
|
(note that theta is preserved here since it doesn't actually use
|
||||||
any non-ASCII characters): <code>&theta;</code></li>
|
any non-ASCII characters): <code>&theta;</code></li>
|
||||||
<li>The <code>EntityParser</code> will transform all named and numeric
|
<li>The <code>EntityParser</code> will transform all named and numeric
|
||||||
character entities to their corresponding raw UTF-8 equivalents:
|
character entities to their corresponding raw UTF-8 equivalents:
|
||||||
@@ -700,7 +711,7 @@ Purifier has provided a slightly more palatable workaround using
|
|||||||
<li>The <code>EntityParser</code> transforms entities: <code>θ</code></li>
|
<li>The <code>EntityParser</code> transforms entities: <code>θ</code></li>
|
||||||
<li>HTML Purifier processes the code: <code>θ</code></li>
|
<li>HTML Purifier processes the code: <code>θ</code></li>
|
||||||
<li>The <code>Encoder</code> replaces all non-ASCII characters
|
<li>The <code>Encoder</code> replaces all non-ASCII characters
|
||||||
with numeric entities: <code>&#952;</code></li>
|
with numeric entity reference: <code>&#952;</code></li>
|
||||||
<li>For good measure, <code>Encoder</code> transforms encoding back to
|
<li>For good measure, <code>Encoder</code> transforms encoding back to
|
||||||
original (which is strictly unnecessary for 99% of encodings
|
original (which is strictly unnecessary for 99% of encodings
|
||||||
out there): <code>&#952;</code> (remember, it's all ASCII!)</li>
|
out there): <code>&#952;</code> (remember, it's all ASCII!)</li>
|
||||||
@@ -710,19 +721,19 @@ Purifier has provided a slightly more palatable workaround using
|
|||||||
the land of Unicode characters, and is totally unacceptable for Chinese
|
the land of Unicode characters, and is totally unacceptable for Chinese
|
||||||
or Japanese texts. The even bigger kicker is that, supposing the
|
or Japanese texts. The even bigger kicker is that, supposing the
|
||||||
input encoding was actually ISO-8859-7, which <em>does</em> support
|
input encoding was actually ISO-8859-7, which <em>does</em> support
|
||||||
theta, the character would get entity-ized anyway! (The Encoder does
|
theta, the character would get converted into a character entity reference
|
||||||
not discriminate).</p>
|
anyway! (The Encoder does not discriminate).</p>
|
||||||
|
|
||||||
<p>The current functionality is about where HTML Purifier will be for
|
<p>The current functionality is about where HTML Purifier will be for
|
||||||
the rest of eternity. HTML Purifier could attempt to preserve the original
|
the rest of eternity. HTML Purifier could attempt to preserve the original
|
||||||
form of the entities so that they could be substituted back in, only the
|
form of the character references so that they could be substituted back in, only the
|
||||||
DOM extension kills them off irreversibly. HTML Purifier could also attempt
|
DOM extension kills them off irreversibly. HTML Purifier could also attempt
|
||||||
to be smart and only convert non-ASCII characters that weren't supported
|
to be smart and only convert non-ASCII characters that weren't supported
|
||||||
by the target encoding, but that would require reimplementing iconv
|
by the target encoding, but that would require reimplementing iconv
|
||||||
with HTML awareness, something I will not do.</p>
|
with HTML awareness, something I will not do.</p>
|
||||||
|
|
||||||
<p>So there: either it's UTF-8 or crippled international support. Your pick! (and I'm
|
<p>So there: either it's UTF-8 or crippled international support. Your pick! (and I'm
|
||||||
not being sarcastic here: some people could care less about other languages)</p>
|
not being sarcastic here: some people could care less about other languages).</p>
|
||||||
|
|
||||||
<h2 id="migrate">Migrate to UTF-8</h2>
|
<h2 id="migrate">Migrate to UTF-8</h2>
|
||||||
|
|
||||||
@@ -984,7 +995,7 @@ and yes, it is variable width. Other traits:</p>
|
|||||||
in different ways. It is beyond the scope of this document to explain
|
in different ways. It is beyond the scope of this document to explain
|
||||||
what precisely these implications are. PHPWact provides
|
what precisely these implications are. PHPWact provides
|
||||||
a very good <a href="http://www.phpwact.org/php/i18n/utf-8">reference document</a>
|
a very good <a href="http://www.phpwact.org/php/i18n/utf-8">reference document</a>
|
||||||
on what to expect from each functions, although coverage is spotty in
|
on what to expect from each function, although coverage is spotty in
|
||||||
some areas. Their more general notes on
|
some areas. Their more general notes on
|
||||||
<a href="http://www.phpwact.org/php/i18n/charsets">character sets</a>
|
<a href="http://www.phpwact.org/php/i18n/charsets">character sets</a>
|
||||||
are also worth looking at for information on UTF-8. Some rules of thumb
|
are also worth looking at for information on UTF-8. Some rules of thumb
|
||||||
@@ -998,7 +1009,7 @@ when dealing with Unicode text:</p>
|
|||||||
<li>Think twice before using functions that:<ul>
|
<li>Think twice before using functions that:<ul>
|
||||||
<li>...count characters (strlen will return bytes, not characters;
|
<li>...count characters (strlen will return bytes, not characters;
|
||||||
str_split and word_wrap may corrupt)</li>
|
str_split and word_wrap may corrupt)</li>
|
||||||
<li>...entity-ize things (UTF-8 doesn't need entities)</li>
|
<li>...convert characters to entity references (UTF-8 doesn't need entities)</li>
|
||||||
<li>...do very complex string processing (*printf)</li>
|
<li>...do very complex string processing (*printf)</li>
|
||||||
</ul></li>
|
</ul></li>
|
||||||
</ul>
|
</ul>
|
||||||
|
28
docs/ref-css-length.txt
Normal file
28
docs/ref-css-length.txt
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
|
||||||
|
CSS Length Reference
|
||||||
|
To bound, or not to bound, that is the question
|
||||||
|
|
||||||
|
It's quite a reasonable request, really, and it's already been implemented
|
||||||
|
for HTML. That is, length bounding. It makes little sense to let users
|
||||||
|
define text blocks that have a font-size of 63,360 inches (that's a mile,
|
||||||
|
by the way) or a width of forty-fold the parent container.
|
||||||
|
|
||||||
|
But it's a little more complicated then that. There are multiple units
|
||||||
|
one can use, and we have to a little unit conversion to get things working.
|
||||||
|
Here's what we have:
|
||||||
|
|
||||||
|
Absolute:
|
||||||
|
1 in ~= 2.54 cm
|
||||||
|
1 cm = 10 mm
|
||||||
|
1 pt = 1/72 in
|
||||||
|
1 pc = 12 pt
|
||||||
|
|
||||||
|
Relative:
|
||||||
|
1 em ~= 10.0667 px
|
||||||
|
1 ex ~= 0.5 em, though Mozilla Firefox says 1 ex = 6px
|
||||||
|
1 px ~= 1 pt
|
||||||
|
|
||||||
|
Watch out: font-sizes can also be nested to get successively larger
|
||||||
|
(although I do not relish having to keep track of context font-sizes,
|
||||||
|
this may be necessary, especially for some of the more advanced features
|
||||||
|
for preventing things like white on white).
|
@@ -22,8 +22,8 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
/*
|
/*
|
||||||
HTML Purifier 2.0.1 - Standards Compliant HTML Filtering
|
HTML Purifier 2.1.3 - Standards Compliant HTML Filtering
|
||||||
Copyright (C) 2006 Edward Z. Yang
|
Copyright (C) 2006-2007 Edward Z. Yang
|
||||||
|
|
||||||
This library is free software; you can redistribute it and/or
|
This library is free software; you can redistribute it and/or
|
||||||
modify it under the terms of the GNU Lesser General Public
|
modify it under the terms of the GNU Lesser General Public
|
||||||
@@ -40,9 +40,11 @@
|
|||||||
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||||
*/
|
*/
|
||||||
|
|
||||||
// almost every class has an undocumented dependency to these, so make sure
|
// constants are slow, but we'll make one exception
|
||||||
// they get included
|
define('HTMLPURIFIER_PREFIX', dirname(__FILE__));
|
||||||
require_once 'HTMLPurifier/ConfigSchema.php'; // important
|
|
||||||
|
// every class has an undocumented dependency to these, must be included!
|
||||||
|
require_once 'HTMLPurifier/ConfigSchema.php'; // fatal errors if not included
|
||||||
require_once 'HTMLPurifier/Config.php';
|
require_once 'HTMLPurifier/Config.php';
|
||||||
require_once 'HTMLPurifier/Context.php';
|
require_once 'HTMLPurifier/Context.php';
|
||||||
|
|
||||||
@@ -57,16 +59,23 @@ require_once 'HTMLPurifier/LanguageFactory.php';
|
|||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'Core', 'CollectErrors', false, 'bool', '
|
'Core', 'CollectErrors', false, 'bool', '
|
||||||
Whether or not to collect errors found while filtering the document. This
|
Whether or not to collect errors found while filtering the document. This
|
||||||
is a useful way to give feedback to your users. CURRENTLY NOT IMPLEMENTED.
|
is a useful way to give feedback to your users. <strong>Warning:</strong>
|
||||||
This directive has been available since 2.0.0.
|
Currently this feature is very patchy and experimental, with lots of
|
||||||
|
possible error messages not yet implemented. It will not cause any problems,
|
||||||
|
but it may not help your users either. This directive has been available
|
||||||
|
since 2.0.0.
|
||||||
');
|
');
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Main library execution class.
|
* Facade that coordinates HTML Purifier's subsystems in order to purify HTML.
|
||||||
*
|
*
|
||||||
* Facade that performs calls to the HTMLPurifier_Lexer,
|
* @note There are several points in which configuration can be specified
|
||||||
* HTMLPurifier_Strategy and HTMLPurifier_Generator subsystems in order to
|
* for HTML Purifier. The precedence of these (from lowest to
|
||||||
* purify HTML.
|
* highest) is as follows:
|
||||||
|
* -# Instance: new HTMLPurifier($config)
|
||||||
|
* -# Invocation: purify($html, $config)
|
||||||
|
* These configurations are entirely independent of each other and
|
||||||
|
* are *not* merged.
|
||||||
*
|
*
|
||||||
* @todo We need an easier way to inject strategies, it'll probably end
|
* @todo We need an easier way to inject strategies, it'll probably end
|
||||||
* up getting done through config though.
|
* up getting done through config though.
|
||||||
@@ -74,15 +83,16 @@ This directive has been available since 2.0.0.
|
|||||||
class HTMLPurifier
|
class HTMLPurifier
|
||||||
{
|
{
|
||||||
|
|
||||||
var $version = '2.0.1';
|
var $version = '2.1.3';
|
||||||
|
|
||||||
var $config;
|
var $config;
|
||||||
var $filters;
|
var $filters = array();
|
||||||
|
|
||||||
var $strategy, $generator;
|
var $strategy, $generator;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Final HTMLPurifier_Context of last run purification. Might be an array.
|
* Resultant HTMLPurifier_Context of last run purification. Is an array
|
||||||
|
* of contexts if the last called method was purifyArray().
|
||||||
* @public
|
* @public
|
||||||
*/
|
*/
|
||||||
var $context;
|
var $context;
|
||||||
@@ -147,6 +157,11 @@ class HTMLPurifier
|
|||||||
$context->register('ErrorCollector', $error_collector);
|
$context->register('ErrorCollector', $error_collector);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// setup id_accumulator context, necessary due to the fact that
|
||||||
|
// AttrValidator can be called from many places
|
||||||
|
$id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
|
||||||
|
$context->register('IDAccumulator', $id_accumulator);
|
||||||
|
|
||||||
$html = HTMLPurifier_Encoder::convertToUTF8($html, $config, $context);
|
$html = HTMLPurifier_Encoder::convertToUTF8($html, $config, $context);
|
||||||
|
|
||||||
for ($i = 0, $size = count($this->filters); $i < $size; $i++) {
|
for ($i = 0, $size = count($this->filters); $i < $size; $i++) {
|
||||||
@@ -195,14 +210,16 @@ class HTMLPurifier
|
|||||||
|
|
||||||
/**
|
/**
|
||||||
* Singleton for enforcing just one HTML Purifier in your system
|
* Singleton for enforcing just one HTML Purifier in your system
|
||||||
|
* @param $prototype Optional prototype HTMLPurifier instance to
|
||||||
|
* overload singleton with.
|
||||||
*/
|
*/
|
||||||
function &getInstance($prototype = null) {
|
static function &getInstance($prototype = null) {
|
||||||
static $htmlpurifier;
|
static $htmlpurifier;
|
||||||
if (!$htmlpurifier || $prototype) {
|
if (!$htmlpurifier || $prototype) {
|
||||||
if (is_a($prototype, 'HTMLPurifier')) {
|
if ($prototype instanceof HTMLPurifier) {
|
||||||
$htmlpurifier = $prototype;
|
$htmlpurifier = $prototype;
|
||||||
} elseif ($prototype) {
|
} elseif ($prototype) {
|
||||||
$htmlpurifier = new HTMLPurifier(HTMLPurifier_Config::create($prototype));
|
$htmlpurifier = new HTMLPurifier($prototype);
|
||||||
} else {
|
} else {
|
||||||
$htmlpurifier = new HTMLPurifier();
|
$htmlpurifier = new HTMLPurifier();
|
||||||
}
|
}
|
||||||
|
@@ -38,19 +38,24 @@ class HTMLPurifier_AttrDef_CSS_FontFamily extends HTMLPurifier_AttrDef
|
|||||||
$quote = $font[0];
|
$quote = $font[0];
|
||||||
if ($font[$length - 1] !== $quote) continue;
|
if ($font[$length - 1] !== $quote) continue;
|
||||||
$font = substr($font, 1, $length - 2);
|
$font = substr($font, 1, $length - 2);
|
||||||
|
// double-backslash processing is buggy
|
||||||
|
$font = str_replace("\\$quote", $quote, $font); // de-escape quote
|
||||||
|
$font = str_replace("\\\n", "\n", $font); // de-escape newlines
|
||||||
}
|
}
|
||||||
// process font
|
// $font is a pure representation of the font name
|
||||||
|
|
||||||
if (ctype_alnum($font)) {
|
if (ctype_alnum($font)) {
|
||||||
// very simple font, allow it in unharmed
|
// very simple font, allow it in unharmed
|
||||||
$final .= $font . ', ';
|
$final .= $font . ', ';
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
$nospace = str_replace(array(' ', '.', '!'), '', $font);
|
|
||||||
if (ctype_alnum($nospace)) {
|
// complicated font, requires quoting
|
||||||
// font with spaces in it
|
|
||||||
|
// armor single quotes and new lines
|
||||||
|
$font = str_replace("'", "\\'", $font);
|
||||||
|
$font = str_replace("\n", "\\\n", $font);
|
||||||
$final .= "'$font', ";
|
$final .= "'$font', ";
|
||||||
continue;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
$final = rtrim($final, ', ');
|
$final = rtrim($final, ', ');
|
||||||
if ($final === '') return false;
|
if ($final === '') return false;
|
||||||
|
@@ -15,7 +15,7 @@ class HTMLPurifier_AttrDef_CSS_URI extends HTMLPurifier_AttrDef_URI
|
|||||||
{
|
{
|
||||||
|
|
||||||
function HTMLPurifier_AttrDef_CSS_URI() {
|
function HTMLPurifier_AttrDef_CSS_URI() {
|
||||||
$this->HTMLPurifier_AttrDef_URI(true); // always embedded
|
parent::HTMLPurifier_AttrDef_URI(true); // always embedded
|
||||||
}
|
}
|
||||||
|
|
||||||
function validate($uri_string, $config, &$context) {
|
function validate($uri_string, $config, &$context) {
|
||||||
|
@@ -1,90 +1,66 @@
|
|||||||
<?php
|
<?php
|
||||||
|
|
||||||
require_once 'HTMLPurifier/AttrDef.php';
|
require_once 'HTMLPurifier/AttrDef.php';
|
||||||
|
require_once 'HTMLPurifier/URIParser.php';
|
||||||
require_once 'HTMLPurifier/URIScheme.php';
|
require_once 'HTMLPurifier/URIScheme.php';
|
||||||
require_once 'HTMLPurifier/URISchemeRegistry.php';
|
require_once 'HTMLPurifier/URISchemeRegistry.php';
|
||||||
require_once 'HTMLPurifier/AttrDef/URI/Host.php';
|
require_once 'HTMLPurifier/AttrDef/URI/Host.php';
|
||||||
require_once 'HTMLPurifier/PercentEncoder.php';
|
require_once 'HTMLPurifier/PercentEncoder.php';
|
||||||
|
require_once 'HTMLPurifier/AttrDef/URI/Email.php';
|
||||||
|
|
||||||
|
// special case filtering directives
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'URI', 'DefaultScheme', 'http', 'string',
|
'URI', 'Munge', null, 'string/null', '
|
||||||
'Defines through what scheme the output will be served, in order to '.
|
<p>
|
||||||
'select the proper object validator when no scheme information is present.'
|
Munges all browsable (usually http, https and ftp)
|
||||||
);
|
absolute URI\'s into another URI, usually a URI redirection service.
|
||||||
|
This directive accepts a URI, formatted with a <code>%s</code> where
|
||||||
|
the url-encoded original URI should be inserted (sample:
|
||||||
|
<code>http://www.google.com/url?q=%s</code>).
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
Uses for this directive:
|
||||||
|
</p>
|
||||||
|
<ul>
|
||||||
|
<li>
|
||||||
|
Prevent PageRank leaks, while being fairly transparent
|
||||||
|
to users (you may also want to add some client side JavaScript to
|
||||||
|
override the text in the statusbar). <strong>Notice</strong>:
|
||||||
|
Many security experts believe that this form of protection does not deter spam-bots.
|
||||||
|
</li>
|
||||||
|
<li>
|
||||||
|
Redirect users to a splash page telling them they are leaving your
|
||||||
|
website. While this is poor usability practice, it is often mandated
|
||||||
|
in corporate environments.
|
||||||
|
</li>
|
||||||
|
</ul>
|
||||||
|
<p>
|
||||||
|
This directive has been available since 1.3.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
// disabling directives
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'URI', 'Host', null, 'string/null',
|
'URI', 'Disable', false, 'bool', '
|
||||||
'Defines the domain name of the server, so we can determine whether or '.
|
<p>
|
||||||
'an absolute URI is from your website or not. Not strictly necessary, '.
|
Disables all URIs in all forms. Not sure why you\'d want to do that
|
||||||
'as users should be using relative URIs to reference resources on your '.
|
(after all, the Internet\'s founded on the notion of a hyperlink).
|
||||||
'website. It will, however, let you use absolute URIs to link to '.
|
This directive has been available since 1.3.0.
|
||||||
'subdomains of the domain you post here: i.e. example.com will allow '.
|
</p>
|
||||||
'sub.example.com. However, higher up domains will still be excluded: '.
|
');
|
||||||
'if you set %URI.Host to sub.example.com, example.com will be blocked. '.
|
|
||||||
'This directive has been available since 1.2.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'DisableExternal', false, 'bool',
|
|
||||||
'Disables links to external websites. This is a highly effective '.
|
|
||||||
'anti-spam and anti-pagerank-leech measure, but comes at a hefty price: no'.
|
|
||||||
'links or images outside of your domain will be allowed. Non-linkified '.
|
|
||||||
'URIs will still be preserved. If you want to be able to link to '.
|
|
||||||
'subdomains or use absolute URIs, specify %URI.Host for your website. '.
|
|
||||||
'This directive has been available since 1.2.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'DisableExternalResources', false, 'bool',
|
|
||||||
'Disables the embedding of external resources, preventing users from '.
|
|
||||||
'embedding things like images from other hosts. This prevents '.
|
|
||||||
'access tracking (good for email viewers), bandwidth leeching, '.
|
|
||||||
'cross-site request forging, goatse.cx posting, and '.
|
|
||||||
'other nasties, but also results in '.
|
|
||||||
'a loss of end-user functionality (they can\'t directly post a pic '.
|
|
||||||
'they posted from Flickr anymore). Use it if you don\'t have a '.
|
|
||||||
'robust user-content moderation team. This directive has been '.
|
|
||||||
'available since 1.3.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'DisableResources', false, 'bool',
|
|
||||||
'Disables embedding resources, essentially meaning no pictures. You can '.
|
|
||||||
'still link to them though. See %URI.DisableExternalResources for why '.
|
|
||||||
'this might be a good idea. This directive has been available since 1.3.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'Munge', null, 'string/null',
|
|
||||||
'Munges all browsable (usually http, https and ftp) URI\'s into some URL '.
|
|
||||||
'redirection service. Pass this directive a URI, with %s inserted where '.
|
|
||||||
'the url-encoded original URI should be inserted (sample: '.
|
|
||||||
'<code>http://www.google.com/url?q=%s</code>). '.
|
|
||||||
'This prevents PageRank leaks, while being as transparent as possible '.
|
|
||||||
'to users (you may also want to add some client side JavaScript to '.
|
|
||||||
'override the text in the statusbar). Warning: many security experts '.
|
|
||||||
'believe that this form of protection does not deter spam-bots. '.
|
|
||||||
'You can also use this directive to redirect users to a splash page '.
|
|
||||||
'telling them they are leaving your website. '.
|
|
||||||
'This directive has been available since 1.3.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'HostBlacklist', array(), 'list',
|
|
||||||
'List of strings that are forbidden in the host of any URI. Use it to '.
|
|
||||||
'kill domain names of spam, etc. Note that it will catch anything in '.
|
|
||||||
'the domain, so <tt>moo.com</tt> will catch <tt>moo.com.example.com</tt>. '.
|
|
||||||
'This directive has been available since 1.3.0.'
|
|
||||||
);
|
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'URI', 'Disable', false, 'bool',
|
|
||||||
'Disables all URIs in all forms. Not sure why you\'d want to do that '.
|
|
||||||
'(after all, the Internet\'s founded on the notion of a hyperlink). '.
|
|
||||||
'This directive has been available since 1.3.0.'
|
|
||||||
);
|
|
||||||
HTMLPurifier_ConfigSchema::defineAlias('Attr', 'DisableURI', 'URI', 'Disable');
|
HTMLPurifier_ConfigSchema::defineAlias('Attr', 'DisableURI', 'URI', 'Disable');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DisableResources', false, 'bool', '
|
||||||
|
<p>
|
||||||
|
Disables embedding resources, essentially meaning no pictures. You can
|
||||||
|
still link to them though. See %URI.DisableExternalResources for why
|
||||||
|
this might be a good idea. This directive has been available since 1.3.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Validates a URI as defined by RFC 3986.
|
* Validates a URI as defined by RFC 3986.
|
||||||
* @note Scheme-specific mechanics deferred to HTMLPurifier_URIScheme
|
* @note Scheme-specific mechanics deferred to HTMLPurifier_URIScheme
|
||||||
@@ -92,214 +68,83 @@ HTMLPurifier_ConfigSchema::defineAlias('Attr', 'DisableURI', 'URI', 'Disable');
|
|||||||
class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
|
class HTMLPurifier_AttrDef_URI extends HTMLPurifier_AttrDef
|
||||||
{
|
{
|
||||||
|
|
||||||
var $host;
|
var $parser, $percentEncoder;
|
||||||
var $embeds_resource;
|
var $embedsResource;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @param $embeds_resource_resource Does the URI here result in an extra HTTP request?
|
* @param $embeds_resource_resource Does the URI here result in an extra HTTP request?
|
||||||
*/
|
*/
|
||||||
function HTMLPurifier_AttrDef_URI($embeds_resource = false) {
|
function HTMLPurifier_AttrDef_URI($embeds_resource = false) {
|
||||||
$this->host = new HTMLPurifier_AttrDef_URI_Host();
|
$this->parser = new HTMLPurifier_URIParser();
|
||||||
$this->embeds_resource = (bool) $embeds_resource;
|
$this->percentEncoder = new HTMLPurifier_PercentEncoder();
|
||||||
|
$this->embedsResource = (bool) $embeds_resource;
|
||||||
}
|
}
|
||||||
|
|
||||||
function validate($uri, $config, &$context) {
|
function validate($uri, $config, &$context) {
|
||||||
|
|
||||||
static $PercentEncoder = null;
|
|
||||||
if ($PercentEncoder === null) $PercentEncoder = new HTMLPurifier_PercentEncoder();
|
|
||||||
|
|
||||||
// We'll write stack-based parsers later, for now, use regexps to
|
|
||||||
// get things working as fast as possible (irony)
|
|
||||||
|
|
||||||
if ($config->get('URI', 'Disable')) return false;
|
if ($config->get('URI', 'Disable')) return false;
|
||||||
|
|
||||||
// parse as CDATA
|
// initial operations
|
||||||
$uri = $this->parseCDATA($uri);
|
$uri = $this->parseCDATA($uri);
|
||||||
|
$uri = $this->percentEncoder->normalize($uri);
|
||||||
|
|
||||||
// fix up percent-encoding
|
// parse the URI
|
||||||
$uri = $PercentEncoder->normalize($uri);
|
$uri = $this->parser->parse($uri);
|
||||||
|
if ($uri === false) return false;
|
||||||
|
|
||||||
// while it would be nice to use parse_url(), that's specifically
|
// add embedded flag to context for validators
|
||||||
// for HTTP and thus won't work for our generic URI parsing
|
$context->register('EmbeddedURI', $this->embedsResource);
|
||||||
|
|
||||||
// according to the RFC... (but this cuts corners, i.e. non-validating)
|
$ok = false;
|
||||||
$r_URI = '!'.
|
do {
|
||||||
'(([^:/?#<>\'"]+):)?'. // 2. Scheme
|
|
||||||
'(//([^/?#<>\'"]*))?'. // 4. Authority
|
|
||||||
'([^?#<>\'"]*)'. // 5. Path
|
|
||||||
'(\?([^#<>\'"]*))?'. // 7. Query
|
|
||||||
'(#([^<>\'"]*))?'. // 8. Fragment
|
|
||||||
'!';
|
|
||||||
|
|
||||||
$matches = array();
|
// generic validation
|
||||||
$result = preg_match($r_URI, $uri, $matches);
|
$result = $uri->validate($config, $context);
|
||||||
|
if (!$result) break;
|
||||||
|
|
||||||
if (!$result) return false; // invalid URI
|
// chained filtering
|
||||||
|
$uri_def =& $config->getDefinition('URI');
|
||||||
|
$result = $uri_def->filter($uri, $config, $context);
|
||||||
|
if (!$result) break;
|
||||||
|
|
||||||
// seperate out parts
|
// scheme-specific validation
|
||||||
$scheme = !empty($matches[1]) ? $matches[2] : null;
|
$scheme_obj = $uri->getSchemeObj($config, $context);
|
||||||
$authority = !empty($matches[3]) ? $matches[4] : null;
|
if (!$scheme_obj) break;
|
||||||
$path = $matches[5]; // always present, can be empty
|
if ($this->embedsResource && !$scheme_obj->browsable) break;
|
||||||
$query = !empty($matches[6]) ? $matches[7] : null;
|
$result = $scheme_obj->validate($uri, $config, $context);
|
||||||
$fragment = !empty($matches[8]) ? $matches[9] : null;
|
if (!$result) break;
|
||||||
|
|
||||||
|
// survived gauntlet
|
||||||
|
$ok = true;
|
||||||
|
|
||||||
|
} while (false);
|
||||||
|
|
||||||
$registry =& HTMLPurifier_URISchemeRegistry::instance();
|
$context->destroy('EmbeddedURI');
|
||||||
if ($scheme !== null) {
|
if (!$ok) return false;
|
||||||
// no need to validate the scheme's fmt since we do that when we
|
|
||||||
// retrieve the specific scheme object from the registry
|
// munge scheme off if necessary (this must be last)
|
||||||
$scheme = ctype_lower($scheme) ? $scheme : strtolower($scheme);
|
if (!is_null($uri->scheme) && is_null($uri->host)) {
|
||||||
$scheme_obj = $registry->getScheme($scheme, $config, $context);
|
if ($uri_def->defaultScheme == $uri->scheme) {
|
||||||
if (!$scheme_obj) return false; // invalid scheme, clean it out
|
$uri->scheme = null;
|
||||||
} else {
|
}
|
||||||
$scheme_obj = $registry->getScheme(
|
|
||||||
$config->get('URI', 'DefaultScheme'), $config, $context
|
|
||||||
);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// something funky weird happened in the registry, abort!
|
// back to string
|
||||||
if (!$scheme_obj) {
|
$result = $uri->toString();
|
||||||
trigger_error(
|
|
||||||
'Default scheme object "' . $config->get('URI', 'DefaultScheme') . '" was not readable',
|
|
||||||
E_USER_WARNING
|
|
||||||
);
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// the URI we're processing embeds_resource a resource in the page, but the URI
|
// munge entire URI if necessary
|
||||||
// it references cannot be located
|
|
||||||
if ($this->embeds_resource && !$scheme_obj->browsable) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
if ($authority !== null) {
|
|
||||||
|
|
||||||
// remove URI if it's absolute and we disabled externals or
|
|
||||||
// if it's absolute and embedded and we disabled external resources
|
|
||||||
unset($our_host);
|
|
||||||
if (
|
if (
|
||||||
$config->get('URI', 'DisableExternal') ||
|
!is_null($uri->host) && // indicator for authority
|
||||||
(
|
!empty($scheme_obj->browsable) &&
|
||||||
$config->get('URI', 'DisableExternalResources') &&
|
!is_null($munge = $config->get('URI', 'Munge'))
|
||||||
$this->embeds_resource
|
|
||||||
)
|
|
||||||
) {
|
) {
|
||||||
$our_host = $config->get('URI', 'Host');
|
|
||||||
if ($our_host === null) return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
$HEXDIG = '[A-Fa-f0-9]';
|
|
||||||
$unreserved = 'A-Za-z0-9-._~'; // make sure you wrap with []
|
|
||||||
$sub_delims = '!$&\'()'; // needs []
|
|
||||||
$pct_encoded = "%$HEXDIG$HEXDIG";
|
|
||||||
$r_userinfo = "(?:[$unreserved$sub_delims:]|$pct_encoded)*";
|
|
||||||
$r_authority = "/^(($r_userinfo)@)?(\[[^\]]+\]|[^:]*)(:(\d*))?/";
|
|
||||||
$matches = array();
|
|
||||||
preg_match($r_authority, $authority, $matches);
|
|
||||||
// overloads regexp!
|
|
||||||
$userinfo = !empty($matches[1]) ? $matches[2] : null;
|
|
||||||
$host = !empty($matches[3]) ? $matches[3] : null;
|
|
||||||
$port = !empty($matches[4]) ? $matches[5] : null;
|
|
||||||
|
|
||||||
// validate port
|
|
||||||
if ($port !== null) {
|
|
||||||
$port = (int) $port;
|
|
||||||
if ($port < 1 || $port > 65535) $port = null;
|
|
||||||
}
|
|
||||||
|
|
||||||
$host = $this->host->validate($host, $config, $context);
|
|
||||||
if ($host === false) $host = null;
|
|
||||||
|
|
||||||
if ($this->checkBlacklist($host, $config, $context)) return false;
|
|
||||||
|
|
||||||
// more lenient absolute checking
|
|
||||||
if (isset($our_host)) {
|
|
||||||
$host_parts = array_reverse(explode('.', $host));
|
|
||||||
// could be cached
|
|
||||||
$our_host_parts = array_reverse(explode('.', $our_host));
|
|
||||||
foreach ($our_host_parts as $i => $discard) {
|
|
||||||
if (!isset($host_parts[$i])) return false;
|
|
||||||
if ($host_parts[$i] != $our_host_parts[$i]) return false;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// userinfo and host are validated within the regexp
|
|
||||||
|
|
||||||
} else {
|
|
||||||
$port = $host = $userinfo = null;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
// query and fragment are quite simple in terms of definition:
|
|
||||||
// *( pchar / "/" / "?" ), so define their validation routines
|
|
||||||
// when we start fixing percent encoding
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
// path gets to be validated against a hodge-podge of rules depending
|
|
||||||
// on the status of authority and scheme, but it's not that important,
|
|
||||||
// esp. since it won't be applicable to everyone
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
// okay, now we defer execution to the subobject for more processing
|
|
||||||
// note that $fragment is omitted
|
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
|
||||||
$scheme_obj->validateComponents(
|
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context
|
|
||||||
);
|
|
||||||
|
|
||||||
|
|
||||||
// reconstruct authority
|
|
||||||
$authority = null;
|
|
||||||
if (!is_null($userinfo) || !is_null($host) || !is_null($port)) {
|
|
||||||
$authority = '';
|
|
||||||
if($userinfo !== null) $authority .= $userinfo . '@';
|
|
||||||
$authority .= $host;
|
|
||||||
if($port !== null) $authority .= ':' . $port;
|
|
||||||
}
|
|
||||||
|
|
||||||
// reconstruct the result
|
|
||||||
$result = '';
|
|
||||||
if ($scheme !== null) $result .= "$scheme:";
|
|
||||||
if ($authority !== null) $result .= "//$authority";
|
|
||||||
$result .= $path;
|
|
||||||
if ($query !== null) $result .= "?$query";
|
|
||||||
if ($fragment !== null) $result .= "#$fragment";
|
|
||||||
|
|
||||||
// munge if necessary
|
|
||||||
$munge = $config->get('URI', 'Munge');
|
|
||||||
if (!empty($scheme_obj->browsable) && $munge !== null) {
|
|
||||||
if ($authority !== null) {
|
|
||||||
$result = str_replace('%s', rawurlencode($result), $munge);
|
$result = str_replace('%s', rawurlencode($result), $munge);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
return $result;
|
return $result;
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
|
||||||
* Checks a host against an array blacklist
|
|
||||||
* @param $host Host to check
|
|
||||||
* @param $config HTMLPurifier_Config instance
|
|
||||||
* @param $context HTMLPurifier_Context instance
|
|
||||||
* @return bool Is spam?
|
|
||||||
*/
|
|
||||||
function checkBlacklist($host, &$config, &$context) {
|
|
||||||
$blacklist = $config->get('URI', 'HostBlacklist');
|
|
||||||
if (!empty($blacklist)) {
|
|
||||||
foreach($blacklist as $blacklisted_host_fragment) {
|
|
||||||
if (strpos($host, $blacklisted_host_fragment) !== false) {
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@@ -14,3 +14,5 @@ class HTMLPurifier_AttrDef_URI_Email extends HTMLPurifier_AttrDef
|
|||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// sub-implementations
|
||||||
|
require_once 'HTMLPurifier/AttrDef/URI/Email/SimpleCheck.php';
|
||||||
|
@@ -44,6 +44,9 @@ class HTMLPurifier_AttrTypes
|
|||||||
$this->info['LanguageCode'] = new HTMLPurifier_AttrDef_Lang();
|
$this->info['LanguageCode'] = new HTMLPurifier_AttrDef_Lang();
|
||||||
$this->info['Color'] = new HTMLPurifier_AttrDef_HTML_Color();
|
$this->info['Color'] = new HTMLPurifier_AttrDef_HTML_Color();
|
||||||
|
|
||||||
|
// unimplemented aliases
|
||||||
|
$this->info['ContentType'] = new HTMLPurifier_AttrDef_Text();
|
||||||
|
|
||||||
// number is really a positive integer (one or more digits)
|
// number is really a positive integer (one or more digits)
|
||||||
// FIXME: ^^ not always, see start and value of list items
|
// FIXME: ^^ not always, see start and value of list items
|
||||||
$this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true);
|
$this->info['Number'] = new HTMLPurifier_AttrDef_Integer(false, false, true);
|
||||||
|
@@ -23,6 +23,13 @@ class HTMLPurifier_AttrValidator
|
|||||||
$definition = $config->getHTMLDefinition();
|
$definition = $config->getHTMLDefinition();
|
||||||
$e =& $context->get('ErrorCollector', true);
|
$e =& $context->get('ErrorCollector', true);
|
||||||
|
|
||||||
|
// initialize IDAccumulator if necessary
|
||||||
|
$ok =& $context->get('IDAccumulator', true);
|
||||||
|
if (!$ok) {
|
||||||
|
$id_accumulator = HTMLPurifier_IDAccumulator::build($config, $context);
|
||||||
|
$context->register('IDAccumulator', $id_accumulator);
|
||||||
|
}
|
||||||
|
|
||||||
// initialize CurrentToken if necessary
|
// initialize CurrentToken if necessary
|
||||||
$current_token =& $context->get('CurrentToken', true);
|
$current_token =& $context->get('CurrentToken', true);
|
||||||
if (!$current_token) $context->register('CurrentToken', $token);
|
if (!$current_token) $context->register('CurrentToken', $token);
|
||||||
|
@@ -204,7 +204,7 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
|
|||||||
$this->info['border-right'] = new HTMLPurifier_AttrDef_CSS_Border($config);
|
$this->info['border-right'] = new HTMLPurifier_AttrDef_CSS_Border($config);
|
||||||
|
|
||||||
$this->info['border-collapse'] = new HTMLPurifier_AttrDef_Enum(array(
|
$this->info['border-collapse'] = new HTMLPurifier_AttrDef_Enum(array(
|
||||||
'collapse', 'seperate'));
|
'collapse', 'separate'));
|
||||||
|
|
||||||
$this->info['caption-side'] = new HTMLPurifier_AttrDef_Enum(array(
|
$this->info['caption-side'] = new HTMLPurifier_AttrDef_Enum(array(
|
||||||
'top', 'bottom'));
|
'top', 'bottom'));
|
||||||
@@ -219,6 +219,8 @@ class HTMLPurifier_CSSDefinition extends HTMLPurifier_Definition
|
|||||||
new HTMLPurifier_AttrDef_CSS_Percentage()
|
new HTMLPurifier_AttrDef_CSS_Percentage()
|
||||||
));
|
));
|
||||||
|
|
||||||
|
$this->info['border-spacing'] = new HTMLPurifier_AttrDef_CSS_Multiple(new HTMLPurifier_AttrDef_CSS_Length(), 2);
|
||||||
|
|
||||||
// partial support
|
// partial support
|
||||||
$this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap'));
|
$this->info['white-space'] = new HTMLPurifier_AttrDef_Enum(array('nowrap'));
|
||||||
|
|
||||||
|
@@ -15,7 +15,10 @@ class HTMLPurifier_ChildDef_Optional extends HTMLPurifier_ChildDef_Required
|
|||||||
var $type = 'optional';
|
var $type = 'optional';
|
||||||
function validateChildren($tokens_of_children, $config, &$context) {
|
function validateChildren($tokens_of_children, $config, &$context) {
|
||||||
$result = parent::validateChildren($tokens_of_children, $config, $context);
|
$result = parent::validateChildren($tokens_of_children, $config, $context);
|
||||||
if ($result === false) return array();
|
if ($result === false) {
|
||||||
|
if (empty($tokens_of_children)) return true;
|
||||||
|
else return array();
|
||||||
|
}
|
||||||
return $result;
|
return $result;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@@ -5,6 +5,7 @@ require_once 'HTMLPurifier/ConfigSchema.php';
|
|||||||
// member variables
|
// member variables
|
||||||
require_once 'HTMLPurifier/HTMLDefinition.php';
|
require_once 'HTMLPurifier/HTMLDefinition.php';
|
||||||
require_once 'HTMLPurifier/CSSDefinition.php';
|
require_once 'HTMLPurifier/CSSDefinition.php';
|
||||||
|
require_once 'HTMLPurifier/URIDefinition.php';
|
||||||
require_once 'HTMLPurifier/Doctype.php';
|
require_once 'HTMLPurifier/Doctype.php';
|
||||||
require_once 'HTMLPurifier/DefinitionCacheFactory.php';
|
require_once 'HTMLPurifier/DefinitionCacheFactory.php';
|
||||||
|
|
||||||
@@ -41,7 +42,7 @@ class HTMLPurifier_Config
|
|||||||
/**
|
/**
|
||||||
* HTML Purifier's version
|
* HTML Purifier's version
|
||||||
*/
|
*/
|
||||||
var $version = '2.0.1';
|
var $version = '2.1.3';
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Two-level associative array of configuration directives
|
* Two-level associative array of configuration directives
|
||||||
@@ -75,6 +76,11 @@ class HTMLPurifier_Config
|
|||||||
*/
|
*/
|
||||||
var $serials = array();
|
var $serials = array();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Serial for entire configuration object
|
||||||
|
*/
|
||||||
|
var $serial;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @param $definition HTMLPurifier_ConfigSchema that defines what directives
|
* @param $definition HTMLPurifier_ConfigSchema that defines what directives
|
||||||
* are allowed.
|
* are allowed.
|
||||||
@@ -98,7 +104,6 @@ class HTMLPurifier_Config
|
|||||||
$ret = HTMLPurifier_Config::createDefault();
|
$ret = HTMLPurifier_Config::createDefault();
|
||||||
if (is_string($config)) $ret->loadIni($config);
|
if (is_string($config)) $ret->loadIni($config);
|
||||||
elseif (is_array($config)) $ret->loadArray($config);
|
elseif (is_array($config)) $ret->loadArray($config);
|
||||||
if (isset($revision)) $ret->revision = $revision;
|
|
||||||
return $ret;
|
return $ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -165,6 +170,17 @@ class HTMLPurifier_Config
|
|||||||
return $this->serials[$namespace];
|
return $this->serials[$namespace];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a md5 signature for the entire configuration object
|
||||||
|
* that uniquely identifies that particular configuration
|
||||||
|
*/
|
||||||
|
function getSerial() {
|
||||||
|
if (empty($this->serial)) {
|
||||||
|
$this->serial = md5(serialize($this->getAll()));
|
||||||
|
}
|
||||||
|
return $this->serial;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Retrieves all directives, organized by namespace
|
* Retrieves all directives, organized by namespace
|
||||||
*/
|
*/
|
||||||
@@ -295,6 +311,8 @@ class HTMLPurifier_Config
|
|||||||
$this->definitions[$type] = new HTMLPurifier_HTMLDefinition();
|
$this->definitions[$type] = new HTMLPurifier_HTMLDefinition();
|
||||||
} elseif ($type == 'CSS') {
|
} elseif ($type == 'CSS') {
|
||||||
$this->definitions[$type] = new HTMLPurifier_CSSDefinition();
|
$this->definitions[$type] = new HTMLPurifier_CSSDefinition();
|
||||||
|
} elseif ($type == 'URI') {
|
||||||
|
$this->definitions[$type] = new HTMLPurifier_URIDefinition();
|
||||||
} else {
|
} else {
|
||||||
trigger_error("Definition of $type type not supported");
|
trigger_error("Definition of $type type not supported");
|
||||||
$false = false;
|
$false = false;
|
||||||
@@ -393,6 +411,26 @@ class HTMLPurifier_Config
|
|||||||
* @static
|
* @static
|
||||||
*/
|
*/
|
||||||
static function loadArrayFromForm($array, $index, $allowed = true, $mq_fix = true) {
|
static function loadArrayFromForm($array, $index, $allowed = true, $mq_fix = true) {
|
||||||
|
$ret = HTMLPurifier_Config::prepareArrayFromForm($array, $index, $allowed, $mq_fix);
|
||||||
|
$config = HTMLPurifier_Config::create($ret);
|
||||||
|
return $config;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Merges in configuration values from $_GET/$_POST to object. NOT STATIC.
|
||||||
|
* @note Same parameters as loadArrayFromForm
|
||||||
|
*/
|
||||||
|
function mergeArrayFromForm($array, $index, $allowed = true, $mq_fix = true) {
|
||||||
|
$ret = HTMLPurifier_Config::prepareArrayFromForm($array, $index, $allowed, $mq_fix);
|
||||||
|
$this->loadArray($ret);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Prepares an array from a form into something usable for the more
|
||||||
|
* strict parts of HTMLPurifier_Config
|
||||||
|
* @static
|
||||||
|
*/
|
||||||
|
static function prepareArrayFromForm($array, $index, $allowed = true, $mq_fix = true) {
|
||||||
$array = (isset($array[$index]) && is_array($array[$index])) ? $array[$index] : array();
|
$array = (isset($array[$index]) && is_array($array[$index])) ? $array[$index] : array();
|
||||||
$mq = get_magic_quotes_gpc() && $mq_fix;
|
$mq = get_magic_quotes_gpc() && $mq_fix;
|
||||||
|
|
||||||
@@ -409,9 +447,7 @@ class HTMLPurifier_Config
|
|||||||
$value = $mq ? stripslashes($array[$skey]) : $array[$skey];
|
$value = $mq ? stripslashes($array[$skey]) : $array[$skey];
|
||||||
$ret[$ns][$directive] = $value;
|
$ret[$ns][$directive] = $value;
|
||||||
}
|
}
|
||||||
|
return $ret;
|
||||||
$config = HTMLPurifier_Config::create($ret);
|
|
||||||
return $config;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@@ -6,6 +6,8 @@ require_once 'HTMLPurifier/ConfigDef/Namespace.php';
|
|||||||
require_once 'HTMLPurifier/ConfigDef/Directive.php';
|
require_once 'HTMLPurifier/ConfigDef/Directive.php';
|
||||||
require_once 'HTMLPurifier/ConfigDef/DirectiveAlias.php';
|
require_once 'HTMLPurifier/ConfigDef/DirectiveAlias.php';
|
||||||
|
|
||||||
|
if (!defined('HTMLPURIFIER_SCHEMA_STRICT')) define('HTMLPURIFIER_SCHEMA_STRICT', false);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Configuration definition, defines directives and their defaults.
|
* Configuration definition, defines directives and their defaults.
|
||||||
* @note If you update this, please update Printer_ConfigForm
|
* @note If you update this, please update Printer_ConfigForm
|
||||||
@@ -49,6 +51,8 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
var $types = array(
|
var $types = array(
|
||||||
'string' => 'String',
|
'string' => 'String',
|
||||||
'istring' => 'Case-insensitive string',
|
'istring' => 'Case-insensitive string',
|
||||||
|
'text' => 'Text',
|
||||||
|
'itext' => 'Case-insensitive text',
|
||||||
'int' => 'Integer',
|
'int' => 'Integer',
|
||||||
'float' => 'Float',
|
'float' => 'Float',
|
||||||
'bool' => 'Boolean',
|
'bool' => 'Boolean',
|
||||||
@@ -100,11 +104,11 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
* HTMLPurifier_DirectiveDef::$type for allowed values
|
* HTMLPurifier_DirectiveDef::$type for allowed values
|
||||||
* @param $description Description of directive for documentation
|
* @param $description Description of directive for documentation
|
||||||
*/
|
*/
|
||||||
static function define(
|
static function define($namespace, $name, $default, $type, $description) {
|
||||||
$namespace, $name, $default, $type,
|
|
||||||
$description
|
|
||||||
) {
|
|
||||||
$def =& HTMLPurifier_ConfigSchema::instance();
|
$def =& HTMLPurifier_ConfigSchema::instance();
|
||||||
|
|
||||||
|
// basic sanity checks
|
||||||
|
if (HTMLPURIFIER_SCHEMA_STRICT) {
|
||||||
if (!isset($def->info[$namespace])) {
|
if (!isset($def->info[$namespace])) {
|
||||||
trigger_error('Cannot define directive for undefined namespace',
|
trigger_error('Cannot define directive for undefined namespace',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
@@ -120,7 +124,10 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (isset($def->info[$namespace][$name])) {
|
if (isset($def->info[$namespace][$name])) {
|
||||||
|
// already defined
|
||||||
if (
|
if (
|
||||||
$def->info[$namespace][$name]->type !== $type ||
|
$def->info[$namespace][$name]->type !== $type ||
|
||||||
$def->defaults[$namespace][$name] !== $default
|
$def->defaults[$namespace][$name] !== $default
|
||||||
@@ -129,12 +136,15 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
// process modifiers
|
// needs defining
|
||||||
|
|
||||||
|
// process modifiers (OPTIMIZE!)
|
||||||
$type_values = explode('/', $type, 2);
|
$type_values = explode('/', $type, 2);
|
||||||
$type = $type_values[0];
|
$type = $type_values[0];
|
||||||
$modifier = isset($type_values[1]) ? $type_values[1] : false;
|
$modifier = isset($type_values[1]) ? $type_values[1] : false;
|
||||||
$allow_null = ($modifier === 'null');
|
$allow_null = ($modifier === 'null');
|
||||||
|
|
||||||
|
if (HTMLPURIFIER_SCHEMA_STRICT) {
|
||||||
if (!isset($def->types[$type])) {
|
if (!isset($def->types[$type])) {
|
||||||
trigger_error('Invalid type for configuration directive',
|
trigger_error('Invalid type for configuration directive',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
@@ -146,12 +156,15 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
$def->info[$namespace][$name] =
|
$def->info[$namespace][$name] =
|
||||||
new HTMLPurifier_ConfigDef_Directive();
|
new HTMLPurifier_ConfigDef_Directive();
|
||||||
$def->info[$namespace][$name]->type = $type;
|
$def->info[$namespace][$name]->type = $type;
|
||||||
$def->info[$namespace][$name]->allow_null = $allow_null;
|
$def->info[$namespace][$name]->allow_null = $allow_null;
|
||||||
$def->defaults[$namespace][$name] = $default;
|
$def->defaults[$namespace][$name] = $default;
|
||||||
}
|
}
|
||||||
|
if (!HTMLPURIFIER_SCHEMA_STRICT) return;
|
||||||
$backtrace = debug_backtrace();
|
$backtrace = debug_backtrace();
|
||||||
$file = $def->mungeFilename($backtrace[0]['file']);
|
$file = $def->mungeFilename($backtrace[0]['file']);
|
||||||
$line = $backtrace[0]['line'];
|
$line = $backtrace[0]['line'];
|
||||||
@@ -166,6 +179,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
*/
|
*/
|
||||||
static function defineNamespace($namespace, $description) {
|
static function defineNamespace($namespace, $description) {
|
||||||
$def =& HTMLPurifier_ConfigSchema::instance();
|
$def =& HTMLPurifier_ConfigSchema::instance();
|
||||||
|
if (HTMLPURIFIER_SCHEMA_STRICT) {
|
||||||
if (isset($def->info[$namespace])) {
|
if (isset($def->info[$namespace])) {
|
||||||
trigger_error('Cannot redefine namespace', E_USER_ERROR);
|
trigger_error('Cannot redefine namespace', E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
@@ -180,6 +194,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
$def->info[$namespace] = array();
|
$def->info[$namespace] = array();
|
||||||
$def->info_namespace[$namespace] = new HTMLPurifier_ConfigDef_Namespace();
|
$def->info_namespace[$namespace] = new HTMLPurifier_ConfigDef_Namespace();
|
||||||
$def->info_namespace[$namespace]->description = $description;
|
$def->info_namespace[$namespace]->description = $description;
|
||||||
@@ -199,12 +214,13 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
*/
|
*/
|
||||||
static function defineValueAliases($namespace, $name, $aliases) {
|
static function defineValueAliases($namespace, $name, $aliases) {
|
||||||
$def =& HTMLPurifier_ConfigSchema::instance();
|
$def =& HTMLPurifier_ConfigSchema::instance();
|
||||||
if (!isset($def->info[$namespace][$name])) {
|
if (HTMLPURIFIER_SCHEMA_STRICT && !isset($def->info[$namespace][$name])) {
|
||||||
trigger_error('Cannot set value alias for non-existant directive',
|
trigger_error('Cannot set value alias for non-existant directive',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
foreach ($aliases as $alias => $real) {
|
foreach ($aliases as $alias => $real) {
|
||||||
|
if (HTMLPURIFIER_SCHEMA_STRICT) {
|
||||||
if (!$def->info[$namespace][$name] !== true &&
|
if (!$def->info[$namespace][$name] !== true &&
|
||||||
!isset($def->info[$namespace][$name]->allowed[$real])
|
!isset($def->info[$namespace][$name]->allowed[$real])
|
||||||
) {
|
) {
|
||||||
@@ -217,6 +233,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
$def->info[$namespace][$name]->aliases[$alias] = $real;
|
$def->info[$namespace][$name]->aliases[$alias] = $real;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -230,14 +247,14 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
*/
|
*/
|
||||||
static function defineAllowedValues($namespace, $name, $allowed_values) {
|
static function defineAllowedValues($namespace, $name, $allowed_values) {
|
||||||
$def =& HTMLPurifier_ConfigSchema::instance();
|
$def =& HTMLPurifier_ConfigSchema::instance();
|
||||||
if (!isset($def->info[$namespace][$name])) {
|
if (HTMLPURIFIER_SCHEMA_STRICT && !isset($def->info[$namespace][$name])) {
|
||||||
trigger_error('Cannot define allowed values for undefined directive',
|
trigger_error('Cannot define allowed values for undefined directive',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
$directive =& $def->info[$namespace][$name];
|
$directive =& $def->info[$namespace][$name];
|
||||||
$type = $directive->type;
|
$type = $directive->type;
|
||||||
if ($type != 'string' && $type != 'istring') {
|
if (HTMLPURIFIER_SCHEMA_STRICT && $type != 'string' && $type != 'istring') {
|
||||||
trigger_error('Cannot define allowed values for directive whose type is not string',
|
trigger_error('Cannot define allowed values for directive whose type is not string',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
@@ -248,8 +265,11 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
foreach ($allowed_values as $value) {
|
foreach ($allowed_values as $value) {
|
||||||
$directive->allowed[$value] = true;
|
$directive->allowed[$value] = true;
|
||||||
}
|
}
|
||||||
if ($def->defaults[$namespace][$name] !== null &&
|
if (
|
||||||
!isset($directive->allowed[$def->defaults[$namespace][$name]])) {
|
HTMLPURIFIER_SCHEMA_STRICT &&
|
||||||
|
$def->defaults[$namespace][$name] !== null &&
|
||||||
|
!isset($directive->allowed[$def->defaults[$namespace][$name]])
|
||||||
|
) {
|
||||||
trigger_error('Default value must be in allowed range of variables',
|
trigger_error('Default value must be in allowed range of variables',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
$directive->allowed = true; // undo undo!
|
$directive->allowed = true; // undo undo!
|
||||||
@@ -267,6 +287,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
*/
|
*/
|
||||||
static function defineAlias($namespace, $name, $new_namespace, $new_name) {
|
static function defineAlias($namespace, $name, $new_namespace, $new_name) {
|
||||||
$def =& HTMLPurifier_ConfigSchema::instance();
|
$def =& HTMLPurifier_ConfigSchema::instance();
|
||||||
|
if (HTMLPURIFIER_SCHEMA_STRICT) {
|
||||||
if (!isset($def->info[$namespace])) {
|
if (!isset($def->info[$namespace])) {
|
||||||
trigger_error('Cannot define directive alias in undefined namespace',
|
trigger_error('Cannot define directive alias in undefined namespace',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
@@ -292,6 +313,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
$def->info[$namespace][$name] =
|
$def->info[$namespace][$name] =
|
||||||
new HTMLPurifier_ConfigDef_DirectiveAlias(
|
new HTMLPurifier_ConfigDef_DirectiveAlias(
|
||||||
$new_namespace, $new_name);
|
$new_namespace, $new_name);
|
||||||
@@ -313,8 +335,10 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
return $var;
|
return $var;
|
||||||
case 'istring':
|
case 'istring':
|
||||||
case 'string':
|
case 'string':
|
||||||
|
case 'text': // no difference, just is longer/multiple line string
|
||||||
|
case 'itext':
|
||||||
if (!is_string($var)) break;
|
if (!is_string($var)) break;
|
||||||
if ($type === 'istring') $var = strtolower($var);
|
if ($type === 'istring' || $type === 'itext') $var = strtolower($var);
|
||||||
return $var;
|
return $var;
|
||||||
case 'int':
|
case 'int':
|
||||||
if (is_string($var) && ctype_digit($var)) $var = (int) $var;
|
if (is_string($var) && ctype_digit($var)) $var = (int) $var;
|
||||||
@@ -345,9 +369,13 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
// a single empty string item, but having an empty
|
// a single empty string item, but having an empty
|
||||||
// array is more intuitive
|
// array is more intuitive
|
||||||
if ($var == '') return array();
|
if ($var == '') return array();
|
||||||
|
if (strpos($var, "\n") === false && strpos($var, "\r") === false) {
|
||||||
// simplistic string to array method that only works
|
// simplistic string to array method that only works
|
||||||
// for simple lists of tag names or alphanumeric characters
|
// for simple lists of tag names or alphanumeric characters
|
||||||
$var = explode(',',$var);
|
$var = explode(',',$var);
|
||||||
|
} else {
|
||||||
|
$var = preg_split('/(,|[\n\r]+)/', $var);
|
||||||
|
}
|
||||||
// remove spaces
|
// remove spaces
|
||||||
foreach ($var as $i => $j) $var[$i] = trim($j);
|
foreach ($var as $i => $j) $var[$i] = trim($j);
|
||||||
if ($type === 'hash') {
|
if ($type === 'hash') {
|
||||||
@@ -388,6 +416,7 @@ class HTMLPurifier_ConfigSchema {
|
|||||||
* Takes an absolute path and munges it into a more manageable relative path
|
* Takes an absolute path and munges it into a more manageable relative path
|
||||||
*/
|
*/
|
||||||
function mungeFilename($filename) {
|
function mungeFilename($filename) {
|
||||||
|
if (!HTMLPURIFIER_SCHEMA_STRICT) return $filename;
|
||||||
$offset = strrpos($filename, 'HTMLPurifier');
|
$offset = strrpos($filename, 'HTMLPurifier');
|
||||||
$filename = substr($filename, $offset);
|
$filename = substr($filename, $offset);
|
||||||
$filename = str_replace('\\', '/', $filename);
|
$filename = str_replace('\\', '/', $filename);
|
||||||
|
@@ -5,6 +5,7 @@ require_once 'HTMLPurifier/ChildDef.php';
|
|||||||
require_once 'HTMLPurifier/ChildDef/Empty.php';
|
require_once 'HTMLPurifier/ChildDef/Empty.php';
|
||||||
require_once 'HTMLPurifier/ChildDef/Required.php';
|
require_once 'HTMLPurifier/ChildDef/Required.php';
|
||||||
require_once 'HTMLPurifier/ChildDef/Optional.php';
|
require_once 'HTMLPurifier/ChildDef/Optional.php';
|
||||||
|
require_once 'HTMLPurifier/ChildDef/Custom.php';
|
||||||
|
|
||||||
// NOT UNIT TESTED!!!
|
// NOT UNIT TESTED!!!
|
||||||
|
|
||||||
|
@@ -99,7 +99,7 @@ class HTMLPurifier_DefinitionCache_Serializer extends
|
|||||||
*/
|
*/
|
||||||
function generateBaseDirectoryPath($config) {
|
function generateBaseDirectoryPath($config) {
|
||||||
$base = $config->get('Cache', 'SerializerPath');
|
$base = $config->get('Cache', 'SerializerPath');
|
||||||
$base = is_null($base) ? dirname(__FILE__) . '/Serializer' : $base;
|
$base = is_null($base) ? HTMLPURIFIER_PREFIX . '/HTMLPurifier/DefinitionCache/Serializer' : $base;
|
||||||
return $base;
|
return $base;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -19,7 +19,7 @@ class HTMLPurifier_EntityLookup {
|
|||||||
*/
|
*/
|
||||||
function setup($file = false) {
|
function setup($file = false) {
|
||||||
if (!$file) {
|
if (!$file) {
|
||||||
$file = dirname(__FILE__) . '/EntityLookup/entities.ser';
|
$file = HTMLPURIFIER_PREFIX . '/HTMLPurifier/EntityLookup/entities.ser';
|
||||||
}
|
}
|
||||||
$this->table = unserialize(file_get_contents($file));
|
$this->table = unserialize(file_get_contents($file));
|
||||||
}
|
}
|
||||||
|
@@ -110,12 +110,13 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
');
|
');
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'HTML', 'Allowed', null, 'string/null', '
|
'HTML', 'Allowed', null, 'itext/null', '
|
||||||
<p>
|
<p>
|
||||||
This is a convenience directive that rolls the functionality of
|
This is a convenience directive that rolls the functionality of
|
||||||
%HTML.AllowedElements and %HTML.AllowedAttributes into one directive.
|
%HTML.AllowedElements and %HTML.AllowedAttributes into one directive.
|
||||||
Specify elements and attributes that are allowed using:
|
Specify elements and attributes that are allowed using:
|
||||||
<code>element1[attr1|attr2],element2...</code>.
|
<code>element1[attr1|attr2],element2...</code>. You can also use
|
||||||
|
newlines instead of commas to separate elements.
|
||||||
</p>
|
</p>
|
||||||
<p>
|
<p>
|
||||||
<strong>Warning</strong>:
|
<strong>Warning</strong>:
|
||||||
@@ -235,13 +236,26 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
|||||||
/**
|
/**
|
||||||
* Adds a custom element to your HTML definition
|
* Adds a custom element to your HTML definition
|
||||||
* @note See HTMLPurifier_HTMLModule::addElement for detailed
|
* @note See HTMLPurifier_HTMLModule::addElement for detailed
|
||||||
* parameter descriptions.
|
* parameter and return value descriptions.
|
||||||
*/
|
*/
|
||||||
function addElement($element_name, $type, $contents, $attr_collections, $attributes) {
|
function &addElement($element_name, $type, $contents, $attr_collections, $attributes) {
|
||||||
$module =& $this->getAnonymousModule();
|
$module =& $this->getAnonymousModule();
|
||||||
// assume that if the user is calling this, the element
|
// assume that if the user is calling this, the element
|
||||||
// is safe. This may not be a good idea
|
// is safe. This may not be a good idea
|
||||||
$module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
|
$element =& $module->addElement($element_name, true, $type, $contents, $attr_collections, $attributes);
|
||||||
|
return $element;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Adds a blank element to your HTML definition, for overriding
|
||||||
|
* existing behavior
|
||||||
|
* @note See HTMLPurifier_HTMLModule::addBlankElement for detailed
|
||||||
|
* parameter and return value descriptions.
|
||||||
|
*/
|
||||||
|
function &addBlankElement($element_name) {
|
||||||
|
$module =& $this->getAnonymousModule();
|
||||||
|
$element =& $module->addBlankElement($element_name);
|
||||||
|
return $element;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -329,7 +343,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
|||||||
if (isset($this->info_content_sets['Block'][$block_wrapper])) {
|
if (isset($this->info_content_sets['Block'][$block_wrapper])) {
|
||||||
$this->info_block_wrapper = $block_wrapper;
|
$this->info_block_wrapper = $block_wrapper;
|
||||||
} else {
|
} else {
|
||||||
trigger_error('Cannot use non-block element as block wrapper.',
|
trigger_error('Cannot use non-block element as block wrapper',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -339,7 +353,7 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
|||||||
$this->info_parent = $parent;
|
$this->info_parent = $parent;
|
||||||
$this->info_parent_def = $def;
|
$this->info_parent_def = $def;
|
||||||
} else {
|
} else {
|
||||||
trigger_error('Cannot use unrecognized element as parent.',
|
trigger_error('Cannot use unrecognized element as parent',
|
||||||
E_USER_ERROR);
|
E_USER_ERROR);
|
||||||
$this->info_parent_def = $this->manager->getElement($this->info_parent, true);
|
$this->info_parent_def = $this->manager->getElement($this->info_parent, true);
|
||||||
}
|
}
|
||||||
@@ -426,8 +440,9 @@ class HTMLPurifier_HTMLDefinition extends HTMLPurifier_Definition
|
|||||||
$elements = array();
|
$elements = array();
|
||||||
$attributes = array();
|
$attributes = array();
|
||||||
|
|
||||||
$chunks = explode(',', $list);
|
$chunks = preg_split('/(,|[\n\r]+)/', $list);
|
||||||
foreach ($chunks as $chunk) {
|
foreach ($chunks as $chunk) {
|
||||||
|
if (empty($chunk)) continue;
|
||||||
// remove TinyMCE element control characters
|
// remove TinyMCE element control characters
|
||||||
if (!strpos($chunk, '[')) {
|
if (!strpos($chunk, '[')) {
|
||||||
$element = $chunk;
|
$element = $chunk;
|
||||||
|
47
library/HTMLPurifier/HTMLModule/Object.php
Normal file
47
library/HTMLPurifier/HTMLModule/Object.php
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/HTMLModule.php';
|
||||||
|
|
||||||
|
/**
|
||||||
|
* XHTML 1.1 Object Module, defines elements for generic object inclusion
|
||||||
|
* @warning Users will commonly use <embed> to cater to legacy browsers: this
|
||||||
|
* module does not allow this sort of behavior
|
||||||
|
*/
|
||||||
|
class HTMLPurifier_HTMLModule_Object extends HTMLPurifier_HTMLModule
|
||||||
|
{
|
||||||
|
|
||||||
|
var $name = 'Object';
|
||||||
|
|
||||||
|
function HTMLPurifier_HTMLModule_Object() {
|
||||||
|
|
||||||
|
$this->addElement('object', false, 'Inline', 'Optional: #PCDATA | Flow | param', 'Common',
|
||||||
|
array(
|
||||||
|
'archive' => 'URI',
|
||||||
|
'classid' => 'URI',
|
||||||
|
'codebase' => 'URI',
|
||||||
|
'codetype' => 'Text',
|
||||||
|
'data' => 'URI',
|
||||||
|
'declare' => 'Bool#declare',
|
||||||
|
'height' => 'Length',
|
||||||
|
'name' => 'CDATA',
|
||||||
|
'standby' => 'Text',
|
||||||
|
'tabindex' => 'Number',
|
||||||
|
'type' => 'ContentType',
|
||||||
|
'width' => 'Length'
|
||||||
|
)
|
||||||
|
);
|
||||||
|
|
||||||
|
$this->addElement('param', false, false, 'Empty', false,
|
||||||
|
array(
|
||||||
|
'id' => 'ID',
|
||||||
|
'name*' => 'Text',
|
||||||
|
'type' => 'Text',
|
||||||
|
'value' => 'Text',
|
||||||
|
'valuetype' => 'Enum#data,ref,object'
|
||||||
|
)
|
||||||
|
);
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
28
library/HTMLPurifier/HTMLModule/Ruby.php
Normal file
28
library/HTMLPurifier/HTMLModule/Ruby.php
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/HTMLModule.php';
|
||||||
|
|
||||||
|
/**
|
||||||
|
* XHTML 1.1 Ruby Annotation Module, defines elements that indicate
|
||||||
|
* short runs of text alongside base text for annotation or pronounciation.
|
||||||
|
*/
|
||||||
|
class HTMLPurifier_HTMLModule_Ruby extends HTMLPurifier_HTMLModule
|
||||||
|
{
|
||||||
|
|
||||||
|
var $name = 'Ruby';
|
||||||
|
|
||||||
|
function HTMLPurifier_HTMLModule_Ruby() {
|
||||||
|
$this->addElement('ruby', true, 'Inline',
|
||||||
|
'Custom: ((rb, (rt | (rp, rt, rp))) | (rbc, rtc, rtc?))',
|
||||||
|
'Common');
|
||||||
|
$this->addElement('rbc', true, false, 'Required: rb', 'Common');
|
||||||
|
$this->addElement('rtc', true, false, 'Required: rt', 'Common');
|
||||||
|
$rb =& $this->addElement('rb', true, false, 'Inline', 'Common');
|
||||||
|
$rb->excludes = array('ruby' => true);
|
||||||
|
$rt =& $this->addElement('rt', true, false, 'Inline', 'Common', array('rbspan' => 'Number'));
|
||||||
|
$rt->excludes = array('ruby' => true);
|
||||||
|
$this->addElement('rp', true, false, 'Optional: #PCDATA', 'Common');
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
@@ -13,6 +13,8 @@ require_once 'HTMLPurifier/AttrTransform/Length.php';
|
|||||||
require_once 'HTMLPurifier/AttrTransform/ImgSpace.php';
|
require_once 'HTMLPurifier/AttrTransform/ImgSpace.php';
|
||||||
require_once 'HTMLPurifier/AttrTransform/EnumToCSS.php';
|
require_once 'HTMLPurifier/AttrTransform/EnumToCSS.php';
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
|
||||||
|
|
||||||
class HTMLPurifier_HTMLModule_Tidy_XHTMLAndHTML4 extends
|
class HTMLPurifier_HTMLModule_Tidy_XHTMLAndHTML4 extends
|
||||||
HTMLPurifier_HTMLModule_Tidy
|
HTMLPurifier_HTMLModule_Tidy
|
||||||
{
|
{
|
||||||
@@ -188,5 +190,17 @@ class HTMLPurifier_HTMLModule_Tidy_Strict extends
|
|||||||
{
|
{
|
||||||
var $name = 'Tidy_Strict';
|
var $name = 'Tidy_Strict';
|
||||||
var $defaultLevel = 'light';
|
var $defaultLevel = 'light';
|
||||||
|
|
||||||
|
function makeFixes() {
|
||||||
|
$r = parent::makeFixes();
|
||||||
|
$r['blockquote#content_model_type'] = 'strictblockquote';
|
||||||
|
return $r;
|
||||||
|
}
|
||||||
|
|
||||||
|
var $defines_child_def = true;
|
||||||
|
function getChildDef($def) {
|
||||||
|
if ($def->content_model_type != 'strictblockquote') return parent::getChildDef($def);
|
||||||
|
return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -1,26 +0,0 @@
|
|||||||
<?php
|
|
||||||
|
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy.php';
|
|
||||||
require_once 'HTMLPurifier/ChildDef/StrictBlockquote.php';
|
|
||||||
|
|
||||||
class HTMLPurifier_HTMLModule_Tidy_XHTMLStrict extends
|
|
||||||
HTMLPurifier_HTMLModule_Tidy
|
|
||||||
{
|
|
||||||
|
|
||||||
var $name = 'Tidy_XHTMLStrict';
|
|
||||||
var $defaultLevel = 'light';
|
|
||||||
|
|
||||||
function makeFixes() {
|
|
||||||
$r = array();
|
|
||||||
$r['blockquote#content_model_type'] = 'strictblockquote';
|
|
||||||
return $r;
|
|
||||||
}
|
|
||||||
|
|
||||||
var $defines_child_def = true;
|
|
||||||
function getChildDef($def) {
|
|
||||||
if ($def->content_model_type != 'strictblockquote') return false;
|
|
||||||
return new HTMLPurifier_ChildDef_StrictBlockquote($def->content_model);
|
|
||||||
}
|
|
||||||
|
|
||||||
}
|
|
||||||
|
|
@@ -28,12 +28,13 @@ require_once 'HTMLPurifier/HTMLModule/Target.php';
|
|||||||
require_once 'HTMLPurifier/HTMLModule/Scripting.php';
|
require_once 'HTMLPurifier/HTMLModule/Scripting.php';
|
||||||
require_once 'HTMLPurifier/HTMLModule/XMLCommonAttributes.php';
|
require_once 'HTMLPurifier/HTMLModule/XMLCommonAttributes.php';
|
||||||
require_once 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
|
require_once 'HTMLPurifier/HTMLModule/NonXMLCommonAttributes.php';
|
||||||
|
require_once 'HTMLPurifier/HTMLModule/Ruby.php';
|
||||||
|
require_once 'HTMLPurifier/HTMLModule/Object.php';
|
||||||
|
|
||||||
// tidy modules
|
// tidy modules
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy.php';
|
require_once 'HTMLPurifier/HTMLModule/Tidy.php';
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php';
|
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLAndHTML4.php';
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTML.php';
|
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTML.php';
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy/XHTMLStrict.php';
|
|
||||||
require_once 'HTMLPurifier/HTMLModule/Tidy/Proprietary.php';
|
require_once 'HTMLPurifier/HTMLModule/Tidy/Proprietary.php';
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
@@ -171,7 +172,7 @@ class HTMLPurifier_HTMLModuleManager
|
|||||||
$common = array(
|
$common = array(
|
||||||
'CommonAttributes', 'Text', 'Hypertext', 'List',
|
'CommonAttributes', 'Text', 'Hypertext', 'List',
|
||||||
'Presentation', 'Edit', 'Bdo', 'Tables', 'Image',
|
'Presentation', 'Edit', 'Bdo', 'Tables', 'Image',
|
||||||
'StyleAttribute', 'Scripting'
|
'StyleAttribute', 'Scripting', 'Object'
|
||||||
);
|
);
|
||||||
$transitional = array('Legacy', 'Target');
|
$transitional = array('Legacy', 'Target');
|
||||||
$xml = array('XMLCommonAttributes');
|
$xml = array('XMLCommonAttributes');
|
||||||
@@ -207,7 +208,7 @@ class HTMLPurifier_HTMLModuleManager
|
|||||||
$this->doctypes->register(
|
$this->doctypes->register(
|
||||||
'XHTML 1.0 Strict', true,
|
'XHTML 1.0 Strict', true,
|
||||||
array_merge($common, $xml, $non_xml),
|
array_merge($common, $xml, $non_xml),
|
||||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_XHTMLStrict', 'Tidy_Proprietary'),
|
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Strict', 'Tidy_Proprietary'),
|
||||||
array(),
|
array(),
|
||||||
'-//W3C//DTD XHTML 1.0 Strict//EN',
|
'-//W3C//DTD XHTML 1.0 Strict//EN',
|
||||||
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
|
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
|
||||||
@@ -215,8 +216,8 @@ class HTMLPurifier_HTMLModuleManager
|
|||||||
|
|
||||||
$this->doctypes->register(
|
$this->doctypes->register(
|
||||||
'XHTML 1.1', true,
|
'XHTML 1.1', true,
|
||||||
array_merge($common, $xml),
|
array_merge($common, $xml, array('Ruby')),
|
||||||
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary'), // Tidy_XHTML1_1
|
array('Tidy_Strict', 'Tidy_XHTML', 'Tidy_Proprietary', 'Tidy_Strict'), // Tidy_XHTML1_1
|
||||||
array(),
|
array(),
|
||||||
'-//W3C//DTD XHTML 1.1//EN',
|
'-//W3C//DTD XHTML 1.1//EN',
|
||||||
'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'
|
'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'
|
||||||
|
@@ -1,11 +1,15 @@
|
|||||||
<?php
|
<?php
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'Attr', 'IDBlacklist', array(), 'list',
|
||||||
|
'Array of IDs not allowed in the document.'
|
||||||
|
);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
|
* Component of HTMLPurifier_AttrContext that accumulates IDs to prevent dupes
|
||||||
* @note In Slashdot-speak, dupe means duplicate.
|
* @note In Slashdot-speak, dupe means duplicate.
|
||||||
* @note This class does not accept $config or $context, thus, it is the
|
* @note The default constructor does not accept $config or $context objects:
|
||||||
* burden of the callee to register the appropriate errors or
|
* use must use the static build() factory method to perform initialization.
|
||||||
* configuration.
|
|
||||||
*/
|
*/
|
||||||
class HTMLPurifier_IDAccumulator
|
class HTMLPurifier_IDAccumulator
|
||||||
{
|
{
|
||||||
@@ -16,6 +20,19 @@ class HTMLPurifier_IDAccumulator
|
|||||||
*/
|
*/
|
||||||
var $ids = array();
|
var $ids = array();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Builds an IDAccumulator, also initializing the default blacklist
|
||||||
|
* @param $config Instance of HTMLPurifier_Config
|
||||||
|
* @param $context Instance of HTMLPurifier_Context
|
||||||
|
* @return Fully initialized HTMLPurifier_IDAccumulator
|
||||||
|
* @static
|
||||||
|
*/
|
||||||
|
static function build($config, &$context) {
|
||||||
|
$id_accumulator = new HTMLPurifier_IDAccumulator();
|
||||||
|
$id_accumulator->load($config->get('Attr', 'IDBlacklist'));
|
||||||
|
return $id_accumulator;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Add an ID to the lookup table.
|
* Add an ID to the lookup table.
|
||||||
* @param $id ID to be added.
|
* @param $id ID to be added.
|
||||||
|
@@ -4,10 +4,18 @@
|
|||||||
* Injects tokens into the document while parsing for well-formedness.
|
* Injects tokens into the document while parsing for well-formedness.
|
||||||
* This enables "formatter-like" functionality such as auto-paragraphing,
|
* This enables "formatter-like" functionality such as auto-paragraphing,
|
||||||
* smiley-ification and linkification to take place.
|
* smiley-ification and linkification to take place.
|
||||||
|
*
|
||||||
|
* @todo Allow injectors to request a re-run on their output. This
|
||||||
|
* would help if an operation is recursive.
|
||||||
*/
|
*/
|
||||||
class HTMLPurifier_Injector
|
class HTMLPurifier_Injector
|
||||||
{
|
{
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Advisory name of injector, this is for friendly error messages
|
||||||
|
*/
|
||||||
|
var $name;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Amount of tokens the injector needs to skip + 1. Because
|
* Amount of tokens the injector needs to skip + 1. Because
|
||||||
* the decrement is the first thing that happens, this needs to
|
* the decrement is the first thing that happens, this needs to
|
||||||
@@ -40,16 +48,37 @@ class HTMLPurifier_Injector
|
|||||||
var $inputIndex;
|
var $inputIndex;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Prepares the injector by giving it the config and context objects,
|
* Array of elements and attributes this injector creates and therefore
|
||||||
* so that important variables can be extracted and not passed via
|
* need to be allowed by the definition. Takes form of
|
||||||
* parameter constantly. Remember: always instantiate a new injector
|
* array('element' => array('attr', 'attr2'), 'element2')
|
||||||
* when handling a set of HTML.
|
*/
|
||||||
|
var $needed = array();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Prepares the injector by giving it the config and context objects:
|
||||||
|
* this allows references to important variables to be made within
|
||||||
|
* the injector. This function also checks if the HTML environment
|
||||||
|
* will work with the Injector: if p tags are not allowed, the
|
||||||
|
* Auto-Paragraphing injector should not be enabled.
|
||||||
|
* @param $config Instance of HTMLPurifier_Config
|
||||||
|
* @param $context Instance of HTMLPurifier_Context
|
||||||
|
* @return Boolean false if success, string of missing needed element/attribute if failure
|
||||||
*/
|
*/
|
||||||
function prepare($config, &$context) {
|
function prepare($config, &$context) {
|
||||||
$this->htmlDefinition = $config->getHTMLDefinition();
|
$this->htmlDefinition = $config->getHTMLDefinition();
|
||||||
|
// perform $needed checks
|
||||||
|
foreach ($this->needed as $element => $attributes) {
|
||||||
|
if (is_int($element)) $element = $attributes;
|
||||||
|
if (!isset($this->htmlDefinition->info[$element])) return $element;
|
||||||
|
if (!is_array($attributes)) continue;
|
||||||
|
foreach ($attributes as $name) {
|
||||||
|
if (!isset($this->htmlDefinition->info[$element]->attr[$name])) return "$element.$name";
|
||||||
|
}
|
||||||
|
}
|
||||||
$this->currentNesting =& $context->get('CurrentNesting');
|
$this->currentNesting =& $context->get('CurrentNesting');
|
||||||
$this->inputTokens =& $context->get('InputTokens');
|
$this->inputTokens =& $context->get('InputTokens');
|
||||||
$this->inputIndex =& $context->get('InputIndex');
|
$this->inputIndex =& $context->get('InputIndex');
|
||||||
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -74,12 +103,19 @@ class HTMLPurifier_Injector
|
|||||||
/**
|
/**
|
||||||
* Handler that is called when a text token is processed
|
* Handler that is called when a text token is processed
|
||||||
*/
|
*/
|
||||||
function handleText(&$token, $config, &$context) {}
|
function handleText(&$token) {}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Handler that is called when a start token is processed
|
* Handler that is called when a start or empty token is processed
|
||||||
*/
|
*/
|
||||||
function handleStart(&$token, $config, &$context) {}
|
function handleElement(&$token) {}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Notifier that is called when an end token is processed
|
||||||
|
* @note This differs from handlers in that the token is read-only
|
||||||
|
*/
|
||||||
|
function notifyEnd($token) {}
|
||||||
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -6,15 +6,28 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
'AutoFormat', 'AutoParagraph', false, 'bool', '
|
'AutoFormat', 'AutoParagraph', false, 'bool', '
|
||||||
<p>
|
<p>
|
||||||
This directive turns on auto-paragraphing, where double newlines are
|
This directive turns on auto-paragraphing, where double newlines are
|
||||||
converted in to paragraphs whenever possible. Auto-paragraphing
|
converted in to paragraphs whenever possible. Auto-paragraphing:
|
||||||
applies when:
|
|
||||||
</p>
|
</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>There are inline elements or text in the root node</li>
|
<li>Always applies to inline elements or text in the root node,</li>
|
||||||
<li>There are inline elements or text with double newlines or
|
<li>Applies to inline elements or text with double newlines in nodes
|
||||||
block elements in nodes that allow paragraph tags</li>
|
that allow paragraph tags,</li>
|
||||||
<li>There are double newlines in paragraph tags</li>
|
<li>Applies to double newlines in paragraph tags</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
<p>
|
||||||
|
<code>p</code> tags must be allowed for this directive to take effect.
|
||||||
|
We do not use <code>br</code> tags for paragraphing, as that is
|
||||||
|
semantically incorrect.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
To prevent auto-paragraphing as a content-producer, refrain from using
|
||||||
|
double-newlines except to specify a new paragraph or in contexts where
|
||||||
|
it has special meaning (whitespace usually has no meaning except in
|
||||||
|
tags like <code>pre</code>, so this should not be difficult.) To prevent
|
||||||
|
the paragraphing of inline text adjacent to block elements, wrap them
|
||||||
|
in <code>div</code> tags (the behavior is slightly different outside of
|
||||||
|
the root node.)
|
||||||
|
</p>
|
||||||
<p>
|
<p>
|
||||||
This directive has been available since 2.0.1.
|
This directive has been available since 2.0.1.
|
||||||
</p>
|
</p>
|
||||||
@@ -27,13 +40,16 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
||||||
{
|
{
|
||||||
|
|
||||||
|
var $name = 'AutoParagraph';
|
||||||
|
var $needed = array('p');
|
||||||
|
|
||||||
function _pStart() {
|
function _pStart() {
|
||||||
$par = new HTMLPurifier_Token_Start('p');
|
$par = new HTMLPurifier_Token_Start('p');
|
||||||
$par->armor['MakeWellFormed_TagClosedError'] = true;
|
$par->armor['MakeWellFormed_TagClosedError'] = true;
|
||||||
return $par;
|
return $par;
|
||||||
}
|
}
|
||||||
|
|
||||||
function handleText(&$token, $config, &$context) {
|
function handleText(&$token) {
|
||||||
$text = $token->data;
|
$text = $token->data;
|
||||||
if (empty($this->currentNesting)) {
|
if (empty($this->currentNesting)) {
|
||||||
if (!$this->allowsElement('p')) return;
|
if (!$this->allowsElement('p')) return;
|
||||||
@@ -54,19 +70,27 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
|||||||
$ok = false;
|
$ok = false;
|
||||||
// test if up-coming tokens are either block or have
|
// test if up-coming tokens are either block or have
|
||||||
// a double newline in them
|
// a double newline in them
|
||||||
|
$nesting = 0;
|
||||||
for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
|
for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
|
||||||
if ($this->inputTokens[$i]->type == 'start'){
|
if ($this->inputTokens[$i]->type == 'start'){
|
||||||
if (!$this->_isInline($this->inputTokens[$i])) {
|
if (!$this->_isInline($this->inputTokens[$i])) {
|
||||||
$ok = true;
|
// we haven't found a double-newline, and
|
||||||
}
|
// we've hit a block element, so don't paragraph
|
||||||
|
$ok = false;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
if ($this->inputTokens[$i]->type == 'end') break;
|
$nesting++;
|
||||||
|
}
|
||||||
|
if ($this->inputTokens[$i]->type == 'end') {
|
||||||
|
if ($nesting <= 0) break;
|
||||||
|
$nesting--;
|
||||||
|
}
|
||||||
if ($this->inputTokens[$i]->type == 'text') {
|
if ($this->inputTokens[$i]->type == 'text') {
|
||||||
|
// found it!
|
||||||
if (strpos($this->inputTokens[$i]->data, "\n\n") !== false) {
|
if (strpos($this->inputTokens[$i]->data, "\n\n") !== false) {
|
||||||
$ok = true;
|
$ok = true;
|
||||||
|
break;
|
||||||
}
|
}
|
||||||
if (!$this->inputTokens[$i]->is_whitespace) break;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if ($ok) {
|
if ($ok) {
|
||||||
@@ -79,7 +103,7 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
|||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
function handleStart(&$token, $config, &$context) {
|
function handleElement(&$token) {
|
||||||
// check if we're inside a tag already
|
// check if we're inside a tag already
|
||||||
if (!empty($this->currentNesting)) {
|
if (!empty($this->currentNesting)) {
|
||||||
if ($this->allowsElement('p')) {
|
if ($this->allowsElement('p')) {
|
||||||
@@ -88,11 +112,19 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
|||||||
// this token is already paragraph, abort
|
// this token is already paragraph, abort
|
||||||
if ($token->name == 'p') return;
|
if ($token->name == 'p') return;
|
||||||
|
|
||||||
// check if this token is adjacent to the parent
|
// this token is a block level, abort
|
||||||
if ($this->inputTokens[$this->inputIndex - 1]->type != 'start') {
|
if (!$this->_isInline($token)) return;
|
||||||
|
|
||||||
|
// check if this token is adjacent to the parent token
|
||||||
|
$prev = $this->inputTokens[$this->inputIndex - 1];
|
||||||
|
if ($prev->type != 'start') {
|
||||||
// not adjacent, we can abort early
|
// not adjacent, we can abort early
|
||||||
// add lead paragraph tag if our token is inline
|
// add lead paragraph tag if our token is inline
|
||||||
if ($this->_isInline($token)) {
|
// and the previous tag was an end paragraph
|
||||||
|
if (
|
||||||
|
$prev->name == 'p' && $prev->type == 'end' &&
|
||||||
|
$this->_isInline($token)
|
||||||
|
) {
|
||||||
$token = array($this->_pStart(), $token);
|
$token = array($this->_pStart(), $token);
|
||||||
}
|
}
|
||||||
return;
|
return;
|
||||||
@@ -105,8 +137,8 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
|||||||
$ok = false;
|
$ok = false;
|
||||||
// maintain a mini-nesting counter, this lets us bail out
|
// maintain a mini-nesting counter, this lets us bail out
|
||||||
// early if possible
|
// early if possible
|
||||||
$j = 2; // current nesting, is two due to parent and this start
|
$j = 1; // current nesting, one is due to parent (we recalculate current token)
|
||||||
for ($i = $this->inputIndex + 1; isset($this->inputTokens[$i]); $i++) {
|
for ($i = $this->inputIndex; isset($this->inputTokens[$i]); $i++) {
|
||||||
if ($this->inputTokens[$i]->type == 'start') $j++;
|
if ($this->inputTokens[$i]->type == 'start') $j++;
|
||||||
if ($this->inputTokens[$i]->type == 'end') $j--;
|
if ($this->inputTokens[$i]->type == 'end') $j--;
|
||||||
if ($this->inputTokens[$i]->type == 'text') {
|
if ($this->inputTokens[$i]->type == 'text') {
|
||||||
@@ -150,7 +182,14 @@ class HTMLPurifier_Injector_AutoParagraph extends HTMLPurifier_Injector
|
|||||||
$needs_start = false;
|
$needs_start = false;
|
||||||
$needs_end = false;
|
$needs_end = false;
|
||||||
|
|
||||||
for ($i = 0, $c = count($raw_paragraphs); $i < $c; $i++) {
|
$c = count($raw_paragraphs);
|
||||||
|
if ($c == 1) {
|
||||||
|
// there were no double-newlines, abort quickly
|
||||||
|
$result[] = new HTMLPurifier_Token_Text($data);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
for ($i = 0; $i < $c; $i++) {
|
||||||
$par = $raw_paragraphs[$i];
|
$par = $raw_paragraphs[$i];
|
||||||
if (trim($par) !== '') {
|
if (trim($par) !== '') {
|
||||||
$paragraphs[] = $par;
|
$paragraphs[] = $par;
|
||||||
|
@@ -6,7 +6,8 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
'AutoFormat', 'Linkify', false, 'bool', '
|
'AutoFormat', 'Linkify', false, 'bool', '
|
||||||
<p>
|
<p>
|
||||||
This directive turns on linkification, auto-linking http, ftp and
|
This directive turns on linkification, auto-linking http, ftp and
|
||||||
https URLs. This directive has been available since 2.0.1.
|
https URLs. <code>a</code> tags with the <code>href</code> attribute
|
||||||
|
must be allowed. This directive has been available since 2.0.1.
|
||||||
</p>
|
</p>
|
||||||
');
|
');
|
||||||
|
|
||||||
@@ -16,7 +17,10 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
class HTMLPurifier_Injector_Linkify extends HTMLPurifier_Injector
|
class HTMLPurifier_Injector_Linkify extends HTMLPurifier_Injector
|
||||||
{
|
{
|
||||||
|
|
||||||
function handleText(&$token, $config, &$context) {
|
var $name = 'Linkify';
|
||||||
|
var $needed = array('a' => array('href'));
|
||||||
|
|
||||||
|
function handleText(&$token) {
|
||||||
if (!$this->allowsElement('a')) return;
|
if (!$this->allowsElement('a')) return;
|
||||||
|
|
||||||
if (strpos($token->data, '://') === false) {
|
if (strpos($token->data, '://') === false) {
|
||||||
|
@@ -6,8 +6,9 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
'AutoFormat', 'PurifierLinkify', false, 'bool', '
|
'AutoFormat', 'PurifierLinkify', false, 'bool', '
|
||||||
<p>
|
<p>
|
||||||
Internal auto-formatter that converts configuration directives in
|
Internal auto-formatter that converts configuration directives in
|
||||||
syntax <a>%Namespace.Directive</a> to links. This directive has been available
|
syntax <a>%Namespace.Directive</a> to links. <code>a</code> tags
|
||||||
since 2.0.1.
|
with the <code>href</code> attribute must be allowed.
|
||||||
|
This directive has been available since 2.0.1.
|
||||||
</p>
|
</p>
|
||||||
');
|
');
|
||||||
|
|
||||||
@@ -27,14 +28,16 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
class HTMLPurifier_Injector_PurifierLinkify extends HTMLPurifier_Injector
|
class HTMLPurifier_Injector_PurifierLinkify extends HTMLPurifier_Injector
|
||||||
{
|
{
|
||||||
|
|
||||||
|
var $name = 'PurifierLinkify';
|
||||||
var $docURL;
|
var $docURL;
|
||||||
|
var $needed = array('a' => array('href'));
|
||||||
|
|
||||||
function prepare($config, &$context) {
|
function prepare($config, &$context) {
|
||||||
parent::prepare($config, $context);
|
|
||||||
$this->docURL = $config->get('AutoFormatParam', 'PurifierLinkifyDocURL');
|
$this->docURL = $config->get('AutoFormatParam', 'PurifierLinkifyDocURL');
|
||||||
|
return parent::prepare($config, $context);
|
||||||
}
|
}
|
||||||
|
|
||||||
function handleText(&$token, $config, &$context) {
|
function handleText(&$token) {
|
||||||
if (!$this->allowsElement('a')) return;
|
if (!$this->allowsElement('a')) return;
|
||||||
if (strpos($token->data, '%') === false) return;
|
if (strpos($token->data, '%') === false) return;
|
||||||
|
|
||||||
|
@@ -28,7 +28,7 @@ $messages = array(
|
|||||||
'Strategy_RemoveForeignElements: Foreign element to text' => 'Unrecognized $CurrentToken.Serialized tag converted to text',
|
'Strategy_RemoveForeignElements: Foreign element to text' => 'Unrecognized $CurrentToken.Serialized tag converted to text',
|
||||||
'Strategy_RemoveForeignElements: Foreign element removed' => 'Unrecognized $CurrentToken.Serialized tag removed',
|
'Strategy_RemoveForeignElements: Foreign element removed' => 'Unrecognized $CurrentToken.Serialized tag removed',
|
||||||
'Strategy_RemoveForeignElements: Comment removed' => 'Comment containing "$CurrentToken.Data" removed',
|
'Strategy_RemoveForeignElements: Comment removed' => 'Comment containing "$CurrentToken.Data" removed',
|
||||||
'Strategy_RemoveForeignElements: Script removed' => 'Script removed',
|
'Strategy_RemoveForeignElements: Foreign meta element removed' => 'Unrecognized $CurrentToken.Serialized meta tag and all descendants removed',
|
||||||
'Strategy_RemoveForeignElements: Token removed to end' => 'Tags and text starting from $1 element where removed to end',
|
'Strategy_RemoveForeignElements: Token removed to end' => 'Tags and text starting from $1 element where removed to end',
|
||||||
|
|
||||||
'Strategy_MakeWellFormed: Unnecessary end tag removed' => 'Unnecessary $CurrentToken.Serialized tag removed',
|
'Strategy_MakeWellFormed: Unnecessary end tag removed' => 'Unnecessary $CurrentToken.Serialized tag removed',
|
||||||
|
@@ -82,7 +82,7 @@ class HTMLPurifier_LanguageFactory
|
|||||||
*/
|
*/
|
||||||
function setup() {
|
function setup() {
|
||||||
$this->validator = new HTMLPurifier_AttrDef_Lang();
|
$this->validator = new HTMLPurifier_AttrDef_Lang();
|
||||||
$this->dir = dirname(__FILE__);
|
$this->dir = HTMLPURIFIER_PREFIX . '/HTMLPurifier';
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
@@ -13,11 +13,14 @@ if (version_compare(PHP_VERSION, "5", ">=")) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'Core', 'AcceptFullDocuments', true, 'bool',
|
'Core', 'ConvertDocumentToFragment', true, 'bool', '
|
||||||
'This parameter determines whether or not the filter should accept full '.
|
This parameter determines whether or not the filter should convert
|
||||||
'HTML documents, not just HTML fragments. When on, it will '.
|
input that is a full document with html and body tags to a fragment
|
||||||
'drop all sections except the content between body.'
|
of just the contents of a body tag. This parameter is simply something
|
||||||
);
|
HTML Purifier can do during an edge-case: for most inputs, this
|
||||||
|
processing is not necessary.
|
||||||
|
');
|
||||||
|
HTMLPurifier_ConfigSchema::defineAlias('Core', 'AcceptFullDocuments', 'Core', 'ConvertDocumentToFragment');
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'Core', 'LexerImpl', null, 'mixed/null', '
|
'Core', 'LexerImpl', null, 'mixed/null', '
|
||||||
@@ -66,6 +69,16 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
</p>
|
</p>
|
||||||
');
|
');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'Core', 'AggressivelyFixLt', false, 'bool', '
|
||||||
|
This directive enables aggressive pre-filter fixes HTML Purifier can
|
||||||
|
perform in order to ensure that open angled-brackets do not get killed
|
||||||
|
during parsing stage. Enabling this will result in two preg_replace_callback
|
||||||
|
calls and one preg_replace call for every bit of HTML passed through here.
|
||||||
|
It is not necessary and will have no effect for PHP 4.
|
||||||
|
This directive has been available since 2.1.0.
|
||||||
|
');
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Forgivingly lexes HTML (SGML-style) markup into tokens.
|
* Forgivingly lexes HTML (SGML-style) markup into tokens.
|
||||||
*
|
*
|
||||||
@@ -179,6 +192,9 @@ class HTMLPurifier_Lexer
|
|||||||
return new HTMLPurifier_Lexer_DOMLex();
|
return new HTMLPurifier_Lexer_DOMLex();
|
||||||
case 'DirectLex':
|
case 'DirectLex':
|
||||||
return new HTMLPurifier_Lexer_DirectLex();
|
return new HTMLPurifier_Lexer_DirectLex();
|
||||||
|
case 'PH5P':
|
||||||
|
// experimental Lexer that must be manually included
|
||||||
|
return new HTMLPurifier_Lexer_PH5P();
|
||||||
default:
|
default:
|
||||||
trigger_error("Cannot instantiate unrecognized Lexer type " . htmlspecialchars($lexer), E_USER_ERROR);
|
trigger_error("Cannot instantiate unrecognized Lexer type " . htmlspecialchars($lexer), E_USER_ERROR);
|
||||||
}
|
}
|
||||||
@@ -303,7 +319,7 @@ class HTMLPurifier_Lexer
|
|||||||
function normalize($html, $config, &$context) {
|
function normalize($html, $config, &$context) {
|
||||||
|
|
||||||
// extract body from document if applicable
|
// extract body from document if applicable
|
||||||
if ($config->get('Core', 'AcceptFullDocuments')) {
|
if ($config->get('Core', 'ConvertDocumentToFragment')) {
|
||||||
$html = $this->extractBody($html);
|
$html = $this->extractBody($html);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -42,15 +42,18 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
|
|||||||
|
|
||||||
$html = $this->normalize($html, $config, $context);
|
$html = $this->normalize($html, $config, $context);
|
||||||
|
|
||||||
|
// attempt to armor stray angled brackets that cannot possibly
|
||||||
|
// form tags and thus are probably being used as emoticons
|
||||||
|
if ($config->get('Core', 'AggressivelyFixLt')) {
|
||||||
|
$char = '[^a-z!\/]';
|
||||||
|
$comment = "/<!--(.*?)(-->|\z)/is";
|
||||||
|
$html = preg_replace_callback($comment, array('HTMLPurifier_Lexer_DOMLex', 'callbackArmorCommentEntities'), $html);
|
||||||
|
$html = preg_replace("/<($char)/i", '<\\1', $html);
|
||||||
|
$html = preg_replace_callback($comment, array('HTMLPurifier_Lexer_DOMLex', 'callbackUndoCommentSubst'), $html); // fix comments
|
||||||
|
}
|
||||||
|
|
||||||
// preprocess html, essential for UTF-8
|
// preprocess html, essential for UTF-8
|
||||||
$html =
|
$html = $this->wrapHTML($html, $config, $context);
|
||||||
'<!DOCTYPE html '.
|
|
||||||
'PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"'.
|
|
||||||
'"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'.
|
|
||||||
'<html><head>'.
|
|
||||||
'<meta http-equiv="Content-Type" content="text/html;'.
|
|
||||||
' charset=utf-8" />'.
|
|
||||||
'</head><body><div>'.$html.'</div></body></html>';
|
|
||||||
|
|
||||||
$doc = new DOMDocument();
|
$doc = new DOMDocument();
|
||||||
$doc->encoding = 'UTF-8'; // theoretically, the above has this covered
|
$doc->encoding = 'UTF-8'; // theoretically, the above has this covered
|
||||||
@@ -151,5 +154,41 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
|
|||||||
*/
|
*/
|
||||||
public function muteErrorHandler($errno, $errstr) {}
|
public function muteErrorHandler($errno, $errstr) {}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Callback function for undoing escaping of stray angled brackets
|
||||||
|
* in comments
|
||||||
|
*/
|
||||||
|
static public function callbackUndoCommentSubst($matches) {
|
||||||
|
return '<!--' . strtr($matches[1], array('&'=>'&','<'=>'<')) . $matches[2];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Callback function that entity-izes ampersands in comments so that
|
||||||
|
* callbackUndoCommentSubst doesn't clobber them
|
||||||
|
*/
|
||||||
|
static public function callbackArmorCommentEntities($matches) {
|
||||||
|
return '<!--' . str_replace('&', '&', $matches[1]) . $matches[2];
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Wraps an HTML fragment in the necessary HTML
|
||||||
|
*/
|
||||||
|
function wrapHTML($html, $config, &$context) {
|
||||||
|
$def = $config->getDefinition('HTML');
|
||||||
|
$ret = '';
|
||||||
|
|
||||||
|
if (!empty($def->doctype->dtdPublic) || !empty($def->doctype->dtdSystem)) {
|
||||||
|
$ret .= '<!DOCTYPE html ';
|
||||||
|
if (!empty($def->doctype->dtdPublic)) $ret .= 'PUBLIC "' . $def->doctype->dtdPublic . '" ';
|
||||||
|
if (!empty($def->doctype->dtdSystem)) $ret .= '"' . $def->doctype->dtdSystem . '" ';
|
||||||
|
$ret .= '>';
|
||||||
|
}
|
||||||
|
|
||||||
|
$ret .= '<html><head>';
|
||||||
|
$ret .= '<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />';
|
||||||
|
$ret .= '</head><body><div>'.$html.'</div></body></html>';
|
||||||
|
return $ret;
|
||||||
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -150,11 +150,25 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
// We are in tag and it is well formed
|
// We are in tag and it is well formed
|
||||||
// Grab the internals of the tag
|
// Grab the internals of the tag
|
||||||
$strlen_segment = $position_next_gt - $cursor;
|
$strlen_segment = $position_next_gt - $cursor;
|
||||||
|
|
||||||
|
if ($strlen_segment < 1) {
|
||||||
|
// there's nothing to process!
|
||||||
|
$token = new HTMLPurifier_Token_Text('<');
|
||||||
|
$cursor++;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
$segment = substr($html, $cursor, $strlen_segment);
|
$segment = substr($html, $cursor, $strlen_segment);
|
||||||
|
|
||||||
|
if ($segment === false) {
|
||||||
|
// somehow, we attempted to access beyond the end of
|
||||||
|
// the string, defense-in-depth, reported by Nate Abele
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
// Check if it's a comment
|
// Check if it's a comment
|
||||||
if (
|
if (
|
||||||
substr($segment, 0, 3) == '!--'
|
substr($segment, 0, 3) === '!--'
|
||||||
) {
|
) {
|
||||||
// re-determine segment length, looking for -->
|
// re-determine segment length, looking for -->
|
||||||
$position_comment_end = strpos($html, '-->', $cursor);
|
$position_comment_end = strpos($html, '-->', $cursor);
|
||||||
@@ -204,7 +218,8 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
// Check leading character is alnum, if not, we may
|
// Check leading character is alnum, if not, we may
|
||||||
// have accidently grabbed an emoticon. Translate into
|
// have accidently grabbed an emoticon. Translate into
|
||||||
// text and go our merry way
|
// text and go our merry way
|
||||||
if (!ctype_alnum($segment[0])) {
|
if (!ctype_alpha($segment[0])) {
|
||||||
|
// XML: $segment[0] !== '_' && $segment[0] !== ':'
|
||||||
if ($e) $e->send(E_NOTICE, 'Lexer: Unescaped lt');
|
if ($e) $e->send(E_NOTICE, 'Lexer: Unescaped lt');
|
||||||
$token = new
|
$token = new
|
||||||
HTMLPurifier_Token_Text(
|
HTMLPurifier_Token_Text(
|
||||||
@@ -228,7 +243,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
// trailing slash. Remember, we could have a tag like <br>, so
|
// trailing slash. Remember, we could have a tag like <br>, so
|
||||||
// any later token processing scripts must convert improperly
|
// any later token processing scripts must convert improperly
|
||||||
// classified EmptyTags from StartTags.
|
// classified EmptyTags from StartTags.
|
||||||
$is_self_closing= (strpos($segment,'/') === $strlen_segment-1);
|
$is_self_closing = (strrpos($segment,'/') === $strlen_segment-1);
|
||||||
if ($is_self_closing) {
|
if ($is_self_closing) {
|
||||||
$strlen_segment--;
|
$strlen_segment--;
|
||||||
$segment = substr($segment, 0, $strlen_segment);
|
$segment = substr($segment, 0, $strlen_segment);
|
||||||
@@ -371,6 +386,7 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
$value = $quoted_value;
|
$value = $quoted_value;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
if ($value === false) $value = '';
|
||||||
return array($key => $value);
|
return array($key => $value);
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -385,7 +401,6 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
|
|
||||||
// infinite loop protection
|
// infinite loop protection
|
||||||
$loops = 0;
|
$loops = 0;
|
||||||
|
|
||||||
while(true) {
|
while(true) {
|
||||||
|
|
||||||
// infinite loop protection
|
// infinite loop protection
|
||||||
@@ -399,7 +414,6 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
}
|
}
|
||||||
|
|
||||||
$cursor += ($value = strspn($string, $this->_whitespace, $cursor));
|
$cursor += ($value = strspn($string, $this->_whitespace, $cursor));
|
||||||
|
|
||||||
// grab the key
|
// grab the key
|
||||||
|
|
||||||
$key_begin = $cursor; //we're currently at the start of the key
|
$key_begin = $cursor; //we're currently at the start of the key
|
||||||
@@ -435,6 +449,11 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
$cursor++;
|
$cursor++;
|
||||||
$cursor += strspn($string, $this->_whitespace, $cursor);
|
$cursor += strspn($string, $this->_whitespace, $cursor);
|
||||||
|
|
||||||
|
if ($cursor === false) {
|
||||||
|
$array[$key] = '';
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
// we might be in front of a quote right now
|
// we might be in front of a quote right now
|
||||||
|
|
||||||
$char = @$string[$cursor];
|
$char = @$string[$cursor];
|
||||||
@@ -452,7 +471,14 @@ class HTMLPurifier_Lexer_DirectLex extends HTMLPurifier_Lexer
|
|||||||
$value_end = $cursor;
|
$value_end = $cursor;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// we reached a premature end
|
||||||
|
if ($cursor === false) {
|
||||||
|
$cursor = $size;
|
||||||
|
$value_end = $cursor;
|
||||||
|
}
|
||||||
|
|
||||||
$value = substr($string, $value_begin, $value_end - $value_begin);
|
$value = substr($string, $value_begin, $value_end - $value_begin);
|
||||||
|
if ($value === false) $value = '';
|
||||||
$array[$key] = $this->parseData($value);
|
$array[$key] = $this->parseData($value);
|
||||||
$cursor++;
|
$cursor++;
|
||||||
|
|
||||||
|
3886
library/HTMLPurifier/Lexer/PH5P.php
Normal file
3886
library/HTMLPurifier/Lexer/PH5P.php
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,7 +1,7 @@
|
|||||||
|
|
||||||
.hp-config {}
|
.hp-config {}
|
||||||
|
|
||||||
.hp-config tbody th {text-align:right;}
|
.hp-config tbody th {text-align:right; padding-right:0.5em;}
|
||||||
.hp-config thead, .hp-config .namespace {background:#3C578C; color:#FFF;}
|
.hp-config thead, .hp-config .namespace {background:#3C578C; color:#FFF;}
|
||||||
.hp-config .namespace th {text-align:center;}
|
.hp-config .namespace th {text-align:center;}
|
||||||
.hp-config .verbose {display:none;}
|
.hp-config .verbose {display:none;}
|
||||||
|
@@ -23,18 +23,55 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
|||||||
*/
|
*/
|
||||||
var $name;
|
var $name;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Whether or not to compress directive names, clipping them off
|
||||||
|
* after a certain amount of letters. False to disable or integer letters
|
||||||
|
* before clipping.
|
||||||
|
* @protected
|
||||||
|
*/
|
||||||
|
var $compress = false;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @param $name Form element name for directives to be stuffed into
|
* @param $name Form element name for directives to be stuffed into
|
||||||
* @param $doc_url String documentation URL, will have fragment tagged on
|
* @param $doc_url String documentation URL, will have fragment tagged on
|
||||||
|
* @param $compress Integer max length before compressing a directive name, set to false to turn off
|
||||||
*/
|
*/
|
||||||
function HTMLPurifier_Printer_ConfigForm($name, $doc_url = null) {
|
function HTMLPurifier_Printer_ConfigForm(
|
||||||
|
$name, $doc_url = null, $compress = false
|
||||||
|
) {
|
||||||
parent::HTMLPurifier_Printer();
|
parent::HTMLPurifier_Printer();
|
||||||
$this->docURL = $doc_url;
|
$this->docURL = $doc_url;
|
||||||
$this->name = $name;
|
$this->name = $name;
|
||||||
|
$this->compress = $compress;
|
||||||
|
// initialize sub-printers
|
||||||
$this->fields['default'] = new HTMLPurifier_Printer_ConfigForm_default();
|
$this->fields['default'] = new HTMLPurifier_Printer_ConfigForm_default();
|
||||||
$this->fields['bool'] = new HTMLPurifier_Printer_ConfigForm_bool();
|
$this->fields['bool'] = new HTMLPurifier_Printer_ConfigForm_bool();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Sets default column and row size for textareas in sub-printers
|
||||||
|
* @param $cols Integer columns of textarea, null to use default
|
||||||
|
* @param $rows Integer rows of textarea, null to use default
|
||||||
|
*/
|
||||||
|
function setTextareaDimensions($cols = null, $rows = null) {
|
||||||
|
if ($cols) $this->fields['default']->cols = $cols;
|
||||||
|
if ($rows) $this->fields['default']->rows = $rows;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves styling, in case it is not accessible by webserver
|
||||||
|
*/
|
||||||
|
function getCSS() {
|
||||||
|
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.css');
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves JavaScript, in case it is not accessible by webserver
|
||||||
|
*/
|
||||||
|
function getJavaScript() {
|
||||||
|
return file_get_contents(HTMLPURIFIER_PREFIX . '/HTMLPurifier/Printer/ConfigForm.js');
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Returns HTML output for a configuration form
|
* Returns HTML output for a configuration form
|
||||||
* @param $config Configuration object of current form state
|
* @param $config Configuration object of current form state
|
||||||
@@ -63,14 +100,14 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
|||||||
$ret .= $this->renderNamespace($ns, $directives);
|
$ret .= $this->renderNamespace($ns, $directives);
|
||||||
}
|
}
|
||||||
if ($render_controls) {
|
if ($render_controls) {
|
||||||
$ret .= $this->start('tfoot');
|
$ret .= $this->start('tbody');
|
||||||
$ret .= $this->start('tr');
|
$ret .= $this->start('tr');
|
||||||
$ret .= $this->start('td', array('colspan' => 2, 'class' => 'controls'));
|
$ret .= $this->start('td', array('colspan' => 2, 'class' => 'controls'));
|
||||||
$ret .= $this->elementEmpty('input', array('type' => 'Submit', 'value' => 'Submit'));
|
$ret .= $this->elementEmpty('input', array('type' => 'submit', 'value' => 'Submit'));
|
||||||
$ret .= '[<a href="?">Reset</a>]';
|
$ret .= '[<a href="?">Reset</a>]';
|
||||||
$ret .= $this->end('td');
|
$ret .= $this->end('td');
|
||||||
$ret .= $this->end('tr');
|
$ret .= $this->end('tr');
|
||||||
$ret .= $this->end('tfoot');
|
$ret .= $this->end('tbody');
|
||||||
}
|
}
|
||||||
$ret .= $this->end('table');
|
$ret .= $this->end('table');
|
||||||
return $ret;
|
return $ret;
|
||||||
@@ -98,11 +135,12 @@ class HTMLPurifier_Printer_ConfigForm extends HTMLPurifier_Printer
|
|||||||
$ret .= $this->start('a', array('href' => $url));
|
$ret .= $this->start('a', array('href' => $url));
|
||||||
}
|
}
|
||||||
$attr = array('for' => "{$this->name}:$ns.$directive");
|
$attr = array('for' => "{$this->name}:$ns.$directive");
|
||||||
|
|
||||||
// crop directive name if it's too long
|
// crop directive name if it's too long
|
||||||
if (strlen($directive) < 14) {
|
if (!$this->compress || (strlen($directive) < $this->compress)) {
|
||||||
$directive_disp = $directive;
|
$directive_disp = $directive;
|
||||||
} else {
|
} else {
|
||||||
$directive_disp = substr($directive, 0, 12) . '...';
|
$directive_disp = substr($directive, 0, $this->compress - 2) . '...';
|
||||||
$attr['title'] = $directive;
|
$attr['title'] = $directive;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -176,6 +214,8 @@ class HTMLPurifier_Printer_ConfigForm_NullDecorator extends HTMLPurifier_Printer
|
|||||||
* Swiss-army knife configuration form field printer
|
* Swiss-army knife configuration form field printer
|
||||||
*/
|
*/
|
||||||
class HTMLPurifier_Printer_ConfigForm_default extends HTMLPurifier_Printer {
|
class HTMLPurifier_Printer_ConfigForm_default extends HTMLPurifier_Printer {
|
||||||
|
var $cols = 18;
|
||||||
|
var $rows = 5;
|
||||||
function render($ns, $directive, $value, $name, $config) {
|
function render($ns, $directive, $value, $name, $config) {
|
||||||
$this->prepareGenerator($config);
|
$this->prepareGenerator($config);
|
||||||
// this should probably be split up a little
|
// this should probably be split up a little
|
||||||
@@ -190,12 +230,12 @@ class HTMLPurifier_Printer_ConfigForm_default extends HTMLPurifier_Printer {
|
|||||||
$value[] = $val;
|
$value[] = $val;
|
||||||
}
|
}
|
||||||
case 'list':
|
case 'list':
|
||||||
$value = implode(',', $value);
|
$value = implode(PHP_EOL, $value);
|
||||||
break;
|
break;
|
||||||
case 'hash':
|
case 'hash':
|
||||||
$nvalue = '';
|
$nvalue = '';
|
||||||
foreach ($value as $i => $v) {
|
foreach ($value as $i => $v) {
|
||||||
$nvalue .= "$i:$v,";
|
$nvalue .= "$i:$v" . PHP_EOL;
|
||||||
}
|
}
|
||||||
$value = $nvalue;
|
$value = $nvalue;
|
||||||
break;
|
break;
|
||||||
@@ -220,6 +260,15 @@ class HTMLPurifier_Printer_ConfigForm_default extends HTMLPurifier_Printer {
|
|||||||
$ret .= $this->element('option', $val, $attr);
|
$ret .= $this->element('option', $val, $attr);
|
||||||
}
|
}
|
||||||
$ret .= $this->end('select');
|
$ret .= $this->end('select');
|
||||||
|
} elseif (
|
||||||
|
$def->type == 'text' || $def->type == 'itext' ||
|
||||||
|
$def->type == 'list' || $def->type == 'hash' || $def->type == 'lookup'
|
||||||
|
) {
|
||||||
|
$attr['cols'] = $this->cols;
|
||||||
|
$attr['rows'] = $this->rows;
|
||||||
|
$ret .= $this->start('textarea', $attr);
|
||||||
|
$ret .= $this->text($value);
|
||||||
|
$ret .= $this->end('textarea');
|
||||||
} else {
|
} else {
|
||||||
$attr['value'] = $value;
|
$attr['value'] = $value;
|
||||||
$attr['type'] = 'text';
|
$attr['type'] = 'text';
|
||||||
|
@@ -102,6 +102,7 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
|
|||||||
$ret .= $this->element('td', $this->listifyTagLookup($lookup));
|
$ret .= $this->element('td', $this->listifyTagLookup($lookup));
|
||||||
$ret .= $this->end('tr');
|
$ret .= $this->end('tr');
|
||||||
}
|
}
|
||||||
|
$ret .= $this->end('table');
|
||||||
return $ret;
|
return $ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -179,7 +180,8 @@ class HTMLPurifier_Printer_HTMLDefinition extends HTMLPurifier_Printer
|
|||||||
$def->validateChildren(array(), $this->config, $context);
|
$def->validateChildren(array(), $this->config, $context);
|
||||||
}
|
}
|
||||||
$elements = $def->elements;
|
$elements = $def->elements;
|
||||||
} elseif ($def->type == 'chameleon') {
|
}
|
||||||
|
if ($def->type == 'chameleon') {
|
||||||
$attr['rowspan'] = 2;
|
$attr['rowspan'] = 2;
|
||||||
} elseif ($def->type == 'empty') {
|
} elseif ($def->type == 'empty') {
|
||||||
$elements = array();
|
$elements = array();
|
||||||
|
@@ -195,7 +195,7 @@ class HTMLPurifier_Strategy_FixNesting extends HTMLPurifier_Strategy
|
|||||||
//################################################################//
|
//################################################################//
|
||||||
// Process result by interpreting $result
|
// Process result by interpreting $result
|
||||||
|
|
||||||
if ($result === true) {
|
if ($result === true || $child_tokens === $result) {
|
||||||
// leave the node as is
|
// leave the node as is
|
||||||
|
|
||||||
// register start token as a parental node start
|
// register start token as a parental node start
|
||||||
|
@@ -36,27 +36,22 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
$definition = $config->getHTMLDefinition();
|
$definition = $config->getHTMLDefinition();
|
||||||
|
|
||||||
// CurrentNesting
|
// local variables
|
||||||
$this->currentNesting = array();
|
|
||||||
$context->register('CurrentNesting', $this->currentNesting);
|
|
||||||
|
|
||||||
// InputIndex
|
|
||||||
$this->inputIndex = false;
|
|
||||||
$context->register('InputIndex', $this->inputIndex);
|
|
||||||
|
|
||||||
// InputTokens
|
|
||||||
$context->register('InputTokens', $tokens);
|
|
||||||
$this->inputTokens =& $tokens;
|
|
||||||
|
|
||||||
// OutputTokens
|
|
||||||
$result = array();
|
$result = array();
|
||||||
|
$generator = new HTMLPurifier_Generator();
|
||||||
|
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
||||||
|
$e =& $context->get('ErrorCollector', true);
|
||||||
|
|
||||||
|
// member variables
|
||||||
|
$this->currentNesting = array();
|
||||||
|
$this->inputIndex = false;
|
||||||
|
$this->inputTokens =& $tokens;
|
||||||
$this->outputTokens =& $result;
|
$this->outputTokens =& $result;
|
||||||
|
|
||||||
// %Core.EscapeInvalidTags
|
// context variables
|
||||||
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
$context->register('CurrentNesting', $this->currentNesting);
|
||||||
$generator = new HTMLPurifier_Generator();
|
$context->register('InputIndex', $this->inputIndex);
|
||||||
|
$context->register('InputTokens', $tokens);
|
||||||
$e =& $context->get('ErrorCollector', true);
|
|
||||||
|
|
||||||
// -- begin INJECTOR --
|
// -- begin INJECTOR --
|
||||||
|
|
||||||
@@ -67,7 +62,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
unset($injectors['Custom']); // special case
|
unset($injectors['Custom']); // special case
|
||||||
foreach ($injectors as $injector => $b) {
|
foreach ($injectors as $injector => $b) {
|
||||||
$injector = "HTMLPurifier_Injector_$injector";
|
$injector = "HTMLPurifier_Injector_$injector";
|
||||||
if ($b) $this->injectors[] = new $injector;
|
if (!$b) continue;
|
||||||
|
$this->injectors[] = new $injector;
|
||||||
}
|
}
|
||||||
foreach ($custom_injectors as $injector) {
|
foreach ($custom_injectors as $injector) {
|
||||||
if (is_string($injector)) {
|
if (is_string($injector)) {
|
||||||
@@ -87,9 +83,17 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
// give the injectors references to the definition and context
|
// give the injectors references to the definition and context
|
||||||
// variables for performance reasons
|
// variables for performance reasons
|
||||||
foreach ($this->injectors as $i => $x) {
|
foreach ($this->injectors as $i => $x) {
|
||||||
$this->injectors[$i]->prepare($config, $context);
|
$error = $this->injectors[$i]->prepare($config, $context);
|
||||||
|
if (!$error) continue;
|
||||||
|
list($injector) = array_splice($this->injectors, $i, 1);
|
||||||
|
$name = $injector->name;
|
||||||
|
trigger_error("Cannot enable $name injector because $error is not allowed", E_USER_WARNING);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// warning: most foreach loops follow the convention $i => $x.
|
||||||
|
// be sure, for PHP4 compatibility, to only perform write operations
|
||||||
|
// directly referencing the object using $i: $x is only safe for reads
|
||||||
|
|
||||||
// -- end INJECTOR --
|
// -- end INJECTOR --
|
||||||
|
|
||||||
$token = false;
|
$token = false;
|
||||||
@@ -100,6 +104,8 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
// if all goes well, this token will be passed through unharmed
|
// if all goes well, this token will be passed through unharmed
|
||||||
$token = $tokens[$this->inputIndex];
|
$token = $tokens[$this->inputIndex];
|
||||||
|
|
||||||
|
//printTokens($tokens, $this->inputIndex);
|
||||||
|
|
||||||
foreach ($this->injectors as $i => $x) {
|
foreach ($this->injectors as $i => $x) {
|
||||||
if ($x->skip > 0) $this->injectors[$i]->skip--;
|
if ($x->skip > 0) $this->injectors[$i]->skip--;
|
||||||
}
|
}
|
||||||
@@ -109,7 +115,7 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
if ($token->type === 'text') {
|
if ($token->type === 'text') {
|
||||||
// injector handler code; duplicated for performance reasons
|
// injector handler code; duplicated for performance reasons
|
||||||
foreach ($this->injectors as $i => $x) {
|
foreach ($this->injectors as $i => $x) {
|
||||||
if (!$x->skip) $x->handleText($token, $config, $context);
|
if (!$x->skip) $this->injectors[$i]->handleText($token);
|
||||||
if (is_array($token)) {
|
if (is_array($token)) {
|
||||||
$this->currentInjector = $i;
|
$this->currentInjector = $i;
|
||||||
break;
|
break;
|
||||||
@@ -122,26 +128,24 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
$info = $definition->info[$token->name]->child;
|
$info = $definition->info[$token->name]->child;
|
||||||
|
|
||||||
// quick checks:
|
// quick tag checks: anything that's *not* an end tag
|
||||||
// test if it claims to be a start tag but is empty
|
$ok = false;
|
||||||
if ($info->type == 'empty' && $token->type == 'start') {
|
if ($info->type == 'empty' && $token->type == 'start') {
|
||||||
$result[] = new HTMLPurifier_Token_Empty($token->name, $token->attr);
|
// test if it claims to be a start tag but is empty
|
||||||
continue;
|
$token = new HTMLPurifier_Token_Empty($token->name, $token->attr);
|
||||||
}
|
$ok = true;
|
||||||
// test if it claims to be empty but really is a start tag
|
} elseif ($info->type != 'empty' && $token->type == 'empty' ) {
|
||||||
if ($info->type != 'empty' && $token->type == 'empty' ) {
|
// claims to be empty but really is a start tag
|
||||||
$result[] = new HTMLPurifier_Token_Start($token->name, $token->attr);
|
$token = array(
|
||||||
$result[] = new HTMLPurifier_Token_End($token->name);
|
new HTMLPurifier_Token_Start($token->name, $token->attr),
|
||||||
continue;
|
new HTMLPurifier_Token_End($token->name)
|
||||||
}
|
);
|
||||||
// automatically insert empty tags
|
$ok = true;
|
||||||
if ($token->type == 'empty') {
|
} elseif ($token->type == 'empty') {
|
||||||
$result[] = $token;
|
// real empty token
|
||||||
continue;
|
$ok = true;
|
||||||
}
|
} elseif ($token->type == 'start') {
|
||||||
|
// start tag
|
||||||
// start tags have precedence, so they get passed through...
|
|
||||||
if ($token->type == 'start') {
|
|
||||||
|
|
||||||
// ...unless they also have to close their parent
|
// ...unless they also have to close their parent
|
||||||
if (!empty($this->currentNesting)) {
|
if (!empty($this->currentNesting)) {
|
||||||
@@ -163,16 +167,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
$this->currentNesting[] = $parent; // undo the pop
|
$this->currentNesting[] = $parent; // undo the pop
|
||||||
}
|
}
|
||||||
|
$ok = true;
|
||||||
|
}
|
||||||
|
|
||||||
// injector handler code; duplicated for performance reasons
|
// injector handler code; duplicated for performance reasons
|
||||||
|
if ($ok) {
|
||||||
foreach ($this->injectors as $i => $x) {
|
foreach ($this->injectors as $i => $x) {
|
||||||
if (!$x->skip) $x->handleStart($token, $config, $context);
|
if (!$x->skip) $this->injectors[$i]->handleElement($token);
|
||||||
if (is_array($token)) {
|
if (is_array($token)) {
|
||||||
$this->currentInjector = $i;
|
$this->currentInjector = $i;
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
$this->processToken($token, $config, $context);
|
$this->processToken($token, $config, $context);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
@@ -197,6 +203,9 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
$current_parent = array_pop($this->currentNesting);
|
$current_parent = array_pop($this->currentNesting);
|
||||||
if ($current_parent->name == $token->name) {
|
if ($current_parent->name == $token->name) {
|
||||||
$result[] = $token;
|
$result[] = $token;
|
||||||
|
foreach ($this->injectors as $i => $x) {
|
||||||
|
$this->injectors[$i]->notifyEnd($token);
|
||||||
|
}
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -233,15 +242,15 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
// okay, we found it, close all the skipped tags
|
// okay, we found it, close all the skipped tags
|
||||||
// note that skipped tags contains the element we need closed
|
// note that skipped tags contains the element we need closed
|
||||||
$size = count($skipped_tags);
|
for ($i = count($skipped_tags) - 1; $i >= 0; $i--) {
|
||||||
for ($i = $size - 1; $i > 0; $i--) {
|
if ($i && $e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
||||||
if ($e && !isset($skipped_tags[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
|
||||||
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$i]);
|
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by element end', $skipped_tags[$i]);
|
||||||
}
|
}
|
||||||
$result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
$result[] = $new_token = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
||||||
|
foreach ($this->injectors as $j => $x) { // $j, not $i!!!
|
||||||
|
$this->injectors[$j]->notifyEnd($new_token);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
$result[] = new HTMLPurifier_Token_End($skipped_tags[$i]->name);
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -250,17 +259,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
$context->destroy('InputIndex');
|
$context->destroy('InputIndex');
|
||||||
$context->destroy('CurrentToken');
|
$context->destroy('CurrentToken');
|
||||||
|
|
||||||
// we're at the end now, fix all still unclosed tags
|
// we're at the end now, fix all still unclosed tags (this is
|
||||||
// not using processToken() because at this point we don't
|
// duplicated from the end of the loop with some slight modifications)
|
||||||
// care about current nesting
|
// not using $skipped_tags since it would invariably be all of them
|
||||||
if (!empty($this->currentNesting)) {
|
if (!empty($this->currentNesting)) {
|
||||||
$size = count($this->currentNesting);
|
for ($i = count($this->currentNesting) - 1; $i >= 0; $i--) {
|
||||||
for ($i = $size - 1; $i >= 0; $i--) {
|
|
||||||
if ($e && !isset($this->currentNesting[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
if ($e && !isset($this->currentNesting[$i]->armor['MakeWellFormed_TagClosedError'])) {
|
||||||
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $this->currentNesting[$i]);
|
$e->send(E_NOTICE, 'Strategy_MakeWellFormed: Tag closed by document end', $this->currentNesting[$i]);
|
||||||
}
|
}
|
||||||
$result[] =
|
$result[] = $new_token = new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
|
||||||
new HTMLPurifier_Token_End($this->currentNesting[$i]->name);
|
foreach ($this->injectors as $j => $x) { // $j, not $i!!!
|
||||||
|
$this->injectors[$j]->notifyEnd($new_token);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -280,10 +290,18 @@ class HTMLPurifier_Strategy_MakeWellFormed extends HTMLPurifier_Strategy
|
|||||||
array_splice($this->inputTokens, $this->inputIndex--, 1, $token);
|
array_splice($this->inputTokens, $this->inputIndex--, 1, $token);
|
||||||
|
|
||||||
// adjust the injector skips based on the array substitution
|
// adjust the injector skips based on the array substitution
|
||||||
$offset = count($token) + 1;
|
if ($this->injectors) {
|
||||||
|
$offset = count($token);
|
||||||
for ($i = 0; $i <= $this->currentInjector; $i++) {
|
for ($i = 0; $i <= $this->currentInjector; $i++) {
|
||||||
|
// because of the skip back, we need to add one more
|
||||||
|
// for uninitialized injectors. I'm not exactly
|
||||||
|
// sure why this is the case, but I think it has to
|
||||||
|
// do with the fact that we're decrementing skips
|
||||||
|
// before re-checking text
|
||||||
|
if (!$this->injectors[$i]->skip) $this->injectors[$i]->skip++;
|
||||||
$this->injectors[$i]->skip += $offset;
|
$this->injectors[$i]->skip += $offset;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
} elseif ($token) {
|
} elseif ($token) {
|
||||||
// regular case
|
// regular case
|
||||||
$this->outputTokens[] = $token;
|
$this->outputTokens[] = $token;
|
||||||
|
@@ -8,19 +8,38 @@ require_once 'HTMLPurifier/TagTransform.php';
|
|||||||
require_once 'HTMLPurifier/AttrValidator.php';
|
require_once 'HTMLPurifier/AttrValidator.php';
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'Core', 'RemoveInvalidImg', true, 'bool',
|
'Core', 'RemoveInvalidImg', true, 'bool', '
|
||||||
'This directive enables pre-emptive URI checking in <code>img</code> '.
|
<p>
|
||||||
'tags, as the attribute validation strategy is not authorized to '.
|
This directive enables pre-emptive URI checking in <code>img</code>
|
||||||
'remove elements from the document. This directive has been available '.
|
tags, as the attribute validation strategy is not authorized to
|
||||||
'since 1.3.0, revert to pre-1.3.0 behavior by setting to false.'
|
remove elements from the document. This directive has been available
|
||||||
|
since 1.3.0, revert to pre-1.3.0 behavior by setting to false.
|
||||||
|
</p>
|
||||||
|
'
|
||||||
);
|
);
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'Core', 'RemoveScriptContents', true, 'bool', '
|
'Core', 'RemoveScriptContents', null, 'bool/null', '
|
||||||
<p>
|
<p>
|
||||||
This directive enables HTML Purifier to remove not only script tags
|
This directive enables HTML Purifier to remove not only script tags
|
||||||
but all of their contents. This directive has been available since 2.0.0,
|
but all of their contents. This directive has been deprecated since 2.1.0,
|
||||||
revert to pre-2.0.0 behavior by setting to false.
|
and when not set the value of %Core.HiddenElements will take
|
||||||
|
precedence. This directive has been available since 2.0.0, and can be used to
|
||||||
|
revert to pre-2.0.0 behavior by setting it to false.
|
||||||
|
</p>
|
||||||
|
'
|
||||||
|
);
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'Core', 'HiddenElements', array('script' => true, 'style' => true), 'lookup', '
|
||||||
|
<p>
|
||||||
|
This directive is a lookup array of elements which should have their
|
||||||
|
contents removed when they are not allowed by the HTML definition.
|
||||||
|
For example, the contents of a <code>script</code> tag are not
|
||||||
|
normally shown in a document, so if script tags are to be removed,
|
||||||
|
their contents should be removed to. This is opposed to a <code>b</code>
|
||||||
|
tag, which defines some presentational changes but does not hide its
|
||||||
|
contents.
|
||||||
</p>
|
</p>
|
||||||
'
|
'
|
||||||
);
|
);
|
||||||
@@ -43,7 +62,16 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
$escape_invalid_tags = $config->get('Core', 'EscapeInvalidTags');
|
||||||
$remove_invalid_img = $config->get('Core', 'RemoveInvalidImg');
|
$remove_invalid_img = $config->get('Core', 'RemoveInvalidImg');
|
||||||
|
|
||||||
$remove_script_contents = $config->get('Core', 'RemoveScriptContents');
|
$remove_script_contents = $config->get('Core', 'RemoveScriptContents');
|
||||||
|
$hidden_elements = $config->get('Core', 'HiddenElements');
|
||||||
|
|
||||||
|
// remove script contents compatibility
|
||||||
|
if ($remove_script_contents === true) {
|
||||||
|
$hidden_elements['script'] = true;
|
||||||
|
} elseif ($remove_script_contents === false && isset($hidden_elements['script'])) {
|
||||||
|
unset($hidden_elements['script']);
|
||||||
|
}
|
||||||
|
|
||||||
$attr_validator = new HTMLPurifier_AttrValidator();
|
$attr_validator = new HTMLPurifier_AttrValidator();
|
||||||
|
|
||||||
@@ -88,6 +116,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
|||||||
// mostly everything's good, but
|
// mostly everything's good, but
|
||||||
// we need to make sure required attributes are in order
|
// we need to make sure required attributes are in order
|
||||||
if (
|
if (
|
||||||
|
($token->type === 'start' || $token->type === 'empty') &&
|
||||||
$definition->info[$token->name]->required_attr &&
|
$definition->info[$token->name]->required_attr &&
|
||||||
($token->name != 'img' || $remove_invalid_img) // ensure config option still works
|
($token->name != 'img' || $remove_invalid_img) // ensure config option still works
|
||||||
) {
|
) {
|
||||||
@@ -106,8 +135,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
|||||||
$token->armor['ValidateAttributes'] = true;
|
$token->armor['ValidateAttributes'] = true;
|
||||||
}
|
}
|
||||||
|
|
||||||
// CAN BE GENERICIZED
|
if (isset($hidden_elements[$token->name]) && $token->type == 'start') {
|
||||||
if ($token->name == 'script' && $token->type == 'start') {
|
|
||||||
$textify_comments = $token->name;
|
$textify_comments = $token->name;
|
||||||
} elseif ($token->name === $textify_comments && $token->type == 'end') {
|
} elseif ($token->name === $textify_comments && $token->type == 'end') {
|
||||||
$textify_comments = false;
|
$textify_comments = false;
|
||||||
@@ -122,7 +150,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
|||||||
} else {
|
} else {
|
||||||
// check if we need to destroy all of the tag's children
|
// check if we need to destroy all of the tag's children
|
||||||
// CAN BE GENERICIZED
|
// CAN BE GENERICIZED
|
||||||
if ($token->name == 'script' && $remove_script_contents) {
|
if (isset($hidden_elements[$token->name])) {
|
||||||
if ($token->type == 'start') {
|
if ($token->type == 'start') {
|
||||||
$remove_until = $token->name;
|
$remove_until = $token->name;
|
||||||
} elseif ($token->type == 'empty') {
|
} elseif ($token->type == 'empty') {
|
||||||
@@ -130,7 +158,7 @@ class HTMLPurifier_Strategy_RemoveForeignElements extends HTMLPurifier_Strategy
|
|||||||
} else {
|
} else {
|
||||||
$remove_until = false;
|
$remove_until = false;
|
||||||
}
|
}
|
||||||
if ($e) $e->send(E_ERROR, 'Strategy_RemoveForeignElements: Script removed');
|
if ($e) $e->send(E_ERROR, 'Strategy_RemoveForeignElements: Foreign meta element removed');
|
||||||
} else {
|
} else {
|
||||||
if ($e) $e->send(E_ERROR, 'Strategy_RemoveForeignElements: Foreign element removed');
|
if ($e) $e->send(E_ERROR, 'Strategy_RemoveForeignElements: Foreign element removed');
|
||||||
}
|
}
|
||||||
|
@@ -6,10 +6,6 @@ require_once 'HTMLPurifier/IDAccumulator.php';
|
|||||||
|
|
||||||
require_once 'HTMLPurifier/AttrValidator.php';
|
require_once 'HTMLPurifier/AttrValidator.php';
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
|
||||||
'Attr', 'IDBlacklist', array(), 'list',
|
|
||||||
'Array of IDs not allowed in the document.');
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Validate all attributes in the tokens.
|
* Validate all attributes in the tokens.
|
||||||
*/
|
*/
|
||||||
@@ -19,11 +15,6 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
function execute($tokens, $config, &$context) {
|
function execute($tokens, $config, &$context) {
|
||||||
|
|
||||||
// setup id_accumulator context
|
|
||||||
$id_accumulator = new HTMLPurifier_IDAccumulator();
|
|
||||||
$id_accumulator->load($config->get('Attr', 'IDBlacklist'));
|
|
||||||
$context->register('IDAccumulator', $id_accumulator);
|
|
||||||
|
|
||||||
// setup validator
|
// setup validator
|
||||||
$validator = new HTMLPurifier_AttrValidator();
|
$validator = new HTMLPurifier_AttrValidator();
|
||||||
|
|
||||||
@@ -44,8 +35,7 @@ class HTMLPurifier_Strategy_ValidateAttributes extends HTMLPurifier_Strategy
|
|||||||
|
|
||||||
$tokens[$key] = $token; // for PHP 4
|
$tokens[$key] = $token; // for PHP 4
|
||||||
}
|
}
|
||||||
|
$context->destroy('CurrentToken');
|
||||||
$context->destroy('IDAccumulator');
|
|
||||||
|
|
||||||
return $tokens;
|
return $tokens;
|
||||||
}
|
}
|
||||||
|
119
library/HTMLPurifier/URI.php
Normal file
119
library/HTMLPurifier/URI.php
Normal file
@@ -0,0 +1,119 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIParser.php';
|
||||||
|
require_once 'HTMLPurifier/URIFilter.php';
|
||||||
|
|
||||||
|
/**
|
||||||
|
* HTML Purifier's internal representation of a URI
|
||||||
|
*/
|
||||||
|
class HTMLPurifier_URI
|
||||||
|
{
|
||||||
|
|
||||||
|
var $scheme, $userinfo, $host, $port, $path, $query, $fragment;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* @note Automatically normalizes scheme and port
|
||||||
|
*/
|
||||||
|
function HTMLPurifier_URI($scheme, $userinfo, $host, $port, $path, $query, $fragment) {
|
||||||
|
$this->scheme = is_null($scheme) || ctype_lower($scheme) ? $scheme : strtolower($scheme);
|
||||||
|
$this->userinfo = $userinfo;
|
||||||
|
$this->host = $host;
|
||||||
|
$this->port = is_null($port) ? $port : (int) $port;
|
||||||
|
$this->path = $path;
|
||||||
|
$this->query = $query;
|
||||||
|
$this->fragment = $fragment;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Retrieves a scheme object corresponding to the URI's scheme/default
|
||||||
|
* @param $config Instance of HTMLPurifier_Config
|
||||||
|
* @param $context Instance of HTMLPurifier_Context
|
||||||
|
* @return Scheme object appropriate for validating this URI
|
||||||
|
*/
|
||||||
|
function getSchemeObj($config, &$context) {
|
||||||
|
$registry =& HTMLPurifier_URISchemeRegistry::instance();
|
||||||
|
if ($this->scheme !== null) {
|
||||||
|
$scheme_obj = $registry->getScheme($this->scheme, $config, $context);
|
||||||
|
if (!$scheme_obj) return false; // invalid scheme, clean it out
|
||||||
|
} else {
|
||||||
|
// no scheme: retrieve the default one
|
||||||
|
$def = $config->getDefinition('URI');
|
||||||
|
$scheme_obj = $registry->getScheme($def->defaultScheme, $config, $context);
|
||||||
|
if (!$scheme_obj) {
|
||||||
|
// something funky happened to the default scheme object
|
||||||
|
trigger_error(
|
||||||
|
'Default scheme object "' . $def->defaultScheme . '" was not readable',
|
||||||
|
E_USER_WARNING
|
||||||
|
);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return $scheme_obj;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generic validation method applicable for all schemes
|
||||||
|
* @param $config Instance of HTMLPurifier_Config
|
||||||
|
* @param $context Instance of HTMLPurifier_Context
|
||||||
|
* @return True if validation/filtering succeeds, false if failure
|
||||||
|
*/
|
||||||
|
function validate($config, &$context) {
|
||||||
|
|
||||||
|
// validate host
|
||||||
|
if (!is_null($this->host)) {
|
||||||
|
$host_def = new HTMLPurifier_AttrDef_URI_Host();
|
||||||
|
$this->host = $host_def->validate($this->host, $config, $context);
|
||||||
|
if ($this->host === false) $this->host = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// validate port
|
||||||
|
if (!is_null($this->port)) {
|
||||||
|
if ($this->port < 1 || $this->port > 65535) $this->port = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// query and fragment are quite simple in terms of definition:
|
||||||
|
// *( pchar / "/" / "?" ), so define their validation routines
|
||||||
|
// when we start fixing percent encoding
|
||||||
|
|
||||||
|
// path gets to be validated against a hodge-podge of rules depending
|
||||||
|
// on the status of authority and scheme, but it's not that important,
|
||||||
|
// esp. since it won't be applicable to everyone
|
||||||
|
|
||||||
|
return true;
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Convert URI back to string
|
||||||
|
* @return String URI appropriate for output
|
||||||
|
*/
|
||||||
|
function toString() {
|
||||||
|
// reconstruct authority
|
||||||
|
$authority = null;
|
||||||
|
if (!is_null($this->host)) {
|
||||||
|
$authority = '';
|
||||||
|
if(!is_null($this->userinfo)) $authority .= $this->userinfo . '@';
|
||||||
|
$authority .= $this->host;
|
||||||
|
if(!is_null($this->port)) $authority .= ':' . $this->port;
|
||||||
|
}
|
||||||
|
|
||||||
|
// reconstruct the result
|
||||||
|
$result = '';
|
||||||
|
if (!is_null($this->scheme)) $result .= $this->scheme . ':';
|
||||||
|
if (!is_null($authority)) $result .= '//' . $authority;
|
||||||
|
$result .= $this->path;
|
||||||
|
if (!is_null($this->query)) $result .= '?' . $this->query;
|
||||||
|
if (!is_null($this->fragment)) $result .= '#' . $this->fragment;
|
||||||
|
|
||||||
|
return $result;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Returns a copy of the URI object
|
||||||
|
*/
|
||||||
|
function copy() {
|
||||||
|
return unserialize(serialize($this));
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
145
library/HTMLPurifier/URIDefinition.php
Normal file
145
library/HTMLPurifier/URIDefinition.php
Normal file
@@ -0,0 +1,145 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/Definition.php';
|
||||||
|
require_once 'HTMLPurifier/URIFilter.php';
|
||||||
|
require_once 'HTMLPurifier/URIParser.php';
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIFilter/DisableExternal.php';
|
||||||
|
require_once 'HTMLPurifier/URIFilter/DisableExternalResources.php';
|
||||||
|
require_once 'HTMLPurifier/URIFilter/HostBlacklist.php';
|
||||||
|
require_once 'HTMLPurifier/URIFilter/MakeAbsolute.php';
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DefinitionID', null, 'string/null', '
|
||||||
|
<p>
|
||||||
|
Unique identifier for a custom-built URI definition. If you want
|
||||||
|
to add custom URIFilters, you must specify this value.
|
||||||
|
This directive has been available since 2.1.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DefinitionRev', 1, 'int', '
|
||||||
|
<p>
|
||||||
|
Revision identifier for your custom definition. See
|
||||||
|
%HTML.DefinitionRev for details. This directive has been available
|
||||||
|
since 2.1.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
// informative URI directives
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DefaultScheme', 'http', 'string', '
|
||||||
|
<p>
|
||||||
|
Defines through what scheme the output will be served, in order to
|
||||||
|
select the proper object validator when no scheme information is present.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'Host', null, 'string/null', '
|
||||||
|
<p>
|
||||||
|
Defines the domain name of the server, so we can determine whether or
|
||||||
|
an absolute URI is from your website or not. Not strictly necessary,
|
||||||
|
as users should be using relative URIs to reference resources on your
|
||||||
|
website. It will, however, let you use absolute URIs to link to
|
||||||
|
subdomains of the domain you post here: i.e. example.com will allow
|
||||||
|
sub.example.com. However, higher up domains will still be excluded:
|
||||||
|
if you set %URI.Host to sub.example.com, example.com will be blocked.
|
||||||
|
<strong>Note:</strong> This directive overrides %URI.Base because
|
||||||
|
a given page may be on a sub-domain, but you wish HTML Purifier to be
|
||||||
|
more relaxed and allow some of the parent domains too.
|
||||||
|
This directive has been available since 1.2.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'Base', null, 'string/null', '
|
||||||
|
<p>
|
||||||
|
The base URI is the URI of the document this purified HTML will be
|
||||||
|
inserted into. This information is important if HTML Purifier needs
|
||||||
|
to calculate absolute URIs from relative URIs, such as when %URI.MakeAbsolute
|
||||||
|
is on. You may use a non-absolute URI for this value, but behavior
|
||||||
|
may vary (%URI.MakeAbsolute deals nicely with both absolute and
|
||||||
|
relative paths, but forwards-compatibility is not guaranteed).
|
||||||
|
<strong>Warning:</strong> If set, the scheme on this URI
|
||||||
|
overrides the one specified by %URI.DefaultScheme. This directive has
|
||||||
|
been available since 2.1.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
class HTMLPurifier_URIDefinition extends HTMLPurifier_Definition
|
||||||
|
{
|
||||||
|
|
||||||
|
var $type = 'URI';
|
||||||
|
var $filters = array();
|
||||||
|
var $registeredFilters = array();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* HTMLPurifier_URI object of the base specified at %URI.Base
|
||||||
|
*/
|
||||||
|
var $base;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* String host to consider "home" base
|
||||||
|
*/
|
||||||
|
var $host;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Name of default scheme based on %URI.DefaultScheme and %URI.Base
|
||||||
|
*/
|
||||||
|
var $defaultScheme;
|
||||||
|
|
||||||
|
function HTMLPurifier_URIDefinition() {
|
||||||
|
$this->registerFilter(new HTMLPurifier_URIFilter_DisableExternal());
|
||||||
|
$this->registerFilter(new HTMLPurifier_URIFilter_DisableExternalResources());
|
||||||
|
$this->registerFilter(new HTMLPurifier_URIFilter_HostBlacklist());
|
||||||
|
$this->registerFilter(new HTMLPurifier_URIFilter_MakeAbsolute());
|
||||||
|
}
|
||||||
|
|
||||||
|
function registerFilter($filter) {
|
||||||
|
$this->registeredFilters[$filter->name] = $filter;
|
||||||
|
}
|
||||||
|
|
||||||
|
function addFilter($filter, $config) {
|
||||||
|
$filter->prepare($config);
|
||||||
|
$this->filters[$filter->name] = $filter;
|
||||||
|
}
|
||||||
|
|
||||||
|
function doSetup($config) {
|
||||||
|
$this->setupMemberVariables($config);
|
||||||
|
$this->setupFilters($config);
|
||||||
|
}
|
||||||
|
|
||||||
|
function setupFilters($config) {
|
||||||
|
foreach ($this->registeredFilters as $name => $filter) {
|
||||||
|
$conf = $config->get('URI', $name);
|
||||||
|
if ($conf !== false && $conf !== null) {
|
||||||
|
$this->addFilter($filter, $config);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
unset($this->registeredFilters);
|
||||||
|
}
|
||||||
|
|
||||||
|
function setupMemberVariables($config) {
|
||||||
|
$this->host = $config->get('URI', 'Host');
|
||||||
|
$base_uri = $config->get('URI', 'Base');
|
||||||
|
if (!is_null($base_uri)) {
|
||||||
|
$parser = new HTMLPurifier_URIParser();
|
||||||
|
$this->base = $parser->parse($base_uri);
|
||||||
|
$this->defaultScheme = $this->base->scheme;
|
||||||
|
if (is_null($this->host)) $this->host = $this->base->host;
|
||||||
|
}
|
||||||
|
if (is_null($this->defaultScheme)) $this->defaultScheme = $config->get('URI', 'DefaultScheme');
|
||||||
|
}
|
||||||
|
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
foreach ($this->filters as $name => $x) {
|
||||||
|
$result = $this->filters[$name]->filter($uri, $config, $context);
|
||||||
|
if (!$result) return false;
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
40
library/HTMLPurifier/URIFilter.php
Normal file
40
library/HTMLPurifier/URIFilter.php
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Chainable filters for custom URI processing.
|
||||||
|
*
|
||||||
|
* These filters can perform custom actions on a URI filter object,
|
||||||
|
* including transformation or blacklisting.
|
||||||
|
*
|
||||||
|
* @warning This filter is called before scheme object validation occurs.
|
||||||
|
* Make sure, if you require a specific scheme object, you
|
||||||
|
* you check that it exists. This allows filters to convert
|
||||||
|
* proprietary URI schemes into regular ones.
|
||||||
|
*/
|
||||||
|
class HTMLPurifier_URIFilter
|
||||||
|
{
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Unique identifier of filter
|
||||||
|
*/
|
||||||
|
var $name;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Performs initialization for the filter
|
||||||
|
*/
|
||||||
|
function prepare($config) {}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Filter a URI object
|
||||||
|
* @param &$uri Reference to URI object
|
||||||
|
* @param $config Instance of HTMLPurifier_Config
|
||||||
|
* @param &$context Instance of HTMLPurifier_Context
|
||||||
|
* @return bool Whether or not to continue processing: false indicates
|
||||||
|
* URL is no good, true indicates continue processing. Note that
|
||||||
|
* all changes are committed directly on the URI object
|
||||||
|
*/
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
trigger_error('Cannot call abstract function', E_USER_ERROR);
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
34
library/HTMLPurifier/URIFilter/DisableExternal.php
Normal file
34
library/HTMLPurifier/URIFilter/DisableExternal.php
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIFilter.php';
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DisableExternal', false, 'bool',
|
||||||
|
'Disables links to external websites. This is a highly effective '.
|
||||||
|
'anti-spam and anti-pagerank-leech measure, but comes at a hefty price: no'.
|
||||||
|
'links or images outside of your domain will be allowed. Non-linkified '.
|
||||||
|
'URIs will still be preserved. If you want to be able to link to '.
|
||||||
|
'subdomains or use absolute URIs, specify %URI.Host for your website. '.
|
||||||
|
'This directive has been available since 1.2.0.'
|
||||||
|
);
|
||||||
|
|
||||||
|
class HTMLPurifier_URIFilter_DisableExternal extends HTMLPurifier_URIFilter
|
||||||
|
{
|
||||||
|
var $name = 'DisableExternal';
|
||||||
|
var $ourHostParts = false;
|
||||||
|
function prepare($config) {
|
||||||
|
$our_host = $config->get('URI', 'Host');
|
||||||
|
if ($our_host !== null) $this->ourHostParts = array_reverse(explode('.', $our_host));
|
||||||
|
}
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
if (is_null($uri->host)) return true;
|
||||||
|
if ($this->ourHostParts === false) return false;
|
||||||
|
$host_parts = array_reverse(explode('.', $uri->host));
|
||||||
|
foreach ($this->ourHostParts as $i => $x) {
|
||||||
|
if (!isset($host_parts[$i])) return false;
|
||||||
|
if ($host_parts[$i] != $this->ourHostParts[$i]) return false;
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
26
library/HTMLPurifier/URIFilter/DisableExternalResources.php
Normal file
26
library/HTMLPurifier/URIFilter/DisableExternalResources.php
Normal file
@@ -0,0 +1,26 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIFilter/DisableExternal.php';
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'DisableExternalResources', false, 'bool',
|
||||||
|
'Disables the embedding of external resources, preventing users from '.
|
||||||
|
'embedding things like images from other hosts. This prevents '.
|
||||||
|
'access tracking (good for email viewers), bandwidth leeching, '.
|
||||||
|
'cross-site request forging, goatse.cx posting, and '.
|
||||||
|
'other nasties, but also results in '.
|
||||||
|
'a loss of end-user functionality (they can\'t directly post a pic '.
|
||||||
|
'they posted from Flickr anymore). Use it if you don\'t have a '.
|
||||||
|
'robust user-content moderation team. This directive has been '.
|
||||||
|
'available since 1.3.0.'
|
||||||
|
);
|
||||||
|
|
||||||
|
class HTMLPurifier_URIFilter_DisableExternalResources extends HTMLPurifier_URIFilter_DisableExternal
|
||||||
|
{
|
||||||
|
var $name = 'DisableExternalResources';
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
if (!$context->get('EmbeddedURI', true)) return true;
|
||||||
|
return parent::filter($uri, $config, $context);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
28
library/HTMLPurifier/URIFilter/HostBlacklist.php
Normal file
28
library/HTMLPurifier/URIFilter/HostBlacklist.php
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIFilter.php';
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'HostBlacklist', array(), 'list',
|
||||||
|
'List of strings that are forbidden in the host of any URI. Use it to '.
|
||||||
|
'kill domain names of spam, etc. Note that it will catch anything in '.
|
||||||
|
'the domain, so <tt>moo.com</tt> will catch <tt>moo.com.example.com</tt>. '.
|
||||||
|
'This directive has been available since 1.3.0.'
|
||||||
|
);
|
||||||
|
|
||||||
|
class HTMLPurifier_URIFilter_HostBlacklist extends HTMLPurifier_URIFilter
|
||||||
|
{
|
||||||
|
var $name = 'HostBlacklist';
|
||||||
|
var $blacklist = array();
|
||||||
|
function prepare($config) {
|
||||||
|
$this->blacklist = $config->get('URI', 'HostBlacklist');
|
||||||
|
}
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
foreach($this->blacklist as $blacklisted_host_fragment) {
|
||||||
|
if (strpos($uri->host, $blacklisted_host_fragment) !== false) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
119
library/HTMLPurifier/URIFilter/MakeAbsolute.php
Normal file
119
library/HTMLPurifier/URIFilter/MakeAbsolute.php
Normal file
@@ -0,0 +1,119 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
// does not support network paths
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIFilter.php';
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::define(
|
||||||
|
'URI', 'MakeAbsolute', false, 'bool', '
|
||||||
|
<p>
|
||||||
|
Converts all URIs into absolute forms. This is useful when the HTML
|
||||||
|
being filtered assumes a specific base path, but will actually be
|
||||||
|
viewed in a different context (and setting an alternate base URI is
|
||||||
|
not possible). %URI.Base must be set for this directive to work.
|
||||||
|
This directive has been available since 2.1.0.
|
||||||
|
</p>
|
||||||
|
');
|
||||||
|
|
||||||
|
class HTMLPurifier_URIFilter_MakeAbsolute extends HTMLPurifier_URIFilter
|
||||||
|
{
|
||||||
|
var $name = 'MakeAbsolute';
|
||||||
|
var $base;
|
||||||
|
var $basePathStack = array();
|
||||||
|
function prepare($config) {
|
||||||
|
$def = $config->getDefinition('URI');
|
||||||
|
$this->base = $def->base;
|
||||||
|
if (is_null($this->base)) {
|
||||||
|
trigger_error('URI.MakeAbsolute is being ignored due to lack of value for URI.Base configuration', E_USER_ERROR);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
$this->base->fragment = null; // fragment is invalid for base URI
|
||||||
|
$stack = explode('/', $this->base->path);
|
||||||
|
array_pop($stack); // discard last segment
|
||||||
|
$stack = $this->_collapseStack($stack); // do pre-parsing
|
||||||
|
$this->basePathStack = $stack;
|
||||||
|
}
|
||||||
|
function filter(&$uri, $config, &$context) {
|
||||||
|
if (is_null($this->base)) return true; // abort early
|
||||||
|
if (
|
||||||
|
$uri->path === '' && is_null($uri->scheme) &&
|
||||||
|
is_null($uri->host) && is_null($uri->query) && is_null($uri->fragment)
|
||||||
|
) {
|
||||||
|
// reference to current document
|
||||||
|
$uri = $this->base->copy();
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
if (!is_null($uri->scheme)) {
|
||||||
|
// absolute URI already: don't change
|
||||||
|
if (!is_null($uri->host)) return true;
|
||||||
|
$scheme_obj = $uri->getSchemeObj($config, $context);
|
||||||
|
if (!$scheme_obj) {
|
||||||
|
// scheme not recognized
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
if (!$scheme_obj->hierarchical) {
|
||||||
|
// non-hierarchal URI with explicit scheme, don't change
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
// special case: had a scheme but always is hierarchical and had no authority
|
||||||
|
}
|
||||||
|
if (!is_null($uri->host)) {
|
||||||
|
// network path, don't bother
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
if ($uri->path === '') {
|
||||||
|
$uri->path = $this->base->path;
|
||||||
|
}elseif ($uri->path[0] !== '/') {
|
||||||
|
// relative path, needs more complicated processing
|
||||||
|
$stack = explode('/', $uri->path);
|
||||||
|
$new_stack = array_merge($this->basePathStack, $stack);
|
||||||
|
$new_stack = $this->_collapseStack($new_stack);
|
||||||
|
$uri->path = implode('/', $new_stack);
|
||||||
|
}
|
||||||
|
// re-combine
|
||||||
|
$uri->scheme = $this->base->scheme;
|
||||||
|
if (is_null($uri->userinfo)) $uri->userinfo = $this->base->userinfo;
|
||||||
|
if (is_null($uri->host)) $uri->host = $this->base->host;
|
||||||
|
if (is_null($uri->port)) $uri->port = $this->base->port;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Resolve dots and double-dots in a path stack
|
||||||
|
* @private
|
||||||
|
*/
|
||||||
|
function _collapseStack($stack) {
|
||||||
|
$result = array();
|
||||||
|
for ($i = 0; isset($stack[$i]); $i++) {
|
||||||
|
$is_folder = false;
|
||||||
|
// absorb an internally duplicated slash
|
||||||
|
if ($stack[$i] == '' && $i && isset($stack[$i+1])) continue;
|
||||||
|
if ($stack[$i] == '..') {
|
||||||
|
if (!empty($result)) {
|
||||||
|
$segment = array_pop($result);
|
||||||
|
if ($segment === '' && empty($result)) {
|
||||||
|
// error case: attempted to back out too far:
|
||||||
|
// restore the leading slash
|
||||||
|
$result[] = '';
|
||||||
|
} elseif ($segment === '..') {
|
||||||
|
$result[] = '..'; // cannot remove .. with ..
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// relative path, preserve the double-dots
|
||||||
|
$result[] = '..';
|
||||||
|
}
|
||||||
|
$is_folder = true;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if ($stack[$i] == '.') {
|
||||||
|
// silently absorb
|
||||||
|
$is_folder = true;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
$result[] = $stack[$i];
|
||||||
|
}
|
||||||
|
if ($is_folder) $result[] = '';
|
||||||
|
return $result;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
62
library/HTMLPurifier/URIParser.php
Normal file
62
library/HTMLPurifier/URIParser.php
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URI.php';
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parses a URI into the components and fragment identifier as specified
|
||||||
|
* by RFC 2396.
|
||||||
|
* @todo Replace regexps with a native PHP parser
|
||||||
|
*/
|
||||||
|
class HTMLPurifier_URIParser
|
||||||
|
{
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Parses a URI
|
||||||
|
* @param $uri string URI to parse
|
||||||
|
* @return HTMLPurifier_URI representation of URI
|
||||||
|
*/
|
||||||
|
function parse($uri) {
|
||||||
|
$r_URI = '!'.
|
||||||
|
'(([^:/?#<>\'"]+):)?'. // 2. Scheme
|
||||||
|
'(//([^/?#<>\'"]*))?'. // 4. Authority
|
||||||
|
'([^?#<>\'"]*)'. // 5. Path
|
||||||
|
'(\?([^#<>\'"]*))?'. // 7. Query
|
||||||
|
'(#([^<>\'"]*))?'. // 8. Fragment
|
||||||
|
'!';
|
||||||
|
|
||||||
|
$matches = array();
|
||||||
|
$result = preg_match($r_URI, $uri, $matches);
|
||||||
|
|
||||||
|
if (!$result) return false; // *really* invalid URI
|
||||||
|
|
||||||
|
// seperate out parts
|
||||||
|
$scheme = !empty($matches[1]) ? $matches[2] : null;
|
||||||
|
$authority = !empty($matches[3]) ? $matches[4] : null;
|
||||||
|
$path = $matches[5]; // always present, can be empty
|
||||||
|
$query = !empty($matches[6]) ? $matches[7] : null;
|
||||||
|
$fragment = !empty($matches[8]) ? $matches[9] : null;
|
||||||
|
|
||||||
|
// further parse authority
|
||||||
|
if ($authority !== null) {
|
||||||
|
// ridiculously inefficient: it's a stacked regex!
|
||||||
|
$HEXDIG = '[A-Fa-f0-9]';
|
||||||
|
$unreserved = 'A-Za-z0-9-._~'; // make sure you wrap with []
|
||||||
|
$sub_delims = '!$&\'()'; // needs []
|
||||||
|
$pct_encoded = "%$HEXDIG$HEXDIG";
|
||||||
|
$r_userinfo = "(?:[$unreserved$sub_delims:]|$pct_encoded)*";
|
||||||
|
$r_authority = "/^(($r_userinfo)@)?(\[[^\]]+\]|[^:]*)(:(\d*))?/";
|
||||||
|
$matches = array();
|
||||||
|
preg_match($r_authority, $authority, $matches);
|
||||||
|
$userinfo = !empty($matches[1]) ? $matches[2] : null;
|
||||||
|
$host = !empty($matches[3]) ? $matches[3] : '';
|
||||||
|
$port = !empty($matches[4]) ? (int) $matches[5] : null;
|
||||||
|
} else {
|
||||||
|
$port = $host = $userinfo = null;
|
||||||
|
}
|
||||||
|
|
||||||
|
return new HTMLPurifier_URI(
|
||||||
|
$scheme, $userinfo, $host, $port, $path, $query, $fragment);
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
@@ -19,24 +19,24 @@ class HTMLPurifier_URIScheme
|
|||||||
*/
|
*/
|
||||||
var $browsable = false;
|
var $browsable = false;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Whether or not the URI always uses <hier_part>, resolves edge cases
|
||||||
|
* with making relative URIs absolute
|
||||||
|
*/
|
||||||
|
var $hierarchical = false;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Validates the components of a URI
|
* Validates the components of a URI
|
||||||
* @note This implementation should be called by children if they define
|
* @note This implementation should be called by children if they define
|
||||||
* a default port, as it does port processing.
|
* a default port, as it does port processing.
|
||||||
* @note Fragment is omitted as that is scheme independent
|
* @param $uri Instance of HTMLPurifier_URI
|
||||||
* @param $userinfo User info found before at sign in authority
|
|
||||||
* @param $host Hostname in authority
|
|
||||||
* @param $port Port found after colon in authority
|
|
||||||
* @param $path Path of URI
|
|
||||||
* @param $query Query of URI, found after question mark
|
|
||||||
* @param $config HTMLPurifier_Config object
|
* @param $config HTMLPurifier_Config object
|
||||||
* @param $context HTMLPurifier_Context object
|
* @param $context HTMLPurifier_Context object
|
||||||
|
* @return Bool success or failure
|
||||||
*/
|
*/
|
||||||
function validateComponents(
|
function validate(&$uri, $config, &$context) {
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
if ($this->default_port == $uri->port) $uri->port = null;
|
||||||
) {
|
return true;
|
||||||
if ($this->default_port == $port) $port = null;
|
|
||||||
return array($userinfo, $host, $port, $path, $query);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -9,35 +9,35 @@ class HTMLPurifier_URIScheme_ftp extends HTMLPurifier_URIScheme {
|
|||||||
|
|
||||||
var $default_port = 21;
|
var $default_port = 21;
|
||||||
var $browsable = true; // usually
|
var $browsable = true; // usually
|
||||||
|
var $hierarchical = true;
|
||||||
|
|
||||||
|
function validate(&$uri, $config, &$context) {
|
||||||
|
parent::validate($uri, $config, $context);
|
||||||
|
$uri->query = null;
|
||||||
|
|
||||||
function validateComponents(
|
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
|
||||||
) {
|
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
|
||||||
parent::validateComponents(
|
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context );
|
|
||||||
$semicolon_pos = strrpos($path, ';'); // reverse
|
|
||||||
if ($semicolon_pos !== false) {
|
|
||||||
// typecode check
|
// typecode check
|
||||||
$type = substr($path, $semicolon_pos + 1); // no semicolon
|
$semicolon_pos = strrpos($uri->path, ';'); // reverse
|
||||||
$path = substr($path, 0, $semicolon_pos);
|
if ($semicolon_pos !== false) {
|
||||||
|
$type = substr($uri->path, $semicolon_pos + 1); // no semicolon
|
||||||
|
$uri->path = substr($uri->path, 0, $semicolon_pos);
|
||||||
$type_ret = '';
|
$type_ret = '';
|
||||||
if (strpos($type, '=') !== false) {
|
if (strpos($type, '=') !== false) {
|
||||||
// figure out whether or not the declaration is correct
|
// figure out whether or not the declaration is correct
|
||||||
list($key, $typecode) = explode('=', $type, 2);
|
list($key, $typecode) = explode('=', $type, 2);
|
||||||
if ($key !== 'type') {
|
if ($key !== 'type') {
|
||||||
// invalid key, tack it back on encoded
|
// invalid key, tack it back on encoded
|
||||||
$path .= '%3B' . $type;
|
$uri->path .= '%3B' . $type;
|
||||||
} elseif ($typecode === 'a' || $typecode === 'i' || $typecode === 'd') {
|
} elseif ($typecode === 'a' || $typecode === 'i' || $typecode === 'd') {
|
||||||
$type_ret = ";type=$typecode";
|
$type_ret = ";type=$typecode";
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
$path .= '%3B' . $type;
|
$uri->path .= '%3B' . $type;
|
||||||
}
|
}
|
||||||
$path = str_replace(';', '%3B', $path);
|
$uri->path = str_replace(';', '%3B', $uri->path);
|
||||||
$path .= $type_ret;
|
$uri->path .= $type_ret;
|
||||||
}
|
}
|
||||||
return array($userinfo, $host, $port, $path, null);
|
|
||||||
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -9,14 +9,12 @@ class HTMLPurifier_URIScheme_http extends HTMLPurifier_URIScheme {
|
|||||||
|
|
||||||
var $default_port = 80;
|
var $default_port = 80;
|
||||||
var $browsable = true;
|
var $browsable = true;
|
||||||
|
var $hierarchical = true;
|
||||||
|
|
||||||
function validateComponents(
|
function validate(&$uri, $config, &$context) {
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
parent::validate($uri, $config, $context);
|
||||||
) {
|
$uri->userinfo = null;
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
return true;
|
||||||
parent::validateComponents(
|
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context );
|
|
||||||
return array(null, $host, $port, $path, $query);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -15,14 +15,13 @@ class HTMLPurifier_URIScheme_mailto extends HTMLPurifier_URIScheme {
|
|||||||
|
|
||||||
var $browsable = false;
|
var $browsable = false;
|
||||||
|
|
||||||
function validateComponents(
|
function validate(&$uri, $config, &$context) {
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
parent::validate($uri, $config, $context);
|
||||||
) {
|
$uri->userinfo = null;
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
$uri->host = null;
|
||||||
parent::validateComponents(
|
$uri->port = null;
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context );
|
|
||||||
// we need to validate path against RFC 2368's addr-spec
|
// we need to validate path against RFC 2368's addr-spec
|
||||||
return array(null, null, null, $path, $query);
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -9,14 +9,14 @@ class HTMLPurifier_URIScheme_news extends HTMLPurifier_URIScheme {
|
|||||||
|
|
||||||
var $browsable = false;
|
var $browsable = false;
|
||||||
|
|
||||||
function validateComponents(
|
function validate(&$uri, $config, &$context) {
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
parent::validate($uri, $config, $context);
|
||||||
) {
|
$uri->userinfo = null;
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
$uri->host = null;
|
||||||
parent::validateComponents(
|
$uri->port = null;
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context );
|
$uri->query = null;
|
||||||
// typecode check needed on path
|
// typecode check needed on path
|
||||||
return array(null, null, null, $path, null);
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -10,13 +10,11 @@ class HTMLPurifier_URIScheme_nntp extends HTMLPurifier_URIScheme {
|
|||||||
var $default_port = 119;
|
var $default_port = 119;
|
||||||
var $browsable = false;
|
var $browsable = false;
|
||||||
|
|
||||||
function validateComponents(
|
function validate(&$uri, $config, &$context) {
|
||||||
$userinfo, $host, $port, $path, $query, $config, &$context
|
parent::validate($uri, $config, $context);
|
||||||
) {
|
$uri->userinfo = null;
|
||||||
list($userinfo, $host, $port, $path, $query) =
|
$uri->query = null;
|
||||||
parent::validateComponents(
|
return true;
|
||||||
$userinfo, $host, $port, $path, $query, $config, $context );
|
|
||||||
return array(null, $host, $port, $path, null);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -1,5 +1,12 @@
|
|||||||
<?php
|
<?php
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/URIScheme/http.php';
|
||||||
|
require_once 'HTMLPurifier/URIScheme/https.php';
|
||||||
|
require_once 'HTMLPurifier/URIScheme/mailto.php';
|
||||||
|
require_once 'HTMLPurifier/URIScheme/ftp.php';
|
||||||
|
require_once 'HTMLPurifier/URIScheme/nntp.php';
|
||||||
|
require_once 'HTMLPurifier/URIScheme/news.php';
|
||||||
|
|
||||||
HTMLPurifier_ConfigSchema::define(
|
HTMLPurifier_ConfigSchema::define(
|
||||||
'URI', 'AllowedSchemes', array(
|
'URI', 'AllowedSchemes', array(
|
||||||
'http' => true, // "Hypertext Transfer Protocol", nuf' said
|
'http' => true, // "Hypertext Transfer Protocol", nuf' said
|
||||||
@@ -7,7 +14,6 @@ HTMLPurifier_ConfigSchema::define(
|
|||||||
// quite useful, but not necessary
|
// quite useful, but not necessary
|
||||||
'mailto' => true,// Email
|
'mailto' => true,// Email
|
||||||
'ftp' => true, // "File Transfer Protocol"
|
'ftp' => true, // "File Transfer Protocol"
|
||||||
'irc' => true, // "Internet Relay Chat", usually needs another app
|
|
||||||
// for Usenet, these two are similar, but distinct
|
// for Usenet, these two are similar, but distinct
|
||||||
'nntp' => true, // individual Netnews articles
|
'nntp' => true, // individual Netnews articles
|
||||||
'news' => true // newsgroup or individual Netnews articles
|
'news' => true // newsgroup or individual Netnews articles
|
||||||
@@ -54,12 +60,6 @@ class HTMLPurifier_URISchemeRegistry
|
|||||||
*/
|
*/
|
||||||
var $schemes = array();
|
var $schemes = array();
|
||||||
|
|
||||||
/**
|
|
||||||
* Directory where scheme objects can be found
|
|
||||||
* @private
|
|
||||||
*/
|
|
||||||
var $_scheme_dir = null;
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Retrieves a scheme validator object
|
* Retrieves a scheme validator object
|
||||||
* @param $scheme String scheme name like http or mailto
|
* @param $scheme String scheme name like http or mailto
|
||||||
@@ -79,11 +79,8 @@ class HTMLPurifier_URISchemeRegistry
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (isset($this->schemes[$scheme])) return $this->schemes[$scheme];
|
if (isset($this->schemes[$scheme])) return $this->schemes[$scheme];
|
||||||
if (empty($this->_dir)) $this->_dir = dirname(__FILE__) . '/URIScheme/';
|
|
||||||
|
|
||||||
if (!isset($allowed_schemes[$scheme])) return $null;
|
if (!isset($allowed_schemes[$scheme])) return $null;
|
||||||
|
|
||||||
@include_once $this->_dir . $scheme . '.php';
|
|
||||||
$class = 'HTMLPurifier_URIScheme_' . $scheme;
|
$class = 'HTMLPurifier_URIScheme_' . $scheme;
|
||||||
if (!class_exists($class)) return $null;
|
if (!class_exists($class)) return $null;
|
||||||
$this->schemes[$scheme] = new $class();
|
$this->schemes[$scheme] = new $class();
|
||||||
@@ -91,7 +88,7 @@ class HTMLPurifier_URISchemeRegistry
|
|||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Registers a custom scheme to the cache.
|
* Registers a custom scheme to the cache, bypassing reflection.
|
||||||
* @param $scheme Scheme name
|
* @param $scheme Scheme name
|
||||||
* @param $scheme_obj HTMLPurifier_URIScheme object
|
* @param $scheme_obj HTMLPurifier_URIScheme object
|
||||||
*/
|
*/
|
||||||
|
64
maintenance/PH5P.patch
Normal file
64
maintenance/PH5P.patch
Normal file
@@ -0,0 +1,64 @@
|
|||||||
|
--- C:\Users\Edward\Webs\htmlpurifier\maintenance\PH5P.php 2007-11-04 23:41:49.074543700 -0500
|
||||||
|
+++ C:\Users\Edward\Webs\htmlpurifier\maintenance/PH5P.new.php 2007-11-05 00:23:52.839543700 -0500
|
||||||
|
@@ -211,7 +211,10 @@
|
||||||
|
// If nothing is returned, emit a U+0026 AMPERSAND character token.
|
||||||
|
// Otherwise, emit the character token that was returned.
|
||||||
|
$char = (!$entity) ? '&' : $entity;
|
||||||
|
- $this->emitToken($char);
|
||||||
|
+ $this->emitToken(array(
|
||||||
|
+ 'type' => self::CHARACTR,
|
||||||
|
+ 'data' => $char
|
||||||
|
+ ));
|
||||||
|
|
||||||
|
// Finally, switch to the data state.
|
||||||
|
$this->state = 'data';
|
||||||
|
@@ -708,7 +711,7 @@
|
||||||
|
} elseif($char === '&') {
|
||||||
|
/* U+0026 AMPERSAND (&)
|
||||||
|
Switch to the entity in attribute value state. */
|
||||||
|
- $this->entityInAttributeValueState('non');
|
||||||
|
+ $this->entityInAttributeValueState();
|
||||||
|
|
||||||
|
} elseif($char === '>') {
|
||||||
|
/* U+003E GREATER-THAN SIGN (>)
|
||||||
|
@@ -738,7 +741,8 @@
|
||||||
|
? '&'
|
||||||
|
: $entity;
|
||||||
|
|
||||||
|
- $this->emitToken($char);
|
||||||
|
+ $last = count($this->token['attr']) - 1;
|
||||||
|
+ $this->token['attr'][$last]['value'] .= $char;
|
||||||
|
}
|
||||||
|
|
||||||
|
private function bogusCommentState() {
|
||||||
|
@@ -1066,6 +1070,11 @@
|
||||||
|
$this->char++;
|
||||||
|
|
||||||
|
if(in_array($id, $this->entities)) {
|
||||||
|
+ if ($e_name[$c-1] !== ';') {
|
||||||
|
+ if ($c < $len && $e_name[$c] == ';') {
|
||||||
|
+ $this->char++; // consume extra semicolon
|
||||||
|
+ }
|
||||||
|
+ }
|
||||||
|
$entity = $id;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
@@ -3659,7 +3668,7 @@
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
- private function generateImpliedEndTags(array $exclude = array()) {
|
||||||
|
+ private function generateImpliedEndTags($exclude = array()) {
|
||||||
|
/* When the steps below require the UA to generate implied end tags,
|
||||||
|
then, if the current node is a dd element, a dt element, an li element,
|
||||||
|
a p element, a td element, a th element, or a tr element, the UA must
|
||||||
|
@@ -3673,7 +3682,8 @@
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
- private function getElementCategory($name) {
|
||||||
|
+ private function getElementCategory($node) {
|
||||||
|
+ $name = $node->tagName;
|
||||||
|
if(in_array($name, $this->special))
|
||||||
|
return self::SPECIAL;
|
||||||
|
|
3824
maintenance/PH5P.php
Normal file
3824
maintenance/PH5P.php
Normal file
File diff suppressed because it is too large
Load Diff
@@ -1,5 +1,7 @@
|
|||||||
<?php
|
<?php
|
||||||
|
|
||||||
|
require_once 'compat-function-file-put-contents.php';
|
||||||
|
|
||||||
function assertCli() {
|
function assertCli() {
|
||||||
if (php_sapi_name() != 'cli' && !getenv('PHP_IS_CLI')) {
|
if (php_sapi_name() != 'cli' && !getenv('PHP_IS_CLI')) {
|
||||||
echo 'Script cannot be called from web-browser (if you are calling via cli,
|
echo 'Script cannot be called from web-browser (if you are calling via cli,
|
||||||
@@ -7,3 +9,135 @@ set environment variable PHP_IS_CLI to work around this).';
|
|||||||
exit;
|
exit;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Filesystem tools not provided by default; can recursively create, copy
|
||||||
|
* and delete folders. Some template methods are provided for extensibility.
|
||||||
|
* @note This class must be instantiated to be used, although it does
|
||||||
|
* not maintain state.
|
||||||
|
*/
|
||||||
|
class FSTools
|
||||||
|
{
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Recursively creates a directory
|
||||||
|
* @param string $folder Name of folder to create
|
||||||
|
* @note Adapted from the PHP manual comment 76612
|
||||||
|
*/
|
||||||
|
function mkdir($folder) {
|
||||||
|
$folders = preg_split("#[\\\\/]#", $folder);
|
||||||
|
$base = '';
|
||||||
|
for($i = 0, $c = count($folders); $i < $c; $i++) {
|
||||||
|
if(empty($folders[$i])) {
|
||||||
|
if (!$i) {
|
||||||
|
// special case for root level
|
||||||
|
$base .= DIRECTORY_SEPARATOR;
|
||||||
|
}
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
$base .= $folders[$i];
|
||||||
|
if(!is_dir($base)){
|
||||||
|
mkdir($base);
|
||||||
|
}
|
||||||
|
$base .= DIRECTORY_SEPARATOR;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Copy a file, or recursively copy a folder and its contents; modified
|
||||||
|
* so that copied files, if PHP, have includes removed
|
||||||
|
*
|
||||||
|
* @author Aidan Lister <aidan@php.net>
|
||||||
|
* @version 1.0.1-modified
|
||||||
|
* @link http://aidanlister.com/repos/v/function.copyr.php
|
||||||
|
* @param string $source Source path
|
||||||
|
* @param string $dest Destination path
|
||||||
|
* @return bool Returns TRUE on success, FALSE on failure
|
||||||
|
*/
|
||||||
|
function copyr($source, $dest) {
|
||||||
|
// Simple copy for a file
|
||||||
|
if (is_file($source)) {
|
||||||
|
return $this->copy($source, $dest);
|
||||||
|
}
|
||||||
|
// Make destination directory
|
||||||
|
if (!is_dir($dest)) {
|
||||||
|
mkdir($dest);
|
||||||
|
}
|
||||||
|
// Loop through the folder
|
||||||
|
$dir = dir($source);
|
||||||
|
while (false !== $entry = $dir->read()) {
|
||||||
|
// Skip pointers
|
||||||
|
if ($entry == '.' || $entry == '..') {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (!$this->copyable($entry)) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
// Deep copy directories
|
||||||
|
if ($dest !== "$source/$entry") {
|
||||||
|
$this->copyr("$source/$entry", "$dest/$entry");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Clean up
|
||||||
|
$dir->close();
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Stub for PHP's built-in copy function, can be used to overload
|
||||||
|
* functionality
|
||||||
|
*/
|
||||||
|
function copy($source, $dest) {
|
||||||
|
return copy($source, $dest);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Overloadable function that tests a filename for copyability. By
|
||||||
|
* default, everything should be copied; you can restrict things to
|
||||||
|
* ignore hidden files, unreadable files, etc.
|
||||||
|
*/
|
||||||
|
function copyable($file) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Delete a file, or a folder and its contents
|
||||||
|
*
|
||||||
|
* @author Aidan Lister <aidan@php.net>
|
||||||
|
* @version 1.0.3
|
||||||
|
* @link http://aidanlister.com/repos/v/function.rmdirr.php
|
||||||
|
* @param string $dirname Directory to delete
|
||||||
|
* @return bool Returns TRUE on success, FALSE on failure
|
||||||
|
*/
|
||||||
|
function rmdirr($dirname)
|
||||||
|
{
|
||||||
|
// Sanity check
|
||||||
|
if (!file_exists($dirname)) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Simple delete for a file
|
||||||
|
if (is_file($dirname) || is_link($dirname)) {
|
||||||
|
return unlink($dirname);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Loop through the folder
|
||||||
|
$dir = dir($dirname);
|
||||||
|
while (false !== $entry = $dir->read()) {
|
||||||
|
// Skip pointers
|
||||||
|
if ($entry == '.' || $entry == '..') {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
// Recurse
|
||||||
|
$this->rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Clean up
|
||||||
|
$dir->close();
|
||||||
|
return rmdir($dirname);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
107
maintenance/compat-function-file-put-contents.php
Normal file
107
maintenance/compat-function-file-put-contents.php
Normal file
@@ -0,0 +1,107 @@
|
|||||||
|
<?php
|
||||||
|
// $Id: file_put_contents.php,v 1.27 2007/04/17 10:09:56 arpad Exp $
|
||||||
|
|
||||||
|
|
||||||
|
if (!defined('FILE_USE_INCLUDE_PATH')) {
|
||||||
|
define('FILE_USE_INCLUDE_PATH', 1);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!defined('LOCK_EX')) {
|
||||||
|
define('LOCK_EX', 2);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!defined('FILE_APPEND')) {
|
||||||
|
define('FILE_APPEND', 8);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Replace file_put_contents()
|
||||||
|
*
|
||||||
|
* @category PHP
|
||||||
|
* @package PHP_Compat
|
||||||
|
* @license LGPL - http://www.gnu.org/licenses/lgpl.html
|
||||||
|
* @copyright 2004-2007 Aidan Lister <aidan@php.net>, Arpad Ray <arpad@php.net>
|
||||||
|
* @link http://php.net/function.file_put_contents
|
||||||
|
* @author Aidan Lister <aidan@php.net>
|
||||||
|
* @version $Revision: 1.27 $
|
||||||
|
* @internal resource_context is not supported
|
||||||
|
* @since PHP 5
|
||||||
|
* @require PHP 4.0.0 (user_error)
|
||||||
|
*/
|
||||||
|
function php_compat_file_put_contents($filename, $content, $flags = null, $resource_context = null)
|
||||||
|
{
|
||||||
|
// If $content is an array, convert it to a string
|
||||||
|
if (is_array($content)) {
|
||||||
|
$content = implode('', $content);
|
||||||
|
}
|
||||||
|
|
||||||
|
// If we don't have a string, throw an error
|
||||||
|
if (!is_scalar($content)) {
|
||||||
|
user_error('file_put_contents() The 2nd parameter should be either a string or an array',
|
||||||
|
E_USER_WARNING);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get the length of data to write
|
||||||
|
$length = strlen($content);
|
||||||
|
|
||||||
|
// Check what mode we are using
|
||||||
|
$mode = ($flags & FILE_APPEND) ?
|
||||||
|
'a' :
|
||||||
|
'wb';
|
||||||
|
|
||||||
|
// Check if we're using the include path
|
||||||
|
$use_inc_path = ($flags & FILE_USE_INCLUDE_PATH) ?
|
||||||
|
true :
|
||||||
|
false;
|
||||||
|
|
||||||
|
// Open the file for writing
|
||||||
|
if (($fh = @fopen($filename, $mode, $use_inc_path)) === false) {
|
||||||
|
user_error('file_put_contents() failed to open stream: Permission denied',
|
||||||
|
E_USER_WARNING);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Attempt to get an exclusive lock
|
||||||
|
$use_lock = ($flags & LOCK_EX) ? true : false ;
|
||||||
|
if ($use_lock === true) {
|
||||||
|
if (!flock($fh, LOCK_EX)) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Write to the file
|
||||||
|
$bytes = 0;
|
||||||
|
if (($bytes = @fwrite($fh, $content)) === false) {
|
||||||
|
$errormsg = sprintf('file_put_contents() Failed to write %d bytes to %s',
|
||||||
|
$length,
|
||||||
|
$filename);
|
||||||
|
user_error($errormsg, E_USER_WARNING);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Close the handle
|
||||||
|
@fclose($fh);
|
||||||
|
|
||||||
|
// Check all the data was written
|
||||||
|
if ($bytes != $length) {
|
||||||
|
$errormsg = sprintf('file_put_contents() Only %d of %d bytes written, possibly out of free disk space.',
|
||||||
|
$bytes,
|
||||||
|
$length);
|
||||||
|
user_error($errormsg, E_USER_WARNING);
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Return length
|
||||||
|
return $bytes;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
// Define
|
||||||
|
if (!function_exists('file_put_contents')) {
|
||||||
|
function file_put_contents($filename, $content, $flags = null, $resource_context = null)
|
||||||
|
{
|
||||||
|
return php_compat_file_put_contents($filename, $content, $flags, $resource_context);
|
||||||
|
}
|
||||||
|
}
|
@@ -32,5 +32,5 @@ foreach ($names as $name) {
|
|||||||
$cache->flush($config);
|
$cache->flush($config);
|
||||||
}
|
}
|
||||||
|
|
||||||
echo 'Cache flushed successfully.';
|
echo "Cache flushed successfully.\n";
|
||||||
|
|
||||||
|
13
maintenance/generate-ph5p-patch.php
Normal file
13
maintenance/generate-ph5p-patch.php
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
$orig = realpath(dirname(__FILE__) . '/PH5P.php');
|
||||||
|
$new = realpath(dirname(__FILE__) . '/../library/HTMLPurifier/Lexer/PH5P.php');
|
||||||
|
$newt = dirname(__FILE__) . '/PH5P.new.php'; // temporary file
|
||||||
|
|
||||||
|
// minor text-processing of new file to get into same format as original
|
||||||
|
$new_src = file_get_contents($new);
|
||||||
|
$new_src = '<?php' . PHP_EOL . substr($new_src, strpos($new_src, 'class HTML5 {'));
|
||||||
|
|
||||||
|
file_put_contents($newt, $new_src);
|
||||||
|
shell_exec("diff -u \"$orig\" \"$newt\" > PH5P.patch");
|
||||||
|
unlink($newt);
|
@@ -6,20 +6,38 @@ assertCli();
|
|||||||
|
|
||||||
/**
|
/**
|
||||||
* Compiles all of HTML Purifier's library files into one big file
|
* Compiles all of HTML Purifier's library files into one big file
|
||||||
* named HTMLPurifier.standalone.php. Operates recursively, and will
|
* named HTMLPurifier.standalone.php.
|
||||||
* barf if there are conditional includes.
|
|
||||||
*
|
|
||||||
* Details: also creates blank "include" files in the test/blank directory
|
|
||||||
* in order to simulate require_once's inside the test files.
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Global array that tracks already loaded includes
|
* Global hash that tracks already loaded includes
|
||||||
*/
|
*/
|
||||||
$GLOBALS['loaded'] = array('HTMLPurifier.php' => true);
|
$GLOBALS['loaded'] = array('HTMLPurifier.php' => true);
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* @param $text Text to replace includes from
|
* Custom FSTools for this script that overloads some behavior
|
||||||
|
* @warning The overloading of copy() is not necessarily global for
|
||||||
|
* this script. Watch out!
|
||||||
|
*/
|
||||||
|
class MergeLibraryFSTools extends FSTools
|
||||||
|
{
|
||||||
|
function copyable($entry) {
|
||||||
|
// Skip hidden files
|
||||||
|
if ($entry[0] == '.') {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
function copy($source, $dest) {
|
||||||
|
copy_and_remove_includes($source, $dest);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
$FS = new MergeLibraryFSTools();
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Replaces the includes inside PHP source code with the corresponding
|
||||||
|
* source.
|
||||||
|
* @param string $text PHP source code to replace includes from
|
||||||
*/
|
*/
|
||||||
function replace_includes($text) {
|
function replace_includes($text) {
|
||||||
return preg_replace_callback(
|
return preg_replace_callback(
|
||||||
@@ -32,6 +50,8 @@ function replace_includes($text) {
|
|||||||
/**
|
/**
|
||||||
* Removes leading PHP tags from included files. Assumes that there is
|
* Removes leading PHP tags from included files. Assumes that there is
|
||||||
* no trailing tag.
|
* no trailing tag.
|
||||||
|
* @note This is safe for files that have internal <?php
|
||||||
|
* @param string $text Text to have leading PHP tag from
|
||||||
*/
|
*/
|
||||||
function remove_php_tags($text) {
|
function remove_php_tags($text) {
|
||||||
return substr($text, 5);
|
return substr($text, 5);
|
||||||
@@ -40,125 +60,48 @@ function remove_php_tags($text) {
|
|||||||
/**
|
/**
|
||||||
* Creates an appropriate blank file, recursively generating directories
|
* Creates an appropriate blank file, recursively generating directories
|
||||||
* if necessary
|
* if necessary
|
||||||
|
* @param string $file Filename to create blank for
|
||||||
*/
|
*/
|
||||||
function create_blank($file) {
|
function create_blank($file) {
|
||||||
|
global $FS;
|
||||||
$dir = dirname($file);
|
$dir = dirname($file);
|
||||||
$base = realpath('../tests/blanks/') . DIRECTORY_SEPARATOR ;
|
$base = realpath('../tests/blanks/') . DIRECTORY_SEPARATOR ;
|
||||||
if ($dir != '.') mkdir_deep($base . $dir);
|
if ($dir != '.') {
|
||||||
|
$FS->mkdir($base . $dir);
|
||||||
|
}
|
||||||
file_put_contents($base . $file, '');
|
file_put_contents($base . $file, '');
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Recursively creates a directory
|
* Copies the contents of a directory to the standalone directory
|
||||||
* @note Adapted from the PHP manual comment 76612
|
* @param string $dir Directory to copy
|
||||||
*/
|
*/
|
||||||
function mkdir_deep($folder) {
|
function make_dir_standalone($dir) {
|
||||||
$folders = preg_split("#[\\\\/]#", $folder);
|
global $FS;
|
||||||
$base = '';
|
return $FS->copyr($dir, 'standalone/' . $dir);
|
||||||
for($i = 0, $c = count($folders); $i < $c; $i++) {
|
|
||||||
if(empty($folders[$i])) {
|
|
||||||
if (!$i) {
|
|
||||||
// special case for root level
|
|
||||||
$base .= DIRECTORY_SEPARATOR;
|
|
||||||
}
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
$base .= $folders[$i];
|
|
||||||
if(!is_dir($base)){
|
|
||||||
mkdir($base);
|
|
||||||
}
|
|
||||||
$base .= DIRECTORY_SEPARATOR;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Copy a file, or recursively copy a folder and its contents
|
* Copies the contents of a file to the standalone directory
|
||||||
*
|
* @param string $file File to copy
|
||||||
* @author Aidan Lister <aidan@php.net>
|
|
||||||
* @version 1.0.1
|
|
||||||
* @link http://aidanlister.com/repos/v/function.copyr.php
|
|
||||||
* @param string $source Source path
|
|
||||||
* @param string $dest Destination path
|
|
||||||
* @return bool Returns TRUE on success, FALSE on failure
|
|
||||||
*/
|
*/
|
||||||
function copyr($source, $dest) {
|
function make_file_standalone($file) {
|
||||||
// Simple copy for a file
|
global $FS;
|
||||||
if (is_file($source)) {
|
$FS->mkdir('standalone/' . dirname($file));
|
||||||
return copy($source, $dest);
|
copy_and_remove_includes($file, 'standalone/' . $file);
|
||||||
}
|
|
||||||
// Make destination directory
|
|
||||||
if (!is_dir($dest)) {
|
|
||||||
mkdir($dest);
|
|
||||||
}
|
|
||||||
// Loop through the folder
|
|
||||||
$dir = dir($source);
|
|
||||||
while (false !== $entry = $dir->read()) {
|
|
||||||
// Skip pointers
|
|
||||||
if ($entry == '.' || $entry == '..') {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
// Skip hidden files
|
|
||||||
if ($entry[0] == '.') {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
// Deep copy directories
|
|
||||||
if ($dest !== "$source/$entry") {
|
|
||||||
copyr("$source/$entry", "$dest/$entry");
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Clean up
|
|
||||||
$dir->close();
|
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Delete a file, or a folder and its contents
|
* Copies a file to another location recursively, if it is a PHP file
|
||||||
*
|
* remove includes
|
||||||
* @author Aidan Lister <aidan@php.net>
|
* @param string $file Original file
|
||||||
* @version 1.0.3
|
* @param string $sfile New location of file
|
||||||
* @link http://aidanlister.com/repos/v/function.rmdirr.php
|
|
||||||
* @param string $dirname Directory to delete
|
|
||||||
* @return bool Returns TRUE on success, FALSE on failure
|
|
||||||
*/
|
*/
|
||||||
function rmdirr($dirname)
|
function copy_and_remove_includes($file, $sfile) {
|
||||||
{
|
$contents = file_get_contents($file);
|
||||||
// Sanity check
|
if (strrchr($file, '.') === '.php') $contents = replace_includes($contents);
|
||||||
if (!file_exists($dirname)) {
|
return file_put_contents($sfile, $contents);
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Simple delete for a file
|
|
||||||
if (is_file($dirname) || is_link($dirname)) {
|
|
||||||
return unlink($dirname);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Loop through the folder
|
|
||||||
$dir = dir($dirname);
|
|
||||||
while (false !== $entry = $dir->read()) {
|
|
||||||
// Skip pointers
|
|
||||||
if ($entry == '.' || $entry == '..') {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Recurse
|
|
||||||
rmdirr($dirname . DIRECTORY_SEPARATOR . $entry);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Clean up
|
|
||||||
$dir->close();
|
|
||||||
return rmdir($dirname);
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Copies the contents of a directory to the standalone directory
|
|
||||||
*/
|
|
||||||
function make_dir_standalone($dir) {
|
|
||||||
return copyr($dir, 'standalone/' . $dir);
|
|
||||||
}
|
|
||||||
|
|
||||||
function make_file_standalone($file) {
|
|
||||||
mkdir_deep('standalone/' . dirname($file));
|
|
||||||
return copy($file, 'standalone/' . $file);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -167,6 +110,16 @@ function make_file_standalone($file) {
|
|||||||
*/
|
*/
|
||||||
function replace_includes_callback($matches) {
|
function replace_includes_callback($matches) {
|
||||||
$file = $matches[1];
|
$file = $matches[1];
|
||||||
|
$preserve = array(
|
||||||
|
// PHP 5 only
|
||||||
|
'HTMLPurifier/Lexer/DOMLex.php' => 1,
|
||||||
|
'HTMLPurifier/Printer.php' => 1,
|
||||||
|
// PEAR (external)
|
||||||
|
'XML/HTMLSax3.php' => 1
|
||||||
|
);
|
||||||
|
if (isset($preserve[$file])) {
|
||||||
|
return $matches[0];
|
||||||
|
}
|
||||||
if (isset($GLOBALS['loaded'][$file])) return '';
|
if (isset($GLOBALS['loaded'][$file])) return '';
|
||||||
$GLOBALS['loaded'][$file] = true;
|
$GLOBALS['loaded'][$file] = true;
|
||||||
create_blank($file);
|
create_blank($file);
|
||||||
@@ -180,19 +133,30 @@ echo 'Creating full file...';
|
|||||||
$contents = replace_includes(file_get_contents('HTMLPurifier.php'));
|
$contents = replace_includes(file_get_contents('HTMLPurifier.php'));
|
||||||
$contents = str_replace(
|
$contents = str_replace(
|
||||||
"define('HTMLPURIFIER_PREFIX', dirname(__FILE__));",
|
"define('HTMLPURIFIER_PREFIX', dirname(__FILE__));",
|
||||||
"define('HTMLPURIFIER_PREFIX', dirname(__FILE__) . '/standalone');",
|
"define('HTMLPURIFIER_PREFIX', dirname(__FILE__) . '/standalone');
|
||||||
|
set_include_path(HTMLPURIFIER_PREFIX . PATH_SEPARATOR . get_include_path());",
|
||||||
$contents
|
$contents
|
||||||
);
|
);
|
||||||
file_put_contents('HTMLPurifier.standalone.php', $contents);
|
file_put_contents('HTMLPurifier.standalone.php', $contents);
|
||||||
echo ' done!' . PHP_EOL;
|
echo ' done!' . PHP_EOL;
|
||||||
|
|
||||||
echo 'Creating standalone directory...';
|
echo 'Creating standalone directory...';
|
||||||
rmdirr('standalone'); // ensure a clean copy
|
$FS->rmdirr('standalone'); // ensure a clean copy
|
||||||
mkdir_deep('standalone/HTMLPurifier/DefinitionCache/Serializer');
|
|
||||||
make_dir_standalone('HTMLPurifier/EntityLookup');
|
|
||||||
make_dir_standalone('HTMLPurifier/Language');
|
|
||||||
make_file_standalone('HTMLPurifier/Printer/ConfigForm.js');
|
|
||||||
make_file_standalone('HTMLPurifier/Printer/ConfigForm.css');
|
|
||||||
make_dir_standalone('HTMLPurifier/URIScheme');
|
|
||||||
echo ' done!' . PHP_EOL;
|
|
||||||
|
|
||||||
|
// data files
|
||||||
|
$FS->mkdir('standalone/HTMLPurifier/DefinitionCache/Serializer');
|
||||||
|
make_dir_standalone('HTMLPurifier/EntityLookup');
|
||||||
|
|
||||||
|
// non-standard inclusion setup
|
||||||
|
make_dir_standalone('HTMLPurifier/Language');
|
||||||
|
|
||||||
|
// optional components
|
||||||
|
make_file_standalone('HTMLPurifier/Printer.php');
|
||||||
|
make_dir_standalone('HTMLPurifier/Printer');
|
||||||
|
make_dir_standalone('HTMLPurifier/Filter');
|
||||||
|
make_file_standalone('HTMLPurifier/Lexer/PEARSax3.php');
|
||||||
|
|
||||||
|
// PHP 5 only files
|
||||||
|
make_file_standalone('HTMLPurifier/Lexer/DOMLex.php');
|
||||||
|
make_file_standalone('HTMLPurifier/Lexer/PH5P.php');
|
||||||
|
echo ' done!' . PHP_EOL;
|
||||||
|
56
plugins/phorum/config.default.php
Normal file
56
plugins/phorum/config.default.php
Normal file
@@ -0,0 +1,56 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
if(!defined("PHORUM")) exit;
|
||||||
|
|
||||||
|
// default HTML Purifier configuration settings
|
||||||
|
$config->set('HTML', 'Allowed',
|
||||||
|
// alphabetically sorted
|
||||||
|
'a[href|title]
|
||||||
|
abbr[title]
|
||||||
|
acronym[title]
|
||||||
|
b
|
||||||
|
blockquote[cite]
|
||||||
|
br
|
||||||
|
caption
|
||||||
|
cite
|
||||||
|
code
|
||||||
|
dd
|
||||||
|
del
|
||||||
|
dfn
|
||||||
|
div
|
||||||
|
dl
|
||||||
|
dt
|
||||||
|
em
|
||||||
|
i
|
||||||
|
img[src|alt|title|class]
|
||||||
|
ins
|
||||||
|
kbd
|
||||||
|
li
|
||||||
|
ol
|
||||||
|
p
|
||||||
|
pre
|
||||||
|
s
|
||||||
|
strike
|
||||||
|
strong
|
||||||
|
sub
|
||||||
|
sup
|
||||||
|
table
|
||||||
|
tbody
|
||||||
|
td
|
||||||
|
tfoot
|
||||||
|
th
|
||||||
|
thead
|
||||||
|
tr
|
||||||
|
tt
|
||||||
|
u
|
||||||
|
ul
|
||||||
|
var');
|
||||||
|
$config->set('AutoFormat', 'AutoParagraph', true);
|
||||||
|
$config->set('AutoFormat', 'Linkify', true);
|
||||||
|
$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
|
||||||
|
$config->set('Core', 'AggressivelyFixLt', true);
|
||||||
|
$config->set('Core', 'Encoding', $GLOBALS['PHORUM']['DATA']['CHARSET']); // we'll change this eventually
|
||||||
|
if (strtolower($GLOBALS['PHORUM']['DATA']['CHARSET']) !== 'utf-8') {
|
||||||
|
$config->set('Core', 'EscapeNonASCIICharacters', true);
|
||||||
|
}
|
||||||
|
|
302
plugins/phorum/htmlpurifier.php
Normal file
302
plugins/phorum/htmlpurifier.php
Normal file
@@ -0,0 +1,302 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
/**
|
||||||
|
* HTML Purifier Phorum Mod. Filter your HTML the Standards-Compliant Way!
|
||||||
|
*
|
||||||
|
* This Phorum mod enables users to post raw HTML into Phorum. But never
|
||||||
|
* fear: with the help of HTML Purifier, this HTML will be beat into
|
||||||
|
* de-XSSed and standards-compliant form, safe for general consumption.
|
||||||
|
* It is not recommended, but possible to run this mod in parallel
|
||||||
|
* with other formatters (in short, please DISABLE the BBcode mod).
|
||||||
|
*
|
||||||
|
* For help migrating from your previous markup language to pure HTML
|
||||||
|
* please check the migrate.bbcode.php file.
|
||||||
|
*
|
||||||
|
* If you'd like to use this with a WYSIWYG editor, make sure that
|
||||||
|
* editor sets $PHORUM['mod_htmlpurifier']['wysiwyg'] to true. Otherwise,
|
||||||
|
* administrators who need to edit other people's comments may be at
|
||||||
|
* risk for some nasty attacks.
|
||||||
|
*
|
||||||
|
* Tested with Phorum 5.1.22. This module will almost definitely need
|
||||||
|
* to be upgraded when Phorum 6 rolls around.
|
||||||
|
*/
|
||||||
|
|
||||||
|
// Note: Cache data is base64 encoded because Phorum insists on flinging
|
||||||
|
// to the user and expecting it to come back unharmed, newlines and
|
||||||
|
// all, which ain't happening. It's slower, it takes up more space, but
|
||||||
|
// at least it won't get mutilated
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Purifies a data array
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_format($data)
|
||||||
|
{
|
||||||
|
$PHORUM = $GLOBALS["PHORUM"];
|
||||||
|
|
||||||
|
$purifier =& HTMLPurifier::getInstance();
|
||||||
|
$cache_serial = $PHORUM['mod_htmlpurifier']['body_cache_serial'];
|
||||||
|
|
||||||
|
foreach($data as $message_id => $message){
|
||||||
|
if(isset($message['body'])) {
|
||||||
|
|
||||||
|
if ($message_id) {
|
||||||
|
// we're dealing with a real message, not a fake, so
|
||||||
|
// there a number of shortcuts that can be taken
|
||||||
|
|
||||||
|
if (isset($message['meta']['htmlpurifier_light'])) {
|
||||||
|
// format hook was called outside of Phorum's normal
|
||||||
|
// functions, do the abridged purification
|
||||||
|
$data[$message_id]['body'] = $purifier->purify($message['body']);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!empty($PHORUM['args']['purge'])) {
|
||||||
|
// purge the cache, must be below the following if
|
||||||
|
unset($message['meta']['body_cache']);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (
|
||||||
|
isset($message['meta']['body_cache']) &&
|
||||||
|
isset($message['meta']['body_cache_serial']) &&
|
||||||
|
$message['meta']['body_cache_serial'] == $cache_serial
|
||||||
|
) {
|
||||||
|
// cached version is present, bail out early
|
||||||
|
$data[$message_id]['body'] = base64_decode($message['meta']['body_cache']);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// migration might edit this array, that's why it's defined
|
||||||
|
// so early
|
||||||
|
$updated_message = array();
|
||||||
|
|
||||||
|
// create the $body variable
|
||||||
|
if (
|
||||||
|
$message_id && // message must be real to migrate
|
||||||
|
!isset($message['meta']['body_cache_serial'])
|
||||||
|
) {
|
||||||
|
// perform migration
|
||||||
|
$fake_data = array();
|
||||||
|
list($signature, $edit_message) = phorum_htmlpurifier_remove_sig_and_editmessage($message);
|
||||||
|
$fake_data[$message_id] = $message;
|
||||||
|
$fake_data = phorum_htmlpurifier_migrate($fake_data);
|
||||||
|
$body = $fake_data[$message_id]['body'];
|
||||||
|
$body = str_replace("<phorum break>", '', $body);
|
||||||
|
$updated_message['body'] = $body; // save it in
|
||||||
|
$body .= $signature . $edit_message; // add it back in
|
||||||
|
} else {
|
||||||
|
// reverse Phorum's pre-processing
|
||||||
|
$body = $message['body'];
|
||||||
|
// order is important
|
||||||
|
$body = str_replace("<phorum break>\n", "\n", $body);
|
||||||
|
$body = str_replace(array('<','>','&'), array('<','>','&'), $body);
|
||||||
|
if (!$message_id && defined('PHORUM_CONTROL_CENTER')) {
|
||||||
|
// we're in control.php, so it was double-escaped
|
||||||
|
$body = str_replace(array('<','>','&', '"'), array('<','>','&','"'), $body);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
$body = $purifier->purify($body);
|
||||||
|
|
||||||
|
// dynamically update the cache (MUST BE DONE HERE!)
|
||||||
|
// this is inefficient because it's one db call per
|
||||||
|
// cache miss, but once the cache is in place things are
|
||||||
|
// a lot zippier.
|
||||||
|
|
||||||
|
if ($message_id) { // make sure it's not a fake id
|
||||||
|
$updated_message['meta'] = $message['meta'];
|
||||||
|
$updated_message['meta']['body_cache'] = base64_encode($body);
|
||||||
|
$updated_message['meta']['body_cache_serial'] = $cache_serial;
|
||||||
|
phorum_db_update_message($message_id, $updated_message);
|
||||||
|
}
|
||||||
|
|
||||||
|
// must not get overloaded until after we cache it, otherwise
|
||||||
|
// we'll inadvertently change the original text
|
||||||
|
$data[$message_id]['body'] = $body;
|
||||||
|
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return $data;
|
||||||
|
}
|
||||||
|
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// This is fragile code, copied from read.php:359. It will break if
|
||||||
|
// that is changed
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generates a signature based on a message array
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_generate_sig($row) {
|
||||||
|
$phorum_sig = '';
|
||||||
|
if(isset($row["user"]["signature"])
|
||||||
|
&& isset($row['meta']['show_signature']) && $row['meta']['show_signature']==1){
|
||||||
|
$phorum_sig=trim($row["user"]["signature"]);
|
||||||
|
if(!empty($phorum_sig)){
|
||||||
|
$phorum_sig="\n\n$phorum_sig";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return $phorum_sig;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Generates an edit message based on a message array
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_generate_editmessage($row) {
|
||||||
|
$PHORUM = $GLOBALS['PHORUM'];
|
||||||
|
$editmessage = '';
|
||||||
|
if(isset($row['meta']['edit_count']) && $row['meta']['edit_count'] > 0) {
|
||||||
|
$editmessage = str_replace ("%count%", $row['meta']['edit_count'], $PHORUM["DATA"]["LANG"]["EditedMessage"]);
|
||||||
|
$editmessage = str_replace ("%lastedit%", phorum_date($PHORUM["short_date"],$row['meta']['edit_date']), $editmessage);
|
||||||
|
$editmessage = str_replace ("%lastuser%", $row['meta']['edit_username'], $editmessage);
|
||||||
|
$editmessage="\n\n\n\n$editmessage";
|
||||||
|
}
|
||||||
|
return $editmessage;
|
||||||
|
}
|
||||||
|
|
||||||
|
// End fragile code
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Removes the signature and edit message from a message
|
||||||
|
* @param $row Message passed by reference
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_remove_sig_and_editmessage(&$row) {
|
||||||
|
// attempt to remove the Phorum's pre-processing:
|
||||||
|
// we must not process the signature or editmessage
|
||||||
|
$signature = phorum_htmlpurifier_generate_sig($row);
|
||||||
|
$editmessage = phorum_htmlpurifier_generate_editmessage($row);
|
||||||
|
$row['body'] = strtr($row['body'], array($signature => '', $editmessage => ''));
|
||||||
|
return array($signature, $editmessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Indicate that data is fully HTML and not from migration, invalidate
|
||||||
|
* previous caches
|
||||||
|
* @note This function used to generate the actual cache entries, but
|
||||||
|
* since there's data missing that must be deferred to the first read
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_posting($message) {
|
||||||
|
$PHORUM = $GLOBALS["PHORUM"];
|
||||||
|
unset($message['meta']['body_cache']); // invalidate the cache
|
||||||
|
$message['meta']['body_cache_serial'] = $PHORUM['mod_htmlpurifier']['body_cache_serial'];
|
||||||
|
return $message;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Overload quoting mechanism to prevent default, mail-style quote from happening
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_quote($array) {
|
||||||
|
$PHORUM = $GLOBALS["PHORUM"];
|
||||||
|
$purifier =& HTMLPurifier::getInstance();
|
||||||
|
$text = $purifier->purify($array[1]);
|
||||||
|
return "<blockquote cite=\"$array[0]\">\n$text\n</blockquote>";
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Ensure that our format hook is processed last. Also, loads the library.
|
||||||
|
* @credits <http://secretsauce.phorum.org/snippets/make_bbcode_last_formatter.php.txt>
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_common() {
|
||||||
|
|
||||||
|
require_once(dirname(__FILE__).'/htmlpurifier/HTMLPurifier.auto.php');
|
||||||
|
require(dirname(__FILE__).'/init-config.php');
|
||||||
|
|
||||||
|
$config = phorum_htmlpurifier_get_config();
|
||||||
|
HTMLPurifier::getInstance($config);
|
||||||
|
|
||||||
|
// increment revision.txt if you want to invalidate the cache
|
||||||
|
$GLOBALS['PHORUM']['mod_htmlpurifier']['body_cache_serial'] = $config->getSerial();
|
||||||
|
|
||||||
|
// load migration
|
||||||
|
if (file_exists(dirname(__FILE__) . '/migrate.php')) {
|
||||||
|
include(dirname(__FILE__) . '/migrate.php');
|
||||||
|
} else {
|
||||||
|
echo '<strong>Error:</strong> No migration path specified for HTML Purifier, please check
|
||||||
|
<tt>modes/htmlpurifier/migrate.bbcode.php</tt> for instructions on
|
||||||
|
how to migrate from your previous markup language.';
|
||||||
|
exit;
|
||||||
|
}
|
||||||
|
|
||||||
|
// see if our hooks need to be bubbled to the end
|
||||||
|
phorum_htmlpurifier_bubble_hook('format');
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_bubble_hook($hook) {
|
||||||
|
global $PHORUM;
|
||||||
|
$our_idx = null;
|
||||||
|
$last_idx = null;
|
||||||
|
if (!isset($PHORUM['hooks'][$hook]['mods'])) return;
|
||||||
|
foreach ($PHORUM['hooks'][$hook]['mods'] as $idx => $mod) {
|
||||||
|
if ($mod == 'htmlpurifier') $our_idx = $idx;
|
||||||
|
$last_idx = $idx;
|
||||||
|
}
|
||||||
|
list($mod) = array_splice($PHORUM['hooks'][$hook]['mods'], $our_idx, 1);
|
||||||
|
$PHORUM['hooks'][$hook]['mods'][] = $mod;
|
||||||
|
list($func) = array_splice($PHORUM['hooks'][$hook]['funcs'], $our_idx, 1);
|
||||||
|
$PHORUM['hooks'][$hook]['funcs'][] = $func;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Pre-emptively performs purification if it looks like a WYSIWYG editor
|
||||||
|
* is being used
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_before_editor($message) {
|
||||||
|
if (!empty($GLOBALS['PHORUM']['mod_htmlpurifier']['wysiwyg'])) {
|
||||||
|
if (!empty($message['body'])) {
|
||||||
|
$body = $message['body'];
|
||||||
|
// de-entity-ize contents
|
||||||
|
$body = str_replace(array('<','>','&'), array('<','>','&'), $body);
|
||||||
|
$purifier =& HTMLPurifier::getInstance();
|
||||||
|
$body = $purifier->purify($message['body']);
|
||||||
|
// re-entity-ize contents
|
||||||
|
$body = htmlspecialchars($body, ENT_QUOTES, $GLOBALS['PHORUM']['DATA']['CHARSET']);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return $message;
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_editor_after_subject() {
|
||||||
|
// don't show this message if it's a WYSIWYG editor, since it will
|
||||||
|
// then be handled automatically
|
||||||
|
if (!empty($GLOBALS['PHORUM']['mod_htmlpurifier']['wysiwyg'])) return;
|
||||||
|
?><tr><td colspan="2" style="padding:1em 0.3em;" class="htmlpurifier-help">
|
||||||
|
<p>
|
||||||
|
<strong>HTML input</strong> is enabled. Make sure you escape all HTML and
|
||||||
|
angled brackets with <code>&lt;</code> and <code>&gt;</code>.
|
||||||
|
</p><?php
|
||||||
|
$purifier =& HTMLPurifier::getInstance();
|
||||||
|
$config = $purifier->config;
|
||||||
|
if ($config->get('AutoFormat', 'AutoParagraph')) {
|
||||||
|
?><p>
|
||||||
|
<strong>Auto-paragraphing</strong> is enabled. Double
|
||||||
|
newlines will be converted to paragraphs; for single
|
||||||
|
newlines, use the <code>pre</code> tag.
|
||||||
|
</p><?php
|
||||||
|
}
|
||||||
|
$html_definition = $config->getDefinition('HTML');
|
||||||
|
$allowed = array();
|
||||||
|
foreach ($html_definition->info as $name => $x) $allowed[] = "<code>$name</code>";
|
||||||
|
sort($allowed);
|
||||||
|
$allowed_text = implode(', ', $allowed);
|
||||||
|
?><p><strong>Allowed tags:</strong> <?php
|
||||||
|
echo $allowed_text;
|
||||||
|
?>.</p><?php
|
||||||
|
?>
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
For inputting literal code such as HTML and PHP for display, use
|
||||||
|
CDATA tags to auto-escape your angled brackets, and <code>pre</code>
|
||||||
|
to preserve newlines:
|
||||||
|
</p>
|
||||||
|
<pre><pre><![CDATA[
|
||||||
|
<em>Place code here</em>
|
||||||
|
]]></pre></pre>
|
||||||
|
<p>
|
||||||
|
Power users, you can hide this notice with:
|
||||||
|
<pre>.htmlpurifier-help {display:none;}</pre>
|
||||||
|
</p>
|
||||||
|
</td></tr><?php
|
||||||
|
}
|
||||||
|
|
504
plugins/phorum/htmlpurifier/LICENSE
Normal file
504
plugins/phorum/htmlpurifier/LICENSE
Normal file
@@ -0,0 +1,504 @@
|
|||||||
|
GNU LESSER GENERAL PUBLIC LICENSE
|
||||||
|
Version 2.1, February 1999
|
||||||
|
|
||||||
|
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
|
||||||
|
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||||
|
Everyone is permitted to copy and distribute verbatim copies
|
||||||
|
of this license document, but changing it is not allowed.
|
||||||
|
|
||||||
|
[This is the first released version of the Lesser GPL. It also counts
|
||||||
|
as the successor of the GNU Library Public License, version 2, hence
|
||||||
|
the version number 2.1.]
|
||||||
|
|
||||||
|
Preamble
|
||||||
|
|
||||||
|
The licenses for most software are designed to take away your
|
||||||
|
freedom to share and change it. By contrast, the GNU General Public
|
||||||
|
Licenses are intended to guarantee your freedom to share and change
|
||||||
|
free software--to make sure the software is free for all its users.
|
||||||
|
|
||||||
|
This license, the Lesser General Public License, applies to some
|
||||||
|
specially designated software packages--typically libraries--of the
|
||||||
|
Free Software Foundation and other authors who decide to use it. You
|
||||||
|
can use it too, but we suggest you first think carefully about whether
|
||||||
|
this license or the ordinary General Public License is the better
|
||||||
|
strategy to use in any particular case, based on the explanations below.
|
||||||
|
|
||||||
|
When we speak of free software, we are referring to freedom of use,
|
||||||
|
not price. Our General Public Licenses are designed to make sure that
|
||||||
|
you have the freedom to distribute copies of free software (and charge
|
||||||
|
for this service if you wish); that you receive source code or can get
|
||||||
|
it if you want it; that you can change the software and use pieces of
|
||||||
|
it in new free programs; and that you are informed that you can do
|
||||||
|
these things.
|
||||||
|
|
||||||
|
To protect your rights, we need to make restrictions that forbid
|
||||||
|
distributors to deny you these rights or to ask you to surrender these
|
||||||
|
rights. These restrictions translate to certain responsibilities for
|
||||||
|
you if you distribute copies of the library or if you modify it.
|
||||||
|
|
||||||
|
For example, if you distribute copies of the library, whether gratis
|
||||||
|
or for a fee, you must give the recipients all the rights that we gave
|
||||||
|
you. You must make sure that they, too, receive or can get the source
|
||||||
|
code. If you link other code with the library, you must provide
|
||||||
|
complete object files to the recipients, so that they can relink them
|
||||||
|
with the library after making changes to the library and recompiling
|
||||||
|
it. And you must show them these terms so they know their rights.
|
||||||
|
|
||||||
|
We protect your rights with a two-step method: (1) we copyright the
|
||||||
|
library, and (2) we offer you this license, which gives you legal
|
||||||
|
permission to copy, distribute and/or modify the library.
|
||||||
|
|
||||||
|
To protect each distributor, we want to make it very clear that
|
||||||
|
there is no warranty for the free library. Also, if the library is
|
||||||
|
modified by someone else and passed on, the recipients should know
|
||||||
|
that what they have is not the original version, so that the original
|
||||||
|
author's reputation will not be affected by problems that might be
|
||||||
|
introduced by others.
|
||||||
|
|
||||||
|
Finally, software patents pose a constant threat to the existence of
|
||||||
|
any free program. We wish to make sure that a company cannot
|
||||||
|
effectively restrict the users of a free program by obtaining a
|
||||||
|
restrictive license from a patent holder. Therefore, we insist that
|
||||||
|
any patent license obtained for a version of the library must be
|
||||||
|
consistent with the full freedom of use specified in this license.
|
||||||
|
|
||||||
|
Most GNU software, including some libraries, is covered by the
|
||||||
|
ordinary GNU General Public License. This license, the GNU Lesser
|
||||||
|
General Public License, applies to certain designated libraries, and
|
||||||
|
is quite different from the ordinary General Public License. We use
|
||||||
|
this license for certain libraries in order to permit linking those
|
||||||
|
libraries into non-free programs.
|
||||||
|
|
||||||
|
When a program is linked with a library, whether statically or using
|
||||||
|
a shared library, the combination of the two is legally speaking a
|
||||||
|
combined work, a derivative of the original library. The ordinary
|
||||||
|
General Public License therefore permits such linking only if the
|
||||||
|
entire combination fits its criteria of freedom. The Lesser General
|
||||||
|
Public License permits more lax criteria for linking other code with
|
||||||
|
the library.
|
||||||
|
|
||||||
|
We call this license the "Lesser" General Public License because it
|
||||||
|
does Less to protect the user's freedom than the ordinary General
|
||||||
|
Public License. It also provides other free software developers Less
|
||||||
|
of an advantage over competing non-free programs. These disadvantages
|
||||||
|
are the reason we use the ordinary General Public License for many
|
||||||
|
libraries. However, the Lesser license provides advantages in certain
|
||||||
|
special circumstances.
|
||||||
|
|
||||||
|
For example, on rare occasions, there may be a special need to
|
||||||
|
encourage the widest possible use of a certain library, so that it becomes
|
||||||
|
a de-facto standard. To achieve this, non-free programs must be
|
||||||
|
allowed to use the library. A more frequent case is that a free
|
||||||
|
library does the same job as widely used non-free libraries. In this
|
||||||
|
case, there is little to gain by limiting the free library to free
|
||||||
|
software only, so we use the Lesser General Public License.
|
||||||
|
|
||||||
|
In other cases, permission to use a particular library in non-free
|
||||||
|
programs enables a greater number of people to use a large body of
|
||||||
|
free software. For example, permission to use the GNU C Library in
|
||||||
|
non-free programs enables many more people to use the whole GNU
|
||||||
|
operating system, as well as its variant, the GNU/Linux operating
|
||||||
|
system.
|
||||||
|
|
||||||
|
Although the Lesser General Public License is Less protective of the
|
||||||
|
users' freedom, it does ensure that the user of a program that is
|
||||||
|
linked with the Library has the freedom and the wherewithal to run
|
||||||
|
that program using a modified version of the Library.
|
||||||
|
|
||||||
|
The precise terms and conditions for copying, distribution and
|
||||||
|
modification follow. Pay close attention to the difference between a
|
||||||
|
"work based on the library" and a "work that uses the library". The
|
||||||
|
former contains code derived from the library, whereas the latter must
|
||||||
|
be combined with the library in order to run.
|
||||||
|
|
||||||
|
GNU LESSER GENERAL PUBLIC LICENSE
|
||||||
|
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||||
|
|
||||||
|
0. This License Agreement applies to any software library or other
|
||||||
|
program which contains a notice placed by the copyright holder or
|
||||||
|
other authorized party saying it may be distributed under the terms of
|
||||||
|
this Lesser General Public License (also called "this License").
|
||||||
|
Each licensee is addressed as "you".
|
||||||
|
|
||||||
|
A "library" means a collection of software functions and/or data
|
||||||
|
prepared so as to be conveniently linked with application programs
|
||||||
|
(which use some of those functions and data) to form executables.
|
||||||
|
|
||||||
|
The "Library", below, refers to any such software library or work
|
||||||
|
which has been distributed under these terms. A "work based on the
|
||||||
|
Library" means either the Library or any derivative work under
|
||||||
|
copyright law: that is to say, a work containing the Library or a
|
||||||
|
portion of it, either verbatim or with modifications and/or translated
|
||||||
|
straightforwardly into another language. (Hereinafter, translation is
|
||||||
|
included without limitation in the term "modification".)
|
||||||
|
|
||||||
|
"Source code" for a work means the preferred form of the work for
|
||||||
|
making modifications to it. For a library, complete source code means
|
||||||
|
all the source code for all modules it contains, plus any associated
|
||||||
|
interface definition files, plus the scripts used to control compilation
|
||||||
|
and installation of the library.
|
||||||
|
|
||||||
|
Activities other than copying, distribution and modification are not
|
||||||
|
covered by this License; they are outside its scope. The act of
|
||||||
|
running a program using the Library is not restricted, and output from
|
||||||
|
such a program is covered only if its contents constitute a work based
|
||||||
|
on the Library (independent of the use of the Library in a tool for
|
||||||
|
writing it). Whether that is true depends on what the Library does
|
||||||
|
and what the program that uses the Library does.
|
||||||
|
|
||||||
|
1. You may copy and distribute verbatim copies of the Library's
|
||||||
|
complete source code as you receive it, in any medium, provided that
|
||||||
|
you conspicuously and appropriately publish on each copy an
|
||||||
|
appropriate copyright notice and disclaimer of warranty; keep intact
|
||||||
|
all the notices that refer to this License and to the absence of any
|
||||||
|
warranty; and distribute a copy of this License along with the
|
||||||
|
Library.
|
||||||
|
|
||||||
|
You may charge a fee for the physical act of transferring a copy,
|
||||||
|
and you may at your option offer warranty protection in exchange for a
|
||||||
|
fee.
|
||||||
|
|
||||||
|
2. You may modify your copy or copies of the Library or any portion
|
||||||
|
of it, thus forming a work based on the Library, and copy and
|
||||||
|
distribute such modifications or work under the terms of Section 1
|
||||||
|
above, provided that you also meet all of these conditions:
|
||||||
|
|
||||||
|
a) The modified work must itself be a software library.
|
||||||
|
|
||||||
|
b) You must cause the files modified to carry prominent notices
|
||||||
|
stating that you changed the files and the date of any change.
|
||||||
|
|
||||||
|
c) You must cause the whole of the work to be licensed at no
|
||||||
|
charge to all third parties under the terms of this License.
|
||||||
|
|
||||||
|
d) If a facility in the modified Library refers to a function or a
|
||||||
|
table of data to be supplied by an application program that uses
|
||||||
|
the facility, other than as an argument passed when the facility
|
||||||
|
is invoked, then you must make a good faith effort to ensure that,
|
||||||
|
in the event an application does not supply such function or
|
||||||
|
table, the facility still operates, and performs whatever part of
|
||||||
|
its purpose remains meaningful.
|
||||||
|
|
||||||
|
(For example, a function in a library to compute square roots has
|
||||||
|
a purpose that is entirely well-defined independent of the
|
||||||
|
application. Therefore, Subsection 2d requires that any
|
||||||
|
application-supplied function or table used by this function must
|
||||||
|
be optional: if the application does not supply it, the square
|
||||||
|
root function must still compute square roots.)
|
||||||
|
|
||||||
|
These requirements apply to the modified work as a whole. If
|
||||||
|
identifiable sections of that work are not derived from the Library,
|
||||||
|
and can be reasonably considered independent and separate works in
|
||||||
|
themselves, then this License, and its terms, do not apply to those
|
||||||
|
sections when you distribute them as separate works. But when you
|
||||||
|
distribute the same sections as part of a whole which is a work based
|
||||||
|
on the Library, the distribution of the whole must be on the terms of
|
||||||
|
this License, whose permissions for other licensees extend to the
|
||||||
|
entire whole, and thus to each and every part regardless of who wrote
|
||||||
|
it.
|
||||||
|
|
||||||
|
Thus, it is not the intent of this section to claim rights or contest
|
||||||
|
your rights to work written entirely by you; rather, the intent is to
|
||||||
|
exercise the right to control the distribution of derivative or
|
||||||
|
collective works based on the Library.
|
||||||
|
|
||||||
|
In addition, mere aggregation of another work not based on the Library
|
||||||
|
with the Library (or with a work based on the Library) on a volume of
|
||||||
|
a storage or distribution medium does not bring the other work under
|
||||||
|
the scope of this License.
|
||||||
|
|
||||||
|
3. You may opt to apply the terms of the ordinary GNU General Public
|
||||||
|
License instead of this License to a given copy of the Library. To do
|
||||||
|
this, you must alter all the notices that refer to this License, so
|
||||||
|
that they refer to the ordinary GNU General Public License, version 2,
|
||||||
|
instead of to this License. (If a newer version than version 2 of the
|
||||||
|
ordinary GNU General Public License has appeared, then you can specify
|
||||||
|
that version instead if you wish.) Do not make any other change in
|
||||||
|
these notices.
|
||||||
|
|
||||||
|
Once this change is made in a given copy, it is irreversible for
|
||||||
|
that copy, so the ordinary GNU General Public License applies to all
|
||||||
|
subsequent copies and derivative works made from that copy.
|
||||||
|
|
||||||
|
This option is useful when you wish to copy part of the code of
|
||||||
|
the Library into a program that is not a library.
|
||||||
|
|
||||||
|
4. You may copy and distribute the Library (or a portion or
|
||||||
|
derivative of it, under Section 2) in object code or executable form
|
||||||
|
under the terms of Sections 1 and 2 above provided that you accompany
|
||||||
|
it with the complete corresponding machine-readable source code, which
|
||||||
|
must be distributed under the terms of Sections 1 and 2 above on a
|
||||||
|
medium customarily used for software interchange.
|
||||||
|
|
||||||
|
If distribution of object code is made by offering access to copy
|
||||||
|
from a designated place, then offering equivalent access to copy the
|
||||||
|
source code from the same place satisfies the requirement to
|
||||||
|
distribute the source code, even though third parties are not
|
||||||
|
compelled to copy the source along with the object code.
|
||||||
|
|
||||||
|
5. A program that contains no derivative of any portion of the
|
||||||
|
Library, but is designed to work with the Library by being compiled or
|
||||||
|
linked with it, is called a "work that uses the Library". Such a
|
||||||
|
work, in isolation, is not a derivative work of the Library, and
|
||||||
|
therefore falls outside the scope of this License.
|
||||||
|
|
||||||
|
However, linking a "work that uses the Library" with the Library
|
||||||
|
creates an executable that is a derivative of the Library (because it
|
||||||
|
contains portions of the Library), rather than a "work that uses the
|
||||||
|
library". The executable is therefore covered by this License.
|
||||||
|
Section 6 states terms for distribution of such executables.
|
||||||
|
|
||||||
|
When a "work that uses the Library" uses material from a header file
|
||||||
|
that is part of the Library, the object code for the work may be a
|
||||||
|
derivative work of the Library even though the source code is not.
|
||||||
|
Whether this is true is especially significant if the work can be
|
||||||
|
linked without the Library, or if the work is itself a library. The
|
||||||
|
threshold for this to be true is not precisely defined by law.
|
||||||
|
|
||||||
|
If such an object file uses only numerical parameters, data
|
||||||
|
structure layouts and accessors, and small macros and small inline
|
||||||
|
functions (ten lines or less in length), then the use of the object
|
||||||
|
file is unrestricted, regardless of whether it is legally a derivative
|
||||||
|
work. (Executables containing this object code plus portions of the
|
||||||
|
Library will still fall under Section 6.)
|
||||||
|
|
||||||
|
Otherwise, if the work is a derivative of the Library, you may
|
||||||
|
distribute the object code for the work under the terms of Section 6.
|
||||||
|
Any executables containing that work also fall under Section 6,
|
||||||
|
whether or not they are linked directly with the Library itself.
|
||||||
|
|
||||||
|
6. As an exception to the Sections above, you may also combine or
|
||||||
|
link a "work that uses the Library" with the Library to produce a
|
||||||
|
work containing portions of the Library, and distribute that work
|
||||||
|
under terms of your choice, provided that the terms permit
|
||||||
|
modification of the work for the customer's own use and reverse
|
||||||
|
engineering for debugging such modifications.
|
||||||
|
|
||||||
|
You must give prominent notice with each copy of the work that the
|
||||||
|
Library is used in it and that the Library and its use are covered by
|
||||||
|
this License. You must supply a copy of this License. If the work
|
||||||
|
during execution displays copyright notices, you must include the
|
||||||
|
copyright notice for the Library among them, as well as a reference
|
||||||
|
directing the user to the copy of this License. Also, you must do one
|
||||||
|
of these things:
|
||||||
|
|
||||||
|
a) Accompany the work with the complete corresponding
|
||||||
|
machine-readable source code for the Library including whatever
|
||||||
|
changes were used in the work (which must be distributed under
|
||||||
|
Sections 1 and 2 above); and, if the work is an executable linked
|
||||||
|
with the Library, with the complete machine-readable "work that
|
||||||
|
uses the Library", as object code and/or source code, so that the
|
||||||
|
user can modify the Library and then relink to produce a modified
|
||||||
|
executable containing the modified Library. (It is understood
|
||||||
|
that the user who changes the contents of definitions files in the
|
||||||
|
Library will not necessarily be able to recompile the application
|
||||||
|
to use the modified definitions.)
|
||||||
|
|
||||||
|
b) Use a suitable shared library mechanism for linking with the
|
||||||
|
Library. A suitable mechanism is one that (1) uses at run time a
|
||||||
|
copy of the library already present on the user's computer system,
|
||||||
|
rather than copying library functions into the executable, and (2)
|
||||||
|
will operate properly with a modified version of the library, if
|
||||||
|
the user installs one, as long as the modified version is
|
||||||
|
interface-compatible with the version that the work was made with.
|
||||||
|
|
||||||
|
c) Accompany the work with a written offer, valid for at
|
||||||
|
least three years, to give the same user the materials
|
||||||
|
specified in Subsection 6a, above, for a charge no more
|
||||||
|
than the cost of performing this distribution.
|
||||||
|
|
||||||
|
d) If distribution of the work is made by offering access to copy
|
||||||
|
from a designated place, offer equivalent access to copy the above
|
||||||
|
specified materials from the same place.
|
||||||
|
|
||||||
|
e) Verify that the user has already received a copy of these
|
||||||
|
materials or that you have already sent this user a copy.
|
||||||
|
|
||||||
|
For an executable, the required form of the "work that uses the
|
||||||
|
Library" must include any data and utility programs needed for
|
||||||
|
reproducing the executable from it. However, as a special exception,
|
||||||
|
the materials to be distributed need not include anything that is
|
||||||
|
normally distributed (in either source or binary form) with the major
|
||||||
|
components (compiler, kernel, and so on) of the operating system on
|
||||||
|
which the executable runs, unless that component itself accompanies
|
||||||
|
the executable.
|
||||||
|
|
||||||
|
It may happen that this requirement contradicts the license
|
||||||
|
restrictions of other proprietary libraries that do not normally
|
||||||
|
accompany the operating system. Such a contradiction means you cannot
|
||||||
|
use both them and the Library together in an executable that you
|
||||||
|
distribute.
|
||||||
|
|
||||||
|
7. You may place library facilities that are a work based on the
|
||||||
|
Library side-by-side in a single library together with other library
|
||||||
|
facilities not covered by this License, and distribute such a combined
|
||||||
|
library, provided that the separate distribution of the work based on
|
||||||
|
the Library and of the other library facilities is otherwise
|
||||||
|
permitted, and provided that you do these two things:
|
||||||
|
|
||||||
|
a) Accompany the combined library with a copy of the same work
|
||||||
|
based on the Library, uncombined with any other library
|
||||||
|
facilities. This must be distributed under the terms of the
|
||||||
|
Sections above.
|
||||||
|
|
||||||
|
b) Give prominent notice with the combined library of the fact
|
||||||
|
that part of it is a work based on the Library, and explaining
|
||||||
|
where to find the accompanying uncombined form of the same work.
|
||||||
|
|
||||||
|
8. You may not copy, modify, sublicense, link with, or distribute
|
||||||
|
the Library except as expressly provided under this License. Any
|
||||||
|
attempt otherwise to copy, modify, sublicense, link with, or
|
||||||
|
distribute the Library is void, and will automatically terminate your
|
||||||
|
rights under this License. However, parties who have received copies,
|
||||||
|
or rights, from you under this License will not have their licenses
|
||||||
|
terminated so long as such parties remain in full compliance.
|
||||||
|
|
||||||
|
9. You are not required to accept this License, since you have not
|
||||||
|
signed it. However, nothing else grants you permission to modify or
|
||||||
|
distribute the Library or its derivative works. These actions are
|
||||||
|
prohibited by law if you do not accept this License. Therefore, by
|
||||||
|
modifying or distributing the Library (or any work based on the
|
||||||
|
Library), you indicate your acceptance of this License to do so, and
|
||||||
|
all its terms and conditions for copying, distributing or modifying
|
||||||
|
the Library or works based on it.
|
||||||
|
|
||||||
|
10. Each time you redistribute the Library (or any work based on the
|
||||||
|
Library), the recipient automatically receives a license from the
|
||||||
|
original licensor to copy, distribute, link with or modify the Library
|
||||||
|
subject to these terms and conditions. You may not impose any further
|
||||||
|
restrictions on the recipients' exercise of the rights granted herein.
|
||||||
|
You are not responsible for enforcing compliance by third parties with
|
||||||
|
this License.
|
||||||
|
|
||||||
|
11. If, as a consequence of a court judgment or allegation of patent
|
||||||
|
infringement or for any other reason (not limited to patent issues),
|
||||||
|
conditions are imposed on you (whether by court order, agreement or
|
||||||
|
otherwise) that contradict the conditions of this License, they do not
|
||||||
|
excuse you from the conditions of this License. If you cannot
|
||||||
|
distribute so as to satisfy simultaneously your obligations under this
|
||||||
|
License and any other pertinent obligations, then as a consequence you
|
||||||
|
may not distribute the Library at all. For example, if a patent
|
||||||
|
license would not permit royalty-free redistribution of the Library by
|
||||||
|
all those who receive copies directly or indirectly through you, then
|
||||||
|
the only way you could satisfy both it and this License would be to
|
||||||
|
refrain entirely from distribution of the Library.
|
||||||
|
|
||||||
|
If any portion of this section is held invalid or unenforceable under any
|
||||||
|
particular circumstance, the balance of the section is intended to apply,
|
||||||
|
and the section as a whole is intended to apply in other circumstances.
|
||||||
|
|
||||||
|
It is not the purpose of this section to induce you to infringe any
|
||||||
|
patents or other property right claims or to contest validity of any
|
||||||
|
such claims; this section has the sole purpose of protecting the
|
||||||
|
integrity of the free software distribution system which is
|
||||||
|
implemented by public license practices. Many people have made
|
||||||
|
generous contributions to the wide range of software distributed
|
||||||
|
through that system in reliance on consistent application of that
|
||||||
|
system; it is up to the author/donor to decide if he or she is willing
|
||||||
|
to distribute software through any other system and a licensee cannot
|
||||||
|
impose that choice.
|
||||||
|
|
||||||
|
This section is intended to make thoroughly clear what is believed to
|
||||||
|
be a consequence of the rest of this License.
|
||||||
|
|
||||||
|
12. If the distribution and/or use of the Library is restricted in
|
||||||
|
certain countries either by patents or by copyrighted interfaces, the
|
||||||
|
original copyright holder who places the Library under this License may add
|
||||||
|
an explicit geographical distribution limitation excluding those countries,
|
||||||
|
so that distribution is permitted only in or among countries not thus
|
||||||
|
excluded. In such case, this License incorporates the limitation as if
|
||||||
|
written in the body of this License.
|
||||||
|
|
||||||
|
13. The Free Software Foundation may publish revised and/or new
|
||||||
|
versions of the Lesser General Public License from time to time.
|
||||||
|
Such new versions will be similar in spirit to the present version,
|
||||||
|
but may differ in detail to address new problems or concerns.
|
||||||
|
|
||||||
|
Each version is given a distinguishing version number. If the Library
|
||||||
|
specifies a version number of this License which applies to it and
|
||||||
|
"any later version", you have the option of following the terms and
|
||||||
|
conditions either of that version or of any later version published by
|
||||||
|
the Free Software Foundation. If the Library does not specify a
|
||||||
|
license version number, you may choose any version ever published by
|
||||||
|
the Free Software Foundation.
|
||||||
|
|
||||||
|
14. If you wish to incorporate parts of the Library into other free
|
||||||
|
programs whose distribution conditions are incompatible with these,
|
||||||
|
write to the author to ask for permission. For software which is
|
||||||
|
copyrighted by the Free Software Foundation, write to the Free
|
||||||
|
Software Foundation; we sometimes make exceptions for this. Our
|
||||||
|
decision will be guided by the two goals of preserving the free status
|
||||||
|
of all derivatives of our free software and of promoting the sharing
|
||||||
|
and reuse of software generally.
|
||||||
|
|
||||||
|
NO WARRANTY
|
||||||
|
|
||||||
|
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
|
||||||
|
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
|
||||||
|
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
|
||||||
|
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
|
||||||
|
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
|
||||||
|
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||||
|
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
|
||||||
|
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
|
||||||
|
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
|
||||||
|
|
||||||
|
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
|
||||||
|
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
|
||||||
|
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
|
||||||
|
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
|
||||||
|
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
|
||||||
|
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
|
||||||
|
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
|
||||||
|
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
|
||||||
|
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
|
||||||
|
DAMAGES.
|
||||||
|
|
||||||
|
END OF TERMS AND CONDITIONS
|
||||||
|
|
||||||
|
How to Apply These Terms to Your New Libraries
|
||||||
|
|
||||||
|
If you develop a new library, and you want it to be of the greatest
|
||||||
|
possible use to the public, we recommend making it free software that
|
||||||
|
everyone can redistribute and change. You can do so by permitting
|
||||||
|
redistribution under these terms (or, alternatively, under the terms of the
|
||||||
|
ordinary General Public License).
|
||||||
|
|
||||||
|
To apply these terms, attach the following notices to the library. It is
|
||||||
|
safest to attach them to the start of each source file to most effectively
|
||||||
|
convey the exclusion of warranty; and each file should have at least the
|
||||||
|
"copyright" line and a pointer to where the full notice is found.
|
||||||
|
|
||||||
|
<one line to give the library's name and a brief idea of what it does.>
|
||||||
|
Copyright (C) <year> <name of author>
|
||||||
|
|
||||||
|
This library is free software; you can redistribute it and/or
|
||||||
|
modify it under the terms of the GNU Lesser General Public
|
||||||
|
License as published by the Free Software Foundation; either
|
||||||
|
version 2.1 of the License, or (at your option) any later version.
|
||||||
|
|
||||||
|
This library is distributed in the hope that it will be useful,
|
||||||
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
|
||||||
|
Lesser General Public License for more details.
|
||||||
|
|
||||||
|
You should have received a copy of the GNU Lesser General Public
|
||||||
|
License along with this library; if not, write to the Free Software
|
||||||
|
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||||
|
|
||||||
|
Also add information on how to contact you by electronic and paper mail.
|
||||||
|
|
||||||
|
You should also get your employer (if you work as a programmer) or your
|
||||||
|
school, if any, to sign a "copyright disclaimer" for the library, if
|
||||||
|
necessary. Here is a sample; alter the names:
|
||||||
|
|
||||||
|
Yoyodyne, Inc., hereby disclaims all copyright interest in the
|
||||||
|
library `Frob' (a library for tweaking knobs) written by James Random Hacker.
|
||||||
|
|
||||||
|
<signature of Ty Coon>, 1 April 1990
|
||||||
|
Ty Coon, President of Vice
|
||||||
|
|
||||||
|
That's all there is to it!
|
||||||
|
|
||||||
|
|
1
plugins/phorum/htmlpurifier/README
Normal file
1
plugins/phorum/htmlpurifier/README
Normal file
@@ -0,0 +1 @@
|
|||||||
|
The contents of the library/ folder should be here.
|
8
plugins/phorum/info.txt
Normal file
8
plugins/phorum/info.txt
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
hook: format|phorum_htmlpurifier_format
|
||||||
|
hook: quote|phorum_htmlpurifier_quote
|
||||||
|
hook: posting_custom_action|phorum_htmlpurifier_posting
|
||||||
|
hook: common|phorum_htmlpurifier_common
|
||||||
|
hook: before_editor|phorum_htmlpurifier_before_editor
|
||||||
|
hook: tpl_editor_after_subject|phorum_htmlpurifier_editor_after_subject
|
||||||
|
title: HTML Purifier Phorum Mod
|
||||||
|
desc: This module enables standards-compliant HTML filtering on Phorum. Please check migrate.bbcode.php before enabling this mod.
|
27
plugins/phorum/init-config.php
Normal file
27
plugins/phorum/init-config.php
Normal file
@@ -0,0 +1,27 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Initializes the appropriate configuration from either a PHP file
|
||||||
|
* or a module configuration value
|
||||||
|
* @return Instance of HTMLPurifier_Config
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_get_config() {
|
||||||
|
global $PHORUM;
|
||||||
|
$config_exists = phorum_htmlpurifier_config_file_exists();
|
||||||
|
if ($config_exists || !isset($PHORUM['mod_htmlpurifier']['config'])) {
|
||||||
|
$config = HTMLPurifier_Config::createDefault();
|
||||||
|
include(dirname(__FILE__) . '/config.default.php');
|
||||||
|
if ($config_exists) {
|
||||||
|
include(dirname(__FILE__) . '/config.php');
|
||||||
|
}
|
||||||
|
unset($PHORUM['mod_htmlpurifier']['config']); // unnecessary
|
||||||
|
} else {
|
||||||
|
$config = HTMLPurifier_Config::create($PHORUM['mod_htmlpurifier']['config']);
|
||||||
|
}
|
||||||
|
return $config;
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_config_file_exists() {
|
||||||
|
return file_exists(dirname(__FILE__) . '/config.php');
|
||||||
|
}
|
||||||
|
|
33
plugins/phorum/install.txt
Normal file
33
plugins/phorum/install.txt
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
|
||||||
|
HTML Purifier Phorum Mod - Filter your HTML the Standards-Compliant Way!
|
||||||
|
|
||||||
|
This Phorum mod enables HTML posting on Phorum. Under normal circumstances,
|
||||||
|
this would cause a huge security risk, but because we are running
|
||||||
|
HTML through HTML Purifier, output is guaranteed to be XSS free and
|
||||||
|
standards-compliant.
|
||||||
|
|
||||||
|
This mod requires HTML input, and previous markup languages need to be
|
||||||
|
converted accordingly. Thus, it is vital that you create a 'migrate.php'
|
||||||
|
file that works with your installation. If you're using the built-in
|
||||||
|
BBCode formatting, simply move migrate.bbcode.php to that place; for
|
||||||
|
other markup languages, consult said file for instructions on how
|
||||||
|
to adapt it to your needs.
|
||||||
|
|
||||||
|
This module will not work if 'migrate.php' is not created, and an improperly
|
||||||
|
made migration file may *CORRUPT* Phorum, so please take your time to
|
||||||
|
do this correctly. It should go without saying to *BACKUP YOUR DATABASE*
|
||||||
|
before attempting anything here.
|
||||||
|
|
||||||
|
This module will not automatically migrate user signatures, because this
|
||||||
|
process may take a long time. After installing the HTML Purifier module and
|
||||||
|
then configuring 'migrate.php', navigate to Settings and click 'Migrate
|
||||||
|
Signatures' to migrate all user signatures.
|
||||||
|
|
||||||
|
The version of HTML Purifier bundled with is a custom modified 2.0.1.
|
||||||
|
Do not attempt to replace it with a version equal to or less than
|
||||||
|
downloaded from the HTML Purifier website: the module will combust
|
||||||
|
spectacularly. (Greater versions, however, are okay, because the changes
|
||||||
|
made to accomodate this module have been committed to the trunk).
|
||||||
|
|
||||||
|
Visit HTML Purifier at <http://htmlpurifier.org/>. May the force
|
||||||
|
be with you.
|
28
plugins/phorum/migrate.bbcode.php
Normal file
28
plugins/phorum/migrate.bbcode.php
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
/**
|
||||||
|
* This file is responsible for migrating from a specific markup language
|
||||||
|
* like BBCode or Markdown to HTML. WARNING: THIS PROCESS IS NOT REVERSIBLE
|
||||||
|
*
|
||||||
|
* Copy this file to 'migrate.php' and it will automatically work for
|
||||||
|
* BBCode; you may need to tweak this a little to get it to work for other
|
||||||
|
* languages (usually, just replace the include name and the function name).
|
||||||
|
*
|
||||||
|
* If you do NOT want to have any migration performed (for instance, you
|
||||||
|
* are installing the module on a new forum with no posts), simply remove
|
||||||
|
* phorum_htmlpurifier_migrate() function. You still need migrate.php
|
||||||
|
* present, otherwise the module won't work.
|
||||||
|
*/
|
||||||
|
|
||||||
|
if(!defined("PHORUM")) exit;
|
||||||
|
|
||||||
|
require_once(dirname(__FILE__) . "/../bbcode/bbcode.php");
|
||||||
|
|
||||||
|
/**
|
||||||
|
* 'format' hook style function that will be called to convert
|
||||||
|
* legacy markup into HTML.
|
||||||
|
*/
|
||||||
|
function phorum_htmlpurifier_migrate($data) {
|
||||||
|
return phorum_bb_code($data); // bbcode's 'format' hook
|
||||||
|
}
|
||||||
|
|
63
plugins/phorum/settings.php
Normal file
63
plugins/phorum/settings.php
Normal file
@@ -0,0 +1,63 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
// based off of BBCode's settings file
|
||||||
|
|
||||||
|
/**
|
||||||
|
* HTML Purifier Phorum mod settings configuration. This provides
|
||||||
|
* a convenient web-interface for editing the most common HTML Purifier
|
||||||
|
* configuration directives. You can also specify custom configuration
|
||||||
|
* by creating a 'config.php' file.
|
||||||
|
*/
|
||||||
|
|
||||||
|
if(!defined("PHORUM_ADMIN")) exit;
|
||||||
|
|
||||||
|
// error reporting is good!
|
||||||
|
error_reporting(E_ALL ^ E_NOTICE);
|
||||||
|
|
||||||
|
// load library and other paraphenalia
|
||||||
|
require_once './include/admin/PhorumInputForm.php';
|
||||||
|
require_once (dirname(__FILE__) . '/htmlpurifier/HTMLPurifier.auto.php');
|
||||||
|
require_once (dirname(__FILE__) . '/init-config.php');
|
||||||
|
require_once (dirname(__FILE__) . '/settings/migrate-sigs-form.php');
|
||||||
|
require_once (dirname(__FILE__) . '/settings/migrate-sigs.php');
|
||||||
|
require_once (dirname(__FILE__) . '/settings/form.php');
|
||||||
|
require_once (dirname(__FILE__) . '/settings/save.php');
|
||||||
|
|
||||||
|
// define friendly configuration directives. you can expand this array
|
||||||
|
// to get more web-definable directives
|
||||||
|
$PHORUM['mod_htmlpurifier']['directives'] = array(
|
||||||
|
'URI.Host', // auto-detectable
|
||||||
|
'URI.DisableExternal',
|
||||||
|
'URI.DisableExternalResources',
|
||||||
|
'URI.DisableResources',
|
||||||
|
'URI.Munge',
|
||||||
|
'URI.HostBlacklist',
|
||||||
|
'URI.Disable',
|
||||||
|
'HTML.TidyLevel',
|
||||||
|
'HTML.Doctype', // auto-detectable
|
||||||
|
'HTML.Allowed',
|
||||||
|
'AutoFormat',
|
||||||
|
'-AutoFormat.Custom',
|
||||||
|
'-AutoFormat.PurifierLinkify',
|
||||||
|
'Output.TidyFormat',
|
||||||
|
);
|
||||||
|
|
||||||
|
// lower this setting if you're getting time outs/out of memory
|
||||||
|
$PHORUM['mod_htmlpurifier']['migrate-sigs-increment'] = 100;
|
||||||
|
|
||||||
|
if (isset($_POST['reset'])) {
|
||||||
|
unset($PHORUM['mod_htmlpurifier']['config']);
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($offset = phorum_htmlpurifier_migrate_sigs_check()) {
|
||||||
|
// migrate signatures
|
||||||
|
phorum_htmlpurifier_migrate_sigs($offset);
|
||||||
|
} elseif(!empty($_POST)){
|
||||||
|
// save settings
|
||||||
|
phorum_htmlpurifier_save_settings();
|
||||||
|
}
|
||||||
|
|
||||||
|
phorum_htmlpurifier_show_migrate_sigs_form();
|
||||||
|
echo '<br />';
|
||||||
|
phorum_htmlpurifier_show_form();
|
||||||
|
|
79
plugins/phorum/settings/form.php
Normal file
79
plugins/phorum/settings/form.php
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_show_form() {
|
||||||
|
if (phorum_htmlpurifier_config_file_exists()) {
|
||||||
|
phorum_htmlpurifier_show_config_info();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
global $PHORUM;
|
||||||
|
|
||||||
|
$config = phorum_htmlpurifier_get_config();
|
||||||
|
|
||||||
|
$frm = new PhorumInputForm ("", "post", "Save");
|
||||||
|
$frm->hidden("module", "modsettings");
|
||||||
|
$frm->hidden("mod", "htmlpurifier"); // this is the directory name that the Settings file lives in
|
||||||
|
|
||||||
|
if (!empty($error)){
|
||||||
|
echo "$error<br />";
|
||||||
|
}
|
||||||
|
|
||||||
|
$frm->addbreak("Edit settings for the HTML Purifier module");
|
||||||
|
|
||||||
|
$frm->addMessage('<p>Click on directive links to read what each option does
|
||||||
|
(links do not open in new windows).</p>
|
||||||
|
<p>For more flexibility (for instance, you want to edit the full
|
||||||
|
range of configuration directives), you can create a <tt>config.php</tt>
|
||||||
|
file in your <tt>mods/htmlpurifier/</tt> directory. Doing so will,
|
||||||
|
however, make the web configuration interface unavailable.</p>');
|
||||||
|
|
||||||
|
require_once 'HTMLPurifier/Printer/ConfigForm.php';
|
||||||
|
$htmlpurifier_form = new HTMLPurifier_Printer_ConfigForm('config', 'http://htmlpurifier.org/live/configdoc/plain.html#%s');
|
||||||
|
$htmlpurifier_form->setTextareaDimensions(23, 7); // widen a little, since we have space
|
||||||
|
|
||||||
|
$frm->addMessage($htmlpurifier_form->render(
|
||||||
|
$config, $PHORUM['mod_htmlpurifier']['directives'], false));
|
||||||
|
|
||||||
|
$frm->addMessage("<strong>Warning: Changing HTML Purifier's configuration will invalidate
|
||||||
|
the cache. Expect to see a flurry of database activity after you change
|
||||||
|
any of these settings.</strong>");
|
||||||
|
|
||||||
|
$frm->addrow('Reset to defaults:', $frm->checkbox("reset", "1", "", false));
|
||||||
|
|
||||||
|
// hack to include extra styling
|
||||||
|
echo '<style type="text/css">' . $htmlpurifier_form->getCSS() . '
|
||||||
|
.hp-config {margin-left:auto;margin-right:auto;}
|
||||||
|
</style>';
|
||||||
|
$js = $htmlpurifier_form->getJavaScript();
|
||||||
|
echo '<script type="text/javascript">'."<!--\n$js\n//-->".'</script>';
|
||||||
|
|
||||||
|
$frm->show();
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_show_config_info() {
|
||||||
|
global $PHORUM;
|
||||||
|
|
||||||
|
// update mod_htmlpurifier for housekeeping
|
||||||
|
phorum_htmlpurifier_commit_settings();
|
||||||
|
|
||||||
|
// politely tell user how to edit settings manually
|
||||||
|
?>
|
||||||
|
<div class="input-form-td-break">How to edit settings for HTML Purifier module</div>
|
||||||
|
<p>
|
||||||
|
A <tt>config.php</tt> file exists in your <tt>mods/htmlpurifier/</tt>
|
||||||
|
directory. This file contains your custom configuration: in order to
|
||||||
|
change it, please navigate to that file and edit it accordingly.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
To use the web interface, delete <tt>config.php</tt> (or rename it to
|
||||||
|
<tt>config.php.bak</tt>).
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
<strong>Warning: Changing HTML Purifier's configuration will invalidate
|
||||||
|
the cache. Expect to see a flurry of database activity after you change
|
||||||
|
any of these settings.</strong>
|
||||||
|
</p>
|
||||||
|
<?php
|
||||||
|
|
||||||
|
}
|
||||||
|
|
21
plugins/phorum/settings/migrate-sigs-form.php
Normal file
21
plugins/phorum/settings/migrate-sigs-form.php
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_show_migrate_sigs_form() {
|
||||||
|
|
||||||
|
$frm = new PhorumInputForm ('', "post", "Migrate");
|
||||||
|
$frm->hidden("module", "modsettings");
|
||||||
|
$frm->hidden("mod", "htmlpurifier");
|
||||||
|
$frm->hidden("migrate-sigs", "1");
|
||||||
|
$frm->addbreak("Migrate user signatures to HTML");
|
||||||
|
$frm->addMessage('This operation will migrate your users signatures
|
||||||
|
to HTML. <strong>This process is irreversible and must only be performed once.</strong>
|
||||||
|
Type in yes in the confirmation field to migrate.');
|
||||||
|
if (!file_exists(dirname(__FILE__) . '/../migrate.php')) {
|
||||||
|
$frm->addMessage('Migration file does not exist, cannot migrate signatures.
|
||||||
|
Please check <tt>migrate.bbcode.php</tt> on how to create an appropriate file.');
|
||||||
|
} else {
|
||||||
|
$frm->addrow('Confirm:', $frm->text_box("confirmation", ""));
|
||||||
|
}
|
||||||
|
$frm->show();
|
||||||
|
}
|
||||||
|
|
85
plugins/phorum/settings/migrate-sigs.php
Normal file
85
plugins/phorum/settings/migrate-sigs.php
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_migrate_sigs_check() {
|
||||||
|
global $PHORUM;
|
||||||
|
$offset = 0;
|
||||||
|
if (!empty($_POST['migrate-sigs'])) {
|
||||||
|
if (!isset($_POST['confirmation']) || strtolower($_POST['confirmation']) !== 'yes') {
|
||||||
|
echo 'Invalid confirmation code.';
|
||||||
|
exit;
|
||||||
|
}
|
||||||
|
$PHORUM['mod_htmlpurifier']['migrate-sigs'] = true;
|
||||||
|
phorum_db_update_settings(array("mod_htmlpurifier"=>$PHORUM["mod_htmlpurifier"]));
|
||||||
|
$offset = 1;
|
||||||
|
} elseif (!empty($_GET['migrate-sigs']) && $PHORUM['mod_htmlpurifier']['migrate-sigs']) {
|
||||||
|
$offset = (int) $_GET['migrate-sigs'];
|
||||||
|
}
|
||||||
|
return $offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_migrate_sigs($offset) {
|
||||||
|
global $PHORUM;
|
||||||
|
|
||||||
|
if(!$offset) return; // bail out quick if $offset == 0
|
||||||
|
|
||||||
|
// theoretically, we could get rid of this multi-request
|
||||||
|
// doo-hickery if safe mode is off
|
||||||
|
@set_time_limit(0); // attempt to let this run
|
||||||
|
$increment = $PHORUM['mod_htmlpurifier']['migrate-sigs-increment'];
|
||||||
|
|
||||||
|
require_once(dirname(__FILE__) . '/../migrate.php');
|
||||||
|
// migrate signatures
|
||||||
|
// do this in batches so we don't run out of time/space
|
||||||
|
$end = $offset + $increment;
|
||||||
|
$user_ids = array();
|
||||||
|
for ($i = $offset; $i < $end; $i++) {
|
||||||
|
$user_ids[] = $i;
|
||||||
|
}
|
||||||
|
$userinfos = phorum_db_user_get_fields($user_ids, 'signature');
|
||||||
|
foreach ($userinfos as $i => $user) {
|
||||||
|
if (empty($user['signature'])) continue;
|
||||||
|
$sig = $user['signature'];
|
||||||
|
// perform standard Phorum processing on the sig
|
||||||
|
$sig = str_replace(array("&","<",">"), array("&","<",">"), $sig);
|
||||||
|
$sig = preg_replace("/<((http|https|ftp):\/\/[a-z0-9;\/\?:@=\&\$\-_\.\+!*'\(\),~%]+?)>/i", "$1", $sig);
|
||||||
|
// prepare fake data to pass to migration function
|
||||||
|
$fake_data = array(array("author"=>"", "email"=>"", "subject"=>"", 'body' => $sig));
|
||||||
|
list($fake_message) = phorum_htmlpurifier_migrate($fake_data);
|
||||||
|
$user['signature'] = $fake_message['body'];
|
||||||
|
if (!phorum_user_save($user)) {
|
||||||
|
exit('Error while saving user data');
|
||||||
|
}
|
||||||
|
}
|
||||||
|
unset($userinfos); // free up memory
|
||||||
|
|
||||||
|
// query for highest ID in database
|
||||||
|
$type = $PHORUM['DBCONFIG']['type'];
|
||||||
|
$sql = "select MAX(user_id) from {$PHORUM['user_table']}";
|
||||||
|
if ($type == 'mysql') {
|
||||||
|
$conn = phorum_db_mysql_connect();
|
||||||
|
$res = mysql_query($sql, $conn);
|
||||||
|
$row = mysql_fetch_row($res);
|
||||||
|
} elseif ($type == 'mysqli') {
|
||||||
|
$conn = phorum_db_mysqli_connect();
|
||||||
|
$res = mysqli_query($conn, $sql);
|
||||||
|
$row = mysqli_fetch_row($res);
|
||||||
|
} else {
|
||||||
|
exit('Unrecognized database!');
|
||||||
|
}
|
||||||
|
$top_id = (int) $row[0];
|
||||||
|
|
||||||
|
$offset += $increment;
|
||||||
|
if ($offset > $top_id) { // test for end condition
|
||||||
|
echo 'Migration finished';
|
||||||
|
$PHORUM['mod_htmlpurifier']['migrate-sigs'] = false;
|
||||||
|
phorum_htmlpurifier_commit_settings();
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
$host = $_SERVER['HTTP_HOST'];
|
||||||
|
$uri = rtrim(dirname($_SERVER['PHP_SELF']), '/\\');
|
||||||
|
$extra = 'admin.php?module=modsettings&mod=htmlpurifier&migrate-sigs=' . $offset;
|
||||||
|
// relies on output buffering to work
|
||||||
|
header("Location: http://$host$uri/$extra");
|
||||||
|
exit;
|
||||||
|
|
||||||
|
}
|
23
plugins/phorum/settings/save.php
Normal file
23
plugins/phorum/settings/save.php
Normal file
@@ -0,0 +1,23 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_save_settings() {
|
||||||
|
global $PHORUM;
|
||||||
|
if (phorum_htmlpurifier_config_file_exists()) {
|
||||||
|
echo "Cannot update settings, <code>mods/htmlpurifier/config.php</code> already exists. To change
|
||||||
|
settings, edit that file. To use the web form, delete that file.<br />";
|
||||||
|
} else {
|
||||||
|
$config = phorum_htmlpurifier_get_config();
|
||||||
|
if (!isset($_POST['reset'])) $config->mergeArrayFromForm($_POST, 'config', $PHORUM['mod_htmlpurifier']['directives']);
|
||||||
|
$PHORUM['mod_htmlpurifier']['config'] = $config->getAll();
|
||||||
|
if(!phorum_htmlpurifier_commit_settings()){
|
||||||
|
$error="Database error while updating settings.";
|
||||||
|
} else {
|
||||||
|
echo "Settings Updated<br />";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function phorum_htmlpurifier_commit_settings() {
|
||||||
|
global $PHORUM;
|
||||||
|
return phorum_db_update_settings(array("mod_htmlpurifier"=>$PHORUM["mod_htmlpurifier"]));
|
||||||
|
}
|
@@ -31,7 +31,7 @@ while (false !== ($filename = readdir($dh))) {
|
|||||||
if ($filename == 'all.php') continue;
|
if ($filename == 'all.php') continue;
|
||||||
if ($filename == 'testSchema.php') continue;
|
if ($filename == 'testSchema.php') continue;
|
||||||
?>
|
?>
|
||||||
<iframe src="<?php echo escapeHTML($filename); ?>"></iframe>
|
<iframe src="<?php echo escapeHTML($filename); if (isset($_GET['standalone'])) {echo '?standalone';} ?>"></iframe>
|
||||||
<?php
|
<?php
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@@ -2,7 +2,11 @@
|
|||||||
|
|
||||||
header('Content-type: text/html; charset=UTF-8');
|
header('Content-type: text/html; charset=UTF-8');
|
||||||
|
|
||||||
require_once '../library/HTMLPurifier.auto.php';
|
if (!isset($_GET['standalone'])) {
|
||||||
|
require_once '../library/HTMLPurifier.auto.php';
|
||||||
|
} else {
|
||||||
|
require_once '../library/HTMLPurifier.standalone.php';
|
||||||
|
}
|
||||||
error_reporting(E_ALL | E_STRICT);
|
error_reporting(E_ALL | E_STRICT);
|
||||||
|
|
||||||
function escapeHTML($string) {
|
function escapeHTML($string) {
|
||||||
|
@@ -37,3 +37,7 @@ HTMLPurifier_ConfigSchema::defineNamespace('ReportCard', 'It is for grades.');
|
|||||||
HTMLPurifier_ConfigSchema::define('ReportCard', 'English', null, 'string/null', 'Grade from English class.');
|
HTMLPurifier_ConfigSchema::define('ReportCard', 'English', null, 'string/null', 'Grade from English class.');
|
||||||
HTMLPurifier_ConfigSchema::define('ReportCard', 'Absences', 0, 'int', 'How many times missing from school?');
|
HTMLPurifier_ConfigSchema::define('ReportCard', 'Absences', 0, 'int', 'How many times missing from school?');
|
||||||
|
|
||||||
|
HTMLPurifier_ConfigSchema::defineNamespace('Text', 'This stuff is long, boring, and English.');
|
||||||
|
HTMLPurifier_ConfigSchema::define('Text', 'AboutUs', 'Nothing much, but this should be decently long so that a textarea would be better', 'text', 'Who are we? What are we up to?');
|
||||||
|
HTMLPurifier_ConfigSchema::define('Text', 'Hash', "not-case-sensitive\nstill-not-case-sensitive\nsuper-not-case-sensitive", 'itext', 'This is of limited utility, but of course it ends up being used.');
|
||||||
|
|
||||||
|
@@ -1,16 +1,20 @@
|
|||||||
<?php
|
<?php
|
||||||
|
|
||||||
// This file is necessary to run the unit tests and profiling
|
// ATTENTION! DO NOT EDIT THIS FILE!
|
||||||
// scripts.
|
// This file is necessary to run the unit tests and profiling scripts.
|
||||||
|
// Please copy it to 'test-settings.php' and make the necessary edits.
|
||||||
|
|
||||||
// Is PEAR available on your system? If it isn't, set to false. If PEAR
|
// Some of these scripts run a long time, so it is recommended that you
|
||||||
// is not part of the default include_path, add it.
|
// turn off the time limit
|
||||||
$GLOBALS['HTMLPurifierTest']['PEAR'] = true;
|
set_time_limit(0);
|
||||||
|
|
||||||
|
// Turning off output buffering will prevent mysterious errors from core dumps
|
||||||
|
@ob_end_flush();
|
||||||
|
|
||||||
|
// Where is SimpleTest located?
|
||||||
|
$simpletest_location = '/path/to/simpletest/';
|
||||||
|
|
||||||
// How many times should profiling scripts iterate over the function? More runs
|
// How many times should profiling scripts iterate over the function? More runs
|
||||||
// means more accurate results, but they'll take longer to perform.
|
// means more accurate results, but they'll take longer to perform.
|
||||||
$GLOBALS['HTMLPurifierTest']['Runs'] = 2;
|
$GLOBALS['HTMLPurifierTest']['Runs'] = 2;
|
||||||
|
|
||||||
// Where is SimpleTest located?
|
|
||||||
$simpletest_location = '/path/to/simpletest/';
|
|
||||||
|
|
||||||
|
@@ -54,14 +54,14 @@ function isInScopes($array = array()) {
|
|||||||
}
|
}
|
||||||
/**#@-*/
|
/**#@-*/
|
||||||
|
|
||||||
function printTokens($tokens, $index) {
|
function printTokens($tokens, $index = null) {
|
||||||
$string = '<pre>';
|
$string = '<pre>';
|
||||||
$generator = new HTMLPurifier_Generator();
|
$generator = new HTMLPurifier_Generator();
|
||||||
foreach ($tokens as $i => $token) {
|
foreach ($tokens as $i => $token) {
|
||||||
if ($index == $i) $string .= '[<strong>';
|
if ($index === $i) $string .= '[<strong>';
|
||||||
$string .= "<sup>$i</sup>";
|
$string .= "<sup>$i</sup>";
|
||||||
$string .= $generator->escape($generator->generateFromToken($token));
|
$string .= $generator->escape($generator->generateFromToken($token));
|
||||||
if ($index == $i) $string .= '</strong>]';
|
if ($index === $i) $string .= '</strong>]';
|
||||||
}
|
}
|
||||||
$string .= '</pre>';
|
$string .= '</pre>';
|
||||||
echo $string;
|
echo $string;
|
||||||
|
@@ -67,6 +67,7 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
|
|||||||
$this->assertDef('border:1px solid #000;');
|
$this->assertDef('border:1px solid #000;');
|
||||||
$this->assertDef('border-bottom:2em double #FF00FA;');
|
$this->assertDef('border-bottom:2em double #FF00FA;');
|
||||||
$this->assertDef('border-collapse:collapse;');
|
$this->assertDef('border-collapse:collapse;');
|
||||||
|
$this->assertDef('border-collapse:separate;');
|
||||||
$this->assertDef('caption-side:top;');
|
$this->assertDef('caption-side:top;');
|
||||||
$this->assertDef('vertical-align:middle;');
|
$this->assertDef('vertical-align:middle;');
|
||||||
$this->assertDef('vertical-align:12px;');
|
$this->assertDef('vertical-align:12px;');
|
||||||
@@ -79,6 +80,8 @@ class HTMLPurifier_AttrDef_CSSTest extends HTMLPurifier_AttrDefHarness
|
|||||||
$this->assertDef('background-repeat:repeat-y;');
|
$this->assertDef('background-repeat:repeat-y;');
|
||||||
$this->assertDef('background-attachment:fixed;');
|
$this->assertDef('background-attachment:fixed;');
|
||||||
$this->assertDef('background-position:left 90%;');
|
$this->assertDef('background-position:left 90%;');
|
||||||
|
$this->assertDef('border-spacing:1em;');
|
||||||
|
$this->assertDef('border-spacing:1em 2em;');
|
||||||
|
|
||||||
// duplicates
|
// duplicates
|
||||||
$this->assertDef('text-align:right;text-align:left;',
|
$this->assertDef('text-align:right;text-align:left;',
|
||||||
|
@@ -11,18 +11,19 @@ class HTMLPurifier_AttrTransform_BdoDirTest extends HTMLPurifier_AttrTransformHa
|
|||||||
$this->obj = new HTMLPurifier_AttrTransform_BdoDir();
|
$this->obj = new HTMLPurifier_AttrTransform_BdoDir();
|
||||||
}
|
}
|
||||||
|
|
||||||
function test() {
|
function testAddDefaultDir() {
|
||||||
|
|
||||||
$this->assertResult( array(), array('dir' => 'ltr') );
|
$this->assertResult( array(), array('dir' => 'ltr') );
|
||||||
|
}
|
||||||
|
|
||||||
// leave existing dir alone
|
function testPreserveExistingDir() {
|
||||||
$this->assertResult( array('dir' => 'rtl') );
|
$this->assertResult( array('dir' => 'rtl') );
|
||||||
|
}
|
||||||
|
|
||||||
// use a different default
|
function testAlternateDefault() {
|
||||||
|
$this->config->set('Attr', 'DefaultTextDir', 'rtl');
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array(),
|
array(),
|
||||||
array('dir' => 'rtl'),
|
array('dir' => 'rtl')
|
||||||
array('Attr.DefaultTextDir' => 'rtl')
|
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -3,6 +3,10 @@
|
|||||||
require_once 'HTMLPurifier/AttrTransform/BgColor.php';
|
require_once 'HTMLPurifier/AttrTransform/BgColor.php';
|
||||||
require_once 'HTMLPurifier/AttrTransformHarness.php';
|
require_once 'HTMLPurifier/AttrTransformHarness.php';
|
||||||
|
|
||||||
|
// we currently rely on the CSS validator to fix any problems.
|
||||||
|
// This means that this transform, strictly speaking, supports
|
||||||
|
// a superset of the functionality.
|
||||||
|
|
||||||
class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformHarness
|
class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformHarness
|
||||||
{
|
{
|
||||||
|
|
||||||
@@ -11,31 +15,31 @@ class HTMLPurifier_AttrTransform_BgColorTest extends HTMLPurifier_AttrTransformH
|
|||||||
$this->obj = new HTMLPurifier_AttrTransform_BgColor();
|
$this->obj = new HTMLPurifier_AttrTransform_BgColor();
|
||||||
}
|
}
|
||||||
|
|
||||||
function test() {
|
function testEmptyInput() {
|
||||||
|
|
||||||
$this->assertResult( array() );
|
$this->assertResult( array() );
|
||||||
|
}
|
||||||
|
|
||||||
// we currently rely on the CSS validator to fix any problems.
|
function testBasicTransform() {
|
||||||
// This means that this transform, strictly speaking, supports
|
|
||||||
// a superset of the functionality.
|
|
||||||
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('bgcolor' => '#000000'),
|
array('bgcolor' => '#000000'),
|
||||||
array('style' => 'background-color:#000000;')
|
array('style' => 'background-color:#000000;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testPrependNewCSS() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('bgcolor' => '#000000', 'style' => 'font-weight:bold'),
|
array('bgcolor' => '#000000', 'style' => 'font-weight:bold'),
|
||||||
array('style' => 'background-color:#000000;font-weight:bold')
|
array('style' => 'background-color:#000000;font-weight:bold')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testLenientTreatmentOfInvalidInput() {
|
||||||
// this may change when we natively support the datatype and
|
// this may change when we natively support the datatype and
|
||||||
// validate its contents before forwarding it on
|
// validate its contents before forwarding it on
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('bgcolor' => '#F00'),
|
array('bgcolor' => '#F00'),
|
||||||
array('style' => 'background-color:#F00;')
|
array('style' => 'background-color:#F00;')
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -11,27 +11,29 @@ class HTMLPurifier_AttrTransform_BoolToCSSTest extends HTMLPurifier_AttrTransfor
|
|||||||
$this->obj = new HTMLPurifier_AttrTransform_BoolToCSS('foo', 'bar:3in;');
|
$this->obj = new HTMLPurifier_AttrTransform_BoolToCSS('foo', 'bar:3in;');
|
||||||
}
|
}
|
||||||
|
|
||||||
function test() {
|
function testEmptyInput() {
|
||||||
|
|
||||||
$this->assertResult( array() );
|
$this->assertResult( array() );
|
||||||
|
}
|
||||||
|
|
||||||
|
function testBasicTransform() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('foo' => 'foo'),
|
array('foo' => 'foo'),
|
||||||
array('style' => 'bar:3in;')
|
array('style' => 'bar:3in;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// boolean attribute just has to be set: we don't care about
|
function testIgnoreValueOfBooleanAttribute() {
|
||||||
// anything else
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('foo' => 'no'),
|
array('foo' => 'no'),
|
||||||
array('style' => 'bar:3in;')
|
array('style' => 'bar:3in;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testPrependCSS() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('foo' => 'foo', 'style' => 'background-color:#F00;'),
|
array('foo' => 'foo', 'style' => 'background-color:#F00;'),
|
||||||
array('style' => 'bar:3in;background-color:#F00;')
|
array('style' => 'bar:3in;background-color:#F00;')
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -12,27 +12,29 @@ class HTMLPurifier_AttrTransform_BorderTest extends HTMLPurifier_AttrTransformHa
|
|||||||
$this->obj = new HTMLPurifier_AttrTransform_Border();
|
$this->obj = new HTMLPurifier_AttrTransform_Border();
|
||||||
}
|
}
|
||||||
|
|
||||||
function test() {
|
function testEmptyInput() {
|
||||||
|
|
||||||
$this->assertResult( array() );
|
$this->assertResult( array() );
|
||||||
|
}
|
||||||
|
|
||||||
|
function testBasicTransform() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('border' => '1'),
|
array('border' => '1'),
|
||||||
array('style' => 'border:1px solid;')
|
array('style' => 'border:1px solid;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// once again, no validation done here, we expect CSS validator
|
function testLenientTreatmentOfInvalidInput() {
|
||||||
// to catch it
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('border' => '10%'),
|
array('border' => '10%'),
|
||||||
array('style' => 'border:10%px solid;')
|
array('style' => 'border:10%px solid;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testPrependNewCSS() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('border' => '23', 'style' => 'font-weight:bold;'),
|
array('border' => '23', 'style' => 'font-weight:bold;'),
|
||||||
array('style' => 'border:23px solid;font-weight:bold;')
|
array('style' => 'border:23px solid;font-weight:bold;')
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -6,38 +6,44 @@ require_once 'HTMLPurifier/AttrTransformHarness.php';
|
|||||||
class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransformHarness
|
class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransformHarness
|
||||||
{
|
{
|
||||||
|
|
||||||
function testRegular() {
|
function setUp() {
|
||||||
|
parent::setUp();
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||||
'left' => 'text-align:left;',
|
'left' => 'text-align:left;',
|
||||||
'right' => 'text-align:right;'
|
'right' => 'text-align:right;'
|
||||||
));
|
));
|
||||||
|
}
|
||||||
|
|
||||||
// leave empty arrays alone
|
function testEmptyInput() {
|
||||||
$this->assertResult( array() );
|
$this->assertResult( array() );
|
||||||
|
}
|
||||||
|
|
||||||
// leave arrays without interesting stuff alone
|
function testPreserveArraysWithoutInterestingAttributes() {
|
||||||
$this->assertResult( array('style' => 'font-weight:bold;') );
|
$this->assertResult( array('style' => 'font-weight:bold;') );
|
||||||
|
}
|
||||||
|
|
||||||
// test each of the conversions
|
function testConvertAlignLeft() {
|
||||||
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'left'),
|
array('align' => 'left'),
|
||||||
array('style' => 'text-align:left;')
|
array('style' => 'text-align:left;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testConvertAlignRight() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'right'),
|
array('align' => 'right'),
|
||||||
array('style' => 'text-align:right;')
|
array('style' => 'text-align:right;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// drop garbage value
|
function testRemoveInvalidAlign() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'invalid'),
|
array('align' => 'invalid'),
|
||||||
array()
|
array()
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// test CSS munging
|
function testPrependNewCSS() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'left', 'style' => 'font-weight:bold;'),
|
array('align' => 'left', 'style' => 'font-weight:bold;'),
|
||||||
array('style' => 'text-align:left;font-weight:bold;')
|
array('style' => 'text-align:left;font-weight:bold;')
|
||||||
@@ -46,31 +52,23 @@ class HTMLPurifier_AttrTransform_EnumToCSSTest extends HTMLPurifier_AttrTransfor
|
|||||||
}
|
}
|
||||||
|
|
||||||
function testCaseInsensitive() {
|
function testCaseInsensitive() {
|
||||||
|
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||||
'right' => 'text-align:right;'
|
'right' => 'text-align:right;'
|
||||||
));
|
));
|
||||||
|
|
||||||
// test case insensitivity
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'RIGHT'),
|
array('align' => 'RIGHT'),
|
||||||
array('style' => 'text-align:right;')
|
array('style' => 'text-align:right;')
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
function testCaseSensitive() {
|
function testCaseSensitive() {
|
||||||
|
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
$this->obj = new HTMLPurifier_AttrTransform_EnumToCSS('align', array(
|
||||||
'right' => 'text-align:right;'
|
'right' => 'text-align:right;'
|
||||||
), true);
|
), true);
|
||||||
|
|
||||||
// test case insensitivity
|
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('align' => 'RIGHT'),
|
array('align' => 'RIGHT'),
|
||||||
array()
|
array()
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -11,39 +11,37 @@ class HTMLPurifier_AttrTransform_ImgRequiredTest extends HTMLPurifier_AttrTransf
|
|||||||
$this->obj = new HTMLPurifier_AttrTransform_ImgRequired();
|
$this->obj = new HTMLPurifier_AttrTransform_ImgRequired();
|
||||||
}
|
}
|
||||||
|
|
||||||
function test() {
|
function testAddMissingAttr() {
|
||||||
|
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array(),
|
array(),
|
||||||
array('src' => '', 'alt' => 'Invalid image'),
|
array('src' => '', 'alt' => 'Invalid image')
|
||||||
array(
|
|
||||||
'Core.RemoveInvalidImg' => false
|
|
||||||
)
|
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testAlternateDefaults() {
|
||||||
|
$this->config->set('Attr', 'DefaultInvalidImage', 'blank.png');
|
||||||
|
$this->config->set('Attr', 'DefaultInvalidImageAlt', 'Pawned!');
|
||||||
|
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array(),
|
array(),
|
||||||
array('src' => 'blank.png', 'alt' => 'Pawned!'),
|
array('src' => 'blank.png', 'alt' => 'Pawned!')
|
||||||
array(
|
|
||||||
'Attr.DefaultInvalidImage' => 'blank.png',
|
|
||||||
'Attr.DefaultInvalidImageAlt' => 'Pawned!',
|
|
||||||
'Core.RemoveInvalidImg' => false
|
|
||||||
)
|
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testGenerateAlt() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('src' => '/path/to/foobar.png'),
|
array('src' => '/path/to/foobar.png'),
|
||||||
array('src' => '/path/to/foobar.png', 'alt' => 'foobar.png')
|
array('src' => '/path/to/foobar.png', 'alt' => 'foobar.png')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testAddDefaultSrc() {
|
||||||
|
$this->config->set('Core', 'RemoveInvalidImg', false);
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('alt' => 'intrigue'),
|
array('alt' => 'intrigue'),
|
||||||
array('alt' => 'intrigue', 'src' => ''),
|
array('alt' => 'intrigue', 'src' => '')
|
||||||
array(
|
|
||||||
'Core.RemoveInvalidImg' => false
|
|
||||||
)
|
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
@@ -9,33 +9,35 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
|
|||||||
|
|
||||||
function setUp() {
|
function setUp() {
|
||||||
parent::setUp();
|
parent::setUp();
|
||||||
|
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
|
||||||
}
|
}
|
||||||
|
|
||||||
function testVertical() {
|
function testEmptyInput() {
|
||||||
|
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('vspace');
|
|
||||||
|
|
||||||
$this->assertResult( array() );
|
$this->assertResult( array() );
|
||||||
|
}
|
||||||
|
|
||||||
|
function testVerticalBasicUsage() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('vspace' => '1'),
|
array('vspace' => '1'),
|
||||||
array('style' => 'margin-top:1px;margin-bottom:1px;')
|
array('style' => 'margin-top:1px;margin-bottom:1px;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
// no validation done here, we expect CSS validator to catch it
|
function testLenientHandlingOfInvalidInput() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('vspace' => '10%'),
|
array('vspace' => '10%'),
|
||||||
array('style' => 'margin-top:10%px;margin-bottom:10%px;')
|
array('style' => 'margin-top:10%px;margin-bottom:10%px;')
|
||||||
);
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function testPrependNewCSS() {
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('vspace' => '23', 'style' => 'font-weight:bold;'),
|
array('vspace' => '23', 'style' => 'font-weight:bold;'),
|
||||||
array('style' => 'margin-top:23px;margin-bottom:23px;font-weight:bold;')
|
array('style' => 'margin-top:23px;margin-bottom:23px;font-weight:bold;')
|
||||||
);
|
);
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
function testHorizontal() {
|
function testHorizontalBasicUsage() {
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('hspace');
|
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('hspace');
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
array('hspace' => '1'),
|
array('hspace' => '1'),
|
||||||
@@ -43,7 +45,7 @@ class HTMLPurifier_AttrTransform_ImgSpaceTest extends HTMLPurifier_AttrTransform
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
function testInvalid() {
|
function testInvalidConstructionParameter() {
|
||||||
$this->expectError('ispace is not valid space attribute');
|
$this->expectError('ispace is not valid space attribute');
|
||||||
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('ispace');
|
$this->obj = new HTMLPurifier_AttrTransform_ImgSpace('ispace');
|
||||||
$this->assertResult(
|
$this->assertResult(
|
||||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user