mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-07-31 19:30:21 +02:00
Make the definition format much more logical. Begin migrating specification docs to their respective classes.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@133 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
@@ -1,9 +1,7 @@
|
||||
|
||||
HTML Purifier Specification
|
||||
HTML Purifier
|
||||
by Edward Z. Yang
|
||||
|
||||
== Introduction ==
|
||||
|
||||
There are a number of ad hoc HTML filtering solutions out there on the web
|
||||
(some examples including HTML_Safe, kses and SafeHtmlChecker.class.php) that
|
||||
claim to filter HTML properly, preventing malicious JavaScript and layout
|
||||
@@ -56,29 +54,6 @@ HTML tags. Things like blog comments are, in all likelihood, most appropriately
|
||||
written in an extremely restrictive set of markup that doesn't require
|
||||
all this functionality (or not written in HTML at all).
|
||||
|
||||
|
||||
|
||||
== STAGE 1 - parsing ==
|
||||
|
||||
Status: A (see source, mainly internals and UTF-8)
|
||||
|
||||
The Lexer (currently we have three choices) handles parsing into Tokens.
|
||||
|
||||
Here are the mappings for Lexer_PEARSax3
|
||||
|
||||
* Start(name, attributes) is openHandler
|
||||
* End(name) is closeHandler
|
||||
* Empty(name, attributes) is openHandler (is in array of empties)
|
||||
* Data(parse(text)) is dataHandler
|
||||
* Comment(text) is escapeHandler (has leading -)
|
||||
* Data(text) is escapeHandler (has leading [, CDATA)
|
||||
|
||||
Ignorable/not being implemented (although we probably want to output them raw):
|
||||
* ProcessingInstructions(text) is piHandler
|
||||
* JavaOrASPInstructions(text) is jaspHandler
|
||||
|
||||
|
||||
|
||||
== STAGE 2 - remove foreign elements ==
|
||||
|
||||
Status: A- (transformations need to be implemented)
|
||||
|
Reference in New Issue
Block a user