Merge in r657-674, prompted by near release of 1.4.0.

git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/branches/strict@675 48356398-32a2-884e-a903-53898d9a118a
2025-08-19 20:34:01 +02:00 · 2007-01-21 16:07:36 +00:00
parent 37ea1673dd
commit 9a84e11f34
56 changed files with 1356 additions and 509 deletions
--- a/61
+++ b/61
@@ -7,19 +7,14 @@ TODO List
    ? At-risk
 ==========================

-1.4 release
- # More extensive URI filtering schemes (see docs/proposal-new-directives.txt)
- # Allow for background-image and list-style-image (intrinsically tied to above)
- # Add hooks for custom behavior (for instance, YouTube preservation)
- - Aggressive caching
- ? Rich set* methods and config file loaders for HTMLPurifier_Config
- ? Configuration profiles: sets of directives that get set with one func call
- ? ConfigSchema directive aliases (so we can rename some of them)
- ? URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX)
-
 1.5 release
+ # Implement all non-essential attribute transforms
+ # URI validation routines tighter (see docs/dev-code-quality.html) (COMPLEX)
+ # Advanced URI filtering schemes (see docs/proposal-new-directives.txt)
 # Error logging for filtering/cleanup procedures
    - Requires I18N facilities to be created first (COMPLEX)
+ ? Configuration profiles: sets of directives that get set with one func call
+ - XSS-attempt detection

 1.6 release
 # Add pre-packaged "levels" of cleaning (custom behavior already done)
@@ -28,14 +23,30 @@ TODO List
      specification of elements that, when detected as foreign, trigger removal
      of children, although unbalanced tags could wreck havoc (or at least
      delete the rest of the document)).
+ - Allow specifying global attributes on a tag-by-tag basis in
+   %HTML.AllowAttributes
+ ? More user-friendly warnings when %HTML.Allow* attempts to specify a
+   tag or attribute that is not supported
+ - Parse TinyMCE whitelist into our %HTML.Allow* whitelists

 1.7 release
 # Additional support for poorly written HTML
-    - Implement all non-essential attribute transforms (BIG!)
    - Microsoft Word HTML cleaning (i.e. MsoNormal, but research essential!)
    - Friendly strict handling of <address> (block -> <br>)
+ - Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
+    1. Analyzing which tags to remove duplicants
+    2. Ensure attributes are merged into the parent tag
+    3. Extend the tag exclusion system to specify whether or not the
+    contents should be dropped or not (currently, there's code that could do
+    something like this if it didn't drop the inner text too.)
+ - Remove <span> tags that don't do anything (no attributes)
+ - Remove empty inline tags<i></i>
+ - Append something to duplicate IDs so they're still usable (impl. note: the
+   dupe detector would also need to detect the suffix as well)

 2.0 release
+ # Legit token based CSS parsing (will require revamping almost every
+   AttrDef class)
 # Formatters for plaintext (COMPLEX)
    - Auto-paragraphing (be sure to leverage fact that we know when things
      shouldn't be paragraphed, such as lists and tables).
@@ -48,48 +59,32 @@ TODO List
    - Hooks for adding custom processors to custom namespaced tags and
      attributes, offer default implementation
    - Lots of documentation and samples
+ - Allow tags to be "armored", an internal flag that protects them
+   from validation and passes them out unharmed
 - XHTML 1.1 support

 Ongoing
 - Lots of profiling, make it faster!
 - Plugins for major CMSes (COMPLEX)
-    - Drupal
    - WordPress
    - eFiction
    - more! (look for ones that use WYSIWYGs)

 Unknown release (on a scratch-an-itch basis)
+ - Upgrade SimpleTest testing code to newest versions
 - Fixes for Firefox's inability to handle COL alignment props (Bug 915)
 - Automatically add non-breaking spaces to empty table cells when
   empty-cells:show is applied to have compatibility with Internet Explorer
 - Convert RTL/LTR override characters to <bdo> tags, or vice versa on demand.
   Also, enable disabling of directionality
- - Append something to duplicate IDs so they're still usable (impl. note: the
-   dupe detector would also need to detect the suffix as well)
 - Have 'lang' attribute be checked against official lists
-
-Encoding workarounds
- - Non-lossy dumb alternate character encoding transformations, achieved by
-   numerically encoding all non-ASCII characters
- - Semi-lossy dumb alternate character encoding transformations, achieved by
+ ? Semi-lossy dumb alternate character encoding transformations, achieved by
   encoding all characters that have string entity equivalents

 Requested
- - Native content compression, whitespace stripping (don't rely on Tidy, make
+ ? Native content compression, whitespace stripping (don't rely on Tidy, make
   sure we don't remove from <pre> or related tags)
- - Win32 Phalanger C# binaries (?)
- - Remove redundant tags, ex. <u><u>Underlined</u></u>. Implementation notes:
-    1. Analyzing which tags to remove duplicants
-    2. Ensure attributes are merged into the parent tag
-    3. Extend the tag exclusion system to specify whether or not the
-    contents should be dropped or not (currently, there's code that could do
-    something like this if it didn't drop the inner text too.)
- - More user-friendly warnings when %HTML.Allow* attempts to specify a
-   tag or attribute that is not supported
- - Allow specifying global attributes on a tag-by-tag basis in
-   %HTML.AllowAttributes
- - Parse TinyMCE whitelist into our %HTML.Allow* whitelists
- - XSS-attempt detection
+ ? Win32 Phalanger C# binaries

 Wontfix
 - Non-lossy smart alternate character encoding transformations (unless