mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-08-06 14:16:32 +02:00
[1.2.0]
- Update TODO . Add another possible plaintext formatter . Reference config-ideas.txt for URI options - Update code-quality.txt, removing issues that have been addressed and updating time for post-beta - Update config-ideas.txt . Added more possible URI directives . Removed silly language control directive - Improved documentation on Class, CSS and Host git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@524 48356398-32a2-884e-a903-53898d9a118a
This commit is contained in:
@@ -4,11 +4,8 @@ Code Quality Issues
|
||||
Okay, face it. Programmers can get lazy, cut corners, or make mistakes. They
|
||||
also can do quick prototypes, and then forget to rewrite them later. Well,
|
||||
while I can't list mistakes in here, I can list prototype-like segments
|
||||
of code that should be aggressively refactored after the beta is released.
|
||||
This does not list optimization issues, that needs to be done after intense
|
||||
profiling.
|
||||
|
||||
Here we go:
|
||||
of code that should be aggressively refactored. This does not list
|
||||
optimization issues, that needs to be done after intense profiling.
|
||||
|
||||
AttrDef
|
||||
Class - doesn't support Unicode characters (fringe); uses regular
|
||||
@@ -16,12 +13,10 @@ AttrDef
|
||||
Lang - code duplication; premature optimization; doesn't consult official
|
||||
lists (fringe)
|
||||
Length - easily mistaken for CSSLength
|
||||
URI - multiple regular expressions; needs host validation routines factored
|
||||
out for mailto scheme; missing validation for query; fragment and path,
|
||||
no percent-encode fixing
|
||||
URI - multiple regular expressions; missing validation for query,
|
||||
fragment and path
|
||||
CSS - parser doesn't accept advanced CSS (fringe)
|
||||
Number - constructor interface is inconsistent with Integer
|
||||
AttrTransform - doesn't accept AttrContext
|
||||
Number - constructor interface inconsistent with Integer
|
||||
Config - "load configuration" hooks missing, rich set* accessors missing
|
||||
ConfigSchema - redefinition is a mess
|
||||
Strategy
|
||||
@@ -31,8 +26,7 @@ Strategy
|
||||
might be efficient).
|
||||
RemoveForeignElements - should be run in parallel with MakeWellFormed
|
||||
URIScheme - needs to have callable generic checks
|
||||
ftp - missing typecode check
|
||||
mailto - doesn't validate emails
|
||||
mailto - doesn't validate emails, doesn't validate querystring
|
||||
news - doesn't validate opaque path
|
||||
nntp - doesn't constrain path
|
||||
EOL
|
||||
|
@@ -17,24 +17,17 @@ time. Note the naming convention: %Namespace.Directive
|
||||
%Attr.ClassBlacklist. When it's Whitelist, only allow those in
|
||||
%Attr.ClassWhitelist.
|
||||
|
||||
%Attr.LangAlphaOnly - designate whether or not to allow numerals in language
|
||||
code subtags
|
||||
* RFC 1766, the current standard referenced by XML, does not permit
|
||||
numbers, but,
|
||||
* RFC 3066, the superseding best practice standard since January 2001,
|
||||
permits them.
|
||||
We allow numbers by default, but you generally never see them
|
||||
at all, which makes this a little more sane.
|
||||
|
||||
%Attr.MaxWidth,
|
||||
%Attr.MaxHeight - caps for width and height related checks.
|
||||
(a hack in Pixels for an image crashing attack could be replaced by this)
|
||||
(the hack in Pixels for an image crashing attack could be replaced by this)
|
||||
|
||||
%URI.Munge - will munge all URIs to a different URI, which should redirect
|
||||
%URI.Munge - will munge all external URIs to a different URI, which redirects
|
||||
the user to the applicable page. A urlencoded version of the URI
|
||||
will replace any instances of %s in the string. One possible
|
||||
string is 'http://www.google.com/url?q=%s'. Useful for preventing
|
||||
pagerank from being sent to other sites
|
||||
pagerank from being sent to other sites, but can also be used to
|
||||
redirect to a splash page notifying user that they are leaving your
|
||||
website.
|
||||
|
||||
%URI.AddRelNofollow - will add rel="nofollow" to all links, preventing the
|
||||
spread of ill-gotten pagerank
|
||||
@@ -49,7 +42,16 @@ time. Note the naming convention: %Namespace.Directive
|
||||
'DenyAll' or 'AllowAll' (default)
|
||||
|
||||
%URI.DisableIPHosts - URIs that have IP addresses for hosts are disallowed.
|
||||
Be sure to also grab unusual encodings (dword, hex and octal)
|
||||
Be sure to also grab unusual encodings (dword, hex and octal), which may
|
||||
be currently be caught by regular DNS
|
||||
%URI.DisableAbsoluteDNS - Remove extra dots after host names that trigger
|
||||
absolute DNS. While this is actually the preferred method according to
|
||||
the RFC, most people opt to use a relative domain name relative to . (root).
|
||||
%URI.DisableIDN - Disallow raw internationalized domain names. Punycode
|
||||
will still be permitted.
|
||||
|
||||
%URI.ConvertUnusualIPHosts - transform dword/hex/octal IP addresses to the
|
||||
regular form
|
||||
|
||||
%URI.DisableExternalResources - disallow resource links (i.e. URIs that result
|
||||
in immediate requests, such as src in IMG) to external websites
|
||||
|
Reference in New Issue
Block a user