mirror of
https://github.com/ezyang/htmlpurifier.git
synced 2025-08-05 05:37:49 +02:00
Refine Lexers for parsing stray angled brackets; %Core.AggressivelyFixLt = true
By default, the DirectLex and DOMLex behavior with stray angled brackets varied a great deal due to their implementations. A little known directive %Core.AggressivelyFixLt attempted to match DOMLex's behavior with DirectLex's, but it was off by default. By turning it on by default, users now enjoy these benefits, and performance-minded users can turn it back off. Also, several refinements to stray angled bracket parsing was made. Specifically: * DirectLex: Handle each left angled bracket individually, which prevents strange behavior as reported by eon. * DOMLex: Iterate aggressive lt fix, so that stacked brackets like << are handled. Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com>
This commit is contained in:
@@ -45,7 +45,10 @@ class HTMLPurifier_Lexer_DOMLex extends HTMLPurifier_Lexer
|
||||
$char = '[^a-z!\/]';
|
||||
$comment = "/<!--(.*?)(-->|\z)/is";
|
||||
$html = preg_replace_callback($comment, array($this, 'callbackArmorCommentEntities'), $html);
|
||||
$html = preg_replace("/<($char)/i", '<\\1', $html);
|
||||
do {
|
||||
$old = $html;
|
||||
$html = preg_replace("/<($char)/i", '<\\1', $html);
|
||||
} while ($html !== $old);
|
||||
$html = preg_replace_callback($comment, array($this, 'callbackUndoCommentSubst'), $html); // fix comments
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user