1bed8b6d5f
Added %Core.RemoveProcessingInstructions.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-06-20 18:26:44 -07:00
33afd7d9e0
Fix improper handling of IE conditional comments.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-06-18 06:08:54 -07:00
00c66fa9cb
Fix bug in parsing single attribute with entities.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-05-31 19:44:18 -07:00
c1cbd9e565
Mute STRICT errors from CSSTidy and don't run PEARSax3 on PHP 5.3.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-04-26 18:27:32 -04:00
ac18672aba
Fix extant broken PEARSax3 parsing patterns.
...
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-02-26 21:14:52 -05:00
faf28682ad
Manually work around PEARSax3 E_STRICT errors.
...
Previously, my development environment was not running the PEARSax3
tests because my environment was set to E_STRICT error handling, and
thus the tests were skipped. Relax this requirement by making the
wrapper class E_STRICT safe. This introduces a few failing tests.
Also update TODO and add another fresh test.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2010-02-26 20:42:42 -05:00
ba9fd175d7
Make extractBody not terminate prematurely on first </body>.
...
Previously, if two </body> tags were present, HTML Purifier
would truncate everything after the first </body>. This is
not ideal behavior; so HTML Purifier has been changed to
match up to the last </body>.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu >
2009-07-07 22:19:04 -04:00
86ca784da3
Convert all to new configuration get/set format.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2009-02-21 03:00:34 -05:00
e802065b65
Punt Lexer test entirely for 5.0.5.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2009-02-16 17:18:30 -05:00
12b811d749
Add vim modelines to all files.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2008-12-06 04:24:59 -05:00
2c955af135
Remove trailing whitespace.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2008-12-06 02:28:20 -05:00
ed7983b559
Refactor lexer instantiation logic with exceptions and forced line tracking.
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2008-09-05 14:04:23 -04:00
1d90bb2397
Allow <![CDATA[<body>...</body>]]> not to trigger Core.ConvertDocumentToFragment
...
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2008-08-01 19:06:28 -04:00
aa0fdeee30
Refine Lexers for parsing stray angled brackets; %Core.AggressivelyFixLt = true
...
By default, the DirectLex and DOMLex behavior with stray angled brackets
varied a great deal due to their implementations. A little known directive
%Core.AggressivelyFixLt attempted to match DOMLex's behavior with DirectLex's,
but it was off by default. By turning it on by default, users now enjoy these
benefits, and performance-minded users can turn it back off.
Also, several refinements to stray angled bracket parsing was made. Specifically:
* DirectLex: Handle each left angled bracket individually, which prevents
strange behavior as reported by eon.
* DOMLex: Iterate aggressive lt fix, so that stacked brackets like << are
handled.
Signed-off-by: Edward Z. Yang <edwardzyang@thewritingpot.com >
2008-07-07 08:52:29 -04:00
2f29c27a59
[3.1.0] Fix broken PH5P in latest versions of DOM with bandaid; punt to DirectLex.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1698 48356398-32a2-884e-a903-53898d9a118a
2008-04-26 19:47:22 +00:00
59605d592b
Classname() constructors to __construct() constructors, as per SimpleTest. Also normalized ppp declarations; no public declaration for test methods, public/protected for the rest
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1663 48356398-32a2-884e-a903-53898d9a118a
2008-04-21 15:24:18 +00:00
9f1e678b48
[3.1.0] Fixed fatal error in PH5P lexer with invalid tag names
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1650 48356398-32a2-884e-a903-53898d9a118a
2008-04-05 04:28:37 +00:00
cb793cd9b9
- Restore substr_count compatibility method; it's not just PHP 4
...
- Update missing includes
- Fix generate-standalone.php fatal error
- Make LexerTest resilient against variant versions of libxml
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1573 48356398-32a2-884e-a903-53898d9a118a
2008-02-20 01:28:19 +00:00
6c9c8f2380
[3.1.0] [BACKPORT] Fix bug with comments in styles, and some associated issues
...
- Restore printTokens()
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1570 48356398-32a2-884e-a903-53898d9a118a
2008-02-20 00:15:44 +00:00
fbc595ebed
Remove includes from unit tests.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1569 48356398-32a2-884e-a903-53898d9a118a
2008-02-18 04:41:42 +00:00
5b3431d889
[3.0.0] Fully implement CSS extraction and cleaning. See NEWS for more information, it is now a Filter.
...
- Some Lexer things were moved around
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1469 48356398-32a2-884e-a903-53898d9a118a
2007-12-12 21:46:30 +00:00
831f552ec5
[3.0.0] <style> tags can now be extracted from input HTML using %HTML.ExtractStyleBlocks. These contents can be retrieved from $context->get('StyleBlocks');
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1468 48356398-32a2-884e-a903-53898d9a118a
2007-12-12 03:29:12 +00:00
43f01925cd
Convert to PHP 5 only codebase, adding visibility modifiers to all members and methods in the main library area (function only for test methods)
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1458 48356398-32a2-884e-a903-53898d9a118a
2007-11-25 02:24:39 +00:00
cb92a57e4e
[2.1.2] Implement experimental HTML5 parsing using PH5P
...
- Fix debugger so that tokens can be printed without an index
- Fix some broken PEAR unit tests
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1383 48356398-32a2-884e-a903-53898d9a118a
2007-08-19 18:49:35 +00:00
9881a34712
More unit test refactoring into seperate methods.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1380 48356398-32a2-884e-a903-53898d9a118a
2007-08-16 06:48:24 +00:00
24a4dfdf83
[2.1.2?] Fix invisible DirectLex parsing error with empty elements that have attributes containing slashes
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1375 48356398-32a2-884e-a903-53898d9a118a
2007-08-08 05:05:30 +00:00
2a002857ce
[2.1.0] All unit tests inherit from HTMLPurifier_Harness, not UnitTestCase. prepareCommon() refactored to global test-case.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1332 48356398-32a2-884e-a903-53898d9a118a
2007-08-01 14:06:59 +00:00
e7e81c0a5b
[2.1.0] Fix some minor DirectLex bugs that may lead to PHP errors
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1310 48356398-32a2-884e-a903-53898d9a118a
2007-07-05 21:29:07 +00:00
a6ede3804e
[2.1.0] True emoticon < fix.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1260 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 16:40:18 +00:00
e99520ab96
Remove trailing ?> in PHP library files, add trailing newlines to all other files.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1253 48356398-32a2-884e-a903-53898d9a118a
2007-06-27 13:58:32 +00:00
275932ec05
[2.0.1] Fix DirectLex's incomprehension of un-armored script contents as CDATA using custom preg_replace_callback
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1244 48356398-32a2-884e-a903-53898d9a118a
2007-06-26 16:08:42 +00:00
bf0d659c47
[2.0.1] Improve special case handling for <script>
...
- DirectLex now honors comments with greater than or less than signs in them
- Comments are transformed into script elements, ending comments are scrapped
- Buggy generator code rewritten to be more error-proof
- AttrValidator checks if token has attributes before processing
- Remove invalid documentation from Scripting
- "Commenting" of script elements switched to the more advanced version
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1189 48356398-32a2-884e-a903-53898d9a118a
2007-06-21 14:44:26 +00:00
4bf15de536
[1.7.0] Implement line number counting in DirectLex, in preparation for error reporting
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1155 48356398-32a2-884e-a903-53898d9a118a
2007-06-18 02:01:01 +00:00
c5e33416d3
[1.6.1] Unit tests now use exclusively assertIdentical
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@1024 48356398-32a2-884e-a903-53898d9a118a
2007-05-05 20:17:04 +00:00
ac3ab2a556
[1.6.1] DirectLex now preserves text in which a < bracket is followed by a non-alphanumeric character. This means that certain emoticons are now preserved.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@939 48356398-32a2-884e-a903-53898d9a118a
2007-04-04 02:22:27 +00:00
61f852d429
Merge in PHP5 strict changes that are applicable to PHP4.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@650 48356398-32a2-884e-a903-53898d9a118a
2007-01-16 22:22:08 +00:00
8f515b9cda
[1.2.0]
...
- Partially finished migrating to new Context object (done in r485).
- Created HTMLPurifier_Harness to assist with testing, ChildDefTest migrated to that framework.
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@484 48356398-32a2-884e-a903-53898d9a118a
2006-10-01 20:47:07 +00:00
37def0104b
[1.1.2]
...
- Documentation updated
- API docs now exclude more files that are not classes
- Fixed lack of attribute parsing in HTMLPurifier_Lexer_PEARSax3
- (internal) Refactored parseData() to general Lexer class
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@466 48356398-32a2-884e-a903-53898d9a118a
2006-09-27 02:09:54 +00:00
1de3088276
Refactor encoding and entity specific processing to HTMLPurifier_Encoder. We also need to refactor the escaping to this class too.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@339 48356398-32a2-884e-a903-53898d9a118a
2006-08-29 19:36:40 +00:00
973cc43b64
Malformed UTF-8 and non-SGML character detection and cleaning implemented
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@303 48356398-32a2-884e-a903-53898d9a118a
2006-08-19 17:53:59 +00:00
a33cd12f1a
Fixed broken multibyte numeric entity conversion in Lexer::substituteNonSpecialEntities()
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@299 48356398-32a2-884e-a903-53898d9a118a
2006-08-18 17:49:33 +00:00
cedcbb9e15
Update TODO, add extra fringe test-case for extractBody()
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@259 48356398-32a2-884e-a903-53898d9a118a
2006-08-15 01:14:39 +00:00
9a35dfa6b9
Add support for full document parsing, aka discard everything that's not in-between body if applicable.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@258 48356398-32a2-884e-a903-53898d9a118a
2006-08-15 00:53:24 +00:00
d7140f2e05
Outfit a bunch of other classes so they can accept a configuration object. Put in basic scaffolding for extractBody() functionality.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@257 48356398-32a2-884e-a903-53898d9a118a
2006-08-15 00:31:12 +00:00
299236f695
Fix DOM bug where default encoding for HTML docs is not UTF-8.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@252 48356398-32a2-884e-a903-53898d9a118a
2006-08-14 13:27:18 +00:00
3c2c0c1a1b
Make PEAR tests configurable.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@201 48356398-32a2-884e-a903-53898d9a118a
2006-08-10 12:41:39 +00:00
b267b0c202
Add an attribute entity parse test to Lexer and change PEARSax3 to a proof of concept.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@158 48356398-32a2-884e-a903-53898d9a118a
2006-08-04 02:59:15 +00:00
609977f9f5
Add CDATA support to the Lexers, as well as give PEARSax3 entity replacement.
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@106 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 23:04:34 +00:00
5ce0ae7056
Implement EntityLookup and put in the Lexer. Some behavior was migrated, since it looks like it will have to be used in all Lexers, not just DirectLex (which is the only one that uses it).
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@105 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 21:07:30 +00:00
14f481bcf6
svn:eol-style = native
...
git-svn-id: http://htmlpurifier.org/svnroot/htmlpurifier/trunk@97 48356398-32a2-884e-a903-53898d9a118a
2006-07-23 00:11:03 +00:00