mirror of
https://github.com/processwire/processwire.git
synced 2025-08-09 08:17:12 +02:00
Upgrade SmartyPants version for TextformatterSmartypants module, plus updates per processwire/processwire-issues#17
This commit is contained in:
36
wire/modules/Textformatter/TextformatterSmartypants/Michelf/License.md
Executable file
36
wire/modules/Textformatter/TextformatterSmartypants/Michelf/License.md
Executable file
@@ -0,0 +1,36 @@
|
|||||||
|
PHP SmartyPants Lib
|
||||||
|
Copyright (c) 2005-2016 Michel Fortin
|
||||||
|
<https://michelf.ca/>
|
||||||
|
All rights reserved.
|
||||||
|
|
||||||
|
Original SmartyPants
|
||||||
|
Copyright (c) 2003-2004 John Gruber
|
||||||
|
<https://daringfireball.net/>
|
||||||
|
All rights reserved.
|
||||||
|
|
||||||
|
Redistribution and use in source and binary forms, with or without
|
||||||
|
modification, are permitted provided that the following conditions are
|
||||||
|
met:
|
||||||
|
|
||||||
|
* Redistributions of source code must retain the above copyright notice,
|
||||||
|
this list of conditions and the following disclaimer.
|
||||||
|
|
||||||
|
* Redistributions in binary form must reproduce the above copyright
|
||||||
|
notice, this list of conditions and the following disclaimer in the
|
||||||
|
documentation and/or other materials provided with the distribution.
|
||||||
|
|
||||||
|
* Neither the name "SmartyPants" nor the names of its contributors may
|
||||||
|
be used to endorse or promote products derived from this software
|
||||||
|
without specific prior written permission.
|
||||||
|
|
||||||
|
This software is provided by the copyright holders and contributors "as
|
||||||
|
is" and any express or implied warranties, including, but not limited
|
||||||
|
to, the implied warranties of merchantability and fitness for a
|
||||||
|
particular purpose are disclaimed. In no event shall the copyright owner
|
||||||
|
or contributors be liable for any direct, indirect, incidental, special,
|
||||||
|
exemplary, or consequential damages (including, but not limited to,
|
||||||
|
procurement of substitute goods or services; loss of use, data, or
|
||||||
|
profits; or business interruption) however caused and on any theory of
|
||||||
|
liability, whether in contract, strict liability, or tort (including
|
||||||
|
negligence or otherwise) arising in any way out of the use of this
|
||||||
|
software, even if advised of the possibility of such damage.
|
220
wire/modules/Textformatter/TextformatterSmartypants/Michelf/Readme.md
Executable file
220
wire/modules/Textformatter/TextformatterSmartypants/Michelf/Readme.md
Executable file
@@ -0,0 +1,220 @@
|
|||||||
|
PHP SmartyPants
|
||||||
|
===============
|
||||||
|
|
||||||
|
PHP SmartyPants Lib 1.7.1 - 16 Oct 2016
|
||||||
|
|
||||||
|
by Michel Fortin
|
||||||
|
<https://michelf.ca/>
|
||||||
|
|
||||||
|
Original SmartyPants by John Gruber
|
||||||
|
<https://daringfireball.net/>
|
||||||
|
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
------------
|
||||||
|
|
||||||
|
This is a library package that includes the PHP SmartyPants and its
|
||||||
|
sibling PHP SmartyPants Typographer with additional features.
|
||||||
|
|
||||||
|
SmartyPants is a free web typography prettifyier tool for web writers. It
|
||||||
|
easily translates plain ASCII punctuation characters into "smart" typographic
|
||||||
|
punctuation HTML entities.
|
||||||
|
|
||||||
|
PHP SmartyPants is a port to PHP of the original SmartyPants written
|
||||||
|
in Perl by John Gruber.
|
||||||
|
|
||||||
|
SmartyPants can perform the following transformations:
|
||||||
|
|
||||||
|
* Straight quotes (`"` and `'`) into “curly” quote HTML entities
|
||||||
|
* Backtick-style quotes (` ``like this'' `) into “curly” quote HTML
|
||||||
|
entities
|
||||||
|
* Dashes (`--` and `---`) into en- and em-dash entities
|
||||||
|
* Three consecutive dots (`...`) into an ellipsis entity
|
||||||
|
|
||||||
|
SmartyPants Typographer can perform additional transformations:
|
||||||
|
|
||||||
|
* French guillemets done using (`<<` and `>>`) into true « guillemets »
|
||||||
|
HTML entities.
|
||||||
|
* Comma-style quotes (` ,,like this`` ` or ` ''like this,, `) into their
|
||||||
|
curly equivalent.
|
||||||
|
* Replace existing spaces with non-break spaces around punctuation marks
|
||||||
|
where appropriate, can also add or remove them if configured to.
|
||||||
|
* Replace existing spaces with non-break spaces for spaces used as
|
||||||
|
a thousand separator and between a number and the unit symbol that
|
||||||
|
follows it (for most common units).
|
||||||
|
|
||||||
|
This means you can write, edit, and save using plain old ASCII straight
|
||||||
|
quotes, plain dashes, and plain dots, but your published posts (and
|
||||||
|
final HTML output) will appear with smart quotes, em-dashes, proper
|
||||||
|
ellipses, and proper no-break spaces (with Typographer).
|
||||||
|
|
||||||
|
SmartyPants does not modify characters within `<pre>`, `<code>`,
|
||||||
|
`<kbd>`, or `<script>` tag blocks. Typically, these tags are used to
|
||||||
|
display text where smart quotes and other "smart punctuation" would not
|
||||||
|
be appropriate, such as source code or example markup.
|
||||||
|
|
||||||
|
|
||||||
|
### Backslash Escapes ###
|
||||||
|
|
||||||
|
If you need to use literal straight quotes (or plain hyphens and
|
||||||
|
periods), SmartyPants accepts the following backslash escape sequences
|
||||||
|
to force non-smart punctuation. It does so by transforming the escape
|
||||||
|
sequence into a decimal-encoded HTML entity:
|
||||||
|
|
||||||
|
|
||||||
|
Escape Value Character
|
||||||
|
------ ----- ---------
|
||||||
|
\\ \ \
|
||||||
|
\" " "
|
||||||
|
\' ' '
|
||||||
|
\. . .
|
||||||
|
\- - -
|
||||||
|
\` ` `
|
||||||
|
|
||||||
|
|
||||||
|
This is useful, for example, when you want to use straight quotes as
|
||||||
|
foot and inch marks:
|
||||||
|
|
||||||
|
6\'2\" tall
|
||||||
|
|
||||||
|
translates into:
|
||||||
|
|
||||||
|
6'2" tall
|
||||||
|
|
||||||
|
in SmartyPants's HTML output. Which, when rendered by a web browser,
|
||||||
|
looks like:
|
||||||
|
|
||||||
|
6'2" tall
|
||||||
|
|
||||||
|
|
||||||
|
Requirements
|
||||||
|
------------
|
||||||
|
|
||||||
|
This library package requires PHP 5.3 or later.
|
||||||
|
|
||||||
|
Note: The older plugin/library hybrid package for PHP SmartyPants and
|
||||||
|
PHP SmartyPants Typographer is still will work with PHP 4.0.5 and later.
|
||||||
|
|
||||||
|
|
||||||
|
Usage
|
||||||
|
-----
|
||||||
|
|
||||||
|
This library package is meant to be used with class autoloading. For autoloading
|
||||||
|
to work, your project needs have setup a PSR-0-compatible autoloader. See the
|
||||||
|
included Readme.php file for a minimal autoloader setup. (If you don't want to
|
||||||
|
use autoloading you can do a classic `require_once` to manually include the
|
||||||
|
files prior use instead.)
|
||||||
|
|
||||||
|
With class autoloading in place, putting the 'Michelf' folder in your
|
||||||
|
include path should be enough for this to work:
|
||||||
|
|
||||||
|
use \Michelf\SmartyPants;
|
||||||
|
$html_output = SmartyPants::defaultTransform($html_input);
|
||||||
|
|
||||||
|
SmartyPants Typographer is also available the same way:
|
||||||
|
|
||||||
|
use \Michelf\SmartyPantsTypographer;
|
||||||
|
$html_output = SmartyPantsTypographer::defaultTransform($html_input);
|
||||||
|
|
||||||
|
If you are using PHP SmartyPants with another text filter function that
|
||||||
|
generates HTML such as Markdown, you should filter the text *after* the
|
||||||
|
the HTML-generating filter. This is an example with [PHP Markdown][pmd]:
|
||||||
|
|
||||||
|
use \Michelf\Markdown, \Michelf\SmartyPants;
|
||||||
|
$my_html = Markdown::defaultTransform($my_text);
|
||||||
|
$my_html = SmartyPants::defaultTransform($my_html);
|
||||||
|
|
||||||
|
To learn more about configuration options, see the full list of
|
||||||
|
[configuration variables].
|
||||||
|
|
||||||
|
[configuration variables]: https://michelf.ca/projects/php-smartypants/configuration/
|
||||||
|
[pmd]: https://michelf.ca/projects/php-markdown/
|
||||||
|
|
||||||
|
|
||||||
|
### Usage Without an Autoloader ###
|
||||||
|
|
||||||
|
If you cannot use class autoloading, you can still use include or require to
|
||||||
|
access the parser. To load the \Michelf\SmartyPants parser, do it this way:
|
||||||
|
|
||||||
|
require_once 'Michelf/SmartyPants.inc.php';
|
||||||
|
|
||||||
|
Or, if you need the \Michelf\SmartyPantsTypographer parser:
|
||||||
|
|
||||||
|
require_once 'Michelf/SmartyPantsTypographer.inc.php';
|
||||||
|
|
||||||
|
While the plain `.php` files depend on autoloading to work correctly, using the
|
||||||
|
`.inc.php` files instead will eagerly load the dependencies that would be loaded
|
||||||
|
on demand if you were using autoloading.
|
||||||
|
|
||||||
|
|
||||||
|
Algorithmic Shortcomings
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
One situation in which quotes will get curled the wrong way is when
|
||||||
|
apostrophes are used at the start of leading contractions. For example:
|
||||||
|
|
||||||
|
'Twas the night before Christmas.
|
||||||
|
|
||||||
|
In the case above, SmartyPants will turn the apostrophe into an opening
|
||||||
|
single-quote, when in fact it should be a closing one. I don't think
|
||||||
|
this problem can be solved in the general case -- every word processor
|
||||||
|
I've tried gets this wrong as well. In such cases, it's best to use the
|
||||||
|
proper HTML entity for closing single-quotes (`’` or `’`) by
|
||||||
|
hand.
|
||||||
|
|
||||||
|
|
||||||
|
Bugs
|
||||||
|
----
|
||||||
|
|
||||||
|
To file bug reports or feature requests (other than topics listed in the
|
||||||
|
Caveats section above) please send email to:
|
||||||
|
|
||||||
|
<michel.fortin@michelf.ca>
|
||||||
|
|
||||||
|
If the bug involves quotes being curled the wrong way, please send
|
||||||
|
example text to illustrate.
|
||||||
|
|
||||||
|
|
||||||
|
Version History
|
||||||
|
---------------
|
||||||
|
|
||||||
|
PHP SmartyPants Lib 1.7.1 (16 Oct 2016)
|
||||||
|
|
||||||
|
* Fixing bug where `decodeEntitiesInConfiguration()` would cause the
|
||||||
|
configuration to set the space for units to an empty string.
|
||||||
|
|
||||||
|
|
||||||
|
PHP SmartyPants Lib 1.7.0 (15 Oct 2016)
|
||||||
|
|
||||||
|
* Made `public` some configuration variables that were documented
|
||||||
|
were documented as `public` but were actually `protected`.
|
||||||
|
|
||||||
|
* Added the `decodeEntitiesInConfiguration()` method on
|
||||||
|
`SmartyPantsTypographer` to quickly convert HTML entities in configuration
|
||||||
|
variables to their corresponding UTF-8 character.
|
||||||
|
|
||||||
|
|
||||||
|
PHP SmartyPants Lib 1.6.0 (10 Oct 2016)
|
||||||
|
|
||||||
|
This is the first release of PHP SmartyPants Lib. This package requires PHP
|
||||||
|
version 5.3 or later and is designed to work with PSR-0 autoloading and,
|
||||||
|
optionally with Composer. Here is a list of the changes since
|
||||||
|
PHP SmartyPants 1.5.1f:
|
||||||
|
|
||||||
|
* Plugin interface for Wordpress and Smarty is no longer present in
|
||||||
|
the Lib package. The classic package is still available if you need it:
|
||||||
|
<https://michelf.ca/projects/php-markdown/classic/>
|
||||||
|
|
||||||
|
* SmartyPants parser is now encapsulated in its own class, with methods and
|
||||||
|
configuration variables `public` and `protected` protection attributes.
|
||||||
|
This has been available in unreleased versions since a few years, but now
|
||||||
|
it's official.
|
||||||
|
|
||||||
|
* SmartyPants now works great with PSR-0 autoloading and Composer. If
|
||||||
|
however you prefer to more directly `require_once` the files, the
|
||||||
|
".inc.php" variants of the file will make sure everything is included.
|
||||||
|
|
||||||
|
* For those of you who cannot use class autoloading, you can now
|
||||||
|
include `Michelf/SmartyPants.inc.php` or
|
||||||
|
`Michelf/SmartyPantsTypographer.inc.php` (note the `.inc.php` extension)
|
||||||
|
to automatically include other files required by the parser.
|
@@ -0,0 +1,9 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
// Use this file if you cannot use class autoloading. It will include all the
|
||||||
|
// files needed for the SmartyPants parser.
|
||||||
|
//
|
||||||
|
// Take a look at the PSR-0-compatible class autoloading implementation
|
||||||
|
// in the Readme.php file if you want a simple autoloader setup.
|
||||||
|
|
||||||
|
require_once dirname(__FILE__) . '/SmartyPants.php';
|
513
wire/modules/Textformatter/TextformatterSmartypants/Michelf/SmartyPants.php
Executable file
513
wire/modules/Textformatter/TextformatterSmartypants/Michelf/SmartyPants.php
Executable file
@@ -0,0 +1,513 @@
|
|||||||
|
<?php
|
||||||
|
#
|
||||||
|
# SmartyPants - Smart typography for web sites
|
||||||
|
#
|
||||||
|
# PHP SmartyPants
|
||||||
|
# Copyright (c) 2004-2016 Michel Fortin
|
||||||
|
# <https://michelf.ca/>
|
||||||
|
#
|
||||||
|
# Original SmartyPants
|
||||||
|
# Copyright (c) 2003-2004 John Gruber
|
||||||
|
# <https://daringfireball.net/>
|
||||||
|
#
|
||||||
|
namespace Michelf;
|
||||||
|
|
||||||
|
|
||||||
|
#
|
||||||
|
# SmartyPants Parser Class
|
||||||
|
#
|
||||||
|
|
||||||
|
class SmartyPants {
|
||||||
|
|
||||||
|
### Version ###
|
||||||
|
|
||||||
|
const SMARTYPANTSLIB_VERSION = "1.7.1";
|
||||||
|
|
||||||
|
|
||||||
|
### Presets
|
||||||
|
|
||||||
|
# SmartyPants does nothing at all
|
||||||
|
const ATTR_DO_NOTHING = 0;
|
||||||
|
# "--" for em-dashes; no en-dash support
|
||||||
|
const ATTR_EM_DASH = 1;
|
||||||
|
# "---" for em-dashes; "--" for en-dashes
|
||||||
|
const ATTR_LONG_EM_DASH_SHORT_EN = 2;
|
||||||
|
# "--" for em-dashes; "---" for en-dashes
|
||||||
|
const ATTR_SHORT_EM_DASH_LONG_EN = 3;
|
||||||
|
# "--" for em-dashes; "---" for en-dashes
|
||||||
|
const ATTR_STUPEFY = -1;
|
||||||
|
|
||||||
|
# The default preset: ATTR_EM_DASH
|
||||||
|
const ATTR_DEFAULT = SmartyPants::ATTR_EM_DASH;
|
||||||
|
|
||||||
|
|
||||||
|
### Standard Function Interface ###
|
||||||
|
|
||||||
|
public static function defaultTransform($text, $attr = SmartyPants::ATTR_DEFAULT) {
|
||||||
|
#
|
||||||
|
# Initialize the parser and return the result of its transform method.
|
||||||
|
# This will work fine for derived classes too.
|
||||||
|
#
|
||||||
|
# Take parser class on which this function was called.
|
||||||
|
$parser_class = \get_called_class();
|
||||||
|
|
||||||
|
# try to take parser from the static parser list
|
||||||
|
static $parser_list;
|
||||||
|
$parser =& $parser_list[$parser_class][$attr];
|
||||||
|
|
||||||
|
# create the parser if not already set
|
||||||
|
if (!$parser)
|
||||||
|
$parser = new $parser_class($attr);
|
||||||
|
|
||||||
|
# Transform text using parser.
|
||||||
|
return $parser->transform($text);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
### Configuration Variables ###
|
||||||
|
|
||||||
|
# Partial regex for matching tags to skip
|
||||||
|
public $tags_to_skip = 'pre|code|kbd|script|style|math';
|
||||||
|
|
||||||
|
# Options to specify which transformations to make:
|
||||||
|
public $do_nothing = 0; # disable all transforms
|
||||||
|
public $do_quotes = 0;
|
||||||
|
public $do_backticks = 0; # 1 => double only, 2 => double & single
|
||||||
|
public $do_dashes = 0; # 1, 2, or 3 for the three modes described above
|
||||||
|
public $do_ellipses = 0;
|
||||||
|
public $do_stupefy = 0;
|
||||||
|
public $convert_quot = 0; # should we translate " entities into normal quotes?
|
||||||
|
|
||||||
|
|
||||||
|
### Parser Implementation ###
|
||||||
|
|
||||||
|
public function __construct($attr = SmartyPants::ATTR_DEFAULT) {
|
||||||
|
#
|
||||||
|
# Initialize a parser with certain attributes.
|
||||||
|
#
|
||||||
|
# Parser attributes:
|
||||||
|
# 0 : do nothing
|
||||||
|
# 1 : set all
|
||||||
|
# 2 : set all, using old school en- and em- dash shortcuts
|
||||||
|
# 3 : set all, using inverted old school en and em- dash shortcuts
|
||||||
|
#
|
||||||
|
# q : quotes
|
||||||
|
# b : backtick quotes (``double'' only)
|
||||||
|
# B : backtick quotes (``double'' and `single')
|
||||||
|
# d : dashes
|
||||||
|
# D : old school dashes
|
||||||
|
# i : inverted old school dashes
|
||||||
|
# e : ellipses
|
||||||
|
# w : convert " entities to " for Dreamweaver users
|
||||||
|
#
|
||||||
|
if ($attr == "0") {
|
||||||
|
$this->do_nothing = 1;
|
||||||
|
}
|
||||||
|
else if ($attr == "1") {
|
||||||
|
# Do everything, turn all options on.
|
||||||
|
$this->do_quotes = 1;
|
||||||
|
$this->do_backticks = 1;
|
||||||
|
$this->do_dashes = 1;
|
||||||
|
$this->do_ellipses = 1;
|
||||||
|
}
|
||||||
|
else if ($attr == "2") {
|
||||||
|
# Do everything, turn all options on, use old school dash shorthand.
|
||||||
|
$this->do_quotes = 1;
|
||||||
|
$this->do_backticks = 1;
|
||||||
|
$this->do_dashes = 2;
|
||||||
|
$this->do_ellipses = 1;
|
||||||
|
}
|
||||||
|
else if ($attr == "3") {
|
||||||
|
# Do everything, turn all options on, use inverted old school dash shorthand.
|
||||||
|
$this->do_quotes = 1;
|
||||||
|
$this->do_backticks = 1;
|
||||||
|
$this->do_dashes = 3;
|
||||||
|
$this->do_ellipses = 1;
|
||||||
|
}
|
||||||
|
else if ($attr == "-1") {
|
||||||
|
# Special "stupefy" mode.
|
||||||
|
$this->do_stupefy = 1;
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
$chars = preg_split('//', $attr);
|
||||||
|
foreach ($chars as $c){
|
||||||
|
if ($c == "q") { $this->do_quotes = 1; }
|
||||||
|
else if ($c == "b") { $this->do_backticks = 1; }
|
||||||
|
else if ($c == "B") { $this->do_backticks = 2; }
|
||||||
|
else if ($c == "d") { $this->do_dashes = 1; }
|
||||||
|
else if ($c == "D") { $this->do_dashes = 2; }
|
||||||
|
else if ($c == "i") { $this->do_dashes = 3; }
|
||||||
|
else if ($c == "e") { $this->do_ellipses = 1; }
|
||||||
|
else if ($c == "w") { $this->convert_quot = 1; }
|
||||||
|
else {
|
||||||
|
# Unknown attribute option, ignore.
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public function transform($text) {
|
||||||
|
|
||||||
|
if ($this->do_nothing) {
|
||||||
|
return $text;
|
||||||
|
}
|
||||||
|
|
||||||
|
$tokens = $this->tokenizeHTML($text);
|
||||||
|
$result = '';
|
||||||
|
$in_pre = 0; # Keep track of when we're inside <pre> or <code> tags.
|
||||||
|
|
||||||
|
$prev_token_last_char = ""; # This is a cheat, used to get some context
|
||||||
|
# for one-character tokens that consist of
|
||||||
|
# just a quote char. What we do is remember
|
||||||
|
# the last character of the previous text
|
||||||
|
# token, to use as context to curl single-
|
||||||
|
# character quote tokens correctly.
|
||||||
|
|
||||||
|
foreach ($tokens as $cur_token) {
|
||||||
|
if ($cur_token[0] == "tag") {
|
||||||
|
# Don't mess with quotes inside tags.
|
||||||
|
$result .= $cur_token[1];
|
||||||
|
if (preg_match('@<(/?)(?:'.$this->tags_to_skip.')[\s>]@', $cur_token[1], $matches)) {
|
||||||
|
$in_pre = isset($matches[1]) && $matches[1] == '/' ? 0 : 1;
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
$t = $cur_token[1];
|
||||||
|
$last_char = substr($t, -1); # Remember last char of this token before processing.
|
||||||
|
if (! $in_pre) {
|
||||||
|
$t = $this->educate($t, $prev_token_last_char);
|
||||||
|
}
|
||||||
|
$prev_token_last_char = $last_char;
|
||||||
|
$result .= $t;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return $result;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educate($t, $prev_token_last_char) {
|
||||||
|
$t = $this->processEscapes($t);
|
||||||
|
|
||||||
|
if ($this->convert_quot) {
|
||||||
|
$t = preg_replace('/"/', '"', $t);
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($this->do_dashes) {
|
||||||
|
if ($this->do_dashes == 1) $t = $this->educateDashes($t);
|
||||||
|
if ($this->do_dashes == 2) $t = $this->educateDashesOldSchool($t);
|
||||||
|
if ($this->do_dashes == 3) $t = $this->educateDashesOldSchoolInverted($t);
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($this->do_ellipses) $t = $this->educateEllipses($t);
|
||||||
|
|
||||||
|
# Note: backticks need to be processed before quotes.
|
||||||
|
if ($this->do_backticks) {
|
||||||
|
$t = $this->educateBackticks($t);
|
||||||
|
if ($this->do_backticks == 2) $t = $this->educateSingleBackticks($t);
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($this->do_quotes) {
|
||||||
|
if ($t == "'") {
|
||||||
|
# Special case: single-character ' token
|
||||||
|
if (preg_match('/\S/', $prev_token_last_char)) {
|
||||||
|
$t = "’";
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
$t = "‘";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else if ($t == '"') {
|
||||||
|
# Special case: single-character " token
|
||||||
|
if (preg_match('/\S/', $prev_token_last_char)) {
|
||||||
|
$t = "”";
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
$t = "“";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
# Normal case:
|
||||||
|
$t = $this->educateQuotes($t);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if ($this->do_stupefy) $t = $this->stupefyEntities($t);
|
||||||
|
|
||||||
|
return $t;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateQuotes($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
#
|
||||||
|
# Returns: The string, with "educated" curly quote HTML entities.
|
||||||
|
#
|
||||||
|
# Example input: "Isn't this fun?"
|
||||||
|
# Example output: “Isn’t this fun?”
|
||||||
|
#
|
||||||
|
# Make our own "punctuation" character class, because the POSIX-style
|
||||||
|
# [:PUNCT:] is only available in Perl 5.6 or later:
|
||||||
|
$punct_class = "[!\"#\\$\\%'()*+,-.\\/:;<=>?\\@\\[\\\\\]\\^_`{|}~]";
|
||||||
|
|
||||||
|
# Special case if the very first character is a quote
|
||||||
|
# followed by punctuation at a non-word-break. Close the quotes by brute force:
|
||||||
|
$_ = preg_replace(
|
||||||
|
array("/^'(?=$punct_class\\B)/", "/^\"(?=$punct_class\\B)/"),
|
||||||
|
array('’', '”'), $_);
|
||||||
|
|
||||||
|
|
||||||
|
# Special case for double sets of quotes, e.g.:
|
||||||
|
# <p>He said, "'Quoted' words in a larger quote."</p>
|
||||||
|
$_ = preg_replace(
|
||||||
|
array("/\"'(?=\w)/", "/'\"(?=\w)/"),
|
||||||
|
array('“‘', '‘“'), $_);
|
||||||
|
|
||||||
|
# Special case for decade abbreviations (the '80s):
|
||||||
|
$_ = preg_replace("/'(?=\\d{2}s)/", '’', $_);
|
||||||
|
|
||||||
|
$close_class = '[^\ \t\r\n\[\{\(\-]';
|
||||||
|
$dec_dashes = '&\#8211;|&\#8212;';
|
||||||
|
|
||||||
|
# Get most opening single quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
(
|
||||||
|
\\s | # a whitespace char, or
|
||||||
|
| # a non-breaking space entity, or
|
||||||
|
-- | # dashes, or
|
||||||
|
&[mn]dash; | # named dash entities
|
||||||
|
$dec_dashes | # or decimal entities
|
||||||
|
&\\#x201[34]; # or hex
|
||||||
|
)
|
||||||
|
' # the quote
|
||||||
|
(?=\\w) # followed by a word character
|
||||||
|
}x", '\1‘', $_);
|
||||||
|
# Single closing quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
($close_class)?
|
||||||
|
'
|
||||||
|
(?(1)| # If $1 captured, then do nothing;
|
||||||
|
(?=\\s | s\\b) # otherwise, positive lookahead for a whitespace
|
||||||
|
) # char or an 's' at a word ending position. This
|
||||||
|
# is a special case to handle something like:
|
||||||
|
# \"<i>Custer</i>'s Last Stand.\"
|
||||||
|
}xi", '\1’', $_);
|
||||||
|
|
||||||
|
# Any remaining single quotes should be opening ones:
|
||||||
|
$_ = str_replace("'", '‘', $_);
|
||||||
|
|
||||||
|
|
||||||
|
# Get most opening double quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
(
|
||||||
|
\\s | # a whitespace char, or
|
||||||
|
| # a non-breaking space entity, or
|
||||||
|
-- | # dashes, or
|
||||||
|
&[mn]dash; | # named dash entities
|
||||||
|
$dec_dashes | # or decimal entities
|
||||||
|
&\\#x201[34]; # or hex
|
||||||
|
)
|
||||||
|
\" # the quote
|
||||||
|
(?=\\w) # followed by a word character
|
||||||
|
}x", '\1“', $_);
|
||||||
|
|
||||||
|
# Double closing quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
($close_class)?
|
||||||
|
\"
|
||||||
|
(?(1)|(?=\\s)) # If $1 captured, then do nothing;
|
||||||
|
# if not, then make sure the next char is whitespace.
|
||||||
|
}x", '\1”', $_);
|
||||||
|
|
||||||
|
# Any remaining quotes should be opening ones.
|
||||||
|
$_ = str_replace('"', '“', $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateBackticks($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with ``backticks'' -style double quotes
|
||||||
|
# translated into HTML curly quote entities.
|
||||||
|
#
|
||||||
|
# Example input: ``Isn't this fun?''
|
||||||
|
# Example output: “Isn't this fun?”
|
||||||
|
#
|
||||||
|
|
||||||
|
$_ = str_replace(array("``", "''",),
|
||||||
|
array('“', '”'), $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateSingleBackticks($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with `backticks' -style single quotes
|
||||||
|
# translated into HTML curly quote entities.
|
||||||
|
#
|
||||||
|
# Example input: `Isn't this fun?'
|
||||||
|
# Example output: ‘Isn’t this fun?’
|
||||||
|
#
|
||||||
|
|
||||||
|
$_ = str_replace(array("`", "'",),
|
||||||
|
array('‘', '’'), $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateDashes($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
#
|
||||||
|
# Returns: The string, with each instance of "--" translated to
|
||||||
|
# an em-dash HTML entity.
|
||||||
|
#
|
||||||
|
|
||||||
|
$_ = str_replace('--', '—', $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateDashesOldSchool($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
#
|
||||||
|
# Returns: The string, with each instance of "--" translated to
|
||||||
|
# an en-dash HTML entity, and each "---" translated to
|
||||||
|
# an em-dash HTML entity.
|
||||||
|
#
|
||||||
|
|
||||||
|
# em en
|
||||||
|
$_ = str_replace(array("---", "--",),
|
||||||
|
array('—', '–'), $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateDashesOldSchoolInverted($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
#
|
||||||
|
# Returns: The string, with each instance of "--" translated to
|
||||||
|
# an em-dash HTML entity, and each "---" translated to
|
||||||
|
# an en-dash HTML entity. Two reasons why: First, unlike the
|
||||||
|
# en- and em-dash syntax supported by
|
||||||
|
# EducateDashesOldSchool(), it's compatible with existing
|
||||||
|
# entries written before SmartyPants 1.1, back when "--" was
|
||||||
|
# only used for em-dashes. Second, em-dashes are more
|
||||||
|
# common than en-dashes, and so it sort of makes sense that
|
||||||
|
# the shortcut should be shorter to type. (Thanks to Aaron
|
||||||
|
# Swartz for the idea.)
|
||||||
|
#
|
||||||
|
|
||||||
|
# en em
|
||||||
|
$_ = str_replace(array("---", "--",),
|
||||||
|
array('–', '—'), $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateEllipses($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with each instance of "..." translated to
|
||||||
|
# an ellipsis HTML entity. Also converts the case where
|
||||||
|
# there are spaces between the dots.
|
||||||
|
#
|
||||||
|
# Example input: Huh...?
|
||||||
|
# Example output: Huh…?
|
||||||
|
#
|
||||||
|
|
||||||
|
$_ = str_replace(array("...", ". . .",), '…', $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function stupefyEntities($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with each SmartyPants HTML entity translated to
|
||||||
|
# its ASCII counterpart.
|
||||||
|
#
|
||||||
|
# Example input: “Hello — world.”
|
||||||
|
# Example output: "Hello -- world."
|
||||||
|
#
|
||||||
|
|
||||||
|
# en-dash em-dash
|
||||||
|
$_ = str_replace(array('–', '—'),
|
||||||
|
array('-', '--'), $_);
|
||||||
|
|
||||||
|
# single quote open close
|
||||||
|
$_ = str_replace(array('‘', '’'), "'", $_);
|
||||||
|
|
||||||
|
# double quote open close
|
||||||
|
$_ = str_replace(array('“', '”'), '"', $_);
|
||||||
|
|
||||||
|
$_ = str_replace('…', '...', $_); # ellipsis
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function processEscapes($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with after processing the following backslash
|
||||||
|
# escape sequences. This is useful if you want to force a "dumb"
|
||||||
|
# quote or other character to appear.
|
||||||
|
#
|
||||||
|
# Escape Value
|
||||||
|
# ------ -----
|
||||||
|
# \\ \
|
||||||
|
# \" "
|
||||||
|
# \' '
|
||||||
|
# \. .
|
||||||
|
# \- -
|
||||||
|
# \` `
|
||||||
|
#
|
||||||
|
$_ = str_replace(
|
||||||
|
array('\\\\', '\"', "\'", '\.', '\-', '\`'),
|
||||||
|
array('\', '"', ''', '.', '-', '`'), $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function tokenizeHTML($str) {
|
||||||
|
#
|
||||||
|
# Parameter: String containing HTML markup.
|
||||||
|
# Returns: An array of the tokens comprising the input
|
||||||
|
# string. Each token is either a tag (possibly with nested,
|
||||||
|
# tags contained therein, such as <a href="<MTFoo>">, or a
|
||||||
|
# run of text between tags. Each element of the array is a
|
||||||
|
# two-element array; the first is either 'tag' or 'text';
|
||||||
|
# the second is the actual value.
|
||||||
|
#
|
||||||
|
#
|
||||||
|
# Regular expression derived from the _tokenize() subroutine in
|
||||||
|
# Brad Choate's MTRegex plugin.
|
||||||
|
# <http://www.bradchoate.com/past/mtregex.php>
|
||||||
|
#
|
||||||
|
$index = 0;
|
||||||
|
$tokens = array();
|
||||||
|
|
||||||
|
$match = '(?s:<!--.*?-->)|'. # comment
|
||||||
|
'(?s:<\?.*?\?>)|'. # processing instruction
|
||||||
|
# regular tags
|
||||||
|
'(?:<[/!$]?[-a-zA-Z0-9:]+\b(?>[^"\'>]+|"[^"]*"|\'[^\']*\')*>)';
|
||||||
|
|
||||||
|
$parts = preg_split("{($match)}", $str, -1, PREG_SPLIT_DELIM_CAPTURE);
|
||||||
|
|
||||||
|
foreach ($parts as $part) {
|
||||||
|
if (++$index % 2 && $part != '')
|
||||||
|
$tokens[] = array('text', $part);
|
||||||
|
else
|
||||||
|
$tokens[] = array('tag', $part);
|
||||||
|
}
|
||||||
|
return $tokens;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
@@ -0,0 +1,10 @@
|
|||||||
|
<?php
|
||||||
|
|
||||||
|
// Use this file if you cannot use class autoloading. It will include all the
|
||||||
|
// files needed for the SmartyPants Typographer parser.
|
||||||
|
//
|
||||||
|
// Take a look at the PSR-0-compatible class autoloading implementation
|
||||||
|
// in the Readme.php file if you want a simple autoloader setup.
|
||||||
|
|
||||||
|
require_once dirname(__FILE__) . '/SmartyPants.php';
|
||||||
|
require_once dirname(__FILE__) . '/SmartyPantsTypographer.php';
|
@@ -0,0 +1,553 @@
|
|||||||
|
<?php
|
||||||
|
#
|
||||||
|
# SmartyPants Typographer - Smart typography for web sites
|
||||||
|
#
|
||||||
|
# PHP SmartyPants & Typographer
|
||||||
|
# Copyright (c) 2004-2016 Michel Fortin
|
||||||
|
# <https://michelf.ca/>
|
||||||
|
#
|
||||||
|
# Original SmartyPants
|
||||||
|
# Copyright (c) 2003-2004 John Gruber
|
||||||
|
# <https://daringfireball.net/>
|
||||||
|
#
|
||||||
|
namespace Michelf;
|
||||||
|
|
||||||
|
|
||||||
|
#
|
||||||
|
# SmartyPants Typographer Parser Class
|
||||||
|
#
|
||||||
|
class SmartyPantsTypographer extends \Michelf\SmartyPants {
|
||||||
|
|
||||||
|
### Configuration Variables ###
|
||||||
|
|
||||||
|
# Options to specify which transformations to make:
|
||||||
|
public $do_comma_quotes = 0;
|
||||||
|
public $do_guillemets = 0;
|
||||||
|
public $do_space_emdash = 0;
|
||||||
|
public $do_space_endash = 0;
|
||||||
|
public $do_space_colon = 0;
|
||||||
|
public $do_space_semicolon = 0;
|
||||||
|
public $do_space_marks = 0;
|
||||||
|
public $do_space_frenchquote = 0;
|
||||||
|
public $do_space_thousand = 0;
|
||||||
|
public $do_space_unit = 0;
|
||||||
|
|
||||||
|
# Smart quote characters:
|
||||||
|
# Opening and closing smart double-quotes.
|
||||||
|
public $smart_doublequote_open = '“';
|
||||||
|
public $smart_doublequote_close = '”';
|
||||||
|
public $smart_singlequote_open = '‘';
|
||||||
|
public $smart_singlequote_close = '’'; # Also apostrophe.
|
||||||
|
|
||||||
|
# Space characters for different places:
|
||||||
|
# Space around em-dashes. "He_—_or she_—_should change that."
|
||||||
|
public $space_emdash = " ";
|
||||||
|
# Space around en-dashes. "He_–_or she_–_should change that."
|
||||||
|
public $space_endash = " ";
|
||||||
|
# Space before a colon. "He said_: here it is."
|
||||||
|
public $space_colon = " ";
|
||||||
|
# Space before a semicolon. "That's what I said_; that's what he said."
|
||||||
|
public $space_semicolon = " ";
|
||||||
|
# Space before a question mark and an exclamation mark: "¡_Holà_! What_?"
|
||||||
|
public $space_marks = " ";
|
||||||
|
# Space inside french quotes. "Voici la «_chose_» qui m'a attaqué."
|
||||||
|
public $space_frenchquote = " ";
|
||||||
|
# Space as thousand separator. "On compte 10_000 maisons sur cette liste."
|
||||||
|
public $space_thousand = " ";
|
||||||
|
# Space before a unit abreviation. "This 12_kg of matter costs 10_$."
|
||||||
|
public $space_unit = " ";
|
||||||
|
|
||||||
|
# Expression of a space (breakable or not):
|
||||||
|
public $space = '(?: | | |�*160;|�*[aA]0;)';
|
||||||
|
|
||||||
|
|
||||||
|
### Parser Implementation ###
|
||||||
|
|
||||||
|
public function __construct($attr = SmartyPants::ATTR_DEFAULT) {
|
||||||
|
#
|
||||||
|
# Initialize a SmartyPantsTypographer_Parser with certain attributes.
|
||||||
|
#
|
||||||
|
# Parser attributes:
|
||||||
|
# 0 : do nothing
|
||||||
|
# 1 : set all, except dash spacing
|
||||||
|
# 2 : set all, except dash spacing, using old school en- and em- dash shortcuts
|
||||||
|
# 3 : set all, except dash spacing, using inverted old school en and em- dash shortcuts
|
||||||
|
#
|
||||||
|
# Punctuation:
|
||||||
|
# q -> quotes
|
||||||
|
# b -> backtick quotes (``double'' only)
|
||||||
|
# B -> backtick quotes (``double'' and `single')
|
||||||
|
# c -> comma quotes (,,double`` only)
|
||||||
|
# g -> guillemets (<<double>> only)
|
||||||
|
# d -> dashes
|
||||||
|
# D -> old school dashes
|
||||||
|
# i -> inverted old school dashes
|
||||||
|
# e -> ellipses
|
||||||
|
# w -> convert " entities to " for Dreamweaver users
|
||||||
|
#
|
||||||
|
# Spacing:
|
||||||
|
# : -> colon spacing +-
|
||||||
|
# ; -> semicolon spacing +-
|
||||||
|
# m -> question and exclamation marks spacing +-
|
||||||
|
# h -> em-dash spacing +-
|
||||||
|
# H -> en-dash spacing +-
|
||||||
|
# f -> french quote spacing +-
|
||||||
|
# t -> thousand separator spacing -
|
||||||
|
# u -> unit spacing +-
|
||||||
|
# (you can add a plus sign after some of these options denoted by + to
|
||||||
|
# add the space when it is not already present, or you can add a minus
|
||||||
|
# sign to completly remove any space present)
|
||||||
|
#
|
||||||
|
# Initialize inherited SmartyPants parser.
|
||||||
|
parent::__construct($attr);
|
||||||
|
|
||||||
|
if ($attr == "1" || $attr == "2" || $attr == "3") {
|
||||||
|
# Do everything, turn all options on.
|
||||||
|
$this->do_comma_quotes = 1;
|
||||||
|
$this->do_guillemets = 1;
|
||||||
|
$this->do_space_emdash = 1;
|
||||||
|
$this->do_space_endash = 1;
|
||||||
|
$this->do_space_colon = 1;
|
||||||
|
$this->do_space_semicolon = 1;
|
||||||
|
$this->do_space_marks = 1;
|
||||||
|
$this->do_space_frenchquote = 1;
|
||||||
|
$this->do_space_thousand = 1;
|
||||||
|
$this->do_space_unit = 1;
|
||||||
|
}
|
||||||
|
else if ($attr == "-1") {
|
||||||
|
# Special "stupefy" mode.
|
||||||
|
$this->do_stupefy = 1;
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
$chars = preg_split('//', $attr);
|
||||||
|
foreach ($chars as $c){
|
||||||
|
if ($c == "c") { $current =& $this->do_comma_quotes; }
|
||||||
|
else if ($c == "g") { $current =& $this->do_guillemets; }
|
||||||
|
else if ($c == ":") { $current =& $this->do_space_colon; }
|
||||||
|
else if ($c == ";") { $current =& $this->do_space_semicolon; }
|
||||||
|
else if ($c == "m") { $current =& $this->do_space_marks; }
|
||||||
|
else if ($c == "h") { $current =& $this->do_space_emdash; }
|
||||||
|
else if ($c == "H") { $current =& $this->do_space_endash; }
|
||||||
|
else if ($c == "f") { $current =& $this->do_space_frenchquote; }
|
||||||
|
else if ($c == "t") { $current =& $this->do_space_thousand; }
|
||||||
|
else if ($c == "u") { $current =& $this->do_space_unit; }
|
||||||
|
else if ($c == "+") {
|
||||||
|
$current = 2;
|
||||||
|
unset($current);
|
||||||
|
}
|
||||||
|
else if ($c == "-") {
|
||||||
|
$current = -1;
|
||||||
|
unset($current);
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
# Unknown attribute option, ignore.
|
||||||
|
}
|
||||||
|
$current = 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
function decodeEntitiesInConfiguration() {
|
||||||
|
#
|
||||||
|
# Utility function that converts entities in configuration variables to
|
||||||
|
# UTF-8 characters.
|
||||||
|
#
|
||||||
|
$this->smart_doublequote_open = html_entity_decode($this->smart_doublequote_open);
|
||||||
|
$this->smart_doublequote_close = html_entity_decode($this->smart_doublequote_close);
|
||||||
|
$this->smart_singlequote_open = html_entity_decode($this->smart_singlequote_open);
|
||||||
|
$this->smart_singlequote_close = html_entity_decode($this->smart_singlequote_close);
|
||||||
|
$this->space_emdash = html_entity_decode($this->space_emdash);
|
||||||
|
$this->space_endash = html_entity_decode($this->space_endash);
|
||||||
|
$this->space_colon = html_entity_decode($this->space_colon);
|
||||||
|
$this->space_semicolon = html_entity_decode($this->space_semicolon);
|
||||||
|
$this->space_marks = html_entity_decode($this->space_marks);
|
||||||
|
$this->space_frenchquote = html_entity_decode($this->space_frenchquote);
|
||||||
|
$this->space_thousand = html_entity_decode($this->space_thousand);
|
||||||
|
$this->space_unit = html_entity_decode($this->space_unit);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
function educate($t, $prev_token_last_char) {
|
||||||
|
$t = parent::educate($t, $prev_token_last_char);
|
||||||
|
|
||||||
|
if ($this->do_comma_quotes) $t = $this->educateCommaQuotes($t);
|
||||||
|
if ($this->do_guillemets) $t = $this->educateGuillemets($t);
|
||||||
|
|
||||||
|
if ($this->do_space_emdash) $t = $this->spaceEmDash($t);
|
||||||
|
if ($this->do_space_endash) $t = $this->spaceEnDash($t);
|
||||||
|
if ($this->do_space_colon) $t = $this->spaceColon($t);
|
||||||
|
if ($this->do_space_semicolon) $t = $this->spaceSemicolon($t);
|
||||||
|
if ($this->do_space_marks) $t = $this->spaceMarks($t);
|
||||||
|
if ($this->do_space_frenchquote) $t = $this->spaceFrenchQuotes($t);
|
||||||
|
if ($this->do_space_thousand) $t = $this->spaceThousandSeparator($t);
|
||||||
|
if ($this->do_space_unit) $t = $this->spaceUnit($t);
|
||||||
|
|
||||||
|
return $t;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateQuotes($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
#
|
||||||
|
# Returns: The string, with "educated" curly quote HTML entities.
|
||||||
|
#
|
||||||
|
# Example input: "Isn't this fun?"
|
||||||
|
# Example output: “Isn’t this fun?”
|
||||||
|
#
|
||||||
|
$dq_open = $this->smart_doublequote_open;
|
||||||
|
$dq_close = $this->smart_doublequote_close;
|
||||||
|
$sq_open = $this->smart_singlequote_open;
|
||||||
|
$sq_close = $this->smart_singlequote_close;
|
||||||
|
|
||||||
|
# Make our own "punctuation" character class, because the POSIX-style
|
||||||
|
# [:PUNCT:] is only available in Perl 5.6 or later:
|
||||||
|
$punct_class = "[!\"#\\$\\%'()*+,-.\\/:;<=>?\\@\\[\\\\\]\\^_`{|}~]";
|
||||||
|
|
||||||
|
# Special case if the very first character is a quote
|
||||||
|
# followed by punctuation at a non-word-break. Close the quotes by brute force:
|
||||||
|
$_ = preg_replace(
|
||||||
|
array("/^'(?=$punct_class\\B)/", "/^\"(?=$punct_class\\B)/"),
|
||||||
|
array($sq_close, $dq_close), $_);
|
||||||
|
|
||||||
|
# Special case for double sets of quotes, e.g.:
|
||||||
|
# <p>He said, "'Quoted' words in a larger quote."</p>
|
||||||
|
$_ = preg_replace(
|
||||||
|
array("/\"'(?=\w)/", "/'\"(?=\w)/"),
|
||||||
|
array($dq_open.$sq_open, $sq_open.$dq_open), $_);
|
||||||
|
|
||||||
|
# Special case for decade abbreviations (the '80s):
|
||||||
|
$_ = preg_replace("/'(?=\\d{2}s)/", $sq_close, $_);
|
||||||
|
|
||||||
|
$close_class = '[^\ \t\r\n\[\{\(\-]';
|
||||||
|
$dec_dashes = '&\#8211;|&\#8212;';
|
||||||
|
|
||||||
|
# Get most opening single quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
(
|
||||||
|
\\s | # a whitespace char, or
|
||||||
|
| # a non-breaking space entity, or
|
||||||
|
-- | # dashes, or
|
||||||
|
&[mn]dash; | # named dash entities
|
||||||
|
$dec_dashes | # or decimal entities
|
||||||
|
&\\#x201[34]; # or hex
|
||||||
|
)
|
||||||
|
' # the quote
|
||||||
|
(?=\\w) # followed by a word character
|
||||||
|
}x", '\1'.$sq_open, $_);
|
||||||
|
# Single closing quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
($close_class)?
|
||||||
|
'
|
||||||
|
(?(1)| # If $1 captured, then do nothing;
|
||||||
|
(?=\\s | s\\b) # otherwise, positive lookahead for a whitespace
|
||||||
|
) # char or an 's' at a word ending position. This
|
||||||
|
# is a special case to handle something like:
|
||||||
|
# \"<i>Custer</i>'s Last Stand.\"
|
||||||
|
}xi", '\1'.$sq_close, $_);
|
||||||
|
|
||||||
|
# Any remaining single quotes should be opening ones:
|
||||||
|
$_ = str_replace("'", $sq_open, $_);
|
||||||
|
|
||||||
|
|
||||||
|
# Get most opening double quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
(
|
||||||
|
\\s | # a whitespace char, or
|
||||||
|
| # a non-breaking space entity, or
|
||||||
|
-- | # dashes, or
|
||||||
|
&[mn]dash; | # named dash entities
|
||||||
|
$dec_dashes | # or decimal entities
|
||||||
|
&\\#x201[34]; # or hex
|
||||||
|
)
|
||||||
|
\" # the quote
|
||||||
|
(?=\\w) # followed by a word character
|
||||||
|
}x", '\1'.$dq_open, $_);
|
||||||
|
|
||||||
|
# Double closing quotes:
|
||||||
|
$_ = preg_replace("{
|
||||||
|
($close_class)?
|
||||||
|
\"
|
||||||
|
(?(1)|(?=\\s)) # If $1 captured, then do nothing;
|
||||||
|
# if not, then make sure the next char is whitespace.
|
||||||
|
}x", '\1'.$dq_close, $_);
|
||||||
|
|
||||||
|
# Any remaining quotes should be opening ones.
|
||||||
|
$_ = str_replace('"', $dq_open, $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateCommaQuotes($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with ,,comma,, -style double quotes
|
||||||
|
# translated into HTML curly quote entities.
|
||||||
|
#
|
||||||
|
# Example input: ,,Isn't this fun?,,
|
||||||
|
# Example output: „Isn't this fun?„
|
||||||
|
#
|
||||||
|
# Note: this is meant to be used alongside with backtick quotes; there is
|
||||||
|
# no language that use only lower quotations alone mark like in the example.
|
||||||
|
#
|
||||||
|
$_ = str_replace(",,", '„', $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function educateGuillemets($_) {
|
||||||
|
#
|
||||||
|
# Parameter: String.
|
||||||
|
# Returns: The string, with << guillemets >> -style quotes
|
||||||
|
# translated into HTML guillemets entities.
|
||||||
|
#
|
||||||
|
# Example input: << Isn't this fun? >>
|
||||||
|
# Example output: „ Isn't this fun? „
|
||||||
|
#
|
||||||
|
$_ = preg_replace("/(?:<|<){2}/", '«', $_);
|
||||||
|
$_ = preg_replace("/(?:>|>){2}/", '»', $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceFrenchQuotes($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# inside french-style quotes, only french quotes.
|
||||||
|
#
|
||||||
|
# Example input: Quotes in « French », »German« and »Finnish» style.
|
||||||
|
# Example output: Quotes in «_French_», »German« and »Finnish» style.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_frenchquote == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_frenchquote != -1 ? $this->space_frenchquote : '' );
|
||||||
|
|
||||||
|
# Characters allowed immediatly outside quotes.
|
||||||
|
$outside_char = $this->space . '|\s|[.,:;!?\[\](){}|@*~=+-]|¡|¿';
|
||||||
|
|
||||||
|
$_ = preg_replace(
|
||||||
|
"/(^|$outside_char)(«|«|›|‹)$this->space$opt/",
|
||||||
|
"\\1\\2$chr", $_);
|
||||||
|
$_ = preg_replace(
|
||||||
|
"/$this->space$opt(»|»|‹|›)($outside_char|$)/",
|
||||||
|
"$chr\\1\\2", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceColon($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# before colons.
|
||||||
|
#
|
||||||
|
# Example input: Ingredients : fun.
|
||||||
|
# Example output: Ingredients_: fun.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_colon == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_colon != -1 ? $this->space_colon : '' );
|
||||||
|
|
||||||
|
$_ = preg_replace("/$this->space$opt(:)(\\s|$)/m",
|
||||||
|
"$chr\\1\\2", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceSemicolon($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# before semicolons.
|
||||||
|
#
|
||||||
|
# Example input: There he goes ; there she goes.
|
||||||
|
# Example output: There he goes_; there she goes.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_semicolon == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_semicolon != -1 ? $this->space_semicolon : '' );
|
||||||
|
|
||||||
|
$_ = preg_replace("/$this->space(;)(?=\\s|$)/m",
|
||||||
|
" \\1", $_);
|
||||||
|
$_ = preg_replace("/((?:^|\\s)(?>[^&;\\s]+|&#?[a-zA-Z0-9]+;)*)".
|
||||||
|
" $opt(;)(?=\\s|$)/m",
|
||||||
|
"\\1$chr\\2", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceMarks($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# around question and exclamation marks.
|
||||||
|
#
|
||||||
|
# Example input: ¡ Holà ! What ?
|
||||||
|
# Example output: ¡_Holà_! What_?
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_marks == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_marks != -1 ? $this->space_marks : '' );
|
||||||
|
|
||||||
|
// Regular marks.
|
||||||
|
$_ = preg_replace("/$this->space$opt([?!]+)/", "$chr\\1", $_);
|
||||||
|
|
||||||
|
// Inverted marks.
|
||||||
|
$imarks = "(?:¡|¡|¡|&#x[Aa]1;|¿|¿|¿|&#x[Bb][Ff];)";
|
||||||
|
$_ = preg_replace("/($imarks+)$this->space$opt/", "\\1$chr", $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceEmDash($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, two replacement characters separated by a hyphen (`-`),
|
||||||
|
# and forcing flag.
|
||||||
|
#
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# around dashes.
|
||||||
|
#
|
||||||
|
# Example input: Then — without any plan — the fun happend.
|
||||||
|
# Example output: Then_—_without any plan_—_the fun happend.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_emdash == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_emdash != -1 ? $this->space_emdash : '' );
|
||||||
|
$_ = preg_replace("/$this->space$opt(—|—)$this->space$opt/",
|
||||||
|
"$chr\\1$chr", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceEnDash($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, two replacement characters separated by a hyphen (`-`),
|
||||||
|
# and forcing flag.
|
||||||
|
#
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# around dashes.
|
||||||
|
#
|
||||||
|
# Example input: Then — without any plan — the fun happend.
|
||||||
|
# Example output: Then_—_without any plan_—_the fun happend.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_endash == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_endash != -1 ? $this->space_endash : '' );
|
||||||
|
$_ = preg_replace("/$this->space$opt(–|–)$this->space$opt/",
|
||||||
|
"$chr\\1$chr", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceThousandSeparator($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# inside numbers (thousand separator in french).
|
||||||
|
#
|
||||||
|
# Example input: Il y a 10 000 insectes amusants dans ton jardin.
|
||||||
|
# Example output: Il y a 10_000 insectes amusants dans ton jardin.
|
||||||
|
#
|
||||||
|
$chr = ( $this->do_space_thousand != -1 ? $this->space_thousand : '' );
|
||||||
|
$_ = preg_replace('/([0-9]) ([0-9])/', "\\1$chr\\2", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected $units = '
|
||||||
|
### Metric units (with prefixes)
|
||||||
|
(?:
|
||||||
|
p |
|
||||||
|
µ | µ | &\#0*181; | &\#[xX]0*[Bb]5; |
|
||||||
|
[mcdhkMGT]
|
||||||
|
)?
|
||||||
|
(?:
|
||||||
|
[mgstAKNJWCVFSTHBL]|mol|cd|rad|Hz|Pa|Wb|lm|lx|Bq|Gy|Sv|kat|
|
||||||
|
Ω | Ohm | Ω | &\#0*937; | &\#[xX]0*3[Aa]9;
|
||||||
|
)|
|
||||||
|
### Computers units (KB, Kb, TB, Kbps)
|
||||||
|
[kKMGT]?(?:[oBb]|[oBb]ps|flops)|
|
||||||
|
### Money
|
||||||
|
¢ | ¢ | &\#0*162; | &\#[xX]0*[Aa]2; |
|
||||||
|
M?(?:
|
||||||
|
£ | £ | &\#0*163; | &\#[xX]0*[Aa]3; |
|
||||||
|
¥ | ¥ | &\#0*165; | &\#[xX]0*[Aa]5; |
|
||||||
|
€ | € | &\#0*8364; | &\#[xX]0*20[Aa][Cc]; |
|
||||||
|
$
|
||||||
|
)|
|
||||||
|
### Other units
|
||||||
|
(?: ° | ° | &\#0*176; | &\#[xX]0*[Bb]0; ) [CF]? |
|
||||||
|
%|pt|pi|M?px|em|en|gal|lb|[NSEOW]|[NS][EOW]|ha|mbar
|
||||||
|
'; //x
|
||||||
|
|
||||||
|
protected function spaceUnit($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# before unit symbols.
|
||||||
|
#
|
||||||
|
# Example input: Get 3 mol of fun for 3 $.
|
||||||
|
# Example output: Get 3_mol of fun for 3_$.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_unit == 2 ? '?' : '' );
|
||||||
|
$chr = ( $this->do_space_unit != -1 ? $this->space_unit : '' );
|
||||||
|
|
||||||
|
$_ = preg_replace('/
|
||||||
|
(?:([0-9])[ ]'.$opt.') # Number followed by space.
|
||||||
|
('.$this->units.') # Unit.
|
||||||
|
(?![a-zA-Z0-9]) # Negative lookahead for other unit characters.
|
||||||
|
/x',
|
||||||
|
"\\1$chr\\2", $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function spaceAbbr($_) {
|
||||||
|
#
|
||||||
|
# Parameters: String, replacement character, and forcing flag.
|
||||||
|
# Returns: The string, with appropriates spaces replaced
|
||||||
|
# around abbreviations.
|
||||||
|
#
|
||||||
|
# Example input: Fun i.e. something pleasant.
|
||||||
|
# Example output: Fun i.e._something pleasant.
|
||||||
|
#
|
||||||
|
$opt = ( $this->do_space_abbr == 2 ? '?' : '' );
|
||||||
|
|
||||||
|
$_ = preg_replace("/(^|\s)($this->abbr_after) $opt/m",
|
||||||
|
"\\1\\2$this->space_abbr", $_);
|
||||||
|
$_ = preg_replace("/( )$opt($this->abbr_sp_before)(?![a-zA-Z'])/m",
|
||||||
|
"\\1$this->space_abbr\\2", $_);
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function stupefyEntities($_) {
|
||||||
|
#
|
||||||
|
# Adding angle quotes and lower quotes to SmartyPants's stupefy mode.
|
||||||
|
#
|
||||||
|
$_ = parent::stupefyEntities($_);
|
||||||
|
|
||||||
|
$_ = str_replace(array('„', '«', '»'), '"', $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
protected function processEscapes($_) {
|
||||||
|
#
|
||||||
|
# Adding a few more escapes to SmartyPants's escapes:
|
||||||
|
#
|
||||||
|
# Escape Value
|
||||||
|
# ------ -----
|
||||||
|
# \, ,
|
||||||
|
# \< <
|
||||||
|
# \> >
|
||||||
|
#
|
||||||
|
$_ = parent::processEscapes($_);
|
||||||
|
|
||||||
|
$_ = str_replace(
|
||||||
|
array('\,', '\<', '\>', '\<', '\>'),
|
||||||
|
array(',', '<', '>', '<', '>'), $_);
|
||||||
|
|
||||||
|
return $_;
|
||||||
|
}
|
||||||
|
}
|
@@ -1,28 +1,104 @@
|
|||||||
<?php namespace ProcessWire;
|
<?php namespace ProcessWire;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* ProcessWire Smartypants Textformatter
|
* ProcessWire Smartypants Typographer Textformatter
|
||||||
*
|
*
|
||||||
* See: http://daringfireball.net/projects/smartypants/
|
* See: http://daringfireball.net/projects/smartypants/
|
||||||
|
* See: https://github.com/michelf/php-smartypants'
|
||||||
*
|
*
|
||||||
* ProcessWire 3.x, Copyright 2016 by Ryan Cramer
|
* ProcessWire 3.x, Copyright 2016 by Ryan Cramer
|
||||||
* https://processwire.com
|
* https://processwire.com
|
||||||
*
|
*
|
||||||
|
* @property int $useUTF8
|
||||||
|
*
|
||||||
*/
|
*/
|
||||||
|
|
||||||
class TextformatterSmartypants extends Textformatter {
|
class TextformatterSmartypants extends Textformatter implements ConfigurableModule {
|
||||||
|
|
||||||
public static function getModuleInfo() {
|
public static function getModuleInfo() {
|
||||||
return array(
|
return array(
|
||||||
'title' => 'SmartyPants Typographer',
|
'title' => 'SmartyPants Typographer',
|
||||||
'version' => 152,
|
'version' => 171,
|
||||||
'summary' => "Smart typography for web sites, by Michel Fortin based on SmartyPants by John Gruber. If combined with Markdown, it should be applied AFTER Markdown.",
|
'summary' => "Smart typography for web sites, by Michel Fortin based on SmartyPants by John Gruber. If combined with Markdown, it should be applied AFTER Markdown.",
|
||||||
'url' => 'http://michelf.com/projects/php-smartypants/typographer/',
|
'url' => 'https://github.com/michelf/php-smartypants'
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Replacements when useUTF8 aggressive mode is active
|
||||||
|
*
|
||||||
|
* @var array
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
protected static $replacementsUTF8 = array(
|
||||||
|
'—' => '—', // em dash
|
||||||
|
'–' => '–', // en dash
|
||||||
|
'…' => '…', // ellipsis
|
||||||
|
'“' => '“', // open double quote
|
||||||
|
'”' => '”', // open double quote
|
||||||
|
'„' => '„', // low double open quote
|
||||||
|
'«' => '«', // guillemets <<
|
||||||
|
'»' => '»', // guillemets >>
|
||||||
|
);
|
||||||
|
|
||||||
|
public function __construct() {
|
||||||
|
$this->set('useUTF8', 0);
|
||||||
|
parent::__construct();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Textformatter format
|
||||||
|
*
|
||||||
|
* @param string $str
|
||||||
|
*
|
||||||
|
*/
|
||||||
public function format(&$str) {
|
public function format(&$str) {
|
||||||
require_once(dirname(__FILE__) . "/smartypants.php");
|
$str = self::typographer($str, ((int) $this->useUTF8) ? true : false);
|
||||||
$str = \SmartyPants($str);
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Format string with SmartyPants Typographer
|
||||||
|
*
|
||||||
|
* The SmartyPants classes do a lot of expensive setup in their constructor, so we keep a static
|
||||||
|
* version of the parser so that hopefully we don't need to construct more than one parser per
|
||||||
|
* request even if lots of format() calls are made.
|
||||||
|
*
|
||||||
|
* @param string $str
|
||||||
|
* @param bool $useUTF8 Specify true to use UTF-8 replacements rather than HTML entity replacements
|
||||||
|
* @return string
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
static public function typographer($str, $useUTF8 = false) {
|
||||||
|
|
||||||
|
static $parser = null;
|
||||||
|
static $parserUseUTF8 = false;
|
||||||
|
|
||||||
|
if(is_null($parser) || $parserUseUTF8 != $useUTF8) {
|
||||||
|
require_once(dirname(__FILE__) . "/Michelf/SmartyPantsTypographer.inc.php");
|
||||||
|
$parser = new \Michelf\SmartyPantsTypographer(\Michelf\SmartyPants::ATTR_LONG_EM_DASH_SHORT_EN);
|
||||||
|
if($useUTF8) $parser->decodeEntitiesInConfiguration();
|
||||||
|
$parserUseUTF8 = $useUTF8;
|
||||||
|
}
|
||||||
|
|
||||||
|
$str = $parser->transform($str);
|
||||||
|
|
||||||
|
// uncomment this for aggressive UTF8 replacement
|
||||||
|
// if($useUTF8) $str = str_replace(array_keys(self::$replacementsUTF8), array_values(self::$replacementsUTF8), $str);
|
||||||
|
|
||||||
|
return $str;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Module config
|
||||||
|
*
|
||||||
|
* @param InputfieldWrapper $inputfields
|
||||||
|
*
|
||||||
|
*/
|
||||||
|
public function getModuleConfigInputfields(InputfieldWrapper $inputfields) {
|
||||||
|
$f = $this->wire('modules')->get('InputfieldCheckbox');
|
||||||
|
$f->attr('name', 'useUTF8');
|
||||||
|
$f->label = $this->_('Use UTF-8 characters for replacements rather than HTML entities?');
|
||||||
|
$f->attr('checked', $this->useUTF8 ? true : false);
|
||||||
|
$inputfields->add($f);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user