Update docs

This commit is contained in:
Nikita Popov 2016-10-29 13:37:47 +02:00
parent c0f0edf044
commit 71438559ae
6 changed files with 163 additions and 28 deletions

View File

@ -27,6 +27,8 @@ This release primarily improves our support for error recovery.
* Due to the error handling changes, the `Parser` interface and `Lexer` API have changed.
* The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
This changes the protected API of the lexer.
* The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
used. `Name::concat()` now also supports concatenation with `null`.
### Removed

View File

@ -6,7 +6,9 @@ PHP Parser
This is a PHP 5.2 to PHP 7.1 parser written in PHP. Its purpose is to simplify static code analysis and
manipulation.
[**Documentation for version 2.x**][doc_master] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).
[Documentation for version 3.x][doc_master] (beta; for running on PHP >= 5.5; for parsing PHP 5.2 to PHP 7.1).
[**Documentation for version 2.x**][doc_2_x] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).
[Documentation for version 1.x][doc_1_x] (unsupported; for running on PHP >= 5.3; for parsing PHP 5.2 to PHP 5.6).
@ -89,7 +91,7 @@ Documentation
Component documentation:
1. [Error](doc/component/Error.markdown)
1. [Error handling](doc/component/Error_handling.markdown)
2. [Lexer](doc/component/Lexer.markdown)
[doc_1_x]: https://github.com/nikic/PHP-Parser/tree/1.x/doc

View File

@ -152,3 +152,5 @@ The following methods, arguments or options have been removed:
* The constants on `NameTraverserInterface` have been moved into the `NameTraverser` class.
* The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
This changes the protected API of the emulative lexer.
* The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
used. `Name::concat()` now also supports concatenation with `null`.

View File

@ -8,7 +8,7 @@ Simple serialization
It is possible to serialize the node tree using `serialize()` and also unserialize it using
`unserialize()`. The output is not human readable and not easily processable from anything
but PHP, but it is compact and generates fast. The main application thus is in caching.
but PHP, but it is compact and generates quickly. The main application thus is in caching.
Human readable dumping
----------------------
@ -86,6 +86,134 @@ array(
)
```
JSON encoding
-------------
Nodes (and comments) implement the `JsonSerializable` interface. As such, it is possible to JSON
encode the AST directly using `json_encode()`:
```php
$code = <<<'CODE'
<?php
function printLine($msg) {
echo $msg, "\n";
}
printLine('Hello World!!!');
CODE;
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7);
$nodeDumper = new PhpParser\NodeDumper;
try {
$stmts = $parser->parse($code);
echo json_encode($stmts, JSON_PRETTY_PRINT), "\n";
} catch (PhpParser\Error $e) {
echo 'Parse Error: ', $e->getMessage();
}
```
This will result in the following output (which includes attributes):
```json
[
{
"nodeType": "Stmt_Function",
"byRef": false,
"name": "printLine",
"params": [
{
"nodeType": "Param",
"type": null,
"byRef": false,
"variadic": false,
"name": "msg",
"default": null,
"attributes": {
"startLine": 3,
"endLine": 3
}
}
],
"returnType": null,
"stmts": [
{
"nodeType": "Stmt_Echo",
"exprs": [
{
"nodeType": "Expr_Variable",
"name": "msg",
"attributes": {
"startLine": 4,
"endLine": 4
}
},
{
"nodeType": "Scalar_String",
"value": "\n",
"attributes": {
"startLine": 4,
"endLine": 4,
"kind": 2
}
}
],
"attributes": {
"startLine": 4,
"endLine": 4
}
}
],
"attributes": {
"startLine": 3,
"endLine": 5
}
},
{
"nodeType": "Expr_FuncCall",
"name": {
"nodeType": "Name",
"parts": [
"printLine"
],
"attributes": {
"startLine": 7,
"endLine": 7
}
},
"args": [
{
"nodeType": "Arg",
"value": {
"nodeType": "Scalar_String",
"value": "Hello World!!!",
"attributes": {
"startLine": 7,
"endLine": 7,
"kind": 1
}
},
"byRef": false,
"unpack": false,
"attributes": {
"startLine": 7,
"endLine": 7
}
}
],
"attributes": {
"startLine": 7,
"endLine": 7
}
}
]
```
There is currently no mechanism to convert JSON back into a node tree. Furthermore, not all ASTs
can be JSON encoded. In particular, JSON only supports UTF-8 strings.
Serialization to XML
--------------------

View File

@ -35,6 +35,8 @@ the source code of the parsed file. An example for printing an error:
if ($e->hasColumnInfo()) {
echo $e->getRawMessage() . ' from ' . $e->getStartLine() . ':' . $e->getStartColumn($code)
. ' to ' . $e->getEndLine() . ':' . $e->getEndColumn($code);
// or:
echo $e->getMessageWithColumnInfo();
} else {
echo $e->getMessage();
}
@ -46,27 +48,23 @@ file.
Error recovery
--------------
> **EXPERIMENTAL**
The error behavior of the parser (and other components) is controlled by an `ErrorHandler`. Whenever an error is
encountered, `ErrorHandler::handleError()` is invoked. The default error handling strategy is `ErrorHandler\Throwing`,
which will immediately throw when an error is encountered.
By default the parser will throw an exception upon encountering the first error during parsing. An alternative mode is
also supported, in which the parser will remember the error, but try to continue parsing the rest of the source code.
To enable this mode the `throwOnError` parser option needs to be disabled. Any errors that occurred during parsing can
then be retrieved using `$parser->getErrors()`. The `$parser->parse()` method will either return a partial syntax tree
or `null` if recovery fails.
A usage example:
To instead collect all encountered errors into an array, while trying to continue parsing the rest of the source code,
an instance of `ErrorHandler\Collecting` can be passed to the `Parser::parse()` method. A usage example:
```php
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, null, array(
'throwOnError' => false,
));
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7);
$errorHandler = new PhpParser\ErrorHandler\Collecting;
$stmts = $parser->parse($code);
$errors = $parser->getErrors();
$stmts = $parser->parse($code, $errorHandler);
foreach ($errors as $error) {
// $error is an ordinary PhpParser\Error
if ($errorHandler->hasErrors()) {
foreach ($errorHandler->getErrors() as $error) {
// $error is an ordinary PhpParser\Error
}
}
if (null !== $stmts) {
@ -74,4 +72,4 @@ if (null !== $stmts) {
}
```
The error recovery implementation is experimental -- it currently won't be able to recover from many types of errors.
The `NameResolver` visitor also accepts an `ErrorHandler` as a constructor argument.

View File

@ -95,13 +95,14 @@ Lexer extension
A lexer has to define the following public interface:
void startLexing(string $code);
void startLexing(string $code, ErrorHandler $errorHandler = null);
array getTokens();
string handleHaltCompiler();
int getNextToken(string &$value = null, array &$startAttributes = null, array &$endAttributes = null);
The `startLexing()` method is invoked with the source code that is to be lexed (including the opening tag) whenever the
`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens.
`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens. The
passes `ErrorHandler` should be used to report lexing errors.
The `getTokens()` method returns the current token array, in the usual `token_get_all()` format. This method is not
used by the parser (which uses `getNextToken()`), but is useful in combination with the token position attributes.
@ -122,9 +123,10 @@ node and the `$endAttributes` from the last token that is part of the node.
E.g. if the tokens `T_FUNCTION T_STRING ... '{' ... '}'` constitute a node, then the `$startAttributes` from the
`T_FUNCTION` token will be taken and the `$endAttributes` from the `'}'` token.
An application of custom attributes is storing the original formatting of literals: The parser does not retain
information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type or used
escape sequences). This can be remedied by storing the original value in an attribute:
An application of custom attributes is storing the exact original formatting of literals: While the parser does retain
some information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type), it
does not preserve the exact original formatting (e.g. leading zeros for integers or escape sequences in strings). This
can be remedied by storing the original value in an attribute:
```php
use PhpParser\Lexer;
@ -135,9 +137,10 @@ class KeepOriginalValueLexer extends Lexer // or Lexer\Emulative
public function getNextToken(&$value = null, &$startAttributes = null, &$endAttributes = null) {
$tokenId = parent::getNextToken($value, $startAttributes, $endAttributes);
if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING // non-interpolated string
|| $tokenId == Tokens::T_LNUMBER // integer
|| $tokenId == Tokens::T_DNUMBER // floating point number
if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING // non-interpolated string
|| $tokenId == Tokens::T_ENCAPSED_AND_WHITESPACE // interpolated string
|| $tokenId == Tokens::T_LNUMBER // integer
|| $tokenId == Tokens::T_DNUMBER // floating point number
) {
// could also use $startAttributes, doesn't really matter here
$endAttributes['originalValue'] = $value;