mirror of
https://github.com/nikic/PHP-Parser.git
synced 2025-01-17 15:18:17 +01:00
Update docs
This commit is contained in:
parent
c0f0edf044
commit
71438559ae
@ -27,6 +27,8 @@ This release primarily improves our support for error recovery.
|
||||
* Due to the error handling changes, the `Parser` interface and `Lexer` API have changed.
|
||||
* The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
|
||||
This changes the protected API of the lexer.
|
||||
* The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
|
||||
used. `Name::concat()` now also supports concatenation with `null`.
|
||||
|
||||
### Removed
|
||||
|
||||
|
@ -6,7 +6,9 @@ PHP Parser
|
||||
This is a PHP 5.2 to PHP 7.1 parser written in PHP. Its purpose is to simplify static code analysis and
|
||||
manipulation.
|
||||
|
||||
[**Documentation for version 2.x**][doc_master] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).
|
||||
[Documentation for version 3.x][doc_master] (beta; for running on PHP >= 5.5; for parsing PHP 5.2 to PHP 7.1).
|
||||
|
||||
[**Documentation for version 2.x**][doc_2_x] (stable; for running on PHP >= 5.4; for parsing PHP 5.2 to PHP 7.0).
|
||||
|
||||
[Documentation for version 1.x][doc_1_x] (unsupported; for running on PHP >= 5.3; for parsing PHP 5.2 to PHP 5.6).
|
||||
|
||||
@ -89,7 +91,7 @@ Documentation
|
||||
|
||||
Component documentation:
|
||||
|
||||
1. [Error](doc/component/Error.markdown)
|
||||
1. [Error handling](doc/component/Error_handling.markdown)
|
||||
2. [Lexer](doc/component/Lexer.markdown)
|
||||
|
||||
[doc_1_x]: https://github.com/nikic/PHP-Parser/tree/1.x/doc
|
||||
|
@ -152,3 +152,5 @@ The following methods, arguments or options have been removed:
|
||||
* The constants on `NameTraverserInterface` have been moved into the `NameTraverser` class.
|
||||
* The emulative lexer now directly postprocesses tokens, instead of using `~__EMU__~` sequences.
|
||||
This changes the protected API of the emulative lexer.
|
||||
* The `Name::slice()` method now returns `null` for empty slices, previously `new Name([])` was
|
||||
used. `Name::concat()` now also supports concatenation with `null`.
|
||||
|
@ -8,7 +8,7 @@ Simple serialization
|
||||
|
||||
It is possible to serialize the node tree using `serialize()` and also unserialize it using
|
||||
`unserialize()`. The output is not human readable and not easily processable from anything
|
||||
but PHP, but it is compact and generates fast. The main application thus is in caching.
|
||||
but PHP, but it is compact and generates quickly. The main application thus is in caching.
|
||||
|
||||
Human readable dumping
|
||||
----------------------
|
||||
@ -86,6 +86,134 @@ array(
|
||||
)
|
||||
```
|
||||
|
||||
JSON encoding
|
||||
-------------
|
||||
|
||||
Nodes (and comments) implement the `JsonSerializable` interface. As such, it is possible to JSON
|
||||
encode the AST directly using `json_encode()`:
|
||||
|
||||
```php
|
||||
$code = <<<'CODE'
|
||||
<?php
|
||||
|
||||
function printLine($msg) {
|
||||
echo $msg, "\n";
|
||||
}
|
||||
|
||||
printLine('Hello World!!!');
|
||||
CODE;
|
||||
|
||||
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7);
|
||||
$nodeDumper = new PhpParser\NodeDumper;
|
||||
|
||||
try {
|
||||
$stmts = $parser->parse($code);
|
||||
|
||||
echo json_encode($stmts, JSON_PRETTY_PRINT), "\n";
|
||||
} catch (PhpParser\Error $e) {
|
||||
echo 'Parse Error: ', $e->getMessage();
|
||||
}
|
||||
```
|
||||
|
||||
This will result in the following output (which includes attributes):
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"nodeType": "Stmt_Function",
|
||||
"byRef": false,
|
||||
"name": "printLine",
|
||||
"params": [
|
||||
{
|
||||
"nodeType": "Param",
|
||||
"type": null,
|
||||
"byRef": false,
|
||||
"variadic": false,
|
||||
"name": "msg",
|
||||
"default": null,
|
||||
"attributes": {
|
||||
"startLine": 3,
|
||||
"endLine": 3
|
||||
}
|
||||
}
|
||||
],
|
||||
"returnType": null,
|
||||
"stmts": [
|
||||
{
|
||||
"nodeType": "Stmt_Echo",
|
||||
"exprs": [
|
||||
{
|
||||
"nodeType": "Expr_Variable",
|
||||
"name": "msg",
|
||||
"attributes": {
|
||||
"startLine": 4,
|
||||
"endLine": 4
|
||||
}
|
||||
},
|
||||
{
|
||||
"nodeType": "Scalar_String",
|
||||
"value": "\n",
|
||||
"attributes": {
|
||||
"startLine": 4,
|
||||
"endLine": 4,
|
||||
"kind": 2
|
||||
}
|
||||
}
|
||||
],
|
||||
"attributes": {
|
||||
"startLine": 4,
|
||||
"endLine": 4
|
||||
}
|
||||
}
|
||||
],
|
||||
"attributes": {
|
||||
"startLine": 3,
|
||||
"endLine": 5
|
||||
}
|
||||
},
|
||||
{
|
||||
"nodeType": "Expr_FuncCall",
|
||||
"name": {
|
||||
"nodeType": "Name",
|
||||
"parts": [
|
||||
"printLine"
|
||||
],
|
||||
"attributes": {
|
||||
"startLine": 7,
|
||||
"endLine": 7
|
||||
}
|
||||
},
|
||||
"args": [
|
||||
{
|
||||
"nodeType": "Arg",
|
||||
"value": {
|
||||
"nodeType": "Scalar_String",
|
||||
"value": "Hello World!!!",
|
||||
"attributes": {
|
||||
"startLine": 7,
|
||||
"endLine": 7,
|
||||
"kind": 1
|
||||
}
|
||||
},
|
||||
"byRef": false,
|
||||
"unpack": false,
|
||||
"attributes": {
|
||||
"startLine": 7,
|
||||
"endLine": 7
|
||||
}
|
||||
}
|
||||
],
|
||||
"attributes": {
|
||||
"startLine": 7,
|
||||
"endLine": 7
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
There is currently no mechanism to convert JSON back into a node tree. Furthermore, not all ASTs
|
||||
can be JSON encoded. In particular, JSON only supports UTF-8 strings.
|
||||
|
||||
Serialization to XML
|
||||
--------------------
|
||||
|
||||
|
@ -35,6 +35,8 @@ the source code of the parsed file. An example for printing an error:
|
||||
if ($e->hasColumnInfo()) {
|
||||
echo $e->getRawMessage() . ' from ' . $e->getStartLine() . ':' . $e->getStartColumn($code)
|
||||
. ' to ' . $e->getEndLine() . ':' . $e->getEndColumn($code);
|
||||
// or:
|
||||
echo $e->getMessageWithColumnInfo();
|
||||
} else {
|
||||
echo $e->getMessage();
|
||||
}
|
||||
@ -46,27 +48,23 @@ file.
|
||||
Error recovery
|
||||
--------------
|
||||
|
||||
> **EXPERIMENTAL**
|
||||
The error behavior of the parser (and other components) is controlled by an `ErrorHandler`. Whenever an error is
|
||||
encountered, `ErrorHandler::handleError()` is invoked. The default error handling strategy is `ErrorHandler\Throwing`,
|
||||
which will immediately throw when an error is encountered.
|
||||
|
||||
By default the parser will throw an exception upon encountering the first error during parsing. An alternative mode is
|
||||
also supported, in which the parser will remember the error, but try to continue parsing the rest of the source code.
|
||||
|
||||
To enable this mode the `throwOnError` parser option needs to be disabled. Any errors that occurred during parsing can
|
||||
then be retrieved using `$parser->getErrors()`. The `$parser->parse()` method will either return a partial syntax tree
|
||||
or `null` if recovery fails.
|
||||
|
||||
A usage example:
|
||||
To instead collect all encountered errors into an array, while trying to continue parsing the rest of the source code,
|
||||
an instance of `ErrorHandler\Collecting` can be passed to the `Parser::parse()` method. A usage example:
|
||||
|
||||
```php
|
||||
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, null, array(
|
||||
'throwOnError' => false,
|
||||
));
|
||||
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7);
|
||||
$errorHandler = new PhpParser\ErrorHandler\Collecting;
|
||||
|
||||
$stmts = $parser->parse($code);
|
||||
$errors = $parser->getErrors();
|
||||
$stmts = $parser->parse($code, $errorHandler);
|
||||
|
||||
foreach ($errors as $error) {
|
||||
// $error is an ordinary PhpParser\Error
|
||||
if ($errorHandler->hasErrors()) {
|
||||
foreach ($errorHandler->getErrors() as $error) {
|
||||
// $error is an ordinary PhpParser\Error
|
||||
}
|
||||
}
|
||||
|
||||
if (null !== $stmts) {
|
||||
@ -74,4 +72,4 @@ if (null !== $stmts) {
|
||||
}
|
||||
```
|
||||
|
||||
The error recovery implementation is experimental -- it currently won't be able to recover from many types of errors.
|
||||
The `NameResolver` visitor also accepts an `ErrorHandler` as a constructor argument.
|
@ -95,13 +95,14 @@ Lexer extension
|
||||
|
||||
A lexer has to define the following public interface:
|
||||
|
||||
void startLexing(string $code);
|
||||
void startLexing(string $code, ErrorHandler $errorHandler = null);
|
||||
array getTokens();
|
||||
string handleHaltCompiler();
|
||||
int getNextToken(string &$value = null, array &$startAttributes = null, array &$endAttributes = null);
|
||||
|
||||
The `startLexing()` method is invoked with the source code that is to be lexed (including the opening tag) whenever the
|
||||
`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens.
|
||||
`parse()` method of the parser is called. It can be used to reset state or preprocess the source code or tokens. The
|
||||
passes `ErrorHandler` should be used to report lexing errors.
|
||||
|
||||
The `getTokens()` method returns the current token array, in the usual `token_get_all()` format. This method is not
|
||||
used by the parser (which uses `getNextToken()`), but is useful in combination with the token position attributes.
|
||||
@ -122,9 +123,10 @@ node and the `$endAttributes` from the last token that is part of the node.
|
||||
E.g. if the tokens `T_FUNCTION T_STRING ... '{' ... '}'` constitute a node, then the `$startAttributes` from the
|
||||
`T_FUNCTION` token will be taken and the `$endAttributes` from the `'}'` token.
|
||||
|
||||
An application of custom attributes is storing the original formatting of literals: The parser does not retain
|
||||
information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type or used
|
||||
escape sequences). This can be remedied by storing the original value in an attribute:
|
||||
An application of custom attributes is storing the exact original formatting of literals: While the parser does retain
|
||||
some information about the formatting of integers (like decimal vs. hexadecimal) or strings (like used quote type), it
|
||||
does not preserve the exact original formatting (e.g. leading zeros for integers or escape sequences in strings). This
|
||||
can be remedied by storing the original value in an attribute:
|
||||
|
||||
```php
|
||||
use PhpParser\Lexer;
|
||||
@ -135,9 +137,10 @@ class KeepOriginalValueLexer extends Lexer // or Lexer\Emulative
|
||||
public function getNextToken(&$value = null, &$startAttributes = null, &$endAttributes = null) {
|
||||
$tokenId = parent::getNextToken($value, $startAttributes, $endAttributes);
|
||||
|
||||
if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING // non-interpolated string
|
||||
|| $tokenId == Tokens::T_LNUMBER // integer
|
||||
|| $tokenId == Tokens::T_DNUMBER // floating point number
|
||||
if ($tokenId == Tokens::T_CONSTANT_ENCAPSED_STRING // non-interpolated string
|
||||
|| $tokenId == Tokens::T_ENCAPSED_AND_WHITESPACE // interpolated string
|
||||
|| $tokenId == Tokens::T_LNUMBER // integer
|
||||
|| $tokenId == Tokens::T_DNUMBER // floating point number
|
||||
) {
|
||||
// could also use $startAttributes, doesn't really matter here
|
||||
$endAttributes['originalValue'] = $value;
|
||||
|
Loading…
x
Reference in New Issue
Block a user