Add name resolution, pretty printing component docs

The docs are receiving too little love...
This commit is contained in:
Nikita Popov 2017-10-03 19:09:27 +02:00
parent f6cc85a796
commit f5f3b0d49d
5 changed files with 225 additions and 10 deletions

View File

@ -90,8 +90,10 @@ Documentation
Component documentation:
1. [Error handling](doc/component/Error_handling.markdown)
2. [Lexer](doc/component/Lexer.markdown)
* [Name resolution](doc/component/Name_resolution.markdown)
* [Pretty printing](doc/component/Pretty_printing.markdown)
* [Lexer](doc/component/Lexer.markdown)
* [Error handling](doc/component/Error_handling.markdown)
[doc_2_x]: https://github.com/nikic/PHP-Parser/tree/2.x/doc
[doc_3_x]: https://github.com/nikic/PHP-Parser/tree/3.x/doc

29
doc/README.md Normal file
View File

@ -0,0 +1,29 @@
Table of Contents
=================
Guide
-----
1. [Introduction](0_Introduction.markdown)
2. [Usage of basic components](2_Usage_of_basic_components.markdown)
3. [Other node tree representations](3_Other_node_tree_representations.markdown)
4. [Code generation](4_Code_generation.markdown)
5. [Frequently asked questions](5_FAQ.markdown)
Component documentation
-----------------------
* [Name resolution](component/Name_resolution.markdown)
* Name resolver options
* Name resolution context
* [Pretty printing](component/Pretty_printing.markdown)
* Converting AST back to PHP code
* Customizing formatting
* Formatting-preserving code transformations
* [Lexer](component/Lexer.markdown)
* Lexer options
* Token and file positions for nodes
* Custom attributes
* [Error handling](component/Error_handling.markdown)
* Column information for errors
* Error recovery (parsing of syntactically incorrect code)

View File

@ -27,18 +27,23 @@ The attributes used in this example match the default behavior of the lexer. The
* `comments`: Array of `PhpParser\Comment` or `PhpParser\Comment\Doc` instances, representing all comments that occurred
between the previous non-discarded token and the current one. Use of this attribute is required for the
`$node->getDocComment()` method to work. The attribute is also needed if you wish the pretty printer to retain
comments present in the original code.
`$node->getComments()` and `$node->getDocComment()` methods to work. The attribute is also needed if you wish the pretty
printer to retain comments present in the original code.
* `startLine`: Line in which the node starts. This attribute is required for the `$node->getLine()` to work. It is also
required if syntax errors should contain line number information.
* `endLine`: Line in which the node ends.
* `startTokenPos`: Offset into the token array of the first token in the node.
* `endTokenPos`: Offset into the token array of the last token in the node.
* `startFilePos`: Offset into the code string of the first character that is part of the node.
* `endFilePos`: Offset into the code string of the last character that is part of the node.
* `endLine`: Line in which the node ends. Required for `$node->getEndLine()`.
* `startTokenPos`: Offset into the token array of the first token in the node. Required for `$node->getStartTokenPos()`.
* `endTokenPos`: Offset into the token array of the last token in the node. Required for `$node->getEndTokenPos()`.
* `startFilePos`: Offset into the code string of the first character that is part of the node. Required for `$node->getStartFilePos()`.
* `endFilePos`: Offset into the code string of the last character that is part of the node. Required for `$node->getEndFilePos()`.
### Using token positions
> **Note:** The example in this section is outdated in that this information is directly available in the AST: While
> `$property->isPublic()` does not distinguish between `public` and `var`, directly checking `$property->flags` for
> the `$property->flags & Class_::VISIBILITY_MODIFIER_MASK) === 0` allows making this distinction without resorting to
> tokens. However the general idea behind the example still applies in other cases.
The token offset information is useful if you wish to examine the exact formatting used for a node. For example the AST
does not distinguish whether a property was declared using `public` or using `var`, but you can retrieve this
information based on the token position:
@ -72,7 +77,7 @@ $lexer = new PhpParser\Lexer(array(
'comments', 'startLine', 'endLine', 'startTokenPos', 'endTokenPos'
)
));
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, $lexer);
$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7, $lexer);
$visitor = new MyNodeVisitor();
$traverser = new PhpParser\NodeTraverser();

View File

@ -0,0 +1,87 @@
Name resolution
===============
Since the introduction of namespaces in PHP 5.3, literal names in PHP code are subject to a
relatively complex name resolution process, which is based on the current namespace, the current
import table state, as well the type of the referenced symbol. PHP-Parser implements name
resolution and related functionality, both as reusable logic (NameContext), as well as a node
visitor (NameResolver) based on it.
The NameResolver visitor
------------------------
The `NameResolver` visitor can (and for nearly all uses of the AST, is) be applied to resolve names
to their fully-qualified form, to the degree that this is possible.
```php
$nameResolver = new PhpParser\NodeVisitor\NameResolver;
$nodeTraverser = new PhpParser\NodeTraverser;
$nodeTraverser->addVisitor($nameResolver);
// Resolve names
$stmts = $nodeTraverser->traverse($stmts);
```
In the default configuration, the name resolver will perform three actions:
* Declarations of functions, classes, interfaces, traits and global constants will have a
`namespacedName` property added, which contains the function/class/etc name including the
namespace prefix. For historic reasons this is a **property** rather than an attribute.
* Names will be replaced by fully qualified resolved names, which are instances of
`Node\Name\FullyQualified`.
* Unqualified function and constant names inside a namespace cannot be statically resolved. Inside
a namespace `Foo`, a call to `strlen()` may either refer to the namespaced `\Foo\strlen()`, or
the global `\strlen()`. Because PHP-Parser does not have the necessary context to decide this,
such names are left unresolved. Additionally a `namespacedName` **attribute** is added to the
name node.
The name resolver accepts an option array as the second argument, with the following default values:
```php
$nameResolver = new PhpParser\NodeVisitor\NameResolver(null, [
'preserveOriginalNames' => false,
'replaceNodes' => true,
]);
```
If the `preserveOriginalNames` option is enabled, then the resolved (fully qualified) name will have
an `originalName` attribute, which contains the unresolved name.
If the `replaceNodes` option is disabled, then names will no longer be resolved in-place. Instead a
`resolvedName` attribute will be added to each name, which contains the resolved (fully qualified)
name. Once again, if an unqualified function or constant name cannot be resolved, then the
`resolvedName` attribute will not be present, and instead a `namespacedName` attribute is added.
The `replaceNodes` attribute is useful if you wish to perform modifications on the AST, as you
probably do not wish the resoluting code to have fully resolved names as a side-effect.
The NameContext
---------------
The actual name resolution logic is implemented in the `NameContext` class, which has the following
public API:
```php
class NameContext {
public function __construct(ErrorHandler $errorHandler);
public function startNamespace(Name $namespace = null);
public function addAlias(Name $name, string $aliasName, int $type, array $errorAttrs = []);
public function getNamespace();
public function getResolvedName(Name $name, int $type);
public function getResolvedClassName(Name $name) : Name;
public function getPossibleNames(string $name, int $type) : array;
public function getShortName(string $name, int $type) : Name;
}
```
The `$type` parameters accept on of the `Stmt\Use_::TYPE_*` constants, which represent the three
basic symbol types in PHP (functions, constants and everything else).
Next to name resolution, the `NameContext` also supports the reverse operation of finding a short
representation of a name given the current name resolution environment.
The name context is intended to be used for name resolution operations outside the AST itself, such
as class names inside doc comments. A visitor running in parallel with the name resolver can access
the name context using `$nameResolver->getNameContext()`. Alternatively a visitor can use an
independent context and explicitly feed `Namespace` and `Use` nodes to it.

View File

@ -0,0 +1,92 @@
Pretty printing
===============
Pretty printing is the process of converting a syntax tree back to PHP code. In its basic mode of
operation the pretty printer provided by this library will print the AST using a certain predefined
code style and will discard (nearly) all formatting of the original code. Because programmers tend
to be rather picky about their code formatting, this mode of operation is not very suitable for
refactoring code, but can be used for automatically generated code, which is usually only read for
debugging purposes.
Basic usage
-----------
```php
$stmts = $parser->parse($code);
// MODIFY $stmts here
$prettyPrinter = new PhpParser\PrettyPrinter\Standard;
$newCode = $prettyPrinter->prettyPrintFile();
```
The pretty printer has three basic printing methods: `prettyPrint()`, `prettyPrintFile()` and
`prettyPrintExpr()`. The one that is most commonly useful is `prettyPrintFile()`, which takes an
array of statements and produces a full PHP file, including opening `<?php`.
`prettyPrint()` also takes a statement array, but produces code which is valid inside an already
open `<?php` context. Lastly, `prettyPrintExpr()` takes an `Expr` node and prints only a single
expression.
Customizing the formatting
--------------------------
Apart from an `shortArraySyntax` option, the default pretty printer does not provide any
functionality to customize the formatting of the generated code. The pretty printer does respect a
number of `kind` attributes used by some notes (e.g., whether an integer should be printed as
decimal, hexadecimal, etc), but there are no options to control brace placement or similar.
If you want to make minor changes to the formatting, the easiest way is to extend the pretty printer
and override the methods responsible for the node types you are interested in.
If you want to have more fine-grained formatting control, the recommended method is to combine the
default pretty printer with an existing library for code reformatting, such as
[PHP-CS-Fixer](https://github.com/FriendsOfPHP/PHP-CS-Fixer).
Formatting-preserving pretty printing
-------------------------------------
> **Note:** This functionality is **experimental** and not yet complete.
For automated code refactoring, migration and similar, you will usually only want to modify a small
portion of the code and leave the remainder alone. The basic pretty printer is not suitable for
this, because it will also reformat parts of the code, which have not been modified.
Since PHP-Parser 4.0 an experimental formatting-preserving pretty-printing mode is available, which
attempts to preserve the formatting of code, those AST nodes have not changed, and only reformat
code which has been modified or newly inserted.
Use of the formatting-preservation functionality currently requires some additional preparatory
steps:
```php
use PhpParser\{Lexer, NodeTraverser, NodeVisitor, Parser, PrettyPrinter};
$lexer = new Lexer\Emulative([
'usedAttributes' => [
'comments',
'startLine', 'endLine',
'startTokenPos', 'endTokenPos',
],
]);
$parser = new Parser\Php7($lexer);
$traverser = new NodeTraverser();
$traverser->addVisitor(new NodeVisitor\CloningVisitor());
$printer = new PrettyPrinter\Standard();
$oldStmts = $parser->parse($code);
$oldTokens = $lexer->getTokens();
$newStmts = $traverser->traverse($oldStmts);
// MODIFY $newStmts HERE
$newCode = $printer->printFormatPreserving($newStmts, $oldStmts, $oldTokens);
```
This functionality is experimental and not yet fully implemented. It should not provide incorrect
code, but it may sometimes reformat more code than necessary. Open issues are tracked in
[issue #344][https://github.com/nikic/PHP-Parser/issues/344]. If you encounter problems while using
this functionality, please open an issue, so we know what to prioritize.