Add name resolution, pretty printing component docs

The docs are receiving too little love...
2025-04-21 06:22:12 +02:00 · 2017-10-03 19:09:27 +02:00 · 2017-10-03 19:09:27 +02:00 · f5f3b0d49d
commit f5f3b0d49d
parent f6cc85a796
5 changed files with 225 additions and 10 deletions
--- a/README.md
+++ b/README.md
@ -90,8 +90,10 @@ Documentation

 Component documentation:

- 1. [Error handling](doc/component/Error_handling.markdown)
- 2. [Lexer](doc/component/Lexer.markdown)
+ * [Name resolution](doc/component/Name_resolution.markdown)
+ * [Pretty printing](doc/component/Pretty_printing.markdown)
+ * [Lexer](doc/component/Lexer.markdown)
+ * [Error handling](doc/component/Error_handling.markdown)

 [doc_2_x]: https://github.com/nikic/PHP-Parser/tree/2.x/doc
 [doc_3_x]: https://github.com/nikic/PHP-Parser/tree/3.x/doc
--- a/doc/README.md
+++ b/doc/README.md
@ -0,0 +1,29 @@
+Table of Contents
+=================
+
+Guide
+-----
+
+  1. [Introduction](0_Introduction.markdown)
+  2. [Usage of basic components](2_Usage_of_basic_components.markdown)
+  3. [Other node tree representations](3_Other_node_tree_representations.markdown)
+  4. [Code generation](4_Code_generation.markdown)
+  5. [Frequently asked questions](5_FAQ.markdown)
+ 
+Component documentation
+-----------------------
+ 
+  * [Name resolution](component/Name_resolution.markdown)
+    * Name resolver options
+    * Name resolution context
+  * [Pretty printing](component/Pretty_printing.markdown)
+    * Converting AST back to PHP code
+    * Customizing formatting
+    * Formatting-preserving code transformations
+  * [Lexer](component/Lexer.markdown)
+    * Lexer options
+    * Token and file positions for nodes
+    * Custom attributes
+  * [Error handling](component/Error_handling.markdown)
+    * Column information for errors
+    * Error recovery (parsing of syntactically incorrect code)
--- a/doc/component/Lexer.markdown
+++ b/doc/component/Lexer.markdown
@ -27,18 +27,23 @@ The attributes used in this example match the default behavior of the lexer. The

 * `comments`: Array of `PhpParser\Comment` or `PhpParser\Comment\Doc` instances, representing all comments that occurred
   between the previous non-discarded token and the current one. Use of this attribute is required for the
-   `$node->getDocComment()` method to work. The attribute is also needed if you wish the pretty printer to retain
-   comments present in the original code.
+   `$node->getComments()` and `$node->getDocComment()` methods to work. The attribute is also needed if you wish the pretty
+   printer to retain comments present in the original code.
 * `startLine`: Line in which the node starts. This attribute is required for the `$node->getLine()` to work. It is also
   required if syntax errors should contain line number information.
- * `endLine`: Line in which the node ends.
- * `startTokenPos`: Offset into the token array of the first token in the node.
- * `endTokenPos`: Offset into the token array of the last token in the node.
- * `startFilePos`: Offset into the code string of the first character that is part of the node.
- * `endFilePos`: Offset into the code string of the last character that is part of the node.
+ * `endLine`: Line in which the node ends. Required for `$node->getEndLine()`.
+ * `startTokenPos`: Offset into the token array of the first token in the node. Required for `$node->getStartTokenPos()`.
+ * `endTokenPos`: Offset into the token array of the last token in the node. Required for `$node->getEndTokenPos()`.
+ * `startFilePos`: Offset into the code string of the first character that is part of the node. Required for `$node->getStartFilePos()`.
+ * `endFilePos`: Offset into the code string of the last character that is part of the node. Required for `$node->getEndFilePos()`.

 ### Using token positions

+> **Note:** The example in this section is outdated in that this information is directly available in the AST: While
+> `$property->isPublic()` does not distinguish between `public` and `var`, directly checking `$property->flags` for
+> the `$property->flags & Class_::VISIBILITY_MODIFIER_MASK) === 0` allows making this distinction without resorting to
+> tokens. However the general idea behind the example still applies in other cases.
+
 The token offset information is useful if you wish to examine the exact formatting used for a node. For example the AST
 does not distinguish whether a property was declared using `public` or using `var`, but you can retrieve this
 information based on the token position:
@ -72,7 +77,7 @@ $lexer = new PhpParser\Lexer(array(
        'comments', 'startLine', 'endLine', 'startTokenPos', 'endTokenPos'
    )
 ));
-$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, $lexer);
+$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7, $lexer);

 $visitor = new MyNodeVisitor();
 $traverser = new PhpParser\NodeTraverser();
--- a/doc/component/Name_resolution.markdown
+++ b/doc/component/Name_resolution.markdown
@ -0,0 +1,87 @@
+Name resolution
+===============
+
+Since the introduction of namespaces in PHP 5.3, literal names in PHP code are subject to a
+relatively complex name resolution process, which is based on the current namespace, the current
+import table state, as well the type of the referenced symbol. PHP-Parser implements name
+resolution and related functionality, both as reusable logic (NameContext), as well as a node
+visitor (NameResolver) based on it.
+
+The NameResolver visitor
+------------------------
+
+The `NameResolver` visitor can (and for nearly all uses of the AST, is) be applied to resolve names
+to their fully-qualified form, to the degree that this is possible.
+
+```php
+$nameResolver = new PhpParser\NodeVisitor\NameResolver;
+$nodeTraverser = new PhpParser\NodeTraverser;
+$nodeTraverser->addVisitor($nameResolver);
+
+// Resolve names
+$stmts = $nodeTraverser->traverse($stmts);
+```
+
+In the default configuration, the name resolver will perform three actions:
+
+ * Declarations of functions, classes, interfaces, traits and global constants will have a
+   `namespacedName` property added, which contains the function/class/etc name including the
+   namespace prefix. For historic reasons this is a **property** rather than an attribute.
+ * Names will be replaced by fully qualified resolved names, which are instances of
+   `Node\Name\FullyQualified`.
+ * Unqualified function and constant names inside a namespace cannot be statically resolved. Inside
+   a namespace `Foo`, a call to `strlen()` may either refer to the namespaced `\Foo\strlen()`, or
+   the global `\strlen()`. Because PHP-Parser does not have the necessary context to decide this,
+   such names are left unresolved. Additionally a `namespacedName` **attribute** is added to the
+   name node.
+
+The name resolver accepts an option array as the second argument, with the following default values:
+
+```php
+$nameResolver = new PhpParser\NodeVisitor\NameResolver(null, [
+    'preserveOriginalNames' => false,
+    'replaceNodes' => true,
+]);
+```
+
+If the `preserveOriginalNames` option is enabled, then the resolved (fully qualified) name will have
+an `originalName` attribute, which contains the unresolved name.
+
+If the `replaceNodes` option is disabled, then names will no longer be resolved in-place. Instead a
+`resolvedName` attribute will be added to each name, which contains the resolved (fully qualified)
+name. Once again, if an unqualified function or constant name cannot be resolved, then the
+`resolvedName` attribute will not be present, and instead a `namespacedName` attribute is added.
+
+The `replaceNodes` attribute is useful if you wish to perform modifications on the AST, as you
+probably do not wish the resoluting code to have fully resolved names as a side-effect.
+
+The NameContext
+---------------
+
+The actual name resolution logic is implemented in the `NameContext` class, which has the following
+public API:
+
+```php
+class NameContext {
+    public function __construct(ErrorHandler $errorHandler);
+    public function startNamespace(Name $namespace = null);
+    public function addAlias(Name $name, string $aliasName, int $type, array $errorAttrs = []);
+
+    public function getNamespace();
+    public function getResolvedName(Name $name, int $type);
+    public function getResolvedClassName(Name $name) : Name;
+    public function getPossibleNames(string $name, int $type) : array;
+    public function getShortName(string $name, int $type) : Name;
+}
+```
+
+The `$type` parameters accept on of the `Stmt\Use_::TYPE_*` constants, which represent the three
+basic symbol types in PHP (functions, constants and everything else).
+
+Next to name resolution, the `NameContext` also supports the reverse operation of finding a short
+representation of a name given the current name resolution environment.
+
+The name context is intended to be used for name resolution operations outside the AST itself, such
+as class names inside doc comments. A visitor running in parallel with the name resolver can access
+the name context using `$nameResolver->getNameContext()`. Alternatively a visitor can use an
+independent context and explicitly feed `Namespace` and `Use` nodes to it.
--- a/doc/component/Pretty_printing.markdown
+++ b/doc/component/Pretty_printing.markdown
@ -0,0 +1,92 @@
+Pretty printing
+===============
+
+Pretty printing is the process of converting a syntax tree back to PHP code. In its basic mode of
+operation the pretty printer provided by this library will print the AST using a certain predefined
+code style and will discard (nearly) all formatting of the original code. Because programmers tend
+to be rather picky about their code formatting, this mode of operation is not very suitable for
+refactoring code, but can be used for automatically generated code, which is usually only read for
+debugging purposes.
+
+Basic usage
+-----------
+
+```php
+$stmts = $parser->parse($code);
+
+// MODIFY $stmts here
+
+$prettyPrinter = new PhpParser\PrettyPrinter\Standard;
+$newCode = $prettyPrinter->prettyPrintFile();
+```
+
+The pretty printer has three basic printing methods: `prettyPrint()`, `prettyPrintFile()` and
+`prettyPrintExpr()`. The one that is most commonly useful is `prettyPrintFile()`, which takes an
+array of statements and produces a full PHP file, including opening `<?php`.
+
+`prettyPrint()` also takes a statement array, but produces code which is valid inside an already
+open `<?php` context. Lastly, `prettyPrintExpr()` takes an `Expr` node and prints only a single
+expression.
+
+Customizing the formatting
+--------------------------
+
+Apart from an `shortArraySyntax` option, the default pretty printer does not provide any
+functionality to customize the formatting of the generated code. The pretty printer does respect a
+number of `kind` attributes used by some notes (e.g., whether an integer should be printed as
+decimal, hexadecimal, etc), but there are no options to control brace placement or similar.
+
+If you want to make minor changes to the formatting, the easiest way is to extend the pretty printer
+and override the methods responsible for the node types you are interested in.
+
+If you want to have more fine-grained formatting control, the recommended method is to combine the
+default pretty printer with an existing library for code reformatting, such as
+[PHP-CS-Fixer](https://github.com/FriendsOfPHP/PHP-CS-Fixer).
+
+Formatting-preserving pretty printing
+-------------------------------------
+
+> **Note:** This functionality is **experimental** and not yet complete.
+
+For automated code refactoring, migration and similar, you will usually only want to modify a small
+portion of the code and leave the remainder alone. The basic pretty printer is not suitable for
+this, because it will also reformat parts of the code, which have not been modified.
+
+Since PHP-Parser 4.0 an experimental formatting-preserving pretty-printing mode is available, which
+attempts to preserve the formatting of code, those AST nodes have not changed, and only reformat
+code which has been modified or newly inserted.
+
+Use of the formatting-preservation functionality currently requires some additional preparatory
+steps:
+
+```php
+use PhpParser\{Lexer, NodeTraverser, NodeVisitor, Parser, PrettyPrinter};
+
+$lexer = new Lexer\Emulative([
+    'usedAttributes' => [
+        'comments',
+        'startLine', 'endLine',
+        'startTokenPos', 'endTokenPos',
+    ],
+]);
+$parser = new Parser\Php7($lexer);
+
+$traverser = new NodeTraverser();
+$traverser->addVisitor(new NodeVisitor\CloningVisitor());
+
+$printer = new PrettyPrinter\Standard();
+
+$oldStmts = $parser->parse($code);
+$oldTokens = $lexer->getTokens();
+
+$newStmts = $traverser->traverse($oldStmts);
+
+// MODIFY $newStmts HERE
+
+$newCode = $printer->printFormatPreserving($newStmts, $oldStmts, $oldTokens);
+```
+
+This functionality is experimental and not yet fully implemented. It should not provide incorrect
+code, but it may sometimes reformat more code than necessary. Open issues are tracked in
+[issue #344][https://github.com/nikic/PHP-Parser/issues/344]. If you encounter problems while using
+this functionality, please open an issue, so we know what to prioritize.