diff --git a/README.md b/README.md index 7b0a28b7..9dc59000 100644 --- a/README.md +++ b/README.md @@ -90,8 +90,10 @@ Documentation Component documentation: - 1. [Error handling](doc/component/Error_handling.markdown) - 2. [Lexer](doc/component/Lexer.markdown) + * [Name resolution](doc/component/Name_resolution.markdown) + * [Pretty printing](doc/component/Pretty_printing.markdown) + * [Lexer](doc/component/Lexer.markdown) + * [Error handling](doc/component/Error_handling.markdown) [doc_2_x]: https://github.com/nikic/PHP-Parser/tree/2.x/doc [doc_3_x]: https://github.com/nikic/PHP-Parser/tree/3.x/doc diff --git a/doc/README.md b/doc/README.md new file mode 100644 index 00000000..f65041b3 --- /dev/null +++ b/doc/README.md @@ -0,0 +1,29 @@ +Table of Contents +================= + +Guide +----- + + 1. [Introduction](0_Introduction.markdown) + 2. [Usage of basic components](2_Usage_of_basic_components.markdown) + 3. [Other node tree representations](3_Other_node_tree_representations.markdown) + 4. [Code generation](4_Code_generation.markdown) + 5. [Frequently asked questions](5_FAQ.markdown) + +Component documentation +----------------------- + + * [Name resolution](component/Name_resolution.markdown) + * Name resolver options + * Name resolution context + * [Pretty printing](component/Pretty_printing.markdown) + * Converting AST back to PHP code + * Customizing formatting + * Formatting-preserving code transformations + * [Lexer](component/Lexer.markdown) + * Lexer options + * Token and file positions for nodes + * Custom attributes + * [Error handling](component/Error_handling.markdown) + * Column information for errors + * Error recovery (parsing of syntactically incorrect code) diff --git a/doc/component/Lexer.markdown b/doc/component/Lexer.markdown index b0913b5d..399c92b9 100644 --- a/doc/component/Lexer.markdown +++ b/doc/component/Lexer.markdown @@ -27,18 +27,23 @@ The attributes used in this example match the default behavior of the lexer. The * `comments`: Array of `PhpParser\Comment` or `PhpParser\Comment\Doc` instances, representing all comments that occurred between the previous non-discarded token and the current one. Use of this attribute is required for the - `$node->getDocComment()` method to work. The attribute is also needed if you wish the pretty printer to retain - comments present in the original code. + `$node->getComments()` and `$node->getDocComment()` methods to work. The attribute is also needed if you wish the pretty + printer to retain comments present in the original code. * `startLine`: Line in which the node starts. This attribute is required for the `$node->getLine()` to work. It is also required if syntax errors should contain line number information. - * `endLine`: Line in which the node ends. - * `startTokenPos`: Offset into the token array of the first token in the node. - * `endTokenPos`: Offset into the token array of the last token in the node. - * `startFilePos`: Offset into the code string of the first character that is part of the node. - * `endFilePos`: Offset into the code string of the last character that is part of the node. + * `endLine`: Line in which the node ends. Required for `$node->getEndLine()`. + * `startTokenPos`: Offset into the token array of the first token in the node. Required for `$node->getStartTokenPos()`. + * `endTokenPos`: Offset into the token array of the last token in the node. Required for `$node->getEndTokenPos()`. + * `startFilePos`: Offset into the code string of the first character that is part of the node. Required for `$node->getStartFilePos()`. + * `endFilePos`: Offset into the code string of the last character that is part of the node. Required for `$node->getEndFilePos()`. ### Using token positions +> **Note:** The example in this section is outdated in that this information is directly available in the AST: While +> `$property->isPublic()` does not distinguish between `public` and `var`, directly checking `$property->flags` for +> the `$property->flags & Class_::VISIBILITY_MODIFIER_MASK) === 0` allows making this distinction without resorting to +> tokens. However the general idea behind the example still applies in other cases. + The token offset information is useful if you wish to examine the exact formatting used for a node. For example the AST does not distinguish whether a property was declared using `public` or using `var`, but you can retrieve this information based on the token position: @@ -72,7 +77,7 @@ $lexer = new PhpParser\Lexer(array( 'comments', 'startLine', 'endLine', 'startTokenPos', 'endTokenPos' ) )); -$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::PREFER_PHP7, $lexer); +$parser = (new PhpParser\ParserFactory)->create(PhpParser\ParserFactory::ONLY_PHP7, $lexer); $visitor = new MyNodeVisitor(); $traverser = new PhpParser\NodeTraverser(); diff --git a/doc/component/Name_resolution.markdown b/doc/component/Name_resolution.markdown new file mode 100644 index 00000000..2a7eb603 --- /dev/null +++ b/doc/component/Name_resolution.markdown @@ -0,0 +1,87 @@ +Name resolution +=============== + +Since the introduction of namespaces in PHP 5.3, literal names in PHP code are subject to a +relatively complex name resolution process, which is based on the current namespace, the current +import table state, as well the type of the referenced symbol. PHP-Parser implements name +resolution and related functionality, both as reusable logic (NameContext), as well as a node +visitor (NameResolver) based on it. + +The NameResolver visitor +------------------------ + +The `NameResolver` visitor can (and for nearly all uses of the AST, is) be applied to resolve names +to their fully-qualified form, to the degree that this is possible. + +```php +$nameResolver = new PhpParser\NodeVisitor\NameResolver; +$nodeTraverser = new PhpParser\NodeTraverser; +$nodeTraverser->addVisitor($nameResolver); + +// Resolve names +$stmts = $nodeTraverser->traverse($stmts); +``` + +In the default configuration, the name resolver will perform three actions: + + * Declarations of functions, classes, interfaces, traits and global constants will have a + `namespacedName` property added, which contains the function/class/etc name including the + namespace prefix. For historic reasons this is a **property** rather than an attribute. + * Names will be replaced by fully qualified resolved names, which are instances of + `Node\Name\FullyQualified`. + * Unqualified function and constant names inside a namespace cannot be statically resolved. Inside + a namespace `Foo`, a call to `strlen()` may either refer to the namespaced `\Foo\strlen()`, or + the global `\strlen()`. Because PHP-Parser does not have the necessary context to decide this, + such names are left unresolved. Additionally a `namespacedName` **attribute** is added to the + name node. + +The name resolver accepts an option array as the second argument, with the following default values: + +```php +$nameResolver = new PhpParser\NodeVisitor\NameResolver(null, [ + 'preserveOriginalNames' => false, + 'replaceNodes' => true, +]); +``` + +If the `preserveOriginalNames` option is enabled, then the resolved (fully qualified) name will have +an `originalName` attribute, which contains the unresolved name. + +If the `replaceNodes` option is disabled, then names will no longer be resolved in-place. Instead a +`resolvedName` attribute will be added to each name, which contains the resolved (fully qualified) +name. Once again, if an unqualified function or constant name cannot be resolved, then the +`resolvedName` attribute will not be present, and instead a `namespacedName` attribute is added. + +The `replaceNodes` attribute is useful if you wish to perform modifications on the AST, as you +probably do not wish the resoluting code to have fully resolved names as a side-effect. + +The NameContext +--------------- + +The actual name resolution logic is implemented in the `NameContext` class, which has the following +public API: + +```php +class NameContext { + public function __construct(ErrorHandler $errorHandler); + public function startNamespace(Name $namespace = null); + public function addAlias(Name $name, string $aliasName, int $type, array $errorAttrs = []); + + public function getNamespace(); + public function getResolvedName(Name $name, int $type); + public function getResolvedClassName(Name $name) : Name; + public function getPossibleNames(string $name, int $type) : array; + public function getShortName(string $name, int $type) : Name; +} +``` + +The `$type` parameters accept on of the `Stmt\Use_::TYPE_*` constants, which represent the three +basic symbol types in PHP (functions, constants and everything else). + +Next to name resolution, the `NameContext` also supports the reverse operation of finding a short +representation of a name given the current name resolution environment. + +The name context is intended to be used for name resolution operations outside the AST itself, such +as class names inside doc comments. A visitor running in parallel with the name resolver can access +the name context using `$nameResolver->getNameContext()`. Alternatively a visitor can use an +independent context and explicitly feed `Namespace` and `Use` nodes to it. \ No newline at end of file diff --git a/doc/component/Pretty_printing.markdown b/doc/component/Pretty_printing.markdown new file mode 100644 index 00000000..2e7adcc5 --- /dev/null +++ b/doc/component/Pretty_printing.markdown @@ -0,0 +1,92 @@ +Pretty printing +=============== + +Pretty printing is the process of converting a syntax tree back to PHP code. In its basic mode of +operation the pretty printer provided by this library will print the AST using a certain predefined +code style and will discard (nearly) all formatting of the original code. Because programmers tend +to be rather picky about their code formatting, this mode of operation is not very suitable for +refactoring code, but can be used for automatically generated code, which is usually only read for +debugging purposes. + +Basic usage +----------- + +```php +$stmts = $parser->parse($code); + +// MODIFY $stmts here + +$prettyPrinter = new PhpParser\PrettyPrinter\Standard; +$newCode = $prettyPrinter->prettyPrintFile(); +``` + +The pretty printer has three basic printing methods: `prettyPrint()`, `prettyPrintFile()` and +`prettyPrintExpr()`. The one that is most commonly useful is `prettyPrintFile()`, which takes an +array of statements and produces a full PHP file, including opening ` **Note:** This functionality is **experimental** and not yet complete. + +For automated code refactoring, migration and similar, you will usually only want to modify a small +portion of the code and leave the remainder alone. The basic pretty printer is not suitable for +this, because it will also reformat parts of the code, which have not been modified. + +Since PHP-Parser 4.0 an experimental formatting-preserving pretty-printing mode is available, which +attempts to preserve the formatting of code, those AST nodes have not changed, and only reformat +code which has been modified or newly inserted. + +Use of the formatting-preservation functionality currently requires some additional preparatory +steps: + +```php +use PhpParser\{Lexer, NodeTraverser, NodeVisitor, Parser, PrettyPrinter}; + +$lexer = new Lexer\Emulative([ + 'usedAttributes' => [ + 'comments', + 'startLine', 'endLine', + 'startTokenPos', 'endTokenPos', + ], +]); +$parser = new Parser\Php7($lexer); + +$traverser = new NodeTraverser(); +$traverser->addVisitor(new NodeVisitor\CloningVisitor()); + +$printer = new PrettyPrinter\Standard(); + +$oldStmts = $parser->parse($code); +$oldTokens = $lexer->getTokens(); + +$newStmts = $traverser->traverse($oldStmts); + +// MODIFY $newStmts HERE + +$newCode = $printer->printFormatPreserving($newStmts, $oldStmts, $oldTokens); +``` + +This functionality is experimental and not yet fully implemented. It should not provide incorrect +code, but it may sometimes reformat more code than necessary. Open issues are tracked in +[issue #344][https://github.com/nikic/PHP-Parser/issues/344]. If you encounter problems while using +this functionality, please open an issue, so we know what to prioritize. \ No newline at end of file