将PHP Doc注释解析为数据结构

Ala*_*orm 25 php parsing phpdoc

我在PHP中使用Reflection API从方法中提取DocComment(PHPDoc)字符串

$r = new ReflectionMethod($object);
$comment = $r->getDocComment();
Run Code Online (Sandbox Code Playgroud)

这将返回一个类似于此的字符串(取决于方法记录的程度)

/**
* Does this great things
*
* @param string $thing
* @return Some_Great_Thing
*/
Run Code Online (Sandbox Code Playgroud)

是否有任何可以将PHP Doc Comment String解析为数据结构的内置方法或函数?

$object = some_magic_function_or_method($comment_string);

echo 'Returns a: ', $object->return;
Run Code Online (Sandbox Code Playgroud)

缺乏这一点,我应该看看PHPDoc源代码的哪一部分.

缺乏和/或除此之外,是否有第三方代码被认为是"更好"的PHPDoc代码?

我意识到解析这些字符串不是火箭科学,甚至不是计算机科学,但我更喜欢一个经过良好测试的库/例程/方法,它是为了处理许多janky,半非正确的PHP Doc代码而构建的.可能存在于野外.

Mat*_*eis 21

我很惊讶这还没有提到:如何使用Zend Framework的Zend_Reflection?这可能会派上用场,特别是如果你使用像Magento这样的Zend Framework构建的软件.

有关可用方法的一些代码示例和API文档,请参阅Zend Framework手册.

有不同的方法来做到这一点:

  • 将文件名传递给Zend_Reflection_File.
  • 将对象传递给Zend_Reflection_Class.
  • 将对象和方法名称传递给Zend_Reflection_Method.
  • 如果你真的只有注释字符串,你甚至可以将一个小的虚拟类放在一起,将它保存到临时文件并将该文件传递给Zend_Reflection_File.

让我们来看看这个简单的情况,并假设你有一个你想要检查的现有课程.

代码就像这样(未经测试,请原谅我):

$method = new Zend_Reflection_Method($class, 'yourMethod');
$docblock = $method->getDocBlock();

if ($docBlock->hasTag('return')) {
    $tagReturn = $docBlock->getTag('return'); // $tagReturn is an instance of Zend_Reflection_Docblock_Tag_Return
    echo "Returns a: " . $tagReturn->getType() . "<br>";
    echo "Comment for return type: " . $tagReturn->getDescription();
}
Run Code Online (Sandbox Code Playgroud)


小智 16

您可以使用Fabien Potencier Sami("又一个PHP API文档生成器")开源项目中的" DocBlockParser "类. 首先,从GitHub获取Sami . 这是如何使用它的示例:

<?php

require_once 'Sami/Parser/DocBlockParser.php';
require_once 'Sami/Parser/Node/DocBlockNode.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Node\DocBlockNode;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $doc = $dbp->parse($comment);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>
Run Code Online (Sandbox Code Playgroud)

这是测试页面的输出:

** getDesc:
This is the short description.

This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
** getTags:
Array
(
    [param] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => foo
                    [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
                )

            [1] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => int
                                    [1] => 
                                )

                        )

                    [1] => bar
                    [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
                )

        )

    [return] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => de-html_entitied string (no entities at all)
                )

        )

)

** getTag('param'):
Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => string
                            [1] => 
                        )

                )

            [1] => foo
            [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => int
                            [1] => 
                        )

                )

            [1] => bar
            [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
        )

)

** getErrors:
Array
(
)

** getOtherTags:
Array
(
)

** getShortDesc:
This is the short description.
** getLongDesc:
This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
Run Code Online (Sandbox Code Playgroud)

  • 这种方法不再有效.在2014年,他们似乎决定使用自己的解析器,而只使用DocBlox/phpDocumentor解析器. (2认同)

Tom*_*uba 7

2022 phpdoc 解析器

PHPStan 现在拥有自己的基于 AST 的文档块解析器:

https://github.com/phpstan/phpdoc-parser

  • 它是长期维护的
  • 它允许节点遍历
  • 它可以解析f​​qn
  • 它有保留格式的打印机

以下是如何使用自定义节点访问者修改它


mvr*_*iel 6

您可以使用DocBlox(http://github.com/mvriel/docblox)为您生成XML数据结构; 您可以使用PEAR安装DocBlox ,然后运行命令:

docblox parse -d [FOLDER] -t [TARGET_LOCATION]
Run Code Online (Sandbox Code Playgroud)

这将生成一个名为的文件structure.xml,其中包含有关源代码的所有元数据,包括已解析的docblock.

要么

您可以使用DocBlox_Reflection_DocBlock*类直接解析一段DocBlock文本.

您可以通过确保启用自动加载(或包含所有DocBlox_Reflection_DocBlock*文件)并执行以下操作来执行此操作:

$parsed = new DocBlox_Reflection_DocBlock($docblock);
Run Code Online (Sandbox Code Playgroud)

之后,您可以使用getter提取所需的信息.

注意:您不需要删除星号; Reflection类负责这个.


Ian*_*ips 5

查看

http://pecl.php.net/package/docblock

我想,docblock_tokenize()函数会让你在那里分道扬..


小智 5

我建议附录,它非常酷,很好用,并在许多php5框架中使用...

http://code.google.com/p/addendum/

检查测试示例

http://code.google.com/p/addendum/source/browse/trunk#trunk%2Fannotations%2Ftests