GNU find 中的“启用的功能”是什么意思?

nne*_*neo 8 find gnu

当我find --version与 GNU find 一起使用时,我得到如下信息:

find (GNU findutils) 4.5.9     
[license text]
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS(FTS_CWDFD) CBO(level=2)
Run Code Online (Sandbox Code Playgroud)

这些“特征”是什么意思?在 中有一些关于O_NOFOLLOW作为安全措施的参考man find,并且提到了LEAF_OPTIMISATION一种优化,可以节省lstat对叶节点的一些调用。但我找不到任何关于FTS,D_TYPE或 的信息CBO

nne*_*neo 8

这是从 Ketan 和 daniel kullman 的答案以及我自己的研究得出的完整答案。

大多数“功能”被证明是查询优化,因为find通常能够(几乎)对文件系统进行任意复杂的查询。


类型

在存在D_TYPE该功能的手段find与为支持编译d_type现场struct dirent。该字段是 Linux 也采用的 BSD 扩展,它在从readdir和朋友返回的结构中提供文件类型(目录、文件、管道、套接字、字符/块设备等)。作为一种优化,当用作过滤器表达式时,find可以使用它来减少或消除lstat调用-type

readdir可能并不总是填充d_type在某些文件系统上,因此有时lstat仍然需要。

来自官方文档的更多信息:https : //www.gnu.org/software/findutils/manual/html_node/find_html/d_005ftype-Optimisation.html

O_NOFOLLOW

此选项将读取(enabled)(disabled)。如果存在并启用,此功能会实施一项安全措施,以防止find某些 TOCTTOU 竞赛攻击。具体来说,它可以防止find在执行目录遍历时遍历符号链接,如果在检查目录的文件类型之后但在输入目录之前将目录替换为符号链接,则可能会发生这种情况。

启用此选项后,findopen(..., O_NOFOLLOW)在目录上使用仅打开实际目录,然后用于openat打开该目录中的文件。

LEAF_优化

这种稍微模糊的优化允许find通过使用父目录的链接数来推断父目录的哪些子目录是目录,因为子目录将有助于父目录的链接数(通过..链接)。在某些情况下,它将允许find省略stat呼叫。但是,如果文件系统或操作系统错误表示st_nlinks,则可能会导致find产生虚假结果(幸运的是,这种情况非常罕见)。

官方文档中的更多信息:https : //www.gnu.org/software/findutils/manual/html_node/find_html/Leaf-Optimisation.html

自由贸易协定

启用后,该FTS功能会导致find使用ftsAPI 遍历文件层次结构,而不是直接递归实现。

我不清楚它的优点fts是什么,但FTS基本上是find我迄今为止看到的所有默认版本的默认值。

更多信息:https : //www.gnu.org/software/findutils/manual/html_node/find_html/fts.html,http : //man7.org/linux/man-pages/man3/fts.3.html

首席预算官

事实证明(在阅读了finddaniel kullman 建议的源代码之后)“CBO”指的是查询优化级别(它代表“基于成本的优化器”)。例如,如果我这样做find -O9001 --version,我得到

Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=9001) 
Run Code Online (Sandbox Code Playgroud)

查看 中的-O选项man find,我看到

-Olevel
  Enables query optimisation.   The find program reorders tests to speed up execution  while  preserving  the  overall
  effect; that is, predicates with side effects are not reordered relative to each other.  The optimisations performed
  at each optimisation level are as follows.

  0      Equivalent to optimisation level 1.

  1      This is the default optimisation level  and  corresponds  to  the  traditional  behaviour.   Expressions  are
         reordered  so that tests based only on the names of files (for example -name and -regex) are performed first.

  2      Any -type or -xtype tests are performed after any tests based only on the names  of  files,  but  before  any
         tests  that  require information from the inode.  On many modern versions of Unix, file types are returned by
         readdir() and so these predicates are faster to evaluate than predicates which need to stat the file first.

  3      At this optimisation level, the full cost-based query optimiser is enabled.  The order of tests  is  modified
         so  that  cheap  (i.e. fast) tests are performed first and more expensive ones are performed later, if neces-
         sary.  Within each cost band, predicates are evaluated earlier or later according to whether they are  likely
         to  succeed or not.  For -o, predicates which are likely to succeed are evaluated earlier, and for -a, predi-
         cates which are likely to fail are evaluated earlier.

  The cost-based optimiser has a fixed idea of how likely any given test is to succeed.  In some cases the probability
  takes  account of the specific nature of the test (for example, -type f is assumed to be more likely to succeed than
  -type c).  The cost-based optimiser is currently being evaluated.   If it does not actually improve the  performance
  of find, it will be removed again.  Conversely, optimisations that prove to be reliable, robust and effective may be
  enabled at lower optimisation levels over time.  However, the default behaviour (i.e. optimisation level 1) will not
  be  changed  in  the 4.3.x release series.  The findutils test suite runs all the tests on find at each optimisation
  level and ensures that the result is the same.
Run Code Online (Sandbox Code Playgroud)

谜团已揭开!这个选项是一个运行时值有点奇怪;通常我希望--version输出只反映编译时选项。


dan*_*ann 0

当查看 findutils 源代码树(http://git.savannah.gnu.org/cgit/findutils.git/tree/)时,我发现了以下内容:

  • configure.ac: --enable-d_type-optimization,利用readdir()在struct dirent.d_type中返回的文件类型数据),
  • m4/withfts.m4: --without-fts 使用旧的机制来搜索文件系统,而不是使用 fts()

我没有找到任何关于 CBO 的信息;您可能需要下载源代码并搜索该术语..