用于从文件中选择单个 Python 函数的 Bash 脚本

Ste*_*ett 4 python bash awk parsing git-diff

对于git 别名问题,我希望能够按名称从文件中选择单个 Python 函数。例如:

  ...
  def notyet():
      wait for it

  def ok_start(x):
      stuff
      stuff
      def dontgettrickednow():
         keep going
  #stuff
      more stuff

  def ok_stop_now():
Run Code Online (Sandbox Code Playgroud)

从算法的角度来看,以下内容已经足够接近了:

  1. 当找到匹配的行时开始过滤/^(\s*)def $1[^a-zA-Z0-9]/
  2. 继续匹配,直到找到不是or ^\s*#的行^/\1\s](即,可能是缩进的注释,或者比前一个缩进长的行)

(我并不关心以下函数之前的装饰器是否被选取。结果供人类阅读。)

我试图用 Awk 来做到这一点(我几乎不知道),但这比我想象的要难一些。对于初学者来说,我需要一种方法来存储原始 之前的缩进长度def

Bir*_*rei 5

一种方法是使用awk. 代码有很好的注释,所以我希望它很容易理解。

内容infile

  ...
  def notyet():
      wait for it

  def ok_start(x):
      stuff
      stuff
      def dontgettrickednow():
         keep going
  #stuff
      more stuff

  def ok_stop_now():
Run Code Online (Sandbox Code Playgroud)

内容script.awk

BEGIN {
        ## 'f' variable is the function to search, set a regexp with it.
        f_regex = "^" f "[^a-zA-Z0-9]"

        ## When set, print line. Otherwise omit line.
        ## It is set when found the function searched.
        ## It is unset when found any character different from '#' with less
        ## spaces before it.
        in_func = 0
}

## Found function.
$1 == "def" && $2 ~ f_regex {

        ## Get position of first 'd' in the line.
        i = index( $0, "d" )

        ## Sanity check. Never should success because the condition was
        ## checked before.
        if ( i == 0 ) {
                next
        }

        ## Get characters until matched index before, check that all of
        ## them are spaces, and get its length.
        indent = substr( $0, 0, i - 1 )
        if ( indent ~ /^[[:space:]]*$/ ) {
                num_spaces = length( indent )
        }

        ## Set variable, print line and read next one.
        in_func = 1
        print
        next
}

## When we are inside the function, line doesn't begin with '#' and
## it's not a blank line (only spaces).
in_func == 1 && $1 ~ /^[^#]/ && $0 ~ /[^[:space:]]/ {

        ## Get how many characters there are until first non-space. The result
        ## is the position of first non-blank, so substract one to get the number
        ## of spaces.
        spaces = match( $0, /[^[:space:]]/ )
        spaces -= 1

        ## If current indent is less or equal that the indent of function definition, then
        ## end of function found, so end processing.
        if ( spaces <= num_spaces ) {
                in_func = 0
        }
}

## Self-explanatory.
in_func == 1 { 
        print
}
Run Code Online (Sandbox Code Playgroud)

像这样运行它:

awk -f script.awk -v f="ok_start" infile
Run Code Online (Sandbox Code Playgroud)

具有以下输出:

  def ok_start(x):
      stuff
      stuff
      def dontgettrickednow():
         keep going
  #stuff
      more stuff
Run Code Online (Sandbox Code Playgroud)