Ger*_*ica 6 java regex awk grep git-bash
我想*.java递归打印所有子目录中具有两个以上类型参数(即<R ... H>下面示例中的参数)的文件的标题。其中一个文件看起来像(为简洁起见,名称已减少):
多行.java
class ClazzA<R extends A,
S extends B<T>, T extends C<T>,
U extends D, W extends E,
X extends F, Y extends G, Z extends H>
extends OtherClazz<S> implements I<T> {
public void method(Type<Q, R> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
预期输出:
ClazzA.java:10: class ClazzA<R extends A,
ClazzA.java:11: S extends B<T>, T extends C<T>,
ClazzA.java:12: U extends D, W extends E,
ClazzA.java:13: X extends F, Y extends G, Z extends H>
ClazzA.java:14: extends OtherClazz<S> implements I<T> {
Run Code Online (Sandbox Code Playgroud)
但另一个也可能是这样的:
单行.java
ClazzA.java:10: class ClazzA<R extends A,
ClazzA.java:11: S extends B<T>, T extends C<T>,
ClazzA.java:12: U extends D, W extends E,
ClazzA.java:13: X extends F, Y extends G, Z extends H>
ClazzA.java:14: extends OtherClazz<S> implements I<T> {
Run Code Online (Sandbox Code Playgroud)
预期输出:
ClazzB.java:42: class ClazzB<R extends A, S extends B<T>, T extends C<T>, U extends D, W extends E, X extends F, Y extends G, Z extends H> extends OtherClazz<S> implements I<T> {
Run Code Online (Sandbox Code Playgroud)
不应考虑/打印的文件:
X-无参数.java
class ClazzB<R extends A, S extends B<T>, T extends C<T>, U extends D, W extends E, X extends F, Y extends G, Z extends H> extends OtherClazz<S> implements I<T> {
public void method(Type<Q, R> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
X一参数.java
ClazzB.java:42: class ClazzB<R extends A, S extends B<T>, T extends C<T>, U extends D, W extends E, X extends F, Y extends G, Z extends H> extends OtherClazz<S> implements I<T> {
Run Code Online (Sandbox Code Playgroud)
X-二参数.java
class ClazzC /* no type parameter */ extends OtherClazz<S> implements I<T> {
public void method(Type<A, B> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
X-两行参数.java
class ClazzD<R extends A> // only one type parameter
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
文件中的所有空格都可以是\s+. extends [...]和implements [...]紧接之前{是可选的。extends [...]在每个类型参数中也是可选的。请参阅Java® 语言规范,8.1。类声明的详细信息。
我gawk在Git Bash 中使用:
class ClazzE<R extends A, S extends B<T>> // only two type parameters
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
和:
class ClazzF<R extends A, // only two type parameters
S extends B<T>> // on two lines
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
和ws-class-type-parameter.awk:
$ gawk --version
GNU Awk 5.0.0, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0)
Run Code Online (Sandbox Code Playgroud)
这会找到所有*.java文件......很棒,gawk与每个文件一起执行......很棒,但是在我尝试后你会看到结果作为评论。请注意:这里的ClazzA文字仅用于测试和MCVE。它可能是\w+真实的,但是在测试时在数千个文件中有 500.000 多行...
如果我在regex101.com上尝试它会起作用。嗯,有点。我没有找到如何定义/start-regex/,/end-regex/那里,所以我.*在两者之间添加了另一个。
我从那里拿了标志,但我找不到是否gawk支持标志语法的描述,/.../sU , /.../U所以我只是试了一下。一条现已删除的评论告诉我,没有任何味道awk支持这一点。
我也试过grep:
find . -type f -name '*.java' | xargs gawk -f ws-class-type-parameter.awk > ws-class-type-parameter.log
Run Code Online (Sandbox Code Playgroud)
使用types.grep:
# /start/ , /end/ ... pattern
#/class ClazzA<.*,.*/ , /{/ { # 5 lines, OK for ClazzA, but in real it prints classes with 2 or less type parameters, too
#/class ClazzA<.*,.*,/ , /{/ { # no line with ClazzA, since there's no second ',' on its first line
#/class ClazzA<.*,.*,/s , /{/ { # 500.000+(!) lines
#/class ClazzA<.*,.*,/s , /{/U { # 500.000+(!) lines
#/class ClazzA<.*,.*,/sU , /{/U { # 500.000+(!) lines
/(?s)class ClazzA<.*,.*,/ , /{/ { # no line
match( FILENAME, "/.*/.." )
print substr( FILENAME, RLENGTH ) ":" FNR ": " $0
}
Run Code Online (Sandbox Code Playgroud)
这只会导致singleline.java 的输出。
(?s)是--perl-regexp, -P语法并grep --help声称支持这一点。
Ed Morton 的答案中的解决方案效果很好,但结果是有自动生成的文件,其方法如下:
$ grep --version
grep (GNU grep) 3.1
...
$ grep -nrPf types.grep *.java
Run Code Online (Sandbox Code Playgroud)
这给出了例如的输出:
(?s).*class\s+\w+\s*<.*,.*,.*>.*{
Run Code Online (Sandbox Code Playgroud)
以及其他带有课堂评论和注释的人,例如:
/** more code before here */
public void setId(String value) {
this.id = value;
}
/**
* Gets a map that contains attributes that aren't bound to any typed property on this class.
*
* <p>
* the map is keyed by the name of the attribute and
* the value is the string value of the attribute.
*
* the map returned by this method is live, and you can add new attribute
* by updating the map directly. Because of this design, there's no setter.
*
*
* @return
* always non-null
*/
public Map<QName, String> getOtherAttributes() {
return otherAttributes;
}
Run Code Online (Sandbox Code Playgroud)
输出为例如:
AbstractAddressType.java:81: * Gets a map that contains attributes that aren't bound to any typed property on this class.
AbstractAddressType.java:82: *
AbstractAddressType.java:83: * <p>
AbstractAddressType.java:84: * the map is keyed by the name of the attribute and
AbstractAddressType.java:85: * the value is the string value of the attribute.
AbstractAddressType.java:86: *
AbstractAddressType.java:87: * the map returned by this method is live, and you can add new attribute
AbstractAddressType.java:88: * by updating the map directly. Because of this design, there's no setter.
AbstractAddressType.java:89: *
AbstractAddressType.java:90: *
AbstractAddressType.java:91: * @return
AbstractAddressType.java:92: * always non-null
AbstractAddressType.java:93: */
AbstractAddressType.java:94: public Map<QName, String> getOtherAttributes() {
Run Code Online (Sandbox Code Playgroud)
Ed *_*ton 10
在每个 UNIX 机器上的任何 shell 中使用任何 POSIX awk:
$ cat tst.awk
/[[:space:]]*class[[:space:]]*/ {
inDef = 1
fname = FILENAME
sub(".*/","",fname)
def = out = ""
}
inDef {
out = out fname ":" FNR ": " $0 ORS
# Remove comments (not perfect but should work for 99.9% of cases)
sub("//.*","")
gsub("/[*]|[*]/","\n")
gsub(/\n[^\n]*\n/,"")
def = def $0 ORS
if ( /{/ ) {
if ( gsub(/,/,"&",def) > 2 ) {
printf "%s", out
}
inDef = 0
}
}
Run Code Online (Sandbox Code Playgroud)
$ find tmp -type f -name '*.java' -exec awk -f tst.awk {} +
multiple-lines.java:1: class ClazzA<R extends A,
multiple-lines.java:2: S extends B<T>, T extends C<T>,
multiple-lines.java:3: U extends D, W extends E,
multiple-lines.java:4: X extends F, Y extends G, Z extends H>
multiple-lines.java:5: extends OtherClazz<S> implements I<T> {
single-line.java:1: class ClazzB<R extends A, S extends B<T>, T extends C<T>, U extends D, W extends E, X extends F, Y extends G, Z extends H> extends OtherClazz<S> implements I<T> {
Run Code Online (Sandbox Code Playgroud)
以上是使用此输入运行的:
$ head tmp/*
==> tmp/X-no-parameter.java <==
class ClazzC /* no type parameter */ extends OtherClazz<S> implements I<T> {
public void method(Type<A, B> x) {
// ... code ...
}
}
==> tmp/X-one-parameter.java <==
class ClazzD<R extends A> // only one type parameter
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
==> tmp/X-two-line-parameters.java <==
class ClazzF<R extends A, // only two type parameters
S extends B<T>> // on two lines
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
==> tmp/X-two-parameters.java <==
class ClazzE<R extends A, S extends B<T>> // only two type parameters
extends OtherClazz<S> implements I<T> {
public void method(Type<X, Y> x) {
// ... code ...
}
}
==> tmp/multiple-lines.java <==
class ClazzA<R extends A,
S extends B<T>, T extends C<T>,
U extends D, W extends E,
X extends F, Y extends G, Z extends H>
extends OtherClazz<S> implements I<T> {
public void method(Type<Q, R> x) {
// ... code ...
}
}
==> tmp/single-line.java <==
class ClazzB<R extends A, S extends B<T>, T extends C<T>, U extends D, W extends E, X extends F, Y extends G, Z extends H> extends OtherClazz<S> implements I<T> {
public void method(Type<Q, R> x) {
// ... code ...
}
}
Run Code Online (Sandbox Code Playgroud)
以上只是最好的努力,没有为语言编写解析器,只是让 OP 发布示例输入/输出以继续处理需要处理的内容。