通过终端中的列将任意输出转换为json?

Bra*_*rks 3 linux bash json command-line-interface

我希望能够将任何命令行程序的输出传递给将其转换为json的命令.

例如,我的未知程序可以接受目标列,分隔符和输出字段名称

# select columns 1 and 3 from the output and convert it to simple json
netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
Run Code Online (Sandbox Code Playgroud)

并会生成这样的东西:

[ 
  {"name": "tcp4", "other": "31"},
  {"name": "tcp4", "other": "0"} 
...
]
Run Code Online (Sandbox Code Playgroud)

我正在寻找适用于任何程序的东西,而不仅仅是netstat!

我愿意安装任何第三方工具/开源项目,并倾向于在linux/osx上运行 - 不必是bash脚本解决方案,可以用node,perl,python等编写.

编辑:我当然愿意传递任何需要使其工作的信息,例如分隔符或多个分隔符 - 我只是想避免在命令行中进行显式解析,并拥有该工具去做.

F. *_*uri 5

过滤STDIN以构建json变量

介绍

由于终端是一种非常特殊的接口,使用等宽字体,可以在此终端上监视工具,许多输出可能很难解析:

netstat 输出是一个很好的样本:

Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   Path
unix  2      [ ACC ]     STREAM     LISTENING     13947569 @/tmp/.X11-unix/X1
unix  2      [ ]         DGRAM                    8760     /run/systemd/notify
unix  2      [ ACC ]     SEQPACKET  LISTENING     8790     /run/udev/control
Run Code Online (Sandbox Code Playgroud)

如果某些行包含空白字段,则不能简单地在空格上拆分.

因此,requesttet脚本convert_to_json将发布在此底部.

基于简单空间的分裂 awk

通过使用awk,您可以使用漂亮的语法:

netstat -an |
    awk '/CLOSE_WAIT/{
        printf "  { \42%s\42:\42%s\42,\42%s\42:\42%s\42},\n","name",$1,"other",$3
    }' |
    sed '1s/^/[\n/;$s/,$/\n]/'
Run Code Online (Sandbox Code Playgroud)

基于简单空间拆分perl,但使用json库

但这种方式更灵活:

netstat -an | perl -MJSON::XS -ne 'push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;END{print encode_json(\@out)."\n";}'
Run Code Online (Sandbox Code Playgroud)

或相同但分裂;

netstat -an |
    perl -MJSON::XS -ne '
        push @out,{"name"=>,$1,"other"=>$2} if
                /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
        END{print encode_json(\@out)."\n";
}'
Run Code Online (Sandbox Code Playgroud)

漂亮印刷:

netstat -an | perl -MJSON::XS -ne '
    push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
    END{$coder = JSON::XS->new->ascii->pretty->allow_nonref;
        print $coder->encode(\@out);}'
Run Code Online (Sandbox Code Playgroud)

最后,我喜欢这个版本不是基于:

netstat -an | perl -MJSON::XS -ne '
    do {
        my @line=split(/\s+/);
        push @out,{"name"=>,$line[0],"other"=>$line[2]}
    } if /CLOSE_WAIT/;
    END{
        $coder = JSON::XS->new->ascii->pretty->allow_nonref;
        print $coder->encode(\@out);
    }'
Run Code Online (Sandbox Code Playgroud)

但你可以在perl脚本中运行命令:

perl -MJSON::XS -e '
    open STDIN,"netstat -an|";
    my @out;
    while (<>){
        push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
    };
    print encode_json \@out;'
Run Code Online (Sandbox Code Playgroud)

这可能成为一个基本的原型:

#!/usr/bin/perl -w

use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;

$ENV{'LANG'}='C';
open STDIN,"netstat -naut|";
my @out;
my @fields;

my $searchre=":";
$searchre = shift @ARGV if @ARGV;

while (<>){
    map { s/_/ /g;push @fields,$_; } split(/\s+/) if
        /^Proto.*State/ && s/\sAddr/_Addr/g;
    do {
        my @line=split(/\s+/);
        my %entry;
        for my $i (0..$#fields) {
            $entry{$fields[$i]}=$line[$i];
        };
        push @out,\%entry;
    } if /$searchre/;
}

print $coder->encode(\@out);
Run Code Online (Sandbox Code Playgroud)

(没有参数,这将转储整个netstat -uta,但您可以将任何搜索字符串作为参数,如CLOSE或IP.)

位置参数, netstat2json.pl

除了netcat一些更正之外,此方法可以与许多其他工具一起使用:

#!/usr/bin/perl -w
use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
$ENV{'LANG'}='C';
open STDIN,"netstat -nap|";
my ( $searchre ,@out,%fields)=( "[/:]" );
$searchre = shift @ARGV if @ARGV;
while (<>){
    next if /^Active\s.*\)$/;
    /^Proto.*State/ && do {
        s/\s(name|Addr)/_$1/g;
        my @head;
        map { s/_/ /g;push @head,$_; } split(/\s+/);
        s/_/ /g;
        %fields=();
        for my $i (0..$#head) {
            my $crt=index($_,$head[$i]);
            my $next=-1;
            $next=index($_,$head[$i+1])-$crt-1 if $i < $#head;
            $fields{$head[$i]}=[$crt,$next];
        }
        next;
    };
    do {
        my $line=$_;
        my %entry;
        for my $i (keys %fields) {
            my $crt=substr($line,$fields{$i}[0],$fields{$i}[1]);
            $crt=~s/^\s*(\S(|.*\S))\s*$/$1/;
            $entry{$i}=$crt;
        };
        push @out,\%entry;
    } if /$searchre/;
}
print $coder->encode(\@out);
Run Code Online (Sandbox Code Playgroud)
  • 找到标题行Proto.*State(特定于netcat)
  • 存储具有位置和长度的字段名称
  • 按字段长度分割,然后修剪空格
  • 将变量转储为json字符串.

这可以使用参数运行,如前所述:

./netstat2json.pl CLOS
[
   {
      "Local Address" : "127.0.0.1:31001",
      "State" : "CLOSE_WAIT",
      "Recv-Q" : "18",
      "Proto" : "tcp",
      "Send-Q" : "0",
      "Foreign Address" : "127.0.0.1:55938",
      "PID/Program name" : "-"
   },
   {
      "Recv-Q" : "1",
      "Local Address" : "::1:53816",
      "State" : "CLOSE_WAIT",
      "Send-Q" : "0",
      "PID/Program name" : "-",
      "Foreign Address" : "::1:631",
      "Proto" : "tcp6"
   }
]
Run Code Online (Sandbox Code Playgroud)

空字段不会破坏变量分配:

./netstat2json.pl 1000.*systemd/notify
[
   {
      "Proto" : "unix",
      "I-Node" : "33378",
      "RefCnt" : "2",
      "Path" : "/run/user/1000/systemd/notify",
      "PID/Program name" : "-",
      "Type" : "DGRAM",
      "Flags" : "[ ]",
      "State" : ""
   }
]
Run Code Online (Sandbox Code Playgroud)

诺塔!此修改版本netstat使用-napgetPID/Program name字段的参数运行.

如果没有超级用户运行root,您可以成为以下输出STDERR:

(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Run Code Online (Sandbox Code Playgroud)

你可以避免它们

  • 通过跑步 netstat2json.pl 2>/dev/null,
  • 通过运行此root或与sudo或运行
  • 编辑行#6,更改"netstat -nap|""netstat -na|".

convert_to_json.pl 用于将STDIN转换为json的脚本.

有一个convert_to_json.plperl脚本,严格按照要求:运行为netstat -an | grep CLOSE | ./convert_to_json.pl 1,3 name,other

#!/usr/bin/perl -w

use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;

my (@fields,@pos,@out);

map {
    push @pos,1*$_-1
} split ",",shift @ARGV;      

map { 
    push @fields,$_
} split ",",shift @ARGV;

die "Number of fields don't match number of positions" if $#fields ne $#pos;

while (<>) {
    my @line=split(/\s+/);
    my %entry;
    for my $i (0..$#fields) {
         $entry{$fields[$i]}=$line[$pos[$i]];
    };
    push @out,\%entry;
}
print $coder->encode(\@out);
Run Code Online (Sandbox Code Playgroud)


Eri*_*nil 5

这是我的红宝石版本:

#! /usr/bin/env ruby
#
# Converts stdin columns to a JSON array of hashes
#
# Installation : Save as convert_to_json, make it executable and put it somewhere in PATH. Ruby must be installed
#
# Examples :
#
# netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
# ls -l | convert_to_json
# ls -l | convert_to_json 6,7,8,9
# ls -l | convert_to_json 6,7,8,9 month,day,time,name
# convert_to_json 1,2 time,value ";" < some_file.csv
#
#
# http://stackoverflow.com/questions/40246134/convert-arbitrary-output-to-json-by-column-in-the-terminal

require 'json'

script_name = File.basename(__FILE__)
syntax = "Syntax : command_which_outputs_columns | #{script_name} column1_id,column2_id,...,columnN_id column1_name,column2_name,...,columnN_name delimiter"


if $stdin.tty? or $stdin.closed? then
  $stderr.puts syntax
else
  if ARGV[2]
    delimiter = ARGV[2]
    $stderr.puts "#{script_name} : Using #{delimiter} as delimiter"
  else
    delimiter = /\s+/
  end

  column_ids = (ARGV[0] || "").split(',').map{|column_id| column_id.to_i-1}
  column_names = (ARGV[1] || "").split(',')

  results = []
  $stdin.each do |stdin_line|
    if column_ids.empty?
      values = stdin_line.strip.split(delimiter)
    else
      values = stdin_line.strip.split(delimiter).values_at(*column_ids)
    end
    line_hash=Hash.new
    values.each_with_index.each{|value,i|
      colum_name = column_names[i] || "column#{(column_ids[i] || i)+1}"
      line_hash[colum_name]=value
    }
    results<<line_hash
  end
  puts JSON.pretty_generate(results)
end
Run Code Online (Sandbox Code Playgroud)

它的工作原理如您的示例中所定义:

netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
[
  {
    "name": "tcp",
    "other": "0"
  },
  {
    "name": "tcp6",
    "other": "0"
  }
]
Run Code Online (Sandbox Code Playgroud)

作为奖励,您可以

  • 省略指定参数:每一列都将转换为 json
  • 省略指定名称:列将被称为column1,column2,...
  • 选择缺失的列:值将为空
  • 定义分隔符作为第三个参数。默认为空白

其他例子:

netstat -a | grep CLOSE_WAIT | ./convert_to_json
# [
#   {
#     "column1": "tcp",
#     "column2": "1",
#     "column3": "0",
#     "column4": "10.0.2.15:51074",
#     "column5": "123.45.101.207:https",
#     "column6": "CLOSE_WAIT"
#   },
#   {
#     "column1": "tcp6",
#     "column2": "1",
#     "column3": "0",
#     "column4": "ip6-localhost:50293",
#     "column5": "ip6-localhost:ipp",
#     "column6": "CLOSE_WAIT"
#   }
# ]

netstat -a | grep CLOSE_WAIT | ./convert_to_json 1,3
# [
#   {
#     "column1": "tcp",
#     "column3": "0"
#   },
#   {
#     "column1": "tcp6",
#     "column3": "0"
#   }
# ]

ls -l | tail -n3 | convert_to_json 6,7,8,9 month,day,time,name
# [
#   {
#     "month": "Oct",
#     "day": "27",
#     "time": "10:35",
#     "name": "test.dot"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "uniq.rb"
#   },
#   {
#     "month": "Nov",
#     "day": "2",
#     "time": "14:27",
#     "name": "utf8_nokogiri.rb"
#   }
# ]

# NOTE: ls -l uses the 8th column for year, not time, for older files :
ls --full-time -t /usr/share/doc | tail -n3 | ./convert_to_json 6,7,9 yyyymmdd,time,name
[
  {
    "yyyymmdd": "2013-10-21",
    "time": "15:15:20.000000000",
    "name": "libbz2-dev"
  },
  {
    "yyyymmdd": "2013-10-10",
    "time": "16:27:32.000000000",
    "name": "zsh"
  },
  {
    "yyyymmdd": "2013-10-03",
    "time": "18:52:45.000000000",
    "name": "manpages-dev"
  }
]

ls -l | tail -n3 | convert_to_json 9,12
# [
#   {
#     "column9": "test.dot",
#     "column12": null
#   },
#   {
#     "column9": "uniq.rb",
#     "column12": null
#   },
#   {
#     "column9": "utf8_nokogiri.rb",
#     "column12": null
#   }
# ]

convert_to_json 1,2 time,value ";" < some_file.csv
# convert_to_json : Using ; as delimiter
# [
#   {
#     "time": "1",
#     "value": "3"
#   },
#   {
#     "time": "2",
#     "value": "5"
#   }
# ]
Run Code Online (Sandbox Code Playgroud)