在logstash中使用grok解析多行JSON

Jos*_*eph 5 json elasticsearch logstash logstash-grok

我有一个格式的JSON:

{
    "SOURCE":"Source A",
    "Model":"ModelABC",
    "Qty":"3"
}
Run Code Online (Sandbox Code Playgroud)

我正在尝试使用logstash解析此JSON.基本上我希望logstash输出是一个key:value对的列表,我可以使用kibana进行分析.我认为这可以开箱即用.从很多阅读中,我明白我必须使用grok插件(我仍然不确定json插件的用途).但我无法获得所有领域的活动.我得到多个事件(甚至对于我的JSON的每个属性都有一个).像这样:

{
       "message" => "  \"SOURCE\": \"Source A\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.432Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
       "message" => "  \"Model\": \"ModelABC\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.438Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
       "message" => "  \"Qty\": \"3\",",
      "@version" => "1",
    "@timestamp" => "2014-08-31T01:26:23.438Z",
          "type" => "my-json",
          "tags" => [
        [0] "tag-json"
    ],
          "host" => "myserver.example.com",
          "path" => "/opt/mount/ELK/json/mytestjson.json"
}
Run Code Online (Sandbox Code Playgroud)

我应该使用多行编解码器还是json_lines编解码器?如果是这样,我该怎么做?我是否需要编写自己的grok模式,或者是否存在一些JSON的通用内容,这些内容将为我提供一个具有键值的事件:我在上面的一个事件中得到的值对?我找不到任何能够揭示这一点的文件.任何帮助,将不胜感激.我的conf文件如下所示:

input
{
        file
        {
                type => "my-json"
                path => ["/opt/mount/ELK/json/mytestjson.json"]
                codec => json
                tags => "tag-json"
        }
}

filter
{
   if [type] == "my-json"
   {
        date { locale => "en"  match => [ "RECEIVE-TIMESTAMP", "yyyy-mm-dd HH:mm:ss" ] }
   }
}

output
{
        elasticsearch
        {
                host => localhost
        }
        stdout { codec => rubydebug }
}
Run Code Online (Sandbox Code Playgroud)

Jos*_*eph 6

我想我找到了解决问题的方法.我不确定它是否是一个干净的解决方案,但它有助于解析上述类型的多行JSON.

input 
{   
    file 
    {
        codec => multiline
        {
            pattern => '^\{'
            negate => true
            what => previous                
        }
        path => ["/opt/mount/ELK/json/*.json"]
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
    }
}

filter 
{
    mutate
    {
        replace => [ "message", "%{message}}" ]
        gsub => [ 'message','\n','']
    }
    if [message] =~ /^{.*}$/ 
    {
        json { source => message }
    }

}

output 
{ 
    stdout { codec => rubydebug }
}
Run Code Online (Sandbox Code Playgroud)

我的mutliline编解码器不处理最后一个大括号,因此它不会显示为JSON json { source => message }.因此mutate过滤器:

replace => [ "message", "%{message}}" ]
Run Code Online (Sandbox Code Playgroud)

这增加了缺失的支撑.和

gsub => [ 'message','\n','']
Run Code Online (Sandbox Code Playgroud)

删除\n引入的字符.最后,我有一个可以读取的单行JSONjson { source => message }

如果有一种更简洁/更简单的方法将原始多行JSON转换为单行JSON,请执行POST,因为我觉得上面的内容不太干净.