如何使用 shell 脚本解析 JSON 以用于大括号边缘情况

0 linux json

我有一个 JSON 输出,我需要在 Linux 中从中提取一些参数。

这是 JSON 输出:

{
  items:[
    {
      provider_name:"ucp-ipg",
      subject_name:"rtm-instrumentation",
      dataset_name:"rtm-instrumentation-dataset-hour-sliced",
      dataset_key:[
        2018-03-06T06:00:00Z,
        "000394e3-a9eb-40b6-9463-fbd588d20ba4"
      ],
      record_count:21,
      state:"complete",
      version:0,
      etag:"a221df62",
      creation_timestamp:2018-03-06T06:10:46.294-00:00,
      created_by:"AAA",
      modification_timestamp:2018-03-06T06:10:46.294-00:00,
      modified_by:"AAA"
    },
    {
      provider_name:"ucp-ipg",
      subject_name:"rtm-instrumentation",
      dataset_name:"rtm-instrumentation-dataset-hour-sliced",
      dataset_key:[
        2018-03-06T06:00:00Z,
        "00097722-b02f-4938-bd4b-d935047c3837"
      ],
      record_count:17,
      state:"complete",
      version:0,
      etag:"aa4dbc25",
      creation_timestamp:2018-03-06T06:12:23.293-00:00,
      created_by:"AAA",
      modification_timestamp:2018-03-06T06:12:23.293-00:00,
      modified_by:"AAA"
    }
Run Code Online (Sandbox Code Playgroud)

我想要的输出

dataset_key:[
        2018-03-06T06:00:00Z,
        "00097722-b02f-4938-bd4b-d935047c3837"
      ]
Run Code Online (Sandbox Code Playgroud)

我已经尝试过以下但不起作用:

file.txt | python -mjson.tool | grep 'dataset_key'
Run Code Online (Sandbox Code Playgroud)

Kus*_*nda 5

假设 JSON 文档格式良好且完整,例如

{
  "items": [
    {
      "provider_name": "ucp-ipg",
      "subject_name": "rtm-instrumentation",
      "dataset_name": "rtm-instrumentation-dataset-hour-sliced",
      "dataset_key": [
        "2018-03-06T06:00:00Z",
        "000394e3-a9eb-40b6-9463-fbd588d20ba4"
      ],
      "record_count": 21,
      "state": "complete",
      "version": 0,
      "etag": "a221df62",
      "creation_timestamp": "2018-03-06T06:10:46.294-00:00",
      "created_by": "AAA",
      "modification_timestamp": "2018-03-06T06:10:46.294-00:00",
      "modified_by": "AAA"
    },
    {
      "provider_name": "ucp-ipg",
      "subject_name": "rtm-instrumentation",
      "dataset_name": "rtm-instrumentation-dataset-hour-sliced",
      "dataset_key": [
        "2018-03-06T06:00:00Z",
        "00097722-b02f-4938-bd4b-d935047c3837"
      ],
      "record_count": 17,
      "state": "complete",
      "version": 0,
      "etag": "aa4dbc25",
      "creation_timestamp": "2018-03-06T06:12:23.293-00:00",
      "created_by": "AAA",
      "modification_timestamp": "2018-03-06T06:12:23.293-00:00",
      "modified_by": "AAA"
    }
  ]
}
Run Code Online (Sandbox Code Playgroud)

第二个item元素的dataset_key数组可能有jq

$ jq '.items[1].dataset_key' file.json
[
  "2018-03-06T06:00:00Z",
  "00097722-b02f-4938-bd4b-d935047c3837"
]
Run Code Online (Sandbox Code Playgroud)

改变[1][-1]获得dataset_key最后一个 item元素。

获取数组元素的原始数据:

$ jq -r '.items[1].dataset_key[]' file.json
2018-03-06T06:00:00Z
00097722-b02f-4938-bd4b-d935047c3837
Run Code Online (Sandbox Code Playgroud)

  • @user3256289:如果这对您有用,那么您在问题中给出的示例与您的实际数据不匹配,因为 AFAIK `jq` 不会解析格式错误的 JSON。所以,下次请包括正确的数据。 (2认同)