将 JSON 文件对象拆分为多个文件

Question

将 JSON 文件对象拆分为多个文件

Waf*_*ffy 0 javascript python powershell json jq

我有一个包含过多数据对象的 JSON 文件，格式如下：

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -42.01171875,
              78.31385955743478
            ],
            [
              -37.6171875,
              78.06198918665974
            ],
            [
              -37.880859375,
              78.81903553711727
            ]
          ]
        ]
      }
    },
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.6171875,
              78.07107600956168
            ],
            [
              -35.48583984375,
              78.42019327591201
            ],
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -37.6171875,
              78.07107600956168
            ]
          ]
        ]
      }
    }
  ]
}

Run Code Online (Sandbox Code Playgroud)

我想拆分大文件，以便每个特征对象都有自己的文件，其中包含其类型对象和特征（坐标）对象。所以基本上，我试图获得其中的许多：

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -37.6171875,
              78.07107600956168
            ],
            [
              -35.48583984375,
              78.42019327591201
            ],
            [
              -37.880859375,
              78.81903553711727
            ],
            [
              -37.6171875,
              78.07107600956168
            ]
          ]
        ]
      }
    }
  ]
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

pea*_*eak 5

这是一个只需要调用jq和的解决方案awk，假设输入在文件 (input.json) 中，并且第 N 个组件应写入文件 /tmp/file$N.json 以 N=1 开头：

jq -c '.features = (.features[] | [.]) ' input.json |
  awk '{ print > "/tmp/file" NR ".json"}'

Run Code Online (Sandbox Code Playgroud)

awk此处的替代方法是split -l 1.

如果您希望每个输出文件都被“漂亮地打印”，那么使用诸如 bash 之类的 shell，您可以（以额外调用 n 次 jq 为代价）编写：

N=0
jq -c '.features = (.features[] | [.])' input.json |
  while read -r json ; do
  N=$((N+1))
  jq . <<< "$json"  > "/tmp/file${N}.json"
done

Run Code Online (Sandbox Code Playgroud)

对 jq 的每个额外调用都会很快，因此这可能是可以接受的。

归档时间：	8 年，10 月前
查看次数：	5054 次
最近记录：	7 年，6 月前