如何从clickhouse中的json中提取json?

Jen*_*ens 4 clickhouse

我的基地有一个 json:

{"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}]}
Run Code Online (Sandbox Code Playgroud)

我需要获取每个键的所有唯一值,但在提取键“d”和键“e”的值时遇到一些问题。

使用:

SELECT
   DISTINCT JSONExtractRaw(column, 'c')
FROM t1
Run Code Online (Sandbox Code Playgroud)

我得到:

[{"d":3,"e":"str_1"}, 
{"d":4,"e":"str_2"}]
Run Code Online (Sandbox Code Playgroud)

但是,如果我再次对键“d”和键“e”使用 JsonExtract 品种,它不会返回任何内容。如何解决这个问题呢?

vla*_*mir 5

如果需要,我将使用像这样的“安全”查询来正确处理无序成员和丢失的成员。这种方式不是很快但是可靠。

\n\n
SELECT\n  json,\n  a_and_b,\n  d_uniq_values,\n  e_uniq_values\nFROM (  \n  SELECT\n      json,\n      JSONExtract(json, \'Tuple(a Nullable(Int32), b Nullable(Int32))\') a_and_b,\n      JSONExtractRaw(json, \'c\') c_json,\n      range(JSONLength(c_json)) AS array_indices,\n      arrayDistinct(arrayMap(i -> JSONExtractInt(c_json, i + 1, \'d\'), array_indices)) AS d_uniq_values,\n      arrayDistinct(arrayMap(i -> JSONExtractString(c_json, i + 1, \'e\'), array_indices)) AS e_uniq_values\n  FROM\n  (\n      /* test data */\n      SELECT arrayJoin([\n        \'{}\',\n        \'{"a":1,"b":2}\',\n        \'{"b":1,"a":2}\',\n        \'{"b":1}\',\n        \'{"a":1,"b":2,"c":[]}\',\n        \'{"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}]}\',\n        \'{"b":1,"a":2,"c":[{"e":"3","d":1}, {"e":"4","d":2}]}\',\n        \'{"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}, {"d":3,"e":"str_1"}, {"d":4,"e":"str_1"}, {"d":7,"e":"str_9"}]}\'      \n        ]) AS json\n  ))\nFORMAT Vertical;\n\n/* Result:\n\nRow 1:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {}\na_and_b:       (NULL,NULL)\nd_uniq_values: []\ne_uniq_values: []\n\nRow 2:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"a":1,"b":2}\na_and_b:       (1,2)\nd_uniq_values: []\ne_uniq_values: []\n\nRow 3:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"b":1,"a":2}\na_and_b:       (2,1)\nd_uniq_values: []\ne_uniq_values: []\n\nRow 4:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"b":1}\na_and_b:       (NULL,1)\nd_uniq_values: []\ne_uniq_values: []\n\nRow 5:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"a":1,"b":2,"c":[]}\na_and_b:       (1,2)\nd_uniq_values: []\ne_uniq_values: []\n\nRow 6:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}]}\na_and_b:       (1,2)\nd_uniq_values: [3,4]\ne_uniq_values: [\'str_1\',\'str_2\']\n\nRow 7:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"b":1,"a":2,"c":[{"e":"3","d":1}, {"e":"4","d":2}]}\na_and_b:       (2,1)\nd_uniq_values: [1,2]\ne_uniq_values: [\'3\',\'4\']\n\nRow 8:\n\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\njson:          {"a":1,"b":2,"c":[{"d":3,"e":"str_1"}, {"d":4,"e":"str_2"}, {"d":3,"e":"str_1"}, {"d":4,"e":"str_1"}, {"d":7,"e":"str_9"}]}\na_and_b:       (1,2)\nd_uniq_values: [3,4,7]\ne_uniq_values: [\'str_1\',\'str_2\',\'str_9\']\n*/\n
Run Code Online (Sandbox Code Playgroud)\n