我有以下 JSON,我想将其转换为 CSV。
JSON 的元素将是常量,如果值不存在,那么它将为空。但是属性仍然可用。
我想将其转换为 9 列的 CSV。
我有这样的流程 =>
InvokeHTTP->EvaluateJSONPath->InferAvroSchema-> UpdateAttribute->ConvertRecord
Run Code Online (Sandbox Code Playgroud)
最后一个处理器出现故障。
a,b,c,d,e,sub-a,sub-b,sub-c,sub-d
[
{
"a": "a-value",
"b": "b-value",
"c": "c-value",
"d": "d-value",
"e": 123,
"sub_value": {
"sub-a": "sub-a-value",
"sub-b": null,
"sub-c": "sub-c-value",
"sub-d": "sub-d-value"
}
},
{
"a": "a2-value",
"b": "b2-value",
"c": "c2-value",
"d": "d2-value",
"e": 123,
"sub_value": {
"sub-a": "sub-2a-value",
"sub-b": "sub-2a-value",
"sub-c": "sub-2c-value",
"sub-d": "sub-2d-value"
}
}
]
Run Code Online (Sandbox Code Playgroud)
如果我不想在 JsonReader 中使用属性怎么办?(我可能有太多的值,它可能会成为维护的噩梦)。是其他方式吗?
我仍然可以通过在阅读和写作时使用 avroschema 来解决我的用例吗?
输入-json =>
{"store_number":"33152","store_name":"33152 WALMART JARDINES DEL COUNTRY 2374","store_display_name":"WALMART JARDINES DEL COUNTRY 2374","store_type_name":"Off Trade","street":"CALLE MORELOS 2019, JARDINES DEL COUNTRY, EL PERUL 2DA SECC.","city":"SALAMANCA","postal_code":"36764","latitude":20.665996,"longitude":-101.232345,"is_active":true,"manager_name":"MITZI ORTIZ","is_deleted":false,"manager_phone":null,"manager_email":null,"region_name":"A III","district_name":null,"branch_name":null,"retailer_name":"WALMART DE MEXICO","state_code":null,"store_additional_attributes":{"additional_attribute_1":"SSS MAINSTREAM","additional_attribute_2":"MAINSTREAM","additional_attribute_3":"PROM.SEM.53","additional_attribute_4":"SSS","additional_attribute_5":"R G","additional_attribute_6":"R.G@diageo.com","additional_attribute_11":"XEL HA MASTACHE","additional_attribute_12":"CALDERON VANESSA"}}
Run Code Online (Sandbox Code Playgroud)
avro-schema-text(用于 JsonTreeReader)=>
{"type":"record","name":"jsn_to_csv","fields":[{"name":"store_number","type":"string","doc":"Type inferred from '\"1001\"'"},{"name":"store_name","type":"string","doc":"Type inferred from '\"1001 BODEGA AURRERA CHIMALHUACAN 3762\"'"},{"name":"store_display_name","type":"string","doc":"Type inferred from '\"BODEGA AURRERA CHIMALHUACAN 3762\"'"},{"name":"store_type_name","type":["string","null"],"doc":"Type inferred from '\"Off Trade\"'"},{"name":"street","type":["string","null"],"doc":"Type inferred from '\"AV CHIMALHUACAN 428 COL: BENITO JUAREZ\"'"},{"name":"city","type":["string","null"],"doc":"Type inferred from '\"NEZAHUALCOYOTL\"'"},{"name":"postal_code","type":["string","null"],"doc":"Type inferred from '\"57000\"'"},{"name":"latitude","type":["double","null","int"],"doc":"Type inferred from '19.403'"},{"name":"longitude","type":["double","null","int"],"doc":"Type inferred from '-99.0078'"},{"name":"is_active","type":"boolean","doc":"Type inferred from 'true'"},{"name":"manager_name","type":["string","null"],"doc":"Type inferred from '\"GIOVANI GARCIA\"'"},{"name":"is_deleted","type":"boolean","doc":"Type inferred from 'false'"},{"name":"manager_phone","type":"null","doc":"Type inferred from 'null'"},{"name":"manager_email","type":"null","doc":"Type inferred from 'null'"},{"name":"region_name","type":"string","doc":"Type inferred from '\"A IV\"'"},{"name":"district_name","type":"null","doc":"Type inferred from 'null'"},{"name":"branch_name","type":"null","doc":"Type inferred from 'null'"},{"name":"retailer_name","type":["string","null"],"doc":"Type inferred from '\"WALMART DE MEXICO\"'"},{"name":"state_code","type":"null","doc":"Type inferred from 'null'"},{"name":"store_additional_attributes","type":{"type":"record","name":"store_additional_attributes","fields":[{"name":"additional_attribute_1","type":["null","string"],"doc":"Type inferred from '\"SSS EMC\"'","default":null},{"name":"additional_attribute_2","type":["null","string"],"doc":"Type inferred from '\"EMC\"'","default":null},{"name":"additional_attribute_3","type":["null","string"],"doc":"Type inferred from '\"PROM.SEM.259\"'","default":null},{"name":"additional_attribute_4","type":["null","string"],"doc":"Type inferred from '\"SSS\"'","default":null},{"name":"additional_attribute_5","type":["null","string"],"doc":"Type inferred from '\"FRANCISCO HERNANDEZ\"'","default":null},{"name":"additional_attribute_6","type":["null","string"],"doc":"Type inferred from '\"a@a.com\"'","default":null},{"name":"additional_attribute_11","type":["null","string"],"doc":"Type inferred from '\"GIOVANI GARCIA\"'","default":null},{"name":"additional_attribute_12","type":["null","string"],"doc":"Type inferred from '\"SOLIS MARCIAL ALAN\"'","default":null}]}}]}
Run Code Online (Sandbox Code Playgroud)
JsonTreeReader
CSVRecordSetWriter - Avro 架构
{"type":"record","name":"jsn_to_csv","fields":[{"name":"store_number","type":"string"},{"name":"store_name","type":"string"},{"name":"store_display_name","type":"string"},{"name":"additional_attribute_1","default":null,"type":[{"type":"null"},{"type":"string"}]}]}
输出=>
store_number,store_name,store_display_name,additional_attribute_1
33152,33152 WALMART JARDINES DEL COUNTRY 2374,WALMART JARDINES DEL COUNTRY 2374,
Run Code Online (Sandbox Code Playgroud)
似乎最后一列值为空。如何在编写时提供嵌套 JSON 属性的参考?
模板代码
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><template encoding-version="1.2"><description></description><groupId>e67d1c1e-38c8-1085-8460-d34d1eab4f96</groupId><name>DNU_r_writer</name><snippet><connections><id>a554ad4c-5255-38a8-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold><backPressureObjectThreshold>10000</backPressureObjectThreshold><destination><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>7ad8214e-4359-3339-0000-000000000000</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>success</selectedRelationships><source><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>89b6d4b6-6c38-34f5-0000-000000000000</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><connections><id>350019df-a89c-3065-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold><backPressureObjectThreshold>10000</backPressureObjectThreshold><destination><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>ac868965-be33-3190-0000-000000000000</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>success</selectedRelationships><source><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>7ad8214e-4359-3339-0000-000000000000</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><connections><id>51283208-4111-3683-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold><backPressureObjectThreshold>10000</backPressureObjectThreshold><destination><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>9060f008-2053-3dc0-0000-000000000000</id><type>PROCESSOR</type></destination><flowFileExpiration>0 sec</flowFileExpiration><labelIndex>1</labelIndex><name></name><selectedRelationships>failure</selectedRelationships><source><groupId>50098679-9a85-3379-0000-000000000000</groupId><id>7ad8214e-4359-3339-0000-000000000000</id><type>PROCESSOR</type></source><zIndex>0</zIndex></connections><controllerServices><id>2b600734-09f6-3b38-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><bundle><artifact>nifi-record-serialization-services-nar</artifact><group>org.apache.nifi</group><version>1.6.0</version></bundle><comments></comments><descriptors><entry><key>Schema Write Strategy</key><value><name>Schema Write Strategy</name></value></entry><entry><key>schema-access-strategy</key><value><name>schema-access-strategy</name></value></entry><entry><key>schema-registry</key><value><identifiesControllerService>org.apache.nifi.schemaregistry.services.SchemaRegistry</identifiesControllerService><name>schema-registry</name></value></entry><entry><key>schema-name</key><value><name>schema-name</name></value></entry><entry><key>schema-version</key><value><name>schema-version</name></value></entry><entry><key>schema-branch</key><value><name>schema-branch</name></value></entry><entry><key>schema-text</key><value><name>schema-text</name></value></entry><entry><key>Date Format</key><value><name>Date Format</name></value></entry><entry><key>Time Format</key><value><name>Time Format</name></value></entry><entry><key>Timestamp Format</key><value><name>Timestamp Format</name></value></entry><entry><key>CSV Format</key><value><name>CSV Format</name></value></entry><entry><key>Value Separator</key><value><name>Value Separator</name></value></entry><entry><key>Include Header Line</key><value><name>Include Header Line</name></value></entry><entry><key>Quote Character</key><value><name>Quote Character</name></value></entry><entry><key>Escape Character</key><value><name>Escape Character</name></value></entry><entry><key>Comment Marker</key><value><name>Comment Marker</name></value></entry><entry><key>Null String</key><value><name>Null String</name></value></entry><entry><key>Trim Fields</key><value><name>Trim Fields</name></value></entry><entry><key>Quote Mode</key><value><name>Quote Mode</name></value></entry><entry><key>Record Separator</key><value><name>Record Separator</name></value></entry><entry><key>Include Trailing Delimiter</key><value><name>Include Trailing Delimiter</name></value></entry><entry><key>csvutils-character-set</key><value><name>csvutils-character-set</name></value></entry></descriptors><name>CSVRecordSetWriter</name><persistsState>false</persistsState><properties><entry><key>Schema Write Strategy</key><value>no-schema</value></entry><entry><key>schema-access-strategy</key><value>schema-text-property</value></entry><entry><key>schema-registry</key></entry><entry><key>schema-name</key><value>jsn_to_csv</value></entry><entry><key>schema-version</key><value>${avro.schema}</value></entry><entry><key>schema-branch</key></entry><entry><key>schema-text</key><value>{"type":"record","name":"jsn_to_csv","fields":[{"name":"store_number","type":"string","doc":"Type inferred from '\"33152\"'"},{"name":"store_name","type":"string","doc":"Type inferred from '\"33152 WALMART JARDINES DEL COUNTRY 2374\"'"},{"name":"store_display_name","type":"string","doc":"Type inferred from '\"WALMART JARDINES DEL COUNTRY 2374\"'"},{"name":"additional_attribute_1","default":null,"type":[{"type":"null"},{"type":"string"}]}]}</value></entry><entry><key>Date Format</key></entry><entry><key>Time Format</key></entry><entry><key>Timestamp Format</key></entry><entry><key>CSV Format</key></entry><entry><key>Value Separator</key></entry><entry><key>Include Header Line</key></entry><entry><key>Quote Character</key></entry><entry><key>Escape Character</key></entry><entry><key>Comment Marker</key></entry><entry><key>Null String</key></entry><entry><key>Trim Fields</key></entry><entry><key>Quote Mode</key></entry><entry><key>Record Separator</key></entry><entry><key>Include Trailing Delimiter</key></entry><entry><key>csvutils-character-set</key></entry></properties><state>DISABLED</state><type>org.apache.nifi.csv.CSVRecordSetWriter</type></controllerServices><controllerServices><id>50827b89-5d05-3339-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><bundle><artifact>nifi-record-serialization-services-nar</artifact><group>org.apache.nifi</group><version>1.6.0</version></bundle><comments></comments><descriptors><entry><key>schema-access-strategy</key><value><name>schema-access-strategy</name></value></entry><entry><key>schema-registry</key><value><identifiesControllerService>org.apache.nifi.schemaregistry.services.SchemaRegistry</identifiesControllerService><name>schema-registry</name></value></entry><entry><key>schema-name</key><value><name>schema-name</name></value></entry><entry><key>schema-version</key><value><name>schema-version</name></value></entry><entry><key>schema-branch</key><value><name>schema-branch</name></value></entry><entry><key>schema-text</key><value><name>schema-text</name></value></entry><entry><key>Date Format</key><value><name>Date Format</name></value></entry><entry><key>Time Format</key><value><name>Time Format</name></value></entry><entry><key>Timestamp Format</key><value><name>Timestamp Format</name></value></entry></descriptors><name>JsonTreeReader</name><persistsState>false</persistsState><properties><entry><key>schema-access-strategy</key><value>schema-text-property</value></entry><entry><key>schema-registry</key></entry><entry><key>schema-name</key></entry><entry><key>schema-version</key></entry><entry><key>schema-branch</key></entry><entry><key>schema-text</key><value>{"type":"record","name":"jsn_to_csv","fields":[{"name":"store_number","type":"string","doc":"Type inferred from '\"1001\"'"},{"name":"store_name","type":"string","doc":"Type inferred from '\"1001 BODEGA AURRERA CHIMALHUACAN 3762\"'"},{"name":"store_display_name","type":"string","doc":"Type inferred from '\"BODEGA AURRERA CHIMALHUACAN 3762\"'"},{"name":"store_type_name","type":["string","null"],"doc":"Type inferred from '\"Off Trade\"'"},{"name":"street","type":["string","null"],"doc":"Type inferred from '\"AV CHIMALHUACAN 428 COL: BENITO JUAREZ\"'"},{"name":"city","type":["string","null"],"doc":"Type inferred from '\"NEZAHUALCOYOTL\"'"},{"name":"postal_code","type":["string","null"],"doc":"Type inferred from '\"57000\"'"},{"name":"latitude","type":["double","null","int"],"doc":"Type inferred from '19.403'"},{"name":"longitude","type":["double","null","int"],"doc":"Type inferred from '-99.0078'"},{"name":"is_active","type":"boolean","doc":"Type inferred from 'true'"},{"name":"manager_name","type":["string","null"],"doc":"Type inferred from '\"GIOVANI GARCIA\"'"},{"name":"is_deleted","type":"boolean","doc":"Type inferred from 'false'"},{"name":"manager_phone","type":"null","doc":"Type inferred from 'null'"},{"name":"manager_email","type":"null","doc":"Type inferred from 'null'"},{"name":"region_name","type":"string","doc":"Type inferred from '\"A IV\"'"},{"name":"district_name","type":"null","doc":"Type inferred from 'null'"},{"name":"branch_name","type":"null","doc":"Type inferred from 'null'"},{"name":"retailer_name","type":["string","null"],"doc":"Type inferred from '\"WALMART DE MEXICO\"'"},{"name":"state_code","type":"null","doc":"Type inferred from 'null'"},{"name":"store_additional_attributes","type":{"type":"record","name":"store_additional_attributes","fields":[{"name":"additional_attribute_1","type":["null","string"],"doc":"Type inferred from '\"SSS EMC\"'","default":null},{"name":"additional_attribute_2","type":["null","string"],"doc":"Type inferred from '\"EMC\"'","default":null},{"name":"additional_attribute_3","type":["null","string"],"doc":"Type inferred from '\"PROM.SEM.259\"'","default":null},{"name":"additional_attribute_4","type":["null","string"],"doc":"Type inferred from '\"SSS\"'","default":null},{"name":"additional_attribute_5","type":["null","string"],"doc":"Type inferred from '\"FRANCISCO HERNANDEZ\"'","default":null},{"name":"additional_attribute_6","type":["null","string"],"doc":"Type inferred from '\"a@a.com\"'","default":null},{"name":"additional_attribute_11","type":["null","string"],"doc":"Type inferred from '\"GIOVANI GARCIA\"'","default":null},{"name":"additional_attribute_12","type":["null","string"],"doc":"Type inferred from '\"SOLIS MARCIAL ALAN\"'","default":null}]}}]}</value></entry><entry><key>Date Format</key></entry><entry><key>Time Format</key></entry><entry><key>Timestamp Format</key></entry></properties><state>ENABLED</state><type>org.apache.nifi.json.JsonTreeReader</type></controllerServices><processors><id>89b6d4b6-6c38-34f5-0000-000000000000</id><parentGroupId>50098679-9a85-3379-0000-000000000000</parentGroupId><position><x>640.8075
使用ConvertRecord处理器
JsonPathReader 作为记录阅读器
CsvSetWriter 作为记录编写器
JsonPathReader 配置:
由于您在 json 中有静态元素,因此为 json 消息的所有键添加与 json 路径匹配的新属性
AvroSchemaRegistry 配置:
此架构需要与我们在 JsonPathReader 控制器服务中添加的属性匹配。
CsvSetWriter 配置:
输入:
[
{
"a": "a-value",
"b": "b-value",
"c": "c-value",
"d": "d-value",
"e": 123,
"sub_value": {
"sub-a": "sub-a-value",
"sub-b": null,
"sub-c": "sub-c-value",
"sub-d": "sub-d-value"
}
},
{
"a": "a2-value",
"b": "b2-value",
"c": "c2-value",
"d": "d2-value",
"e": 123,
"sub_value": {
"sub-a": "sub-2a-value",
"sub-b": "sub-2a-value",
"sub-c": "sub-2c-value",
"sub-d": "sub-2d-value"
}
}
]
Run Code Online (Sandbox Code Playgroud)
输出:
a,b,c,d,e,sub-a,sub-b,sub-c,sub-d
a-value,b-value,c-value,d-value,123,sub-a-value,,sub-c-value,sub-d-value
a2-value,b2-value,c2-value,d2-value,123,sub-2a-value,sub-2a-value,sub-2c-value,sub-2d-value
Run Code Online (Sandbox Code Playgroud)
对于我尝试过的模板,可以在此链接中找到。
归档时间: |
|
查看次数: |
2511 次 |
最近记录: |