使用JSON输入步骤处理不均匀的数据

rsi*_*va4 9 json pentaho data-integration kettle

我正在尝试使用JSON输入步骤处理以下内容:

{"address":[
  {"AddressId":"1_1","Street":"A Street"},
  {"AddressId":"1_101","Street":"Another Street"},
  {"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},
  {"AddressId":"1_102","Locality":"New York"}
]}
Run Code Online (Sandbox Code Playgroud)

然而,这似乎是不可能的:

Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) : 
The data structure is not the same inside the resource! 
We found 1 values for json path [$..Locality], which is different that the number retourned for path [$..Street] (3509 values). 
We MUST have the same number of values for all paths.
Run Code Online (Sandbox Code Playgroud)

该步骤提供Ignore Missing Path标志,但只有在所有行都错过相同路径时才有效.在这种情况下,步骤按预期运行,用null填充缺失值.

这限制了这一步骤读取不均匀数据的能力,这实际上是我的优先事项之一.

我的步骤字段定义如下:

JSON输入字段定义

我错过了什么吗?这是正确的行为吗?

rsi*_*va4 11

我所做的是使用$ .address [*]的JSON输入来读取每个元素pe的完整映射到jsonRow字段:

{"address":[
    {"AddressId":"1_1","Street":"A Street"},  
    {"AddressId":"1_101","Street":"Another Street"},  
    {"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},   
    {"AddressId":"1_102","Locality":"New York"} 
]}
Run Code Online (Sandbox Code Playgroud)

这导致每个元素pe的4个jsonRows jsonRow = {"AddressId":"1_101","Street":"Another Street"}.然后使用Javascript步骤我使用以下方法映射我的值:

var AddressId = getFromMap('AddressId', jsonRow);
var Street = getFromMap('Street', jsonRow);
var Locality = getFromMap('Locality', jsonRow);
Run Code Online (Sandbox Code Playgroud)

在第二个脚本选项卡中,我从https://github.com/douglascrockford/JSON-js和getFromMap函数中插入了缩小的JSON解析代码:

function getFromMap(key,jsonRow){
  try{
   var map = JSON.parse(jsonRow);
  }
  catch(e){
   var message = "Unparsable JSON: "+jsonRow+" Desc: "+e.message;
   var nr_errors = 1;
   var field = "jsonRow";
   var errcode = "JSON_PARSE";
   _step_.putError(getInputRowMeta(), row, nr_errors, message, field, errcode);
   trans_Status = SKIP_TRANSFORMATION;
   return null;
  }

  if(map[key] == undefined){
   return null;
  }
  trans_Status = CONTINUE_TRANSFORMATION;
  return map[key]
}
Run Code Online (Sandbox Code Playgroud)