如何过滤可能是几个树级别深度的父子数组的多个属性

ghi*_*ing 7 javascript arrays datagrid slickgrid typescript

TL; 博士; 为简单起见,我如何过滤父子数组的多个属性,这些属性可能是几个树级别的深度。这是针对数百个用户使用的开源数据网格库。

所以我有一系列父/子引用,孩子们也可以有孩子自己等等,树级深度没有限制。此外,我不仅需要能够过滤具有树结构的属性,还需要能够过滤该数组的任何属性,即网格中的列。

例如,我有这个数组,它代表一个文件浏览器列表

const myFiles = [
    {id: 11, file: "Music", parentId: null },
    {id: 12, file: "mp3", parentId: 11 },
    {id: 14, file: "pop", parentId: 12 },
    {id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85, parentId: 14, },
    {id: 16, file: "rock", parentId: 12 },
    {id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16, },
    {id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null, },
    {id: 21, file: "Documents", parentId: null, },
    {id: 2, file: "txt", parentId: 21 },
    {id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2, },
    {id: 4, file: "pdf", parentId: 21 },
    {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, },
    {id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4, },
    {id: 7, file: "xls", parentId: 21 },
    {id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7, },
    {id: 9, file: "misc", parentId: 21 },
    {id: 10, file: "something.txt", dateModified: "2015-02-26", size: 0.4, parentId: 9, },
]
Run Code Online (Sandbox Code Playgroud)

数组看起来很扁平,但实际上,它是一个树视图结构,在数据网格中表示,如下所示。

我发现部分有效的是遍历整个数组并添加每个项目可以包含的文件的完整列表,例如,如果 Documents 有一个子 PDF,它本身有一个子 Map.pdf,那么树映射可以用 ["Documents", "PDF", "map.pdf"] 表示,我们将其存储在父对象上,然后在下一个子对象上存储 ["PDF", "map.pdf"] ,最后在我们存储的最后一个孩子 ["map.pdf"] 像这样

    {id: 21, file: "Documents", parentId: null, treeMap: ["Documents", "PDF", "map.pdf"] }
    {id: 4, file: "pdf", parentId: 21, treeMap: ["PDF", "map.pdf"] }
    {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, treeMap: ["map.pdf"] }
Run Code Online (Sandbox Code Playgroud)

这是允许我这样做的方法

export function modifyDatasetToAddTreeMapping(items: any[], treeViewColumn: Column, dataView: any) {
  for (let i = 0; i < items.length; i++) {
    items[i]['treeMap'] = [items[i][treeViewColumn.id]];
    let item = items[i];

    if (item['parentId'] !== null) {
      let parent = dataView.getItemById(item['parentId']);

      while (parent) {
        parent['treeMap'] = dedupePrimitiveArray(parent['treeMap'].concat(item['treeMap']));
        item = parent;
        parent = dataView.getItemById(item['parentId']);
      }
    }
  }
}

export function dedupePrimitiveArray(inputArray: Array<number | string>): Array<number | string> {
  const seen = {};
  const out = [];
  const len = inputArray.length;
  let j = 0;
  for (let i = 0; i < len; i++) {
    const item = inputArray[i];
    if (seen[item] !== 1) {
      seen[item] = 1;
      out[j++] = item;
    }
  }
  return out;
}
Run Code Online (Sandbox Code Playgroud)

然后 datagrid lib 使用 Filter 方法,我可以使用这种方式,其中columnFilters包含 1 个或多个过滤器的对象,例如const columnFilters = { file: 'map', size: '>3' }

数据网格是一个库(SlickGrid),它使用像这样的过滤器方法 dataView.setFilter(treeFilter);

function treeFilter(dataView: any, item: any) {
    const columnFilters = { file: this.searchString.toLowerCase(), size: 2 };
    let filterCount = 0;

    if (item[parentPropName] !== null) {
      let parent = dataView.getItemById(item['parentId']);
      while (parent) {
        if (parent.__collapsed) {
          return false;
        }
        parent = dataView.getItemById(parent['parentId']);
      }
    }

    for (const columnId in columnFilters) {
      if (columnId !== undefined && columnFilters[columnId] !== '') {
        filterCount++;

        if (item.treeMap === undefined || !item.treeMap.find((itm: string) => itm.endsWith(columnFilters[columnId]))) {
          return false;
        }
      }
    }
    return true;
  }
Run Code Online (Sandbox Code Playgroud)

modifyDatasetToAddTreeMapping()如果我想在 File 列上过滤,调用它就可以了,但是如果我添加更多的列过滤器,它就不能按预期工作。例如,如果您查看第二个打印屏幕,您会看到我输入了“地图”,这将显示“文档 > PDF > map.pdf”,这很好,但是如果添加的文件大小小于 3Mb,它应该't display "map.pdf" 并且因为该文件未显示并且 "Documents > PDF" 不包含 "map" 一词,因此不应显示任何内容,因此您可以看到过滤器的行为不正常。

因此,对于当前的实现,我有 2 个问题 1. 未显示树节点时行为不正确,不应显示它的父节点 2. 必须调用modifyDatasetToAddTreeMapping()是可能不需要的额外调用 3. 它还修改源数组,我可以深度克隆数组,但这将是另一项性能开销

在转换为层次结构(树)之后,这可能可以通过递归来实现,但是如果使用递归,我无法找出执行此操作的最佳算法,总是向下钻取树以查找项目不是很昂贵吗?

最后,目的是将它与可能有 10k 甚至 50k 行的 SlickGrid 一起使用,因此它必须很快。您可以看到此SlickGrid 演示,但它们的过滤实现不正确,我还发现该方法在其他SO Answer 中添加映射

注意:我还想指出,这个问题的解决方案可能会使数百(或数千)个用户受益,因为它将在Angular-SlickgridAurelia-Slickgrid 中实现,它们都是开源库并由 at至少 300+ 用户。

在此处输入图片说明

使用“map”一词进行过滤不应在此处返回任何内容,因为没有任何节点/子节点具有该文本。 在此处输入图片说明

编辑

最好的代码是将完成这项工作的任何代码插入到常规的 JS 中filter,这意味着最终的解决方案myFilter将是一个filter回调方法。我坚持这样做的原因是因为我使用外部库SlickGrid并且我必须使用该库所提供的作为公开的公共方法。

function myFilter(item, args) {
  const columnFilters = args.columnFilters;

  // iterate through each items of the dataset
  // return true/false on each item
}

// to be used as a drop in
dataView.setFilterArgs({ columnFilters: this._columnFilters });
dataView.setFilter(myFilter.bind(this));
Run Code Online (Sandbox Code Playgroud)

如果我有const columnFilters = { file: "map", size: "<3.2" };,数组的预期结果将是 4 行

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  {id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4, }
]
Run Code Online (Sandbox Code Playgroud)

如果我有const columnFilters = { file: "map", size: "<3" };,数组的预期结果将是 3 行

// result
[
  {id: 21, file: "Documents", parentId: null },
  {id: 4, file: "pdf", parentId: 21, },
  {id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
]
Run Code Online (Sandbox Code Playgroud)

最后,如果我有const columnFilters = { file: "map", size: ">3" };那么预期的结果将是一个空数组,因为没有一个文件具有该字符和文件大小条件。

编辑 2

从@AlexL 的回答来看,它开始起作用了。只是一些调整,但它看起来已经很有希望了 在此处输入图片说明

编辑 3

感谢 Alex 出色的工作,他的回答帮助我将其合并到我的开源库中。我现在有 2 个带有父/子引用(平面数据集)和分层数据集(树数据集)的现场演示。我希望我能不止一次投票 :)

Ale*_*x L 5

我有办法做到这一点。它应该具有相当的性能,但我们可能还想将 map 和 reduce 等替换为旧的 for 循环以进一步优化速度(我看过各种博客和文章,比较了 forEach、map 等与 for 循环和 for 的速度) -循环似乎赢了)

这是一个演示(也在这里:https : //codepen.io/Alexander9111/pen/abvojzN):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

//example how to use the "<3" string - better way than using eval():
const columnFilters = { file: "map", size: "<3.2" }; //, size: "<3" 
const isSizeValid = Function("return " + myFiles[11].size + "<3")();
//console.log(isSizeValid);

const myObj = myFiles.reduce((aggObj, child) => {
  aggObj[child.id] = child;
  //the filtered data is used again as each subsequent letter is typed
  //we need to delete the ._used property, otherwise the logic below
  //in the while loop (which checks for parents) doesn't work:
  delete aggObj[child.id]._used;
  return aggObj;
}, {});

function filterMyFiles(myArray, columnFilters){
  const filteredChildren = myArray.filter(a => {
    for (let key in columnFilters){
      //console.log(key)      
      if (a.hasOwnProperty(key)){
        const strContains =  String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>=]?\d+)?\s*))+$/;
        const comparison = re.test(columnFilters[key]) && Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison){
          //don't return true as need to check other keys in columnFilters
        }else{
          //console.log('false', a)
          return false;
        }
      } else{
        return false;
      }           
    }
    //console.log('true', a)
    return true;
  })
  return filteredChildren;
}

const initFiltered = filterMyFiles(myFiles, columnFilters);

const finalWithParents = initFiltered.map(child => {
  const childWithParents = [child];
  let parent = myObj[child.parentId];
  while (parent){
    //console.log('parent', parent)
    parent._used || childWithParents.unshift(parent)
    myObj[parent.id]._used = true;
    parent = myObj[parent.parentId] || false;    
  }
  return childWithParents;
}).flat();

console.log(finalWithParents)
Run Code Online (Sandbox Code Playgroud)
.as-console-wrapper { max-height: 100% !important; top: 0; }
Run Code Online (Sandbox Code Playgroud)

基本上设置一个对象以供以后用于查找所有父母。

然后执行一个过滤器(即数组的一次迭代)并过滤那些与 columnFilters 对象中的条件匹配的过滤器。

然后在这个过滤后的数组上进行映射(即一次迭代)并使用在开始时创建的对象找到每个父对象(因此嵌套迭代到 N 深度)。

使用 .flat() 将数组展平(假设最后一次迭代),然后我们就完成了。

有任何问题请告诉我。

更新 - For 循环方法加上尝试减少对数组的迭代

删减几次迭代:)(https://codepen.io/Alexander9111/pen/MWagdVz):

const myFiles = [
  { id: 11, file: "Music", parentId: null },
  { id: 12, file: "mp3", parentId: 11 },
  { id: 14, file: "pop", parentId: 12 },
  { id: 15, file: "theme.mp3", dateModified: "2015-03-01", size: 85,  parentId: 14 },
  { id: 16, file: "rock", parentId: 12 },
  { id: 17, file: "soft.mp3", dateModified: "2015-05-13", size: 98, parentId: 16 },
  { id: 18, file: "else.txt", dateModified: "2015-03-03", size: 90, parentId: null },
  { id: 21, file: "Documents", parentId: null },
  { id: 2, file: "txt", parentId: 21 },
  { id: 3, file: "todo.txt", dateModified: "2015-05-12", size: 0.7, parentId: 2 },
  { id: 4, file: "pdf", parentId: 21 },
  { id: 22, file: "map2.pdf", dateModified: "2015-05-21", size: 2.9, parentId: 4 },
  { id: 5, file: "map.pdf", dateModified: "2015-05-21", size: 3.1, parentId: 4 },
  { id: 6, file: "internet-bill.pdf", dateModified: "2015-05-12", size: 1.4, parentId: 4 },
  { id: 7, file: "xls", parentId: 21 },
  { id: 8, file: "compilation.xls", dateModified: "2014-10-02", size: 2.3, parentId: 7 },
  { id: 9, file: "misc", parentId: 21 },
  { id: 10,  file: "something.txt", dateModified: "2015-02-26", size: 0.4,  parentId: 9 }
];

const columnFilters = { file: "map", size: "<3.2" };
console.log(customLocalFilter(myFiles, columnFilters));

function customLocalFilter(array, filters){  
  const myObj = {};
  for (let i = 0; i < myFiles.length; i++) {
    myObj[myFiles[i].id] = myFiles[i];
    //the filtered data is used again as each subsequent letter is typed
    //we need to delete the ._used property, otherwise the logic below
    //in the while loop (which checks for parents) doesn't work:
    delete myObj[myFiles[i].id]._used;
  }

  const filteredChildrenAndParents = [];
  for (let i = 0; i < myFiles.length; i++) {
    const a = myFiles[i];
    let matchFilter = true;
    for (let key in columnFilters) {
      if (a.hasOwnProperty(key)) {
        const strContains = String(a[key]).includes(columnFilters[key]);
        const re = /(?:(?:^|[-+<>!=_*/])(?:\s*-?\d+(\.\d+)?(?:[eE][+-<>!=]?\d+)?\s*))+$/;
        const comparison =
          re.test(columnFilters[key]) &&
          Function("return " + a[key] + columnFilters[key])();
        if (strContains || comparison) {
          //don't return true as need to check other keys in columnFilters
        } else {
          matchFilter = false;
          continue;
        }
      } else {
        matchFilter = false;
        continue;
      }
    }
    if (matchFilter) {
      const len = filteredChildrenAndParents.length;
      filteredChildrenAndParents.splice(len, 0, a);
      let parent = myObj[a.parentId] || false;
      while (parent) {
        //only add parent if not already added:
        parent._used || filteredChildrenAndParents.splice(len, 0, parent);
        //mark each parent as used so not used again:
        myObj[parent.id]._used = true;
        //try to find parent of the current parent, if exists:
        parent = myObj[parent.parentId] || false;
      }
    }
  }
  return filteredChildrenAndParents;
}
Run Code Online (Sandbox Code Playgroud)
.as-console-wrapper { max-height: 100% !important; top: 0; }
Run Code Online (Sandbox Code Playgroud)