Esb*_*rdt 5 directory tree recursion r summarization
我使用 data.tree 结构来汇总文件文件夹中的各种信息。在每个文件夹中我都有许多文件(值),我需要为每个文件夹做的是总结该文件夹+所有子文件夹包含多少个文件。
\n\n示例数据:
\n\nlibrary(data.tree)\ndata <- data.frame(pathString = c("MainFolder",\n "MainFolder/Folder1",\n "MainFolder/Folder2",\n "MainFolder/Folder3",\n "MainFolder/Folder1/Subfolder1",\n "MainFolder/Folder1/Subfolder2"),\n Value = c(1,1,5,2,4,10))\ntree <- as.Node(data, Value)\nprint(tree, "Value")\n levelName Value\n1 MainFolder 1\n2 \xc2\xa6--Folder1 1\n3 \xc2\xa6 \xc2\xa6--Subfolder1 4\n4 \xc2\xa6 \xc2\xb0--Subfolder2 10\n5 \xc2\xa6--Folder2 5\n6 \xc2\xb0--Folder3 2\nRun Code Online (Sandbox Code Playgroud)\n\n我目前对问题的解决方案非常缓慢:
\n\n# Function to sum up file counts pr folder + subfolders\ntotal_count <- function(node) {\n results <- sum(as.data.frame(print(node, "Value"))$Value)\n return(results)\n}\n\n# Summing up file counts pr folder + subfolders\ntree$Do(function(node) node$Value_by_folder <- total_count(node))\n\n\n# Results\nprint(tree, "Value", "Value_by_folder")\n levelName Value Value_by_folder\n1 MainFolder 1 23\n2 \xc2\xa6--Folder1 1 15\n3 \xc2\xa6 \xc2\xa6--Subfolder1 4 4\n4 \xc2\xa6 \xc2\xb0--Subfolder2 10 10\n5 \xc2\xa6--Folder2 5 5\n6 \xc2\xb0--Folder3 2 2\nRun Code Online (Sandbox Code Playgroud)\n\n您对如何更有效地做到这一点有什么建议吗?我一直在尝试构建一个递归方法,并在节点上使用函数“isLeaf”和“children”,但未能使其工作。
\n这是执行此操作的有效方法。它使用 data.tree API 并将值存储在树中:
MyAggregate <- function(node) {
if (node$isLeaf) return (node$Value)
sum(Get(node$children, "Value_by_folder")) + node$Value
}
tree$Do(function(node) node$Value_by_folder <- MyAggregate(node), traversal = "post-order")
Run Code Online (Sandbox Code Playgroud)