使用交叉表(也称为“数据透视表”)汇总哈希数组

Tom*_*hes 1 ruby aggregate

我有一个像这样的哈希数组:

my_array_of_hashes = [
  { :customer=>"Matthew",
    :fruit=>"Apples",
    :quantity=>2,
    :order_month => "January"
  },
  { :customer => "Philip",
    :fruit => "Oranges",
    :quantity => 3,
    :order_month => "July"
  },
  { :customer => "Matthew",
    :fruit => "Oranges",
    :quantity => 1,
    :order_month => "March"
  },
  { :customer => "Charles",
    :fruit => "Pears",
    :quantity => 3,
    :order_month => "January"
  },
  { :customer => "Philip",
    :fruit => "Apples",
    :quantity => 2,
    :order_month => "April"
  },
  { :customer => "Philip",
    :fruit => "Oranges",
    :quantity => 1,
    :order_month => "July"
  }
]
Run Code Online (Sandbox Code Playgroud)

我想以一种row-column格式总结一下。使用我的样本数据,这意味着对:quantity值进行求和,每个唯一客户一行,每个唯一水果一列。

-----------------------------------
Customer | Apples | Oranges | Pears
Charles  |        |         |   3            
Matthew  |   2    |    1    |
Philip   |   2    |    4    |
-----------------------------------
Run Code Online (Sandbox Code Playgroud)

这感觉像是可以用 Ruby 解决的问题enumerables,但我不知道如何解决。

Car*_*and 5

创建构建表所需的数组

我将构造三个数组,其中包含行标签 ( customers)、列标签 ( fruit) 和表中的值 ( values)。

arr_of_hash = [
  {:customer=>"Matthew", :fruit=>"Apples",  :quantity=>2, :order_month=>"January"},
  {:customer=>"Philip",  :fruit=>"Oranges", :quantity=>3, :order_month=>"July"   },
  {:customer=>"Matthew", :fruit=>"Oranges", :quantity=>1, :order_month=>"March"  },
  {:customer=>"Charles", :fruit=>"Pears",   :quantity=>3, :order_month=>"January"},
  {:customer=>"Philip",  :fruit=>"Apples",  :quantity=>2, :order_month=>"April"  }, 
  {:customer=>"Philip",  :fruit=>"Oranges", :quantity=>1, :order_month=>"July"   }
]

customers = arr_of_hash.flat_map { |g| g[:customer] }.uniq.sort
  #=> ["Charles", "Matthew", "Philip"]
fruit = arr_of_hash.flat_map { |g| g[:fruit] }.uniq.sort
  #=> ["Apples", "Oranges", "Pears"]
h = customers.each_with_object({}) { |cust,h| h[cust] = fruit.product([0]).to_h }
  #=> {"Charles"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>0},
  #    "Matthew"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>0},
  #    "Philip" =>{"Apples"=>0, "Oranges"=>0, "Pears"=>0}} 
arr_of_hash.each do |g|
  customer = g[:customer]
  h[customer][g[:fruit]] += g[:quantity]
end
values = h.map { |_,v| v.values }
  #=> [[0, 0, 3],
  #    [2, 1, 0],
  #    [2, 4, 0]] 
Run Code Online (Sandbox Code Playgroud)

请注意,紧接之前values = h.map { |_,v| v.values }

  h #=> {"Charles"=>{"Apples"=>0, "Oranges"=>0, "Pears"=>3},
  #      "Matthew"=>{"Apples"=>2, "Oranges"=>1, "Pears"=>0},
  #      "Philip" =>{"Apples"=>2, "Oranges"=>4, "Pears"=>0}} 
Run Code Online (Sandbox Code Playgroud)

打印表格

def print_table(row_labels_title, row_labels, col_labels, values, gap_size=3)
  col_width = [values.flatten.max.size, col_labels.max_by(&:size).size].max + gap_size
  row_labels_width = [row_labels_title.size, row_labels.max_by(&:size).size].max +
    gap_size
  horiz_line = '-'*(row_labels_width + col_labels.size * col_width + col_labels.size)
  puts horiz_line
  print row_labels_title.ljust(row_labels_width)
  col_labels.each do |s|
    print "|#{s.center(col_width)}"
  end
  puts
  row_labels.each do |row_label|
    print row_label.ljust(row_labels_width)
    vals = values.shift
    col_labels.each do |col_label|
      print "|#{vals.shift.to_s.center(col_width)}"
    end
    puts
  end
  puts horiz_line
end

print_table("Customers", customers, fruit, values, 2)
--------------------------------------------
Customers  |  Apples  | Oranges  |  Pears   
Charles    |    0     |    0     |    3     
Matthew    |    2     |    1     |    0     
Philip     |    2     |    4     |    0     
--------------------------------------------
Run Code Online (Sandbox Code Playgroud)

  • @DiodonHystrix,是的,当然。谢谢。对于早期版本的 Ruby,以 `.to_h` 结尾的行可以替换为 `h = Hash[customers.each_with_object({}) { |cust,h| h[cust] = Fruit.product([0])] }`. (2认同)