R中的路线产生的总收入

use*_*360 5 r permutation

我有一个数据集,其来源(“ from”),目的地(“ to”)和价格如下:

from    to  price
A       B   28109
A       D   2356
A       E   4216
B       A   445789
B       D   123
D       A   45674
D       B   1979
Run Code Online (Sandbox Code Playgroud)

我也想考虑返回路线的总价。例如,A-B由以下数据组成:

from    to  price
  A     B   28109
  B     A   445789
Run Code Online (Sandbox Code Playgroud)

然后,取价格的总和(28109 + 445789)。输出将如下所示:

route   total_price
A - B   473898
A - D   48030
A - E   4216
B - D   2102
Run Code Online (Sandbox Code Playgroud)

我当时想运行一个for循环,但是我的数据量很大(800k行)。任何帮助将不胜感激。非常感谢。

Ice*_*can 6

您可以通过对“从”到“对”进行排序,然后对已排序的对进行分组并求和来完成此操作。

编辑:请参阅@JasonAizkalns的tidyverse等效答案

library(data.table)
setDT(df)

df[, .(total_price = sum(price))
   , by = .(route = paste(pmin(from, to), '-', pmax(from, to)))]

#    route total_price
# 1: A - B      473898
# 2: A - D       48030
# 3: A - E        4216
# 4: B - D        2102
Run Code Online (Sandbox Code Playgroud)

@Frank notes that this result hides the fact that route "A - E" is not complete, in the sense that there is no row of the original data with from == 'E' and to == 'A'. He's offered a good way of capturing that info (and more), and I've added some others below.

df[, .(total_price = sum(price), complete = .N > 1)
   , by = .(route = paste(pmin(from, to), '-', pmax(from, to)))]

#    route total_price complete
# 1: A - B      473898     TRUE
# 2: A - D       48030     TRUE
# 3: A - E        4216    FALSE
# 4: B - D        2102     TRUE

df[, .(total_price = sum(price), paths_counted = .(paste(from, '-', to)))
   , by = .(route = paste(pmin(from, to), '-', pmax(from, to)))]

#    route total_price paths_counted
# 1: A - B      473898   A - B,B - A
# 2: A - D       48030   A - D,D - A
# 3: A - E        4216         A - E
# 4: B - D        2102   B - D,D - B
Run Code Online (Sandbox Code Playgroud)

Data used

df <- fread('
from    to  price
A       B   28109
A       D   2356
A       E   4216
B       A   445789
B       D   123
D       A   45674
D       B   1979')
Run Code Online (Sandbox Code Playgroud)