use*_*611 3 sql r plyr sqldf dplyr
我有一个有两列的表
Order | CustomerID
1. A | C1
2. B | C1
3. C | C1
4. D | C2
5. B | C3
6. C | C3
7. D | C4
Run Code Online (Sandbox Code Playgroud)
它是一张很长的桌子.我想要一个显示的输出
C1 | C3 | 2 #Customer C1 and Customer C3 share 2 orders (i.e. orders, B & C)
C1 | C2 | 0 #Customer C1 and Customer C2 share 0 orders
C2 | C4 | 1 #Customer C2 and Customer C4 share 1 orders (i.e. order, D)
C2 | C3 | 0 Customer C2 and Customer C3 share 0 orders
Run Code Online (Sandbox Code Playgroud)
select
a.CustomerId
, b.CustomerId
, sum(case when a.[Order] = b.[Order] then 1 else 0 end) as SharedOrders
from t as a
inner join t as b
on a.CustomerId < b.CustomerId
group by a.CustomerId, b.CustomerId
Run Code Online (Sandbox Code Playgroud)
测试设置:http://rextester.com/ISSCL35174
收益:
+------------+------------+--------------+
| CustomerId | CustomerId | SharedOrders |
+------------+------------+--------------+
| C1 | C2 | 0 |
| C1 | C3 | 2 |
| C2 | C3 | 0 |
| C1 | C4 | 0 |
| C2 | C4 | 1 |
| C3 | C4 | 0 |
+------------+------------+--------------+
Run Code Online (Sandbox Code Playgroud)
要返回共享订单:
select a.CustomerId
, b.CustomerId
, count(*) as SharedOrders
from t as a
inner join t as b
on a.CustomerId < b.CustomerId
and a.[Order] = b.[Order]
group by a.CustomerId, b.CustomerId
Run Code Online (Sandbox Code Playgroud)
收益:
+------------+------------+--------------+
| CustomerId | CustomerId | SharedOrders |
+------------+------------+--------------+
| C1 | C3 | 2 |
| C2 | C4 | 1 |
+------------+------------+--------------+
Run Code Online (Sandbox Code Playgroud)