xav*_*xav 6 r reshape dataframe
我目前正在学习使用data.frame,并对如何重新排序它们感到困惑.
目前,我有一个data.frame显示:
或者在视觉上是这样的:
+---+-----------+-------+----------+--+
| | Shop.Name | Items | Product | |
+---+-----------+-------+----------+--+
| 1 | Shop1 | 2 | Product1 | |
| 2 | Shop1 | 4 | Product2 | |
| 3 | Shop2 | 3 | Product1 | |
| 4 | Shop3 | 2 | Product1 | |
| 5 | Shop3 | 1 | Product4 | |
+---+-----------+-------+----------+--+
Run Code Online (Sandbox Code Playgroud)
我想要实现的是以下"以商店为中心"的结构:
如果没有特定商店/产品的行(因为没有销售),我想创建一个0.
要么
+---+-------+-------+-------+-------+-------+-----+--+--+
| | Shop | Prod1 | Prod2 | Prod3 | Prod4 | ... | | |
+---+-------+-------+-------+-------+-------+-----+--+--+
| 1 | Shop1 | 2 | 4 | 0 | 0 | ... | | |
| 2 | Shop2 | 3 | 0 | 0 | 0 | ... | | |
| 3 | Shop3 | 2 | 0 | 0 | 1 | ... | | |
+---+-------+-------+-------+-------+-------+-----+--+--+
Run Code Online (Sandbox Code Playgroud)
A5C*_*2T1 12
到目前为止,答案在某种程度上起作用,但没有完全回答你的问题.特别是,它们没有解决没有商店销售特定产品的情况的问题.根据您的示例输入和所需的输出,没有商店出售"Product3".实际上,"Product3"甚至没有出现在您的来源中data.frame.此外,它们没有解决每个商店+产品组合具有多个行的可能情况.
这是您的数据的修改版本和目前为止的两个解决方案.我为"Shop1"和"Product1"的组合添加了另一行.请注意,我已将您的产品转换为factor包含变量可以采用的级别的变量,即使这些级别实际上都没有该级别.
mydf <- data.frame(
Shop.Name = c("Shop1", "Shop1", "Shop2", "Shop3", "Shop3", "Shop1"),
Items = c(2, 4, 3, 2, 1, 2),
Product = factor(
c("Product1", "Product2", "Product1", "Product1", "Product4", "Product1"),
levels = c("Product1", "Product2", "Product3", "Product4")))
Run Code Online (Sandbox Code Playgroud)
dcast 来自"reshape2"
library(reshape2)
dcast(mydf, formula = Shop.Name ~ Product, value="Items", fill=0)
# Using Product as value column: use value.var to override.
# Aggregation function missing: defaulting to length
# Error in .fun(.value[i], ...) :
# 2 arguments passed to 'length' which requires 1
Run Code Online (Sandbox Code Playgroud)
世界卫生大会?突然不起作用.改为:
dcast(mydf, formula = Shop.Name ~ Product,
fill = 0, value.var = "Items",
fun.aggregate = sum, drop = FALSE)
# Shop.Name Product1 Product2 Product3 Product4
# 1 Shop1 4 4 0 0
# 2 Shop2 3 0 0 0
# 3 Shop3 2 0 0 1
Run Code Online (Sandbox Code Playgroud)我们是老学校.cast来自"重塑"
library(reshape)
cast(mydf, formula = Shop.Name ~ Product, value="Items", fill=0)
# Aggregation requires fun.aggregate: length used as default
# Shop.Name Product1 Product2 Product4
# 1 Shop1 2 1 0
# 2 Shop2 1 0 0
# 3 Shop3 1 0 1
Run Code Online (Sandbox Code Playgroud)
呃.不是你想要的......试试这个:
cast(mydf, formula = Shop.Name ~ Product,
value = "Items", fill = 0,
add.missing = TRUE, fun.aggregate = sum)
# Shop.Name Product1 Product2 Product3 Product4
# 1 Shop1 4 4 0 0
# 2 Shop2 3 0 0 0
# 3 Shop3 2 0 0 1
Run Code Online (Sandbox Code Playgroud)让我们回到基础.xtabs来自基地R.
xtabs(Items ~ Shop.Name + Product, mydf)
# Product
# Shop.Name Product1 Product2 Product3 Product4
# Shop1 4 4 0 0
# Shop2 3 0 0 0
# Shop3 2 0 0 1
Run Code Online (Sandbox Code Playgroud)
或者,如果你喜欢data.frame(请注意,您的"Shop.Name"变量已经被转换成row.names的data.frame):
as.data.frame.matrix(xtabs(Items ~ Shop.Name + Product, mydf))
# Product1 Product2 Product3 Product4
# Shop1 4 4 0 0
# Shop2 3 0 0 0
# Shop3 2 0 0 1
Run Code Online (Sandbox Code Playgroud)| 归档时间: |
|
| 查看次数: |
3448 次 |
| 最近记录: |