为给定行中的每个唯一值添加列

Question

为给定行中的每个唯一值添加列

我正在尝试将当前数据集的格式更改为每行 1 个用户的格式，并将“颜色”和“食物”列中的所有唯一值（动态数量的值）拆分为各自包含“是”和“否”的列。用户有一个唯一的ID。

Current format: 
ID | Name  | Color  | Food 
1  | John  | Blue   | Pizza
1  | John  | Red    | Pizza
1  | John  | Yellow | Pizza
1  | John  | Blue   | Ice Cream
1  | John  | Red    | Ice Cream
1  | John  | Yellow | Ice Cream
2  | Kelly | Blue   | Pizza
2  | Kelly | Red    | Pizza


Desired format: 
ID | Name  | Color_Blue | Color_Red | Color_Yellow | Food_Pizza | Food_Ice Cream |
1  | John  | Yes        | Yes       | Yes          | Yes        | Yes            |
2  | Kelly | Yes        | Yes       | No           | Yes        | No             |

Run Code Online (Sandbox Code Playgroud)

Answer 1

Jon*_*ing 8

library(dplyr); library(tidyr)
df %>% 
  pivot_longer(-c(ID:Name)) %>%
  unite("col", c(name, value)) %>%
  distinct(ID, Name, col) %>%
  mutate(val = "Yes") %>%
  pivot_wider(names_from = col, values_from = "val", values_fill = "No")

# A tibble: 2 x 7
  ID    Name  Color_Blue Food_Pizza Color_Red Color_Yellow `Food_Ice Cream`
  <chr> <chr> <chr>      <chr>      <chr>     <chr>        <chr>           
1 1     John  Yes        Yes        Yes       Yes          Yes             
2 2     Kelly Yes        Yes        Yes       No           No

Run Code Online (Sandbox Code Playgroud)

如果您想要基本 R 等效项，这里有一个使用相同步骤的方法。（有人可以帮我弄清楚如何删除行名称和附加到最终列名称的“val。”吗？）

df2 <- reshape(df, 
        direction = "long", 
        varying = c("Color", "Food"),
        v.names = "Value",
        timevar = "col_name",
        times = c("Color", "Food"))
df2$col = paste(df2$col_name, df2$Value, sep = "_")

df3 <- unique(df2[c("ID", "Name", "col")])
df3$val = "Yes"

df4 <- reshape(df3,
               direction = "wide",
               idvar = c("ID", "Name"),
               timevar = "col")
df4[is.na(df4)] <- "No"

> df4
        ID  Name val.Color_Blue val.Color_Red val.Color_Yellow val.Food_Pizza val.Food_Ice Cream
1.Color  1  John            Yes           Yes              Yes            Yes                Yes
7.Color  2 Kelly            Yes           Yes               No            Yes                 No

Run Code Online (Sandbox Code Playgroud)

样本数据

df <- tribble(~ID , ~Name  , ~Color  , ~Food,
"1"  , "John",  "Blue",    "Pizza",
"1"  , "John" , "Red",    "Pizza",
"1"  , "John",  "Yellow",  "Pizza",
"1"  , "John" , "Blue",   "Ice Cream",
"1"  , "John",  "Red",    "Ice Cream",
"1"  , "John" , "Yellow", "Ice Cream",
"2"  , "Kelly", "Blue",    "Pizza",
"2"  , "Kelly", "Red",    "Pizza")

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，3 月前
查看次数：	1208 次
最近记录：	4 年，3 月前