将行标签嵌套到列

Ann*_*e C 1 r dplyr data-cleaning

我有一个 CSV,它似乎是 Excel 数据透视表的输出,其名称嵌套为重复组的行标签。我想清理数据,以便行标签在单独的列中重复,最好使用 dplyr。

数据如下:

dd <- data.frame(variables = c("Abington", "Number of Sales","YTD Number of Sales","Median Sale Price","YTD Median Sale Price", "Acton", "Number of Sales","YTD Number of Sales","Median Sale Price","YTD Median Sale Price"), Year1 = c(" ", 16, 50,415000,413500," ",23,60,799900,704000), Year2 = c(" ",8,13,583000,575000," ",9,39,995000,800000))

dd

variables              Year1   Year2
Abington              
Number of Sales        16      8
YTD Number of Sales    50      13
Median Sale Price      415000  583000
YTD Median Sale Price  413500  575000
Acton              
Number of Sales        23      9
YTD Number of Sales    60      39
Median Sale Price      799900  995000
YTD Median Sale Price  704000  800000
Run Code Online (Sandbox Code Playgroud)

我希望它看起来像这样:

Town          variables               Year1  Year2           
Abington      Number of Sales         16     8
Abington      YTD Number of Sales     50     13
Abington      Median Sale Price       415000 583000
Abington      YTD Median Sale Price   413500 575000          
Acton         Number of Sales         23      9
Acton         YTD Number of Sales     60     39
Acton         Median Sale Price       799900 995000
Acton         YTD Median Sale Price   704000 800000
Run Code Online (Sandbox Code Playgroud)

koo*_*ees 5

我们可以使用tidyverse(或dplyr& tidyr) 来实现:

library(tidyverse)

dd %>%
  mutate(Town = ifelse(Year1 == " " & Year2 == " ", variables, NA)) %>%
  fill(Town, .direction = "down") %>%
  filter(Town != variables) %>%
  relocate(Town)
Run Code Online (Sandbox Code Playgroud)

导致:

      Town             variables  Year1  Year2
1 Abington       Number of Sales     16      8
2 Abington   YTD Number of Sales     50     13
3 Abington     Median Sale Price 415000 583000
4 Abington YTD Median Sale Price 413500 575000
5    Acton       Number of Sales     23      9
6    Acton   YTD Number of Sales     60     39
7    Acton     Median Sale Price 799900 995000
8    Acton YTD Median Sale Price 704000  8e+05
Run Code Online (Sandbox Code Playgroud)

Year1需要注意的是,和处的空值Year2实际上是空格 (" "),而不是空字符串或 NA。