使用 purrr:map() 从不规则列表中提取数据

Boa*_*ado 5 dictionary r list purrr

给定一个包含多个元素的列表,目标是将它们放入数据框中。这map_df对于常规列表非常有用,但对于不规则列表会给出错误。

\n\n

例如,按照教程进行以下操作:

\n\n
library(purrr)\nlibrary(repurrrsive) # The data comes from this package\n\n\nmap_dfr(got_chars, magrittr::extract, c("name", "culture", "gender", "id", "born", "alive"))\n\n A tibble: 30 x 6\n   name               culture  gender    id born                                   alive\n   <chr>              <chr>    <chr>  <int> <chr>                                  <lgl>\n 1 Theon Greyjoy      Ironborn Male    1022 In 278 AC or 279 AC, at Pyke           TRUE \n 2 Tyrion Lannister   ""       Male    1052 In 273 AC, at Casterly Rock            TRUE \n 3 Victarion Greyjoy  Ironborn Male    1074 In 268 AC or before, at Pyke           TRUE \n 4 Will               ""       Male    1109 ""                                     FALSE\n 5 Areo Hotah         Norvoshi Male    1166 In 257 AC or before, at Norvos         TRUE \n 6 Chett              ""       Male    1267 At Hag\'s Mire                          FALSE\n 7 Cressen            ""       Male    1295 In 219 AC or 220 AC                    FALSE\n 8 Arianne Martell    Dornish  Female   130 In 276 AC, at Sunspear                 TRUE \n 9 Daenerys Targaryen Valyrian Female  1303 In 284 AC, at Dragonstone              TRUE \n10 Davos Seaworth     Westeros Male    1319 In 260 AC or before, at King\'s Landing TRUE \n# \xe2\x80\xa6 with 20 more rows\n\n
Run Code Online (Sandbox Code Playgroud)\n\n

但是,如果从列表中删除一个元素,该函数就会失败。

\n\n
got_chars[[1]]["gender"]<-NULL\nmap_dfr(got_chars, magrittr::extract, c("name", "culture", "gender", "id", "born", "alive"))\n\n#Error: Argument 3 is a list, must contain atomic vectors\n\n
Run Code Online (Sandbox Code Playgroud)\n\n

所需的输出将是NA缺失元素的值。一个优雅的解决方案是什么?我怀疑解决方案包括使用purrr:possibly(),但我还没有弄清楚。

\n

jen*_*yan 6

tidyr 的开发版本具有强大的新“取消嵌套”功能,它们可以处理这些有问题的数据(选项 1)。另一种方法是逐列解决问题,它允许您使用 的.default参数purrr::map(),它提供了一个用于缺失元素的值(选项 2)。

\n\n
library(tidyverse)   # purrr, tidyr, and dplyr\nlibrary(repurrrsive) # The data comes from this package\n\ngot_chars_mutilated <- got_chars\ngot_chars_mutilated[[1]]["gender"] <- NULL\n\n# original problem\nmap_dfr(\n  got_chars_mutilated,\n  magrittr::extract,\n  c("name", "culture", "gender", "id", "born", "alive")\n)\n#> Error: Argument 3 is a list, must contain atomic vectors\n\n# Option 1:\n# expanded unnest_*() functions coming soon in tidyr\npackageVersion("tidyr")\n#> [1] \'0.8.99.9000\'\n\n# automatic unnesting leads to ... unnest_wider()\ntibble(got = got_chars_mutilated) %>% \n  unnest_auto(got)\n#> Using `unnest_wider(got)`; elements have {n_common} names in common\n#> # A tibble: 30 x 18\n#>    url      id name  culture born  died  alive titles aliases father mother\n#>    <chr> <int> <chr> <chr>   <chr> <chr> <lgl> <list> <list>  <chr>  <chr> \n#>  1 http\xe2\x80\xa6  1022 Theo\xe2\x80\xa6 Ironbo\xe2\x80\xa6 In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  2 http\xe2\x80\xa6  1052 Tyri\xe2\x80\xa6 ""      In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  3 http\xe2\x80\xa6  1074 Vict\xe2\x80\xa6 Ironbo\xe2\x80\xa6 In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  4 http\xe2\x80\xa6  1109 Will  ""      ""    In 2\xe2\x80\xa6 FALSE <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  5 http\xe2\x80\xa6  1166 Areo\xe2\x80\xa6 Norvos\xe2\x80\xa6 In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  6 http\xe2\x80\xa6  1267 Chett ""      At H\xe2\x80\xa6 In 2\xe2\x80\xa6 FALSE <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  7 http\xe2\x80\xa6  1295 Cres\xe2\x80\xa6 ""      In 2\xe2\x80\xa6 In 2\xe2\x80\xa6 FALSE <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  8 http\xe2\x80\xa6   130 Aria\xe2\x80\xa6 Dornish In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#>  9 http\xe2\x80\xa6  1303 Daen\xe2\x80\xa6 Valyri\xe2\x80\xa6 In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#> 10 http\xe2\x80\xa6  1319 Davo\xe2\x80\xa6 Wester\xe2\x80\xa6 In 2\xe2\x80\xa6 ""    TRUE  <chr \xe2\x80\xa6 <chr [\xe2\x80\xa6 ""     ""    \n#> # \xe2\x80\xa6 with 20 more rows, and 7 more variables: spouse <chr>,\n#> #   allegiances <list>, books <list>, povBooks <list>, tvSeries <list>,\n#> #   playedBy <list>, gender <chr>\n\n# let\'s do it again, calling the proper function, and inspect `gender`\ntibble(got = got_chars_mutilated) %>% \n  unnest_wider(got) %>% \n  pull(gender)\n#>  [1] NA       "Male"   "Male"   "Male"   "Male"   "Male"   "Male"  \n#>  [8] "Female" "Female" "Male"   "Female" "Male"   "Female" "Male"  \n#> [15] "Male"   "Male"   "Female" "Female" "Female" "Male"   "Male"  \n#> [22] "Male"   "Male"   "Male"   "Male"   "Female" "Male"   "Male"  \n#> [29] "Male"   "Female"\n\n# Option 2:\n# attack this column-wise\n# mapping the names gives access to the `.default` argument for missing elements\nc("name", "culture", "gender", "id", "born", "alive") %>% \n  set_names() %>% \n  map(~ map(got_chars_mutilated, .x, .default = NA)) %>%\n  map(simplify) %>% \n  as_tibble()\n#> # A tibble: 30 x 6\n#>    name           culture  gender      id born                        alive\n#>    <chr>          <chr>    <list>   <int> <chr>                       <lgl>\n#>  1 Theon Greyjoy  Ironborn <lgl [1\xe2\x80\xa6  1022 In 278 AC or 279 AC, at Py\xe2\x80\xa6 TRUE \n#>  2 Tyrion Lannis\xe2\x80\xa6 ""       <chr [1\xe2\x80\xa6  1052 In 273 AC, at Casterly Rock TRUE \n#>  3 Victarion Gre\xe2\x80\xa6 Ironborn <chr [1\xe2\x80\xa6  1074 In 268 AC or before, at Py\xe2\x80\xa6 TRUE \n#>  4 Will           ""       <chr [1\xe2\x80\xa6  1109 ""                          FALSE\n#>  5 Areo Hotah     Norvoshi <chr [1\xe2\x80\xa6  1166 In 257 AC or before, at No\xe2\x80\xa6 TRUE \n#>  6 Chett          ""       <chr [1\xe2\x80\xa6  1267 At Hag\'s Mire               FALSE\n#>  7 Cressen        ""       <chr [1\xe2\x80\xa6  1295 In 219 AC or 220 AC         FALSE\n#>  8 Arianne Marte\xe2\x80\xa6 Dornish  <chr [1\xe2\x80\xa6   130 In 276 AC, at Sunspear      TRUE \n#>  9 Daenerys Targ\xe2\x80\xa6 Valyrian <chr [1\xe2\x80\xa6  1303 In 284 AC, at Dragonstone   TRUE \n#> 10 Davos Seaworth Westeros <chr [1\xe2\x80\xa6  1319 In 260 AC or before, at Ki\xe2\x80\xa6 TRUE \n#> # \xe2\x80\xa6 with 20 more rows\n
Run Code Online (Sandbox Code Playgroud)\n\n

reprex 包(v0.3.0.9000)创建于 2019-08-15

\n