从 readr::read_csv 中读取的数据中删除属性

con*_*nor 4 r readr

readr::read_csv添加编辑数据时不会更新的属性。例如,

\n\n
library(\'tidyverse\')\ndf <- read_csv("A,B,C\\na,1,x\\nb,1,y\\nc,1,z")\n\n# Remove columns with only one distinct entry\nno_info <- df %>% sapply(n_distinct)\nno_info <- names(no_info[no_info==1]) \n\ndf2 <- df %>% \n  select(-no_info)\n
Run Code Online (Sandbox Code Playgroud)\n\n

检查结构,我们看到 B 列仍然存在于 的属性中df2

\n\n
> str(df)\nClasses \xe2\x80\x98spec_tbl_df\xe2\x80\x99, \xe2\x80\x98tbl_df\xe2\x80\x99, \xe2\x80\x98tbl\xe2\x80\x99 and \'data.frame\':    3 obs. of  3 variables:\n $ A: chr  "a" "b" "c"\n $ B: num  1 1 1\n $ C: chr  "x" "y" "z"\n - attr(*, "spec")=\n  .. cols(\n  ..   A = col_character(),\n  ..   B = col_double(),\n  ..   C = col_character()\n  .. )\n> str(df2)\nClasses \xe2\x80\x98spec_tbl_df\xe2\x80\x99, \xe2\x80\x98tbl_df\xe2\x80\x99, \xe2\x80\x98tbl\xe2\x80\x99 and \'data.frame\':    3 obs. of  2 variables:\n $ A: chr  "a" "b" "c"\n $ C: chr  "x" "y" "z"\n - attr(*, "spec")=\n  .. cols(\n  ..   A = col_character(),\n  ..   B = col_double(),\n  ..   C = col_character()\n  .. )\n> attributes(df2)\n$class\n[1] "spec_tbl_df" "tbl_df"      "tbl"         "data.frame" \n\n$row.names\n[1] 1 2 3\n\n$spec\ncols(\n  A = col_character(),\n  B = col_double(),\n  C = col_character()\n)\n\n$names\n[1] "A" "C"\n\n> \n
Run Code Online (Sandbox Code Playgroud)\n\n

如何删除列(或对数据的任何其他更新)并使更改准确反映在新的数据结构和属性中?

\n

mt1*_*022 5

您可以通过将其设置为来删除列规范NULL

\n\n
> attr(df, \'spec\') <- NULL\n> str(df)\nClasses \xe2\x80\x98tbl_df\xe2\x80\x99, \xe2\x80\x98tbl\xe2\x80\x99 and \'data.frame\':   3 obs. of  3 variables:\n $ A: chr  "a" "b" "c"\n $ B: int  1 1 1\n $ C: chr  "x" "y" "z"\n> df\n# A tibble: 3 x 3\n  A         B C    \n  <chr> <int> <chr>\n1 a         1 x    \n2 b         1 y    \n3 c         1 z    \n
Run Code Online (Sandbox Code Playgroud)\n

  • @康纳是的。我做了一些搜索,但没有找到任何更新它的功能。然而,列规格并未在其他地方使用。它告诉您“read_csv”在读取过程中如何解析每一列。AFAIK,丢弃它们是安全的,并且不太可能产生任何不良后果。 (2认同)