在dplyr中覆盖"未显示的变量",以显示df中的所有列

Hug*_*ugh 34 r options output-formatting displayformat dplyr

当我在本地数据框中有一个列时,有时我得到的消息Variables not shown就像这个(荒谬的)示例只需要足够的列.

library(dplyr)
library(ggplot2) # for movies

movies %.% 
 group_by(year) %.% 
 summarise(Length = mean(length), Title = max(title), 
  Dramaz = sum(Drama), Actionz = sum(Action), 
  Action = sum(Action), Comedyz = sum(Comedy)) %.% 
 mutate(Year1 = year + 1)

   year    Length                       Title Dramaz Actionz Action Comedyz
1  1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
2  1894  1.000000           Sioux Ghost Dance      0       0      0       0
3  1902  3.555556     Voyage dans la lune, Le      1       0      0       2
4  1893  1.000000            Blacksmith Scene      0       0      0       0
5  1912 24.382353            Unseen Enemy, An     22       0      0       4
6  1922 74.192308      Trapped by the Mormons     20       0      0      16
7  1895  1.000000                 Photographe      0       0      0       0
8  1909  9.266667              What Drink Did     14       0      0       7
9  1900  1.437500      Uncle Josh's Nightmare      2       0      0       5
10 1919 53.461538     When the Clouds Roll by     17       2      2      29
..  ...       ...                         ...    ...     ...    ...     ...
Variables not shown: Year1 (dbl)
Run Code Online (Sandbox Code Playgroud)

我想看看Year1!如何查看所有列,最好是默认情况下.

Mar*_*ham 48

(现在)有一种覆盖打印出来的列宽度的方法.如果你运行这个命令一切都会好的

options(dplyr.width = Inf)
Run Code Online (Sandbox Code Playgroud)

在这里写了.

  • 这是一个不错的选项,但如果列数太多,则不太有用.它发生在我的df中,显示了大约200列,但行和列之间的顺序丢失了.此外,由于字符太多,大多数行在某些时候被截断.我想共享命令以恢复默认行为,即:'options(dplyr.width = NULL)' (3认同)

Rom*_*ois 27

你可能会喜欢glimpse:

> movies %>%
+  group_by(year) %>%
+  summarise(Length = mean(length), Title = max(title),
+   Dramaz = sum(Drama), Actionz = sum(Action),
+   Action = sum(Action), Comedyz = sum(Comedy)) %>%
+  mutate(Year1 = year + 1) %>% glimpse()
Variables:
$ year    (int) 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902,...
$ Length  (dbl) 1.000000, 1.000000, 1.000000, 1.307692, 1.000000, 1.000000,...
$ Title   (chr) "Blacksmith Scene", "Sioux Ghost Dance", "Photographe", "Ve...
$ Dramaz  (int) 0, 0, 0, 1, 0, 1, 2, 2, 5, 1, 2, 3, 4, 5, 1, 8, 14, 14, 14,...
$ Actionz (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Action  (int) 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 0, 0, 0, 0, 3, 0, 0, 1, 0,...
$ Comedyz (int) 0, 0, 0, 1, 2, 2, 1, 5, 8, 2, 8, 10, 6, 2, 6, 8, 7, 2, 2, 4...
$ Year1   (dbl) 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901, 1902, 1903,...NULL
Run Code Online (Sandbox Code Playgroud)

  • +1发现'glimpse`.就个人而言,我喜欢看所有列的主要原因是检查我添加的列(通过汇总或变异)是否实际完成了我的预期.所以"瞥见"对此并不恰当. (4认同)
  • 对于最新的dplyr版本,请使用%>%而不是%.% (2认同)

Pau*_*tra 8

dplyr有自己的dplyr对象打印功能.在这种情况下,作为操作结果的对象是tbl_df.然后是匹配的打印功能dplyr:::print.tbl_df.这揭示trunc_mat了负责打印内容的功能,包括哪些变量.

可悲的是,dplyr:::print.tbl_df不传递任何参数,trunc_mat并且trunc_mat也不支持选择哪些变量显示(仅适用于有多少行).解决方法是将dplyr的结果转换为a data.frame并使用head:

res = movies %.% 
 group_by(year) %.% 
 summarise(Length = mean(length), Title = max(title), 
  Dramaz = sum(Drama), Actionz = sum(Action), 
  Action = sum(Action), Comedyz = sum(Comedy)) %.% 
 mutate(Year1 = year + 1)

head(data.frame(res))
  year    Length                       Title Dramaz Actionz Action Comedyz
1 1898  1.000000 Pack Train at Chilkoot Pass      1       0      0       2
2 1894  1.000000           Sioux Ghost Dance      0       0      0       0
3 1902  3.555556     Voyage dans la lune, Le      1       0      0       2
4 1893  1.000000            Blacksmith Scene      0       0      0       0
5 1912 24.382353            Unseen Enemy, An     22       0      0       4
6 1922 74.192308      Trapped by the Mormons     20       0      0      16
  Year1
1  1899
2  1895
3  1903
4  1894
5  1913
6  1923
Run Code Online (Sandbox Code Playgroud)

  • 总是欢迎拉请求:)但是`print.tbl_df`可能确实需要一个`all_columns`参数. (5认同)