我有一个非常大的两个类别的散点图,其中一个点是"点击".我想在剧情的顶部和侧面制作直方图来表示点击率,如下面的网站所示:http://blog.mckuhn.de/2009/09/learning-ggplot2-2d-plot-with.html
我可以将图形排列为2×2网格但是我遇到了一个问题:我的主要散点图的y轴有很长的标题(对于项目很重要),在2x2网格中,顶部直方图延伸到整个宽度,是不再沿x轴对齐.
我的想法是制作一个3x3网格,我用最左边的网格作为标题.但是,这需要将Y轴文本保存为"grob".在上面的博客文章中,这实现如下:
p <- qplot(data = mtcars, mpg, hp, geom = "point", colour = cyl)
legend <- p + opts(keep= "legend_box")
Run Code Online (Sandbox Code Playgroud)
这允许将"图例"放入2x2网格布局中.如果我可以使用相同的逻辑为Yaxis标签制作一个单独的grob,我会很好.我至少试过以下内容:
legend <- p +opts(keep="Yaxis")
legend <- p +opts(keep="axis_text_y")
legend <- p +opts(keep="axis_text")
..... and many others
Run Code Online (Sandbox Code Playgroud)
除了传奇盒之外,是否有可能制作出一些东西呢?如果是这样 - 请告诉我.如果没有,我将采取任何建议,如何安排三个图,同时保持它们对齐并保留Y标签.
谢谢

这个问题已经存在了足够长的时间,现在是时候为后代记录答案了.
简短的回答是,使用'lattice'和'ggplot2'包中的函数包装器无法完成高度自定义的数据可视化.函数包装器的目的是让您完成一些决策,因此您将始终受限于函数编码器最初设想的决策.我强烈建议每个人都学习'lattice'或'ggplot2'软件包,但这些软件包对数据探索比对数据可视化创新更有用.
这个答案适合那些想要创建自定义视觉效果的人. 以下过程可能需要半天时间,但这比将"格子"或"ggplot2"包破解成您想要的形状所花费的时间要少得多.这不是批评任何一个包; 它只是他们目的的副产品.当您需要出版物或客户的创意视觉效果时,与收益相比,一天中的4或5个小时都不算什么.
使用'grid'包来制作自定义视觉的工作非常简单,但这并不意味着它背后的数学总是很简单.这个例子中的大多数工作实际上是数学而不是图形.
前言:在为视觉效果使用基本"网格"包之前,您应该了解一些事项.首先,"网格"是视频的概念.这些是绘图空间,允许您从该空间内引用,忽略图形的其余部分.这很重要,因为它允许您制作图形而无需将工作扩展到整个空间的一小部分.它与基础绘图功能中的布局选项非常相似,除了它们可以重叠,旋转和透明.
单位是另一件要知道的事情.每个视口都有各种单位,可用于指示位置和大小.你可以在'grid'文档中看到整个列表,但是我经常使用的只有一些:npc,native,strwidth和lines.Npc单位从左下角的(0,0)开始,到右上角的c(1,1).原生单位使用'xscale'和'yscale'来创建本质上是数据绘图空间的内容.宽度单位告诉您在图形上打印某段文本后的宽度.线条单位告诉您在图形上打印一行文本的高度.由于多种类型的单元始终可用,因此您应该养成始终使用"单元"功能明确定义数字或从绘图函数中指定"default.units"参数的习惯.
最后,您可以为所有对象的位置指定理由.这太棒了.这意味着您可以指定形状的位置,然后说出您希望该形状水平和垂直对齐的方式(中心,左侧,右侧,底部,顶部).您可以通过引用其他对象的位置以这种方式完美地排列事物.
这就是我们正在制作的:这不是一个完美的图形,因为我不得不猜测OP想要什么,但它足以让我们走向完美的图形.

第1步:加载一些库来使用.如果要进行高度自定义的视觉效果,请使用"网格"包.它是像'lattice'和'ggplot2'这样的包装器调用的基本函数集.如果您想使用日期,请使用'lubridate'套餐,因为它可以让您的生活更美好.最后一个是个人偏好:当我要进行任何类型的数据汇总工作时,我喜欢使用'plyr'包.它允许我快速将数据整形为聚合形式.
library(grid)
library(lubridate)
library(plyr)
Run Code Online (Sandbox Code Playgroud)
样本数据生成:如果您已有数据,则不需要这样做,但是对于此示例,我正在创建一组样本数据.您可以通过更改数据生成的用户设置来解决它.该脚本非常灵活,可以适应生成的数据.随意添加更多网站,并使用lambda值.
set.seed(1)
#############################################
# User settings for the data generation. #
#############################################
# Set number of hours to generate data for.
time_Periods <- 100
# Set starting datetime in m/d/yyyy hh:mm format.
start_Datetime <- "2/24/2013 00:00"
# Specify a list of websites along with a
# Poisson lambda to represent the average
# number of hits in a given time period.
df_Websites <- read.table(text="
url lambda
http://www.asitenoonereallyvisits.com 1
http://www.asitesomepeoplevisit.com 10
http://www.asitesomemorepeoplevisit.com 20
http://www.asiteevenmorepeoplevisit.com 40
http://www.asiteeveryonevisits.com 80
", header=TRUE, sep=" ")
#############################################
# Generate the data. #
#############################################
# Initialize lists to hold hit data and
# website names.
hits <- list()
websites <- list()
# For each time period and for each website,
# flip a coin to see if any visitors come. If
# visitors come, use a Poisson distribution to
# see how many come.
# Also initialize the list of website names.
for (i in 1:nrow(df_Websites)){
hits[[i]] <- rbinom(time_Periods, 1, 0.5) * rpois(time_Periods, df_Websites$lambda[i])
websites[[i]] <- rep(df_Websites$url[i], time_Periods)
}
# Initialize list of time periods.
datetimes <- mdy_hm(start_Datetime) + hours(1:time_Periods)
# Tie the data into a data frame and erase rows with no hits.
# This is what the real data is more likely to look like
# after import and cleaning.
df_Hits <- data.frame(datetime=rep(datetimes, nrow(df_Websites)), hits=unlist(hits), website=unlist(websites))
df_Hits <- df_Hits[df_Hits$hits > 0,]
# Clean up data-generation variables.
rm(list=ls()[ls()!="df_Hits"])
Run Code Online (Sandbox Code Playgroud)
第2步:现在,我们需要决定我们希望图形如何工作.将大小和颜色等内容分隔到代码的不同部分非常有用,这样您就可以快速进行更改.在这里,我选择了一些应该产生一个像样的图形的基本设置.您会注意到一些尺寸设置正在使用"单位"功能.这是关于'grid'包的惊人之处.您可以使用各种单位来描述图形上的空间.例如,unit(1, "lines")是一行文本的高度.这使得布局图形显着更容易.
#############################################
# User settings for the graphic. #
#############################################
# Specify the window width and height and
# pixels per inch.
device_Width=12
device_Height=4.5
pixels_Per_Inch <- 100
# Specify the bin width (in hours) of the
# upper histogram.
bin_Width <- 2
# Specify a padding size for separating text
# from other plot elements.
padding <- unit(1, "strwidth", "W")
# Specify the bin cut-off values for the hit
# counts and the corresponding colors. The
# cutoff should be the maximum value to be
# contained in the bin.
bin_Settings <- read.table(text="
cutoff color
10 'darkblue'
20 'deepskyblue'
40 'purple'
80 'magenta'
160 'red'
", header=TRUE, sep=" ")
# Specify the size of the histogram plots
# in 'grid' units. Override only if necessary.
# histogram_Size <- unit(6, "lines")
histogram_Size <- unit(nrow(bin_Settings) + 1, "lines")
# Set the background color for distinguishing
# between rows of data.
row_Background <- "gray90"
# Set the color for the date lines.
date_Color <- "gray40"
# Set the color for marker lines on histograms.
marker_Color <- "gray80"
# Set the fontsize for labels.
label_Size <- 10
Run Code Online (Sandbox Code Playgroud)
第3步:是时候制作图形了.我在SO答案中的解释空间有限,所以我将总结,然后留下代码注释来解释细节.简而言之,我正在计算一切都将是多么大,然后一次制作一个情节.对于每个绘图,我首先格式化我的数据,因此我可以适当地指定视口.然后我放下需要在数据后面的标签,然后我绘制数据.最后,我"弹出"视口以完成它.
#############################################
# Make the graphic. #
#############################################
# Make sure bin cutoffs are in increasing order.
# This way, we can make assumptions later.
bin_Settings <- bin_Settings[order(bin_Settings$cutoff),]
# Initialize plot window.
# Make sure you always specify the pixels per
# inch, so you have an appropriately scaled
# graphic for output.
windows(
width=device_Width,
height=device_Height,
xpinch=pixels_Per_Inch,
ypinch=pixels_Per_Inch)
grid.newpage()
# Push an initial viewport, so we can set the
# font size to use in calculating label widths.
pushViewport(viewport(gp=gpar(fontsize=label_Size)))
# Find the list of websites in the data.
unique_Urls <- as.character(unique(df_Hits$website))
# Calculate the width of the website
# urls once printed on the screen.
label_Width <- list()
for (i in 1:length(unique_Urls)){
label_Width[[i]] <- convertWidth(unit(1, "strwidth", unique_Urls[i]), "npc")
}
# Use the maximum url width plus two padding.
x_Label_Margin <- unit(max(unlist(label_Width)), "npc") + padding * 2
# Calculate a height for the date labels plus two padding.
y_Label_Margin <- unit(1, "strwidth", "99/99/9999") + padding * 2
# Calculate size of main plot after making
# room for histogram and label margins.
main_Width <- unit(1, "npc") - histogram_Size - x_Label_Margin
main_Height <- unit(1, "npc") - histogram_Size - y_Label_Margin
# Calculate x values, using the minimum datetime
# as zero, and counting the hours between each
# datetime and the minimum.
x_Values <- as.integer((df_Hits$datetime - min(df_Hits$datetime)))/60^2
# Initialize main plotting area
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin,
width=main_Width,
height=main_Height,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, length(unique_Urls) + 1),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Put grey background behind every other website
# to make data easier to read, and write urls as
# y-labels.
for (i in 1:length(unique_Urls)){
if (i%%2==0){
grid.rect(
x=unit(-1, "npc"),
y=i,
width=unit(2, "npc"),
height=1,
default.units="native",
just=c("left", "center"),
gp=gpar(col=row_Background, fill=row_Background))
}
grid.text(
unique_Urls[i],
x=unit(0, "npc") - padding,
y=i,
default.units="native",
just=c("right", "center"))
}
# Find the hour offset of the minimum date value.
time_Offset <- as.integer(format(min(df_Hits$datetime), "%H"))
# Find the dates in the data.
x_Labels <- unique(format(df_Hits$datetime, "%m/%d/%Y"))
# Find where the days begin in the data.
midnight_Locations <- (0:max(x_Values))[(0:max(x_Values)+time_Offset)%%24==0]
# Write the appropriate date labels on the x-axis
# where the days begin.
grid.text(
x_Labels,
x=midnight_Locations,
y=unit(0, "npc") - padding,
default.units="native",
just=c("right", "center"),
rot=90)
# Draw lines to vertically mark when days begin.
grid.polyline(
x=c(midnight_Locations, midnight_Locations),
y=unit(c(rep(0, length(midnight_Locations)), rep(1, length(midnight_Locations))), "npc"),
default.units="native",
id=rep(midnight_Locations, 2),
gp=gpar(lty=2, col=date_Color))
# Initialize bin assignment variable.
bin_Assignment <- 1
# Calculate which bin each hit value belongs in.
for (i in 1:nrow(bin_Settings)){
bin_Assignment <- bin_Assignment + ifelse(df_Hits$hits>bin_Settings$cutoff[i], 1, 0)
}
# Draw points, coloring according to the bin settings.
grid.points(
x=x_Values,
y=match(df_Hits$website, unique_Urls),
pch=19,
size=unit(1, "native"),
gp=gpar(col=as.character(bin_Settings$color[bin_Assignment]), alpha=0.5))
# Finalize the main plotting area.
popViewport()
# Create the bins for the upper histogram.
bins <- ddply(
data.frame(df_Hits, bin_Assignment, mid=floor(x_Values/bin_Width)*bin_Width+bin_Width/2),
.(bin_Assignment, mid),
summarize,
freq=length(hits))
# Initialize upper histogram area
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin + main_Height,
width=main_Width,
height=histogram_Size,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, max(bins$freq) * 1.05),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Calculate where to put four value markers.
marker_Interval <- floor(max(bins$freq)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)
# Draw horizontal lines to mark values.
grid.polyline(
x=unit(c(rep(0,4), rep(1,4)), "npc"),
y=c(1:4 * marker_Interval, 1:4 * marker_Interval),
default.units="native",
id=rep(1:4, 2),
gp=gpar(lty=2, col=marker_Color))
# Write value labels for each marker.
grid.text(
1:4 * marker_Interval,
x=unit(0, "npc") - padding,
y=1:4 * marker_Interval,
default.units="native",
just=c("right", "center"))
# Finalize upper histogram area, so we
# can turn it back on but with clipping.
popViewport()
# Initialize upper histogram area again,
# but with clipping turned on.
pushViewport(viewport(
x=x_Label_Margin,
y=y_Label_Margin + main_Height,
width=main_Width,
height=histogram_Size,
xscale=c(-1, max(x_Values) + 1),
yscale=c(0, max(bins$freq) * 1.05),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size),
clip="on"))
# Draw bars for each bin.
for (i in 1:nrow(bin_Settings)){
active_Bin <- bins[bins$bin_Assignment==i,]
if (nrow(active_Bin)>0){
for (j in 1:nrow(active_Bin)){
grid.rect(
x=active_Bin$mid[j],
y=0,
width=bin_Width,
height=active_Bin$freq[j],
default.units="native",
just=c("center","bottom"),
gp=gpar(col=as.character(bin_Settings$color[i]), fill=as.character(bin_Settings$color[i]), alpha=1/nrow(bin_Settings)))
}
}
}
# Draw x-axis.
grid.lines(x=unit(c(0, 1), "npc"), y=0, default.units="native")
# Finalize upper histogram area.
popViewport()
# Calculate the frequencies for each website and bin.
freq_Data <- ddply(
data.frame(df_Hits, bin_Assignment),
.(website, bin_Assignment),
summarize,
freq=length(hits))
# Create the line data for the side histogram.
line_Data <- matrix(0, nrow=length(unique_Urls)+2, ncol=nrow(bin_Settings))
for (i in 1:nrow(freq_Data)){
line_Data[match(freq_Data$website[i], unique_Urls)+1,freq_Data$bin_Assignment[i]] <- freq_Data$freq[i]
}
# Initialize side histogram area
pushViewport(viewport(
x=x_Label_Margin + main_Width,
y=y_Label_Margin,
width=histogram_Size,
height=main_Height,
xscale=c(0, max(line_Data) * 1.05),
yscale=c(0, length(unique_Urls) + 1),
just=c("left", "bottom"),
gp=gpar(fontsize=label_Size)))
# Calculate where to put four value markers.
marker_Interval <- floor(max(line_Data)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)
# Draw vertical lines to mark values.
grid.polyline(
x=c(1:4 * marker_Interval, 1:4 * marker_Interval),
y=unit(c(rep(0,4), rep(1,4)), "npc"),
default.units="native",
id=rep(1:4, 2),
gp=gpar(lty=2, col=marker_Color))
# Write value labels for each marker.
grid.text(
1:4 * marker_Interval,
x=1:4 * marker_Interval,
y=unit(0, "npc") - padding,
default.units="native",
just=c("center", "top"))
# Draw lines for each bin setting.
grid.polyline(
x=array(line_Data),
y=rep(0:(length(unique_Urls)+1), nrow(bin_Settings)),
default.units="native",
id=array(t(matrix(1:nrow(bin_Settings), nrow=nrow(bin_Settings), ncol=length(unique_Urls)+2))),
gp=gpar(col=as.character(bin_Settings$color)))
# Draw vertical line for the y-axis.
grid.lines(x=0, y=c(0, length(unique_Urls)+1), default.units="native")
# Finalize side histogram area.
popViewport()
# Draw legend.
# Draw box behind legend headers.
grid.rect(
x=0,
y=1,
width=unit(1, "strwidth", names(bin_Settings)[1]) + unit(1, "strwidth", names(bin_Settings)[2]) + 3 * padding,
height=unit(1, "lines"),
default.units="npc",
just=c("left","top"),
gp=gpar(col=row_Background, fill=row_Background))
# Draw legend headers from bin_Settings variable.
grid.text(
names(bin_Settings)[1],
x=padding,
y=1,
default.units="npc",
just=c("left","top"))
grid.text(
names(bin_Settings)[2],
x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
y=1,
default.units="npc",
just=c("left","top"))
# For each row in the bin_Settings variable,
# write the cutoff values and the color associated.
# Write the color name in the color it specifies.
for (i in 1:nrow(bin_Settings)){
grid.text(
bin_Settings$cutoff[i],
x=unit(1, "strwidth", names(bin_Settings)[1]) + padding,
y=unit(1, "npc") - i * unit(1, "lines"),
default.units="npc",
just=c("right","top"))
grid.text(
bin_Settings$color[i],
x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
y=unit(1, "npc") - i * unit(1, "lines"),
default.units="npc",
just=c("left","top"),
gp=gpar(col=as.character(bin_Settings$color[i])))
}
Run Code Online (Sandbox Code Playgroud)