fie*_*edl 17 gnuplot histogram bar-chart
我正在尝试使用gnuplot从此数据文件中生成以下直方图集群,其中每个类别在数据文件中每年以单独的行表示:
# datafile
year category num_of_events
2011 "Category 1" 213
2011 "Category 2" 240
2011 "Category 3" 220
2012 "Category 1" 222
2012 "Category 2" 238
...
Run Code Online (Sandbox Code Playgroud)

但我不知道怎么做每个类别一行.如果有人知道如何使用gnuplot,我会很高兴.
更好的是如下所示的堆叠直方图集群,其中堆叠的子类别由数据文件中的单独列表示:
# datafile
year category num_of_events_for_A num_of_events_for_B
2011 "Category 1" 213 30
2011 "Category 2" 240 28
2011 "Category 3" 220 25
2012 "Category 1" 222 13
2012 "Category 2" 238 42
...
Run Code Online (Sandbox Code Playgroud)

非常感谢提前!
fie*_*edl 21
经过一番研究,我提出了两种不同的解决方案.
这两种解决方案都需要将数据文件拆分为按列分类的多个文件.因此,我创建了一个简短的ruby脚本,可以在这个要点中找到:
https://gist.github.com/fiedl/6294424
此脚本的用法如下:为了将数据文件拆分data.csv为data.Category1.csv和data.Category2.csv,请调用:
# bash
ruby categorize_csv.rb --column 2 data.csv
# data.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category1";"213";"30"
"2011";"Category2";"240";"28"
"2012";"Category1";"222";"13"
"2012";"Category2";"238";"42"
...
# data.Category1.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category1";"213";"30"
"2012";"Category1";"222";"13"
...
# data.Category2.csv
# year category num_of_events_for_A num_of_events_for_B
"2011";"Category2";"240";"28"
"2012";"Category2";"238";"42"
...
Run Code Online (Sandbox Code Playgroud)
策略:每个类别一个数据文件.每堆一列.通过使用gnuplot的"with boxes"参数"手动"绘制直方图的条形.
优点:关于酒吧尺寸,帽子,颜色等的充分灵活性
缺点:必须手动放置酒吧.
# solution1.gnuplot
reset
set terminal postscript eps enhanced 14
set datafile separator ";"
set output 'stacked_boxes.eps'
set auto x
set yrange [0:300]
set xtics 1
set style fill solid border -1
num_of_categories=2
set boxwidth 0.3/num_of_categories
dx=0.5/num_of_categories
offset=-0.1
plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \
'' using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \
'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \
'' using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes
Run Code Online (Sandbox Code Playgroud)
结果如下:

策略:每年一个数据文件.每堆一列.使用gnuplot的常规直方图机制产生直方图.
优点:易于使用,因为定位不需要手动完成.
缺点:由于所有类别都在一个文件中,因此每个类别都具有相同的颜色.
# solution2.gnuplot
reset
set terminal postscript eps enhanced 14
set datafile separator ";"
set output 'histo.eps'
set yrange [0:300]
set style data histogram
set style histogram rowstack gap 1
set style fill solid border -1
set boxwidth 0.5 relative
plot newhistogram "2011", \
'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \
'' using 4:xticlabels(2) title "B" linecolor rgb "green", \
newhistogram "2012", \
'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
'' using 4:xticlabels(2) title "" linecolor rgb "green", \
newhistogram "2013", \
'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
'' using 4:xticlabels(2) title "" linecolor rgb "green"
Run Code Online (Sandbox Code Playgroud)
结果如下:
