Gnuplot直方图簇(条形图),每个类别一行

fie*_*edl 17 gnuplot histogram bar-chart

直方图簇/条形图

我正在尝试使用gnuplot从此数据文件中生成以下直方图集群,其中每个类别在数据文件中每年以单独的行表示:

# datafile
year   category        num_of_events
2011   "Category 1"    213
2011   "Category 2"    240
2011   "Category 3"    220
2012   "Category 1"    222
2012   "Category 2"    238
...
Run Code Online (Sandbox Code Playgroud)

所需的直方图簇

但我不知道怎么做每个类别一行.如果有人知道如何使用gnuplot,我会很高兴.

堆积直方图簇/堆积条形图

更好的是如下所示的堆叠直方图集群,其中堆叠的子类别由数据文件中的单独列表示:

# datafile
year   category        num_of_events_for_A    num_of_events_for_B
2011   "Category 1"    213                    30
2011   "Category 2"    240                    28
2011   "Category 3"    220                    25
2012   "Category 1"    222                    13
2012   "Category 2"    238                    42
...
Run Code Online (Sandbox Code Playgroud)

期望的堆积直方图簇

非常感谢提前!

fie*_*edl 21

经过一番研究,我提出了两种不同的解决方案.

必需:拆分数据文件

这两种解决方案都需要将数据文件拆分为按列分类的多个文件.因此,我创建了一个简短的ruby脚本,可以在这个要点中找到:

https://gist.github.com/fiedl/6294424

此脚本的用法如下:为了将数据文件拆分data.csvdata.Category1.csvdata.Category2.csv,请调用:

# bash
ruby categorize_csv.rb --column 2 data.csv

# data.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category1";"213";"30"
"2011";"Category2";"240";"28"
"2012";"Category1";"222";"13"
"2012";"Category2";"238";"42"
...

# data.Category1.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category1";"213";"30"
"2012";"Category1";"222";"13"
...

# data.Category2.csv
# year   category   num_of_events_for_A   num_of_events_for_B
"2011";"Category2";"240";"28"
"2012";"Category2";"238";"42"
...
Run Code Online (Sandbox Code Playgroud)

解决方案1:堆积箱图

策略:每个类别一个数据文件.每堆一列.通过使用gnuplot的"with boxes"参数"手动"绘制直方图的条形.

优点:关于酒吧尺寸,帽子,颜色等的充分灵活性

缺点:必须手动放置酒吧.

# solution1.gnuplot
reset
set terminal postscript eps enhanced 14

set datafile separator ";"

set output 'stacked_boxes.eps'

set auto x
set yrange [0:300]
set xtics 1

set style fill solid border -1

num_of_categories=2
set boxwidth 0.3/num_of_categories
dx=0.5/num_of_categories
offset=-0.1

plot 'data.Category1.csv' using ($1+offset):($3+$4) title "Category 1 A" linecolor rgb "#cc0000" with boxes, \
     ''                   using ($1+offset):3 title "Category 2 B" linecolor rgb "#ff0000" with boxes, \
     'data.Category2.csv' using ($1+offset+dx):($3+$4) title "Category 2 A" linecolor rgb "#00cc00" with boxes, \
     ''                   using ($1+offset+dx):3 title "Category 2 B" linecolor rgb "#00ff00" with boxes
Run Code Online (Sandbox Code Playgroud)

结果如下:

stacked_boxes.eps

解决方案2:原生Gnuplot直方图

策略:每年一个数据文件.每堆一列.使用gnuplot的常规直方图机制产生直方图.

优点:易于使用,因为定位不需要手动完成.

缺点:由于所有类别都在一个文件中,因此每个类别都具有相同的颜色.

# solution2.gnuplot
reset
set terminal postscript eps enhanced 14

set datafile separator ";"

set output 'histo.eps'
set yrange [0:300]

set style data histogram
set style histogram rowstack gap 1
set style fill solid border -1
set boxwidth 0.5 relative

plot newhistogram "2011", \
       'data.2011.csv' using 3:xticlabels(2) title "A" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "B" linecolor rgb "green", \
     newhistogram "2012", \
       'data.2012.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "" linecolor rgb "green", \
     newhistogram "2013", \
       'data.2013.csv' using 3:xticlabels(2) title "" linecolor rgb "red", \
       ''              using 4:xticlabels(2) title "" linecolor rgb "green"
Run Code Online (Sandbox Code Playgroud)

结果如下:

histo.eps

参考