Ont*_*nty 2 database erlang scalability mnesia erlang-otp
我设计了一个包含5个不同表的mnesia数据库.我的想法是模拟来自许多节点(计算机)的查询而不仅仅是一个,从终端我可以执行查询,但我只需要帮助我如何能够使我从多台计算机请求信息.我正在测试可伸缩性,并希望调查mnesia与其他数据库的性能.任何想法都将受到高度赞赏.
测试mnesia的最佳方法是在运行mnesia的本地Erlang节点和远程节点上使用密集线程作业.通常,您希望使用远程节点RPC calls,在mnesia表中执行读写操作.当然,高并发性需要权衡; 交易速度会降低,许多可能会被重试,因为在给定时间锁可能很多; 但是mnesia将确保所有进程都收到{atomic,ok} 他们所进行的每个交易呼叫.
The Concept
I propose that we have a non-blocking overload with both Writes and reads in directed to each mnesia table by as many processes as possible. We measure the time difference between the call to the write function and the time it takes for our massive mnesia subscriber to get a Write Event. These Events are sent by mnesia every after a successful Transaction and so we need not interrupt the working/overloading processes but rather let a "strong" mnesia subscriber to wait for asynchronous events reporting successful deletes and writes as soon as they occur.
The technique here is that we take the time stamp at the point just before calling a write function and then we note down the record key, the write CALL timestamp. Then our mnesia subscriber would note down the record key, the write/read EVENT timestamp.那么这两个时间戳之间的时差(让我们称之为CALL-to-EVENT Time:)会让我们大致了解我们的负载或效率.随着锁的增加随着并发,我们应该注册增加的CALL-to-EVENT Time参数.执行写入(无限制)的进程将同时执行,而执行读取的进程也将继续执行此操作而不会中断.我们将为每个操作选择进程数,但首先要为整个测试用例奠定基础.
以上所有概念都适用于本地操作(与Mnesia在同一节点上运行的进程)
--> Simulating Many Nodes
Well, i have personally not simulated Nodes in Erlang, i have always worked with real Erlang Nodes on the Same box or on several different machines in a networked environment. However, i advise that you look closely on this module: http://www.erlang.org/doc/man/slave.html, concentrate more on this one here: http://www.erlang.org/doc/man/ct_slave.html, and look at the following links as they talk about creating, simulating and controlling many nodes under another parent node (http://www.erlang.org/doc/man/pool.html, Erlang: starting slave node,https://support.process-one.net/doc/display/ERL/Starting+a+set+of+Erlang+cluster+nodes,http://www.berabera.info/oldblog/lenglet/howtos/erlangkerberosremctl/index.html). I will not dive into a jungle of Erlang Nodes here bacause it also another complicated topic but i will concentrate on tests on the same node running mnesia. I have come up with the above mnesia test concept and here, lets start implementing it.
Now, First of all, you need to make a test plan for each table (separate). This should include both writes and reads. Then you need to decide whether you want to do dirty operations or transactional operations on the tables. You need to test speed of traversing a mnesia table in relation to its size. Lets take an example of a simple mnesia table
-record(key_value,{key,value,instanceId,pid}).
We would want to have a general function for writing into our table, here below:
write(Record)->
%% Use mnesia:activity/4 to test several activity
%% contexts (and if your table is fragmented)
%% like the commented code below
%%
%% mnesia:activity(
%% transaction, %% sync_transaction | async_dirty | ets | sync_dirty
%% fun(Y) -> mnesia:write(Y) end,
%% [Record],
%% mnesia_frag
%% )
mnesia:transaction(fun() -> ok = mnesia:write(Record) end).
And for our reads, we will have:
read(Key)->
%% Use mnesia:activity/4 to test several activity
%% contexts (and if your table is fragmented)
%% like the commented code below
%%
%% mnesia:activity(
%% transaction, %% sync_transaction | async_dirty| ets | sync_dirty
%% fun(Y) -> mnesia:read({key_value,Y}) end,
%% [Key],
%% mnesia_frag
%% )
mnesia:transaction(fun() -> mnesia:read({key_value,Key}) end).
现在,我们想在我们的小表中写入很多记录.我们需要一个密钥生成器.这个密钥生成器将是我们自己的伪随机字符串生成器.但是,我们需要我们的生成器告诉我们它生成密钥的瞬间,以便我们记录它.我们想看看编写生成的密钥需要多长时间.让我们这样说:timestamp()-> erlang:now().要进行非常多的并发写入,我们需要一个将由我们将生成的许多进程执行的函数.在这个函数中,它不希望放置任何阻塞函数,例如
str(XX)-> integer_to_list(XX).
generate_instance_id()-> random:seed(now()), guid() ++ str(crypto:rand_uniform(1, 65536 * 65536)) ++ str(erlang:phash2({self(),make_ref(),time()})).
guid()-> random:seed(now()), MD5 = erlang:md5(term_to_binary({self(),time(),node(), now(), make_ref()})), MD5List = binary_to_list(MD5), F = fun(N) -> f("~2.16.0B", [N]) end, L = lists:flatten([F(N) || N <- MD5List]), %% tell our massive mnesia subscriber about this generation InstanceId = generate_instance_id(), mnesia_subscriber ! {self(),{key,write,L,timestamp(),InstanceId}}, {L,InstanceId}.
sleep/1通常实现的sleep(T)-> receive after T -> true end..这样的函数会使进程执行挂起指定的毫秒数.mnesia_tm代表进程进行锁定控制,重试,阻塞等,以避免死锁.可以说,我们希望每个进程都能编写一个unlimited amount of records.这是我们的功能:
-define(NO_OF_PROCESSES,20).
start_write_jobs()->
[spawn(?MODULE,generate_and_write,[]) || _ <- lists:seq(1,?NO_OF_PROCESSES)],
ok.
generate_and_write()->
%% remember that in the function ?MODULE:guid/0,
%% we inform our mnesia_subscriber about our generated key
%% together with the timestamp of the generation just before
%% a write is made.
%% The subscriber will note this down in an ETS Table and then
%% wait for mnesia Event about the write operation. Then it will
%% take the event time stamp and calculate the time difference
%% From there we can make judgement on performance.
%% In this case, we make the processes make unlimited writes
%% into our mnesia tables. Our subscriber will trap the events as soon as
%% a successful write is made in mnesia
%% For all keys we just write a Zero as its value
{Key,Instance} = guid(),
write(#key_value{key = Key,value = 0,instanceId = Instance,pid = self()}),
generate_and_write().
同样,让我们看看如何完成读取作业.我们将拥有一个密钥提供商,这个密钥提供商不断在mnesia表中旋转,只选择键,在桌子上下移动它.这是它的代码:
first()-> mnesia:dirty_first(key_value).
next(FromKey)-> mnesia:dirty_next(key_value,FromKey).
start_key_picker()-> register(key_picker,spawn(fun() -> key_picker() end)).
key_picker()->
try ?MODULE:first() of
'$end_of_table' ->
io:format("\n\tTable is empty, my dear !~n",[]),
%% lets throw something there to start with
?MODULE:write(#key_value{key = guid(),value = 0}),
key_picker();
Key -> wait_key_reqs(Key)
catch
EXIT:REASON ->
error_logger:error_info(["Key Picker dies",{EXIT,REASON}]),
exit({EXIT,REASON})
end.
wait_key_reqs('$end_of_table')->
receive
{From,<<"get_key">>} ->
Key = ?MODULE:first(),
From ! {self(),Key},
wait_key_reqs(?MODULE:next(Key));
{_,<<"stop">>} -> exit(normal)
end;
wait_key_reqs(Key)->
receive
{From,<<"get_key">>} ->
From ! {self(),Key},
NextKey = ?MODULE:next(Key),
wait_key_reqs(NextKey);
{_,<<"stop">>} -> exit(normal)
end.
key_picker_rpc(Command)->
try erlang:send(key_picker,{self(),Command}) of
_ ->
receive
{_,Reply} -> Reply
after timer:seconds(60) ->
%% key_picker hang, or too busy
erlang:throw({key_picker,hanged})
end
catch
_:_ ->
%% key_picker dead
start_key_picker(),
sleep(timer:seconds(5)),
key_picker_rpc(Command)
end.
%% Now, this is where the reader processes will be
%% accessing keys. It will appear to them as though
%% its random, because its one process doing the
%% traversal. It will all be a game of chance
%% depending on the scheduler's choice
%% he who will have the next read chance, will
%% win ! okay, lets get going below :)
get_key()->
Key = key_picker_rpc(<<"get_key">>),
%% lets report to our "massive" mnesia subscriber
%% about a read which is about to happen
%% together with a time stamp.
Instance = generate_instance_id(),
mnesia_subscriber ! {self(),{key,read,Key,timestamp(),Instance}},
{Key,Instance}.
哇 !!!现在我们需要创建我们将启动所有读者的功能.
-define(NO_OF_READERS,10).
start_read_jobs()->
[spawn(?MODULE,constant_reader,[]) || _ <- lists:seq(1,?NO_OF_READERS)],
ok.
constant_reader()->
{Key,InstanceId} = ?MODULE:get_key(),
Record = ?MODULE:read(Key),
%% Tell mnesia_subscriber that a read has been done so it creates timestamp
mnesia:report_event({read_success,Record,self(),InstanceId}),
constant_reader().
现在,最重要的部分; mnesia_subscriber !!! 这是一个订阅简单事件的简单过程.从mnesia用户指南中获取mnesia事件文档.这是mnesia用户
-record(read_instance,{
instance_id,
before_read_time,
after_read_time,
read_time %% after_read_time - before_read_time
}).
-record(write_instance,{
instance_id,
before_write_time,
after_write_time,
write_time %% after_write_time - before_write_time
}).
-record(benchmark,{
id, %% {pid(),Key}
read_instances = [],
write_instances = []
}).
subscriber()->
mnesia:subscribe({table,key_value, simple}),
%% lets also subscribe for system
%% events because events passing through
%% mnesia:event/1 will go via
%% system events.
mnesia:subscribe(system),
wait_events().
-include_lib("stdlib/include/qlc.hrl").
wait_events()->
receive
{From,{key,write,Key,TimeStamp,InstanceId}} ->
%% A process is just about to call
%% mnesia:write/1 and so we note this down
Fun = fun() ->
case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of
[] ->
ok = mnesia:write(#benchmark{
id = {From,Key},
write_instances = [
#write_instance{
instance_id = InstanceId,
before_write_time = TimeStamp
}]
}),
ok;
[Here] ->
WIs = Here#benchmark.write_instances,
NewInstance = #write_instance{
instance_id = InstanceId,
before_write_time = TimeStamp
},
ok = mnesia:write(Here#benchmark{write_instances = [NewInstance|WIs]}),
ok
end
end,
mnesia:transaction(Fun),
wait_events();
{mnesia_table_event,{write,#key_value{key = Key,instanceId = I,pid = From},_ActivityId}} ->
%% A process has successfully made a write. So we look it up and
%% get timeStamp difference, and finish bench marking that write
WriteTimeStamp = timestamp(),
F = fun()->
[Here] = mnesia:read({benchmark,{From,Key}}),
WIs = Here#benchmark.write_instances,
{_,WriteInstance} = lists:keysearch(I,2,WIs),
BeforeTmStmp = WriteInstance#write_instance.before_write_time,
NewWI = WriteInstance#write_instance{
after_write_time = WriteTimeStamp,
write_time = time_diff(WriteTimeStamp,BeforeTmStmp)
},
ok = mnesia:write(Here#benchmark{write_instances = [NewWI|lists:keydelete(I,2,WIs)]}),
ok
end,
mnesia:transaction(F),
wait_events();
{From,{key,read,Key,TimeStamp,InstanceId}} ->
%% A process is just about to do a read
%% using mnesia:read/1 and so we note this down
Fun = fun()->
case qlc:e(qlc:q([X || X <- mnesia:table(benchmark),X#benchmark.id == {From,Key}])) of
[] ->
ok = mnesia:write(#benchmark{
id = {From,Key},
read_instances = [
#read_instance{
instance_id = InstanceId,
before_read_time = TimeStamp
}]
}),
ok;
[Here] ->
RIs = Here#benchmark.read_instances,
NewInstance = #read_instance{
instance_id = InstanceId,
before_read_time = TimeStamp
},
ok = mnesia:write(Here#benchmark{read_instances = [NewInstance|RIs]}),
ok
end
end,
mnesia:transaction(Fun),
wait_events();
{mnesia_system_event,{mnesia_user,{read_success,#key_value{key = Key},From,I}}} ->
%% A process has successfully made a read. So we look it up and
%% get timeStamp difference, and finish bench marking that read
ReadTimeStamp = timestamp(),
F = fun()->
[Here] = mnesia:read({benchmark,{From,Key}}),
RIs = Here#benchmark.read_instances,
{_,ReadInstance} = lists:keysearch(I,2,RIs),
BeforeTmStmp = ReadInstance#read_instance.before_read_time,
NewRI = ReadInstance#read_instance{
after_read_time = ReadTimeStamp,
read_time = time_diff(ReadTimeStamp,BeforeTmStmp)
},
ok = mnesia:write(Here#benchmark{read_instances = [NewRI|lists:keydelete(I,2,RIs)]}),
ok
end,
mnesia:transaction(F),
wait_events();
_ -> wait_events();
end.
time_diff({A2,B2,C2} = _After,{A1,B1,C1} = _Before)->
{A2 - A1,B2 - B1,C2 - C1}.
好的 !那是巨大的:)所以我们完成了订阅者.我们需要将代码全部放在一起并运行必要的测试.
install()->
mnesia:stop().
mnesia:delete_schema([node()]),
mnesia:create_schema([node()]),
mnesia:start(),
{atomic,ok} = mnesia:create_table(key_value,[
{attributes,record_info(fields,key_value)},
{disc_copies,[node()]}
]),
{atomic,ok} = mnesia:create_table(benchmark,[
{attributes,record_info(fields,benchmark)},
{disc_copies,[node()]}
]),
mnesia:stop(),
ok.
start()->
mnesia:start(),
ok = mnesia:wait_for_tables([key_value,benchmark],timer:seconds(120)),
%% boot up our subscriber
register(mnesia_subscriber,spawn(?MODULE,subscriber,[])),
start_write_jobs(),
start_key_picker(),
start_read_jobs(),
ok.
现在,通过对基准表记录的正确分析,您将获得平均读取时间,平均写入时间等的记录.您可以根据不断增加的进程数绘制这些时间的图表.随着我们增加进程数量,您将发现读取和写入时间增加.获取代码,阅读并使用它.您可能不会全部使用它,但我相信您可以从那里获取新概念,因为其他人会发送解决方案.使用mnesia事件是测试mnesia读写的最佳方法,而不会阻止进行实际写入或读取的进程.在上面的示例中,读取和写入过程不受任何控制,事实上,它们将永远运行,直到您终止VM.
As a consequence, the concepts behind mnesia can only be compared with Ericsson's NDB Database found here: http://ww.dolphinics.no/papers/abstract/ericsson.html, but not with existing RDBMS, or Document Oriented Databases, e.t.c. Those are my thoughts :) lets wait for what others have to say.....