Udo*_*o G
5
replication
distributed-filesystems
I'm trying to set up a redundant setup consisting of two servers that have everything redundant:
- the database (MySQL master-master in active/passive mode)
- the file system (distributed/replicated)
- our application software (kept in sync using the distributed file system)
Mostly one of the two servers will be the "main" server and the other will replicate all it's data and will also be used to distribute workload (Gearman). In case the main server fails, everything is switched to the "standby" server which will become the "active" server and continue it's work.
To reduce the risk of complete fail of both servers, they are geographically separated in two distant data centers (same country / direct connections).
I read a lot about distributed file systems, but still have no clue which solution is suitable for just two nodes...
Some more requirements to the distributed file system:
- must be POSIX compliant
- must replicate everything (all data must be available on both servers all the time) in both directions (all data can be changed anywhere)
- current stats relating to the already existing data that should be replicated in future:
- about 30 GB of data, constantly growing since 3 years
- about 3 million files in 7,500 directories
- average file size approx. 5-10 kb; there are a few big files around 10-50 MB
- files are mostly added periodically through the day and moved to another directory once processed (similar to file based mail server)
- once a day a few thousand files (received the day before) are archived to a number of TAR archives and left there "forever"
- when adding files, the data is first written to a temporary file starting with a dot "." and then renamed when complete. Only rarely an existing file is being changed.
- the system should deal well with unexpected connection losses, reboots of a server, etc.
- no problem if the replication lags 1-2 seconds, but it should be always in a consistent state
- as said, the distr. filesys. will consist of only two nodes, but it would be a big bonus if I could add additional nodes/servers, should I need more computing power in the future
Update/more details:
- 我只需要“文件存储在两台服务器上,立即同步”意义上的冗余。当访问文件时,我不需要因为本地硬盘出现故障而需要文件系统从另一台服务器读取数据。当本地硬盘发生故障时,整个服务器计算机被视为“损坏”,因此应停止工作。
哪种文件系统适合这种情况?