We copied a 150 mb csv file into flume's spool directory, when it is getting loaded into hdfs, the file was splitting into smaller size files like 80 kb's. is there a way to load the file without getting split into smaller files using flume? because more metadata will be generated inside namenode about the smaller files, so we need to avoid it.
My flume-ng code looks like this
# Initialize agent's source, channel and sink
agent.sources = TwitterExampleDir
agent.channels …Run Code Online (Sandbox Code Playgroud) 我有一个单词列表
count=100
list = ['apple','orange','mango']
Run Code Online (Sandbox Code Playgroud)
对于上面使用随机函数的计数,有可能选择40%的苹果时间,30%的橙色时间和30%的时间芒果?
对于前:
for the count=100, 40 times apple, 30 times orange and 30 times mango.
Run Code Online (Sandbox Code Playgroud)
这种选择必须随机发生