在最新的"apache mahout"库中是否有"clusterdump"的seqFileDir选项？

Question

在最新的"apache mahout"库中是否有"clusterdump"的seqFileDir选项？

Ani*_*sak 6 hadoop cluster-analysis amazon-ec2 k-means mahout

我试图在mahout kmeans聚类示例(synthetic_control示例)的输出上执行"clusterdump".但我遇到以下错误:

> ~/MAHOUT/trunk/bin/mahout clusterdump --seqFileDir clusters-10-final --pointsDir clusteredPoints --output a1.txt

MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/usr/lib/hadoop/conf/
MAHOUT-JOB: /home/<username>/MAHOUT/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar

12/06/21 22:43:18 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

12/06/21 22:43:25 ERROR common.AbstractJob: Unexpected --seqFileDir while processing Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]
.....

Run Code Online (Sandbox Code Playgroud)

所以我猜clusterdump没有"seqFileDir"选项,但是所有的在线教程(例如https://cwiki.apache.org/MAHOUT/cluster-dumper.html)都引用了这个选项.你能告诉我补救措施或我缺少什么吗？

Answer 1

Ale*_*Ott 2

您是否尝试将其指定为--input选项？

我正在研究这个问题。神奇的是我找到了解决方案！感谢您的建议 --input 代替 --seqFileDir 选项。我做错的是，我没有意识到 clusterdump（设置了 HADOOP_HOME）从 HDFS 读取并将输出写入本地文件系统。不管怎样，现在一切进展顺利！ (2认同)

归档时间：	13 年，7 月前
查看次数：	2039 次
最近记录：	11 年，9 月前