我是hadoop的新手并试图从书中运行一个示例程序.我面临错误
java.io.IOException:键入map中的键不匹配:expected org.apache.hadoop.io.LongWritable,recieved org.apache.hadoop.io.Text
请帮我解决错误.下面是代码
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.KeyValueTextInputFormat;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapred.TextOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class HadoopJob extends Configured implements Tool {
public static class MapperClass extends MapReduceBase implements Mapper<Text, Text, Text, Text> {
@Override
public void map(Text key, Text value, OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
output.collect(value, key);
}
} …Run Code Online (Sandbox Code Playgroud) 我正在使用Mapper加载大量数据,这些数据具有执行时间和与之关联的大型查询.我只需要找到1000个最昂贵的查询,所以我将执行时间作为我输出的关键字输入映射器.我使用1个reducer,只想写1000条记录,减速机停止处理.
如果(count <1000){context.write(key,value)},我可以有一个全局计数器并执行此操作
但这仍将加载所有数十亿条记录,然后不再写入.
我希望在吐出1000条记录后停止减速机.通过避免下一组记录的搜索时间和读取时间.
这可能吗??
what I would like to do in pig is something that is very common in sql. I have date field that is of the form yyy-mm-dd hh:mm:ss and I have another field that contains an integer which represents an amount of hours. Is there a way to easily add the integer to the datetime field so that we get a result of what we expect with clock math.
Example: date is 2013-06-01 : 23:12:12.
Then I add 2 hours
I …
我不清楚hive中分区和分区之间的区别,如果你可以通过示例提供一些细节,我将非常感激.
如果我有一个表有重复行的id,
我可以使用Hive和下面的查询找到它
create table dupe as select * from table1 group by id having count(*) > 1;
Run Code Online (Sandbox Code Playgroud)
我们可以使用Pig执行相同的功能吗?
如果有,有人可以帮助我吗?
在一个hadoop集群(1.x版本)中,NameNode和JobTracker不是同一个服务器,是否需要在NameNode和JobTracker上或仅在NameNode上指定conf/masters和conf/slaves?我似乎无法在文档中找到对此的直接答案.
我在MacOS开发环境中使用Apache Hadoop 2.2.0.当尝试运行hadoop minicluster时,如apache文档中所述:
hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar minicluster
Run Code Online (Sandbox Code Playgroud)
我收到了错误
java.lang.NoClassDefFoundError:org/apache/hadoop/yarn/server/MiniYARNCluster org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:170)org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager) .java:129)at.......
任何想法如何解决这个问题?
这是我的设置

但是它显示了这样的错误消息
Current value: http://master ip:50070/webhdfs/v1
Filesystem root '/' should be owned by 'hdfs'
Run Code Online (Sandbox Code Playgroud) 我们目前对评估datameer感兴趣并且有一些问题.是否有任何数据用户可以回答这些问题:
由于数据中心工作在HDFS之外,查询速度是否与Hive类似?查询速度与柱状数据库相比如何?
由于Hadoop以高延迟着称,建议使用datameer进行实时查询吗?
谢谢.
拉维
尝试使用maven(或使用我的IDE IntelliJ)编译我的Pig UDF时出现以下错误:
cannot access org.apache.hadoop.io.WritableComparable
class file for org.apache.hadoop.io.WritableComparable not found
Run Code Online (Sandbox Code Playgroud)
所以我想我会将hadoop-core的依赖添加到我的POM文件中但仍然没有变化,尽管我检查了并且WritableComparable类在jar中.
我的UDF类看起来像这样:
public class INCREMENTAL_UPDATE extends EvalFunc<DataBag> {
TupleFactory tupleFactory = TupleFactory.getInstance();
BagFactory bagFactory = BagFactory.getInstance();
public DataBag exec(Tuple input) throws IOException {
if (null == input || input.size() != 0) {
return null;
}
try {
DataBag inputbag = (DataBag) input.get(0);
Iterator it = inputbag.iterator();
DataBag outputbag = bagFactory.newDefaultBag();
Tuple previousTuple = null;
while (it.hasNext()) {
Tuple currentTuple = (Tuple) it.next();
Tuple outputTuple = tupleFactory.newTuple();
for (int …Run Code Online (Sandbox Code Playgroud) hadoop ×10
apache-pig ×3
java ×2
hadoop-yarn ×1
hbase ×1
hive ×1
hue ×1
mapper ×1
mapreduce ×1
maven ×1
partitioning ×1
reducers ×1