我刚刚从cloudera(3)安装了hadoop和hbase,但是当我尝试去http:// localhost:60010时,它只是坐在那里不断加载.
我可以很好地访问regionserver - http:// localhost:60030 ...查看主hbase服务器日志我可以看到以下内容.
看起来像根区域的问题.
所有这些都安装在运行Ubuntu(Natty)11的ext4 1TB分区上.没有集群/其他盒子.
任何帮助都会很棒!
11/05/15 19:58:27 WARN master.AssignmentManager: Failed assignment of -ROOT-,,0.70236052 to serverName=localhost,60020,1305452402149, load=(requests=0, regions=0, usedHeap=24, maxHeap=995), trying to assign elsewhere instead; retry=0
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to /127.0.0.1:60020 after attempts=1
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:355)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:957)
at org.apache.hadoop.hbase.master.ServerManager.getServerConnection(ServerManager.java:606)
at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:541)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:901)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:730)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:710)
at org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1605)
at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883) …Run Code Online (Sandbox Code Playgroud) 我知道有关此异常的帖子很多,但我无法解决这个问题.必须编辑类路径我想解决它.我试图在hadoop基础设施中运行一个名为DistMap的程序.这是我得到的错误.
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.hadoop.util.PlatformName. Program will exit.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FsShell
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FsShell
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.hadoop.fs.FsShell. Program will exit.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.util.PlatformName
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306) …Run Code Online (Sandbox Code Playgroud) 我正在编写hadoop程序,我真的不想玩弃用的类.在线任何地方我无法找到更新的程序
org.apache.hadoop.conf.Configuration
上课
org.apache.hadoop.mapred.JobConf
类.
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(Test.class);
conf.setJobName("TESST");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
Run Code Online (Sandbox Code Playgroud)
这就是我的main()的样子.可以请任何人都提供更新的功能.
我已经安装了hadoop和hbase cdh3u2.在hadoop我有一个文件在路径上/home/file.txt.它有像这样的数据
one,1
two,2
three,3
Run Code Online (Sandbox Code Playgroud)
我想将此文件导入hbase.在那里,第一个字段应解析为String,第二个字段解析为整数,然后它应推入hbase.帮我这样做
athanks in dvance ....
我有点卡住修复故障表(在Hbase 0.92.1-cdh4.0.0,Hadoop 2.0.0-cdh4.0.0上)
转换中的某个区域未完成:
Region State
bf2025f4bc154914b5942af4e72ea063 counter_traces,1329773878.35_766a0b4df75e4381a686fbc07db9e333,1339425291230.bf2025f4bc154914b5942af4e72ea063. state=OFFLINE, ts=Tue Jun 12 11:43:53 CEST 2012 (0s ago), server=null
Run Code Online (Sandbox Code Playgroud)
当我跑步时sudo -u hbase hbase hbck -repair,我明白了:
Number of empty REGIONINFO_QUALIFIER rows in .META.: 0
ERROR: Region { meta => counter_traces,1329773878.35_766a0b4df75e4381a686fbc07db9e333,1339425291230.bf2025f4bc154914b5942af4e72ea063., hdfs => hdfs://hbase001:8020/hbase/counter_traces/bf2025f4bc154914b5942af4e72ea063, deployed => } not deployed on any region server.
Trying to fix unassigned region...
12/06/12 11:44:40 INFO util.HBaseFsckRepair: Region still in transition, waiting for it to become assigned: {NAME => 'counter_traces,1329773878.35_766a0b4df75e4381a686fbc07db9e333,1339425291230.bf2025f4bc154914b5942af4e72ea063.', STARTKEY => '1329773878.35_766a0b4df75e4381a686fbc07db9e333', ENDKEY => '1329793347.58_163865765c0a11e184ab003048f0e77e', …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用以下内容在hdfs中列出我的目录:
ubuntu@ubuntu:~$ hadoop fs -ls hdfs://127.0.0.1:50075/
ls: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:
Protocol message end-group tag did not match expected tag.;
Host Details : local host is: "ubuntu/127.0.0.1"; destination host is: "ubuntu":50075;
Run Code Online (Sandbox Code Playgroud)
这是我的/ etc/hosts文件
127.0.0.1 ubuntu localhost
#127.0.1.1 ubuntu
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Run Code Online (Sandbox Code Playgroud)
如何正确使用hdfs://列出我的目录?
我在ubuntu 12.04上使用couldera 4.3
我收到了这个错误
安装失败.无法从代理接收心跳.
当我在单个节点上安装cloudera时.这是我的/etc/hosts文件中的内容:
127.0.0.1 localhost
192.168.2.131 ubuntu
Run Code Online (Sandbox Code Playgroud)
这是我的/etc/hostname文件中的内容:
ubuntu
Run Code Online (Sandbox Code Playgroud)
这是我的/var/log/cloudera-scm-agent文件中的错误:
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor
[13/Jun/2014 12:31:58 +0000] 15366 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include
[13/Jun/2014 12:31:58 +0000] 15366 …Run Code Online (Sandbox Code Playgroud) 我一直在试验和谷歌搜索几个小时,没有运气.
我有一个火花流应用程序,在本地火花群中运行良好.现在我需要在cloudera 5.4.4上部署它.我需要能够启动它,让它在后台持续运行,并能够阻止它.
我试过这个:
$ spark-submit --master yarn-cluster --class MyMain my.jar myArgs
Run Code Online (Sandbox Code Playgroud)
但它只是无休止地印刷这些线条.
15/07/28 17:58:18 INFO Client: Application report for application_1438092860895_0012 (state: RUNNING)
15/07/28 17:58:19 INFO Client: Application report for application_1438092860895_0012 (state: RUNNING)
Run Code Online (Sandbox Code Playgroud)
问题1:因为它是一个流媒体应用程序,它需要连续运行.那么我该如何在"后台"模式下运行呢?我发现在纱线上提交火花作业的所有例子似乎都假设应用程序会做一些工作并终止,因此你想要在前台运行它.但流媒体并非如此.
接下来......此时应用程序似乎无法正常运行.我认为这可能是我的错误或配置错误,所以我试着查看日志以查看发生了什么:
$ yarn logs -applicationId application_1438092860895_012
Run Code Online (Sandbox Code Playgroud)
但它告诉我:
/tmp/logs/hdfs/logs/application_1438092860895_0012does not have any log files.
Run Code Online (Sandbox Code Playgroud)
所以,问题编号为2:如果应用程序正在运行,为什么它没有日志文件?
所以最终我不得不杀了它:
$ yarn application -kill application_1438092860895_012
Run Code Online (Sandbox Code Playgroud)
这提出了问题3:假设我最终可以在后台启动并运行应用程序,"纱线应用程序 - 杀手"是阻止它的首选方式吗?
我有关于Apache Spark的一般性问题:
我们有一些使用Kafka消息的spark流式脚本.问题:他们在没有特定错误的情况下随机失败......
当我手动运行它们时,某些脚本没有任何作用,其中一个脚本没有显示以下消息:
错误SparkUI:无法绑定SparkUI java.net.BindException:地址已被使用:服务'SparkUI'在16次重试后失败!
所以我想知道是否有一种特定的方法来并行运行脚本?
他们都在同一个罐子里,我和Supervisor一起运行.Spark安装在纱线上的Cloudera Manager 5.4上.
以下是我启动脚本的方法:
sudo -u spark spark-submit --class org.soprism.kafka.connector.reader.TwitterPostsMessageWriter /home/soprism/sparkmigration/data-migration-assembly-1.0.jar --master yarn-cluster --deploy-mode client
Run Code Online (Sandbox Code Playgroud)
谢谢你的帮助 !
更新:我更改了命令,现在运行它(现在停止显示特定的消息):
root@ns6512097:~# sudo -u spark spark-submit --class org.soprism.kafka.connector.reader.TwitterPostsMessageWriter --master yarn --deploy-mode client /home/soprism/sparkmigration/data-migration-assembly-1.0.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/avro-tools-1.7.6-cdh5.4.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/09/28 16:14:21 INFO Remoting: Starting remoting
15/09/28 16:14:21 INFO Remoting: Remoting started; listening …Run Code Online (Sandbox Code Playgroud) 我正在尝试在Amazon EMR上部署Livy Server.首先,我建立了Livy主分支
mvn clean package -Pscala-2.11 -Pspark-2.0
Run Code Online (Sandbox Code Playgroud)
然后,我将其上传到EMR集群主服务器.我设置了以下配置:
livy-env.sh
SPARK_HOME=/usr/lib/spark
HADOOP_CONF_DIR=/etc/hadoop/conf
Run Code Online (Sandbox Code Playgroud)
livy.conf
livy.spark.master = yarn
livy.spark.deployMode = cluster
Run Code Online (Sandbox Code Playgroud)
当我启动Livy时,它会在连接到YARN资源管理器时无限期挂起(XX.XX.XXX.XX是IP地址)
16/10/28 17:56:23 INFO RMProxy: Connecting to ResourceManager at /XX.XX.XXX.XX:8032
Run Code Online (Sandbox Code Playgroud)
但是,当我netcat端口8032时,它成功连接
nc -zv XX.XX.XXX.XX 8032
Connection to XX.XX.XXX.XX 8032 port [tcp/pro-ed] succeeded!
Run Code Online (Sandbox Code Playgroud)
我想我可能错过了一些步骤.任何人都知道这一步可能是什么?