谁能告诉我Apache HBase数据库和Bigtable有什么区别?还是一样?哪一个支持关系(如果有)?如果他们是大搜索者,有什么区别?
我正在使用zookeeper ensemble for hbase.Zookeeper在3台机器上运行.虽然HBase也处于完全分布式模式.我有Nutch 2.x版本.当我启动nutch来抓取一些数据时,它会在nutch日志文件中给出以下bug.
ERROR zookeeper.ClientCnxnSocketNIO - Unable to open socket to localhost/0:0:0:0:0:0:0:1:2181
2015-01-23 16:34:21,956 WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.SocketException: Network is unreachable
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:457)
at sun.nio.ch.Net.connect(Net.java:449)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:647)
at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:266)
at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:276)
at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:958)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:993)
2015-01-23 16:34:22,063 WARN zookeeper.RecoverableZooKeeper - Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
Run Code Online (Sandbox Code Playgroud)
当我在三个zookeepers实例上运行命令时
echo ruok | nc 1.1.1.1 2181 it says imok
Run Code Online (Sandbox Code Playgroud)
这有什么问题?我的hbase版本是0.94.14,zookeeper版本是3.4.5,solr版本是4.10.3(用于索引),Nutch版本是2.2.3
我有以下简单的代码:
import org.apache.hadoop.hbase.client.ConnectionFactory
import org.apache.hadoop.hbase.HBaseConfiguration
val hbaseconfLog = HBaseConfiguration.create()
val connectionLog = ConnectionFactory.createConnection(hbaseconfLog)
Run Code Online (Sandbox Code Playgroud)
我在spark-shell上运行,我收到以下错误:
14:23:42 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected
error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:30)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
Run Code Online (Sandbox Code Playgroud)
实际上有很多这些错误,其中有几个偶尔出现:
14:23:46 WARN client.ZooKeeperRegistry: Can't retrieve clusterId from
Zookeeper org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
Run Code Online (Sandbox Code Playgroud)
通过Cloudera的VM,我可以通过简单地重新启动hbase-master,regionserver和thrift来解决这个问题,但在我公司这里我不允许这样做,我也通过复制文件hbase-site.xml解决了一次激发conf目录,但我也不能,它有没有办法在spark-shell参数中设置这个特定文件的路径?
我正在构建一个Hadoop(0.20.1)mapreduce作业,它使用HBase(0.20.1)作为数据源和数据接收器.我想用Python编写这个工作,它要求我使用hadoop-0.20.1-streaming.jar来传输数据到我的Python脚本之间.如果数据源/接收器是HDFS文件,这可以正常工作.
Hadoop是否支持从/向HBase流式传输mapreduce?
我是OpenTSDB的新手.我以某种方式设法安装和配置OpenTSDB,但我不知道如何从客户端将数据放入OpenTSDB.你们有人可以帮助我吗?
在我们无限的智慧中,我们决定用中间的标签键入我们的行:
item_id <tab> location
Run Code Online (Sandbox Code Playgroud)
例如:
000001 http://www.url.com/page
Run Code Online (Sandbox Code Playgroud)
使用Hbase Shell,我们无法执行get命令,因为选项卡字符无法在输入行中正确写入.我们尝试了
get 'tableName', '000001\thttp://www.url.com/page'
Run Code Online (Sandbox Code Playgroud)
没有成功.我们应该做什么?
我正在使用表格映射器和减速器对大规模问题进行一些测试.在某一点之后,当工作完成80%时,我的减速器开始失效.从我看到系统日志时我可以看出问题是我的一个动物园管理员试图连接到本地主机而不是法定人数中的其他动物园管理员
奇怪的是,当映射正在进行时,似乎可以很好地连接到其他节点,它减少了它有问题.以下是系统日志的选定部分,可能与确定最新情况有关
2014-06-27 09:44:01,599 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=hdev02:5181,hdev01:5181,hdev03:5181 sessionTimeout=10000 watcher=hconnection-0x4aee260b, quorum=hdev02:5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-27 09:44:01,612 INFO [main] org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x4aee260b connecting to ZooKeeper ensemble=hdev02:5181,hdev01:5181,hdev03:5181
2014-06-27 09:44:01,614 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server hdev02/172.17.43.36:5181. Will not attempt to authenticate using SASL (Unable to locate a login configuration)
2014-06-27 09:44:01,615 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Socket connection established to hdev02/172.17.43.36:5181, initiating session
2014-06-27 09:44:01,617 INFO [main-SendThread(hdev02:5181)] org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed …Run Code Online (Sandbox Code Playgroud) 我正在通过oozie java动作运行测试hbase java程序.遇到以下错误:
Failing Oozie Launcher, Main class [HbaseTest], main() threw exception, org/apache/hadoop/hbase/HBaseConfiguration
java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at HbaseTest.main(HbaseTest.java:28)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:495)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 14 more
Run Code Online (Sandbox Code Playgroud)
该程序从命令行正确运行:
java -cp `hbase classpath` HbaseTest
Run Code Online (Sandbox Code Playgroud)
有没有办法可以将'hbase classpath'的输出传递给oozie java动作.我不想将hbase jar复制到工作流的lib目录,因为这将是一个维护开销.
以下是来自的java动作workflow.xml:
<java>
<job-tracker>${jobTracker}</job-tracker> …Run Code Online (Sandbox Code Playgroud) 我正在做一个批处理作业,以通过HTableInterface将一批对象放入HBase。有两种API方法,即HTableInterface.put(List)和HTableInterface.put(Put)。
我想知道,对于相同数量的Put对象,批处理是否比逐个放置更快?
另一个问题是,我放置了一个非常大的Put对象,这导致作业失败。放置对象的大小似乎受到限制。可以多大?
我正在使用Java HBase API从Hbase获取值.这是我的代码.
public class GetViewFromHbaseBolt extends BaseBasicBolt {
private HTable table;
private String zkQuorum;
private String zkClientPort;
private String tableName;
public GetViewFromHbaseBolt(String table, String zkQuorum,
String zkClientPort) {
this.tableName = table;
this.zkQuorum = zkQuorum;
this.zkClientPort = zkClientPort;
}
@Override
public void prepare(Map config, TopologyContext context) {
try {
table = getHTable();
} catch (IOException e) {
e.printStackTrace();
}
}
@Override
public void execute(Tuple tuple, BasicOutputCollector collector) {
try {
if (tuple.size() > 0) {
Long dtmid = tuple.getLong(0);
byte[] …Run Code Online (Sandbox Code Playgroud)