Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configuration for remote HBase cluster #96

Closed
mrauter opened this issue Dec 16, 2016 · 3 comments
Closed

configuration for remote HBase cluster #96

mrauter opened this issue Dec 16, 2016 · 3 comments

Comments

@mrauter
Copy link

mrauter commented Dec 16, 2016

Maybe a stupid question but how can I configure the HBase connector to a Zookeeper/HBase Cluster which is not running on localhost?

`[2016-12-16 14:03:42,449] INFO Process identifier=hconnection-0x25a855e1 connecting to ZooKeeper ensemble=localhost:2181 (org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper:120)

[2016-12-16 14:03:59,391] ERROR ZooKeeper exists failed after 4 attempts (org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper:277)`

I couldn't find the setting for this

@stheppi
Copy link
Contributor

stheppi commented Dec 16, 2016

@mrauter HBaseConfiguration.create() will get the configuration from the hbase-site.xml (it typically sits on your /etc/hbase/conf and you need to add it to the connector classpath). This is the best practice for connecting to Hbase as opposed to calling the #set on the configuration with the required settings. Your hbase-site.xml should/would have all the values pointing to the remote Hbase

@stheppi stheppi closed this as completed Dec 16, 2016
@artiship
Copy link

@stheppi

  1. hbase-site.xml
<configuration>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>192.168.1.101</value>
    <description>The directory shared by region servers.</description>
  </property>
</configuration>
  1. Add hbase-site.xml to CLASSPATH
export CLASSPATH=$CLASSPATH:/opt/hbase-site.xml
  1. Restart confluent connect
bin/confluent stop connect
bin/confluent start connect
  1. Start hbase connector as example :
    http://docs.datamountaineer.com/en/latest/hbase.html
➜  bin/connect-cli create hbase-sink < conf/hbase-sink.properties

#Connector name=`hbase-sink`
name=person-hbase-test
connector.class=com.datamountaineer.streamreactor.connect.hbase.HbaseSinkConnector
tasks.max=1
topics=hbase-topic
connect.hbase.column.family=d
connect.hbase.kcql=INSERT INTO person SELECT * FROM hbase-topic PK firstName, lastName
#task ids: 0
  1. But View log found that hbase still connect to localhost rather than 192.168.1.101 which configured in hbase-site.xml.
bin/confluent log connect

[2017-11-17 16:13:08,541] INFO Process identifier=hconnection-0x4107c3fd connecting to ZooKeeper ensemble=localhost:2181 (org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper:120)
  1. And then the task throw exception:
[2017-11-17 16:44:55,196] INFO Setting connector hbase-sink-test state to STARTED (org.apache.kafka.connect.runtime.Worker:541)
[2017-11-17 16:44:55,203] INFO SinkConnectorConfig values: 
        connector.class = com.datamountaineer.streamreactor.connect.hbase.HbaseSinkConnector
        key.converter = null
        name = hbase-sink-test
        tasks.max = 1
        topics = [hbase-topic]
        transforms = null
        value.converter = null
 (org.apache.kafka.connect.runtime.SinkConnectorConfig:223)
[2017-11-17 16:44:55,203] INFO EnrichedConnectorConfig values: 
        connector.class = com.datamountaineer.streamreactor.connect.hbase.HbaseSinkConnector
        key.converter = null
        name = hbase-sink-test
        tasks.max = 1
        topics = [hbase-topic]
        transforms = null
        value.converter = null
 (org.apache.kafka.connect.runtime.ConnectorConfig$EnrichedConnectorConfig:223)
[2017-11-17 16:44:55,204] INFO Setting task configurations for 1 workers. (com.datamountaineer.streamreactor.connect.hbase.HbaseSinkConnector:52)
[2017-11-17 16:44:58,273] INFO 127.0.0.1 - - [17/Nov/2017:08:44:58 +0000] "GET /connectors/hbase-sink-test/status HTTP/1.1" 200 161  5 (org.apache.kafka.connect.runtime.rest.RestServer:60)
[2017-11-17 16:48:04,619] ERROR Failed to get region location  (org.apache.hadoop.hbase.client.AsyncProcess:420)
org.apache.hadoop.hbase.client.NoServerForRegionException: Unable to find region for John\x0ASmith in person after 35 tries.
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1329)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1199)
        at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:410)
        at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:359)
        at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
        at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:190)
        at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1498)
        at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1094)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter$$anonfun$insert$1$$anonfun$1.apply$mcV$sp(HbaseWriter.scala:104)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter$$anonfun$insert$1$$anonfun$1.apply(HbaseWriter.scala:104)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter$$anonfun$insert$1$$anonfun$1.apply(HbaseWriter.scala:104)
        at scala.util.Try$.apply(Try.scala:192)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter$$anonfun$insert$1.apply(HbaseWriter.scala:104)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter$$anonfun$insert$1.apply(HbaseWriter.scala:75)
        at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter.insert(HbaseWriter.scala:75)
        at com.datamountaineer.streamreactor.connect.hbase.writers.HbaseWriter.write(HbaseWriter.scala:64)
        at com.datamountaineer.streamreactor.connect.hbase.HbaseSinkTask$$anonfun$put$2.apply(HbaseSinkTask.scala:83)
        at com.datamountaineer.streamreactor.connect.hbase.HbaseSinkTask$$anonfun$put$2.apply(HbaseSinkTask.scala:83)
        at scala.Option.foreach(Option.scala:257)
        at com.datamountaineer.streamreactor.connect.hbase.HbaseSinkTask.put(HbaseSinkTask.scala:83)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:435)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)
        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)
        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)
        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

@artiship
Copy link

@stheppi Can hbase sink support specifying hbase connection in connector properties rather in hbase-site.xml.

lanbotdeployer pushed a commit that referenced this issue Sep 12, 2024
* Update google-cloud-core, ... to 2.43.0

* Update logback-classic, logback-core to 1.5.8

---------

Co-authored-by: Scala Steward <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants