-
Notifications
You must be signed in to change notification settings - Fork 23
pyaccumulo Tutorial
svn co https://svn.apache.org/repos/asf/accumulo/branches/1.5 accumulo-1.5-src
cd accumulo-1.5-src
mvn package -P assemble -DskipTests
tar -C ../ -xvf assemble/target/apache-accumulo-1.5.0-SNAPSHOT-dist.tar.gz
cd ../apache-accumulo-1.5.0-SNAPSHOT
Now either copy your existing accumulo configs (accumulo-site.xml and accumulo-env.sh) or manually edit accumulo-site.xml and accumulo-env.sh. See the "CONFIGURATION" section of http://accumulo.apache.org/1.4/user_manual/Administration.html#Installation for more info
Now edit proxy/proxy.properties. You want to make sure the following settings are changed to the following.
org.apache.accumulo.proxy.ProxyServer.useMockInstance=false
org.apache.accumulo.proxy.ProxyServer.useMiniAccumulo=false
org.apache.accumulo.proxy.ProxyServer.protocolFactory=org.apache.thrift.protocol.TCompactProtocol$Factory
org.apache.accumulo.proxy.ProxyServer.port=50096
org.apache.accumulo.proxy.ProxyServer.instancename=test
org.apache.accumulo.proxy.ProxyServer.zookeepers=localhost:2181
Make sure that the instance Name and zookeepers match your setup.
Assuming you have accumulo running already. If not, see this section of the manual for more info: http://accumulo.apache.org/1.4/user_manual/Administration.html#Running
Now, from the apache-accumulo-1.5.0-SNAPSHOT
dir (from unpacking the tarball).
./bin/accumulo proxy -p proxy/proxy.properties
At this point you should be able to use the pyaccumulo lib to access this proxy server.
git clone [email protected]:accumulo/pyaccumulo.git
cd pyaccumulo
sudo pip install -r requirements.txt
export PYTHONPATH="."
vi settings.py
Make sure these settings match your setup (user/password/etc).
HOST = "localhost"
PORT = 50096
USER = 'root'
PASSWORD = 'secret'
Now run the following command. Note: this will create a table called analytics
and write to it. If you have a table named analytics
already, then you probably want to edit examples/analytics.py
prior to running this command.
python examples/analytics.py
You should see output like this:
Cell(row='row', cf='count', cq='cq', cv='', ts=1363777816377, val='1000')
Cell(row='row', cf='histo', cq='cq', cv='', ts=1363777816377, val='1000,2000,3000,4000,5000,6000,7000,8000,9000')
Cell(row='row', cf='max', cq='cq', cv='', ts=1363777816377, val='999')
Cell(row='row', cf='min', cq='cq', cv='', ts=1363777816377, val='0')
Cell(row='row', cf='sum', cq='cq', cv='', ts=1363777816377, val='499500')
If you see the following then either your proxy server isn't running or your settings.py settings are wrong.
Traceback (most recent call last):
File "examples/analytics.py", line 21, in <module>
conn = Accumulo(host=settings.HOST, port=settings.PORT, user=settings.USER, password=settings.PASSWORD)
File "/Users/jtrost/workspace/pyaccumulo/pyaccumulo/__init__.py", line 138, in __init__
self.transport.open()
File "/Library/Python/2.7/site-packages/thrift/transport/TTransport.py", line 261, in open
return self.__trans.open()
File "/Library/Python/2.7/site-packages/thrift/transport/TSocket.py", line 99, in open
message=message)
thrift.transport.TTransport.TTransportException: Could not connect to localhost:50096