Skip to content

Creates Hive table from config file, practice using scala and hadoop APIs

Notifications You must be signed in to change notification settings

neil90/Text2Hive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Text2Hive

NOTE: This is meant as a learning exercise for me for Scala/sbt/Hadoop APIs.

Creates a Hive Table and copies the folder to HDFS based on the xml(s) passed via arguments. Meant to replicate how Hue lets you upload a text file and will create Hive Table.

Steps to Run -

  1. sbt assembly
  2. java -jar Text2Hive-assembly-0.1.jar settings.xml

Example XML Config file

<?xml version="1.0" encoding="UTF-8"?>
<config>
  <coreXml>/etc/hadoop/conf/core-site.xml</coreXml>
  <hdfsXml>/etc/hadoop/conf/hdfs-site.xml</hdfsXml>
  <hiveXml>/etc/hive/conf/hive-site.xml</hiveXml>
  <userKeytab>/home/neil/neil.keytab</userKeytab>
  <src>/home/neil/movie_metadatav</src>
  <dest>/user/neil/movie_metadata</dest>
  <tableType>internal</tableType>
  <isHeaders>yes</isHeaders>
  <delimiter>,</delimiter>
  <quoteChar>"</quoteChar>
  <escapeChar>\</escapeChar>
  <dbTable>default.movie_metadata</dbTable>
  <thriftServer>localhost:10011/default</thriftServer>
</config>

Couple Notes:

  • src Folder is meant to Linux FS and Dest folder is HDFS
  • If tableType is internal then dest (HDFS) is ignored and it instead moves the folder to hive warehouse directory
  • dbTable requires both database and tablename
  • If isHeaders is yes then will assume the file(s) have the header in the first line of the file if not will create its own
  • If Kerberos is required please specify Keytabi
  • Files in src folder are all expected to be the same

About

Creates Hive table from config file, practice using scala and hadoop APIs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages