python - luigi target for non-existent table -


i'm trying set simple table existence test luigi task using luigi.hive.hivetabletarget

i create simple table in hive make sure there:

create table test_table (a int); 

next set target luigi:

from luigi.hive import hivetabletarget target = hivetabletarget(table='test_table')  >>> target.exists() true 

great, next try table know doesn't exist make sure returns false.

target = hivetabletarget(table='test_table_not_here')  >>> target.exists() 

and raises exception:

traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "/usr/lib/python2.6/site-packages/luigi/hive.py", line 344, in exists     return self.client.table_exists(self.table, self.database)   file "/usr/lib/python2.6/site-packages/luigi/hive.py", line 117, in table_exists     stdout = run_hive_cmd('use {0}; describe {1}'.format(database, table))   file "/usr/lib/python2.6/site-packages/luigi/hive.py", line 62, in run_hive_cmd     return run_hive(['-e', hivecmd], check_return_code)   file "/usr/lib/python2.6/site-packages/luigi/hive.py", line 56, in run_hive     stdout, stderr) luigi.hive.hivecommanderror: ('hive command: hive -e use default; describe test_table_not_here failed error code: 17', '', '\nlogging initialized using configuration in jar:file:/opt/cloudera/parcels/cdh-5.2.0-1.cdh5.2.0.p0.36/jars/hive-common-0.13.1- cdh5.2.0.jar!/hive-log4j.properties\nok\ntime taken: 0.822 seconds\nfailed:  semanticexception [error 10001]: table not found test_table_not_here\n') 

edited formatting clarity

i don't understand last line of exception. of course table not found, whole point of existence check. expected behavior or have configuration issue need work out?

okay looks may have been bug in latest tagged release (1.0.19) fixed on master branch. code responsible line:

stdout = run_hive_cmd('use {0}; describe {1}'.format(database, table)) return not "does not exist" in stdout 

which changed in master be:

stdout = run_hive_cmd('use {0}; show tables "{1}";'.format(database, table)) return stdout , table in stdout 

the latter works fine whereas former throws hivecommanderror.

if want solution without having update master branch, create own target class minimal effort:

from luigi.hive import hivetabletarget, run_hive_cmd  class myhivetarget(hivetabletarget):     def exists(self):         stdout = run_hive_cmd('use {0}; show tables "{1}";'.format(self.database, self.table))         return self.table in stdout 

this produce desired output.


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -