java - hbase export to flat file -
i'm pretty new hadoop...
i have bunch of data in hbase table need export (with minor transformation) out single flat file. this, building mapreduce job scans table , maps data text type textoutputformat.
something this:
tablemapreduceutil.inittablemapperjob("tablename", // input table scan, // scan instance control cf , attribute selection mymapper.class, // mapper class text.class, // mapper output key text.class, // mapper output value job); job.setnumreducetasks(1); job.setoutputformatclass(textoutputformat.class); job.setoutputkeyclass(text.class); job.setoutputvalueclass(text.class); fileoutputformat.setoutputpath(job, new path("/tmp/mydirectory"));
and mapper:
private static class mymapper extends tablemapper<text, text> { public void map(immutablebyteswritable row, result result, context context) throws ioexception, interruptedexception { string json = new string(result.getvalue("cf".getbytes(), "qualifier".getbytes())); stringbuilder line = new stringbuilder(); //...builds line text k = new text("filename-20141205.txt"); text linetext = new text(line.tostring()); context.write(k, linetext); } }
however, out single file keys , data in part-r-00000 file. think need reducer finish job, i'm not sure looks like.
would identity reducer work? there better way go problem other textoutputformat?
this worked:
private static class myoutputformat<k, v> extends textoutputformat<k, v>{ @override public path getdefaultworkfile(taskattemptcontext context, string extension) throws ioexception { fileoutputcommitter committer = (fileoutputcommitter) getoutputcommitter(context); return new path(committer.getworkpath(), "my-file-name.txt"); } }
Comments
Post a Comment