java - How to replace the special characters from input string in map reduce program -
i able replace special characters in normal java program.
this java code:
public class {      public static void main(string[] args) {     string s = "this785($^#')\"";     system.out.println(s);     s=s.replaceall("[^\\w\\s]", "");     system.out.println(s);  }   but trying same in map reduce program not working
 public static class map extends mapreducebase implements         mapper<longwritable, text, text, intwritable> {      @override     public void map(longwritable key, text value, outputcollector<text, intwritable> output, reporter reporter)             throws ioexception {          string s = value.tostring().replaceall("\\w+\\s+","");         string[] words=s.split(" ");         for(string a:words){   output.collect(new text(a),new  intwritable(1));         }     }   sample input map reduce program
   "this@#$ is$# word$%^ (count)"   "this@#$ is$# word$%^ (count)"   output of map reduce program
 "this@#$   2   (count)"  2     is$#    2  word$%^    2   am doing wrong please me out!
you regex has changed [^\\w\\s] \\w+\\s+
this regex means, match 1 or more alphabet (a-z/a-z) or number (alpha numberic) followed space or tab or new line etc. , replace empty string. in string have:
 "this@#$ is$# word$%^ (count)"   you dont satisfy case , hence output.you either have $ or # or ^ followed space not alpha numeric character followed space.
Comments
Post a Comment