Java, calculating difference between unique characters in strings -
let's have 2 strings , need calculate difference between unique characters. it's simple:
string s1 = "abcd"; string s2 = "aaaacccbbf"; //answer: 1 the answer 1, because there no "f" in s1 variable.
but characters மா or 漢字, or other non ascii character? if loop though strings, 1 character கு count 2-3 times separate character, giving me wrong answer:
string s1 = "ab"; string s2 = "aaaகுb"; //answer: 2 (wrong!) the code tried with:
class {     public static void main(string[] args) {         scanner sc = new scanner(system.in);         string s1 = sc.nextline();         string s2 = sc.nextline();         sc.close();          string missingcharacters= "";          for(char c : s2.tochararray()) {             if(!missingcharacters.contains(c+"") && !s1.contains(c+""))                  missingcharacters+= c;         }          system.out.println(missingcharacters.length());     } } 
your symbol கு compound form of tamil script contains 2 unicode chars க் + உ (0b95 + 0bc1). if plan work tamil script have find similiar characters pattern:
    string s1 = "ab";     string s2 = "aaaகுb";      pattern pattern = pattern.compile("\\p{l}\\p{m}*");      matcher matcher = pattern.matcher(s2);     set<string> missingcharacters=new treeset<>();     while (matcher.find()) {         missingcharacters.add(matcher.group());     }      matcher = pattern.matcher(s1);     while (matcher.find()) {         missingcharacters.remove(matcher.group());     }      system.out.println(missingcharacters.size()); regex source: how match single unicode grapheme
Comments
Post a Comment