Java, calculating difference between unique characters in strings -


let's have 2 strings , need calculate difference between unique characters. it's simple:

string s1 = "abcd"; string s2 = "aaaacccbbf"; //answer: 1 

the answer 1, because there no "f" in s1 variable.

but characters மா or 漢字, or other non ascii character? if loop though strings, 1 character கு count 2-3 times separate character, giving me wrong answer:

string s1 = "ab"; string s2 = "aaaகுb"; //answer: 2 (wrong!) 

the code tried with:

class {     public static void main(string[] args) {         scanner sc = new scanner(system.in);         string s1 = sc.nextline();         string s2 = sc.nextline();         sc.close();          string missingcharacters= "";          for(char c : s2.tochararray()) {             if(!missingcharacters.contains(c+"") && !s1.contains(c+""))                  missingcharacters+= c;         }          system.out.println(missingcharacters.length());     } } 

your symbol கு compound form of tamil script contains 2 unicode chars க் + உ (0b95 + 0bc1). if plan work tamil script have find similiar characters pattern:

    string s1 = "ab";     string s2 = "aaaகுb";      pattern pattern = pattern.compile("\\p{l}\\p{m}*");      matcher matcher = pattern.matcher(s2);     set<string> missingcharacters=new treeset<>();     while (matcher.find()) {         missingcharacters.add(matcher.group());     }      matcher = pattern.matcher(s1);     while (matcher.find()) {         missingcharacters.remove(matcher.group());     }      system.out.println(missingcharacters.size()); 

regex source: how match single unicode grapheme


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -