I'm trying to "play around" with some REST APIs and java code.
As i am using german language mainly, i already managed it to get the Apache HTTP Client to work with UTF-8 encoding to make sure "Umlaute" are handled the right way.
Still i can't get my regex to match my words correctly.
I try to find words/word combinations like "Büro_Licht" from string like ..."type":"Büro_Licht"....
Using regex expression ".*?type\":\"(\\w+).*?" returns "B" for me, as it doesn't recognize the "ü" as a word character. Clearly, as \w is said to be [a-z A-Z 0-9]. Within strings wich no special characters i get the full "Office_Light" meanwhile.
So i tried another hint mentioned here in like nearly the same question (which i could not comment, because i lack of reputation points).
Using regex expression ".*?type\":\"(\\p{L}).*?" returns "Büro" for me. But here again it cuts on the unterscore for a reason i don't understand.
Is there a nice way to combine both expressions to get the "full" word including underscores and special characters?