1

I've trouble parsing tweets which are represented as escaped unicode some found to be foreign language strings e.g \u064a\u0633\u0639\u062f\u0646\u064a

Ivaylo Strandjev
  • 66,530
  • 15
  • 117
  • 170

2 Answers2

1

Using org.apache.commons.lang.StringEscapeUtils.

String s="\\u0048\\u0065\\u006C\\u006C\\u006F";
System.out.println(StringEscapeUtils.unescapeJava(s));

P.S. Oops, I didn't refresh this page before I post the answer, the comments above conveys the same thing.

Judking
  • 5,681
  • 10
  • 49
  • 82
0

you can try str = org.apache.commons.lang.StringEscapeUtils.unescapeJava(str);

from apache commons

check http://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/StringEscapeUtils.html

Lakshmi
  • 2,044
  • 3
  • 26
  • 47