2

I when i use reader.readLine(), the string length is always 80 chars and after the main string unicode spaces are padded up. Is there a way to remove those unwanted characters. (java.io.RandomAccessFile reader) String.trim is not working on this

  • 1
    The question is a bit too narrow. For example, I searched stackoverflow for "[java] internationalization trim" and "[java] unicode trim" and did *not* find this question. You really want a trim() function that is I18N/Unicode aware; if the question was phrased that way, more people would be able to find the answer below. – djb Oct 09 '15 at 13:49

5 Answers5

7

You can use StringUtils.strip from Commons Lang. It is Unicode-aware.

Thilo
  • 250,062
  • 96
  • 490
  • 643
  • well it uses Character.isWhitespace which doesn't work well. It should use newer Character.isSpaceChar to be fully unicode-aware – Michal Bernhard Aug 12 '16 at 11:12
3

You can write a custom method in Java to remove the Unicode space characters , using Character.isWhitespace(char) and Character.isSpaceChar(char) methods, for your specific purpose.

The Spring framework has a StringUtils class with a trimWhitespace(String) method which appears to be based on Character.isWhitespace(char) from the source code here.

AllTooSir
  • 47,910
  • 16
  • 124
  • 159
0

use Google Guava

CharMatcher.WHITESPACE.trimFrom(source);

or try this https://gist.github.com/michalbcz/4861a2b8ed73bb73764e909b87664cb2

Michal Bernhard
  • 3,814
  • 4
  • 26
  • 38
0

If you do not want a big libs. Just use:

str.replaceAll("^[\\s\\p{Z}]+|[\\s\\p{Z}]+$", "");

Testing

    public static String trim(String str) {
        return str.replaceAll("^[\\s\\p{Z}]+|[\\s\\p{Z}]+$", "");
    }

    public static void main(String[] args) {
        System.out.println(trim("\t tes ting \u00a0").length());
        System.out.println(trim("\t testing \u00a0").length());
        System.out.println(trim("tes ting \u00a0").length());
        System.out.println(trim("\t tes ting").length());
    }
lehanh
  • 150
  • 2
  • 12
-2

would have been faster to just search stackoverflow for this question becoz there are multiple questions on that topic there. well, try this:

st.replaceAll("\\s","")

check this one here: link

Community
  • 1
  • 1
bofredo
  • 2,338
  • 6
  • 31
  • 49