12

What regex would match any ASCII character in java?

I've already tried:

^[\\p{ASCII}]*$

but found that it didn't match lots of things that I wanted (like spaces, parentheses, etc...). I'm hoping to avoid explicitly listing all 127 ASCII characters in a format like:

^[a-zA-Z0-9!@#$%^*(),.<>~`[]{}\\/+=-\\s]*$
David
  • 165
  • 1
  • 2
  • 5
  • Downvote as this question doesn't indicate if you just require a single character (in the body) or multiple characters (in the title). – Maarten Bodewes Apr 16 '16 at 19:31

5 Answers5

32

The first try was almost correct

"^\\p{ASCII}*$"
Oleg Pavliv
  • 19,494
  • 7
  • 58
  • 72
  • Although I would use `"^\\p{ASCII}+$"` so as to not match the empty string, but that might be philosophical... :) – David Jan 19 '16 at 09:05
9

I have never used \\p{ASCII} but I have used ^[\\u0000-\\u007F]*$

Bala R
  • 104,615
  • 23
  • 192
  • 207
  • Should there really be two slashes before the u ? i.e. isn't `^[\u0000-\u007F]*$` correct? – Nic Cottrell Apr 14 '15 at 12:43
  • 1
    I tried, single slash works as well. Normally you need double slash because it's an escape command. By the way, I had problems with an String because it has chars from the extended ASCII, but `\\p{ASCII}` is only the standard. For extended ASCII you can use `^[\\u0000-\\u00FE]*$` (`FE` instead of `7F`) – Pascal Schneider Feb 05 '16 at 09:02
  • Why FE, and not FF? – Ingo Schalk-Schupp Sep 02 '19 at 16:43
1

If you only want the printable ASCII characters you can use ^[ -~]*$ - i.e. all characters between space and tilde.

https://en.wikipedia.org/wiki/ASCII#ASCII_printable_code_chart

Raniz
  • 10,449
  • 1
  • 33
  • 61
0

I think question about getting ASCII characters from a raw string which has both ASCII and special characters...

public String getOnlyASCII(String raw) {
    Pattern asciiPattern = Pattern.compile("\\p{ASCII}*$");
    Matcher matcher = asciiPattern.matcher(raw);
    String asciiString = null;
    if (matcher.find()) {
        asciiString = matcher.group();
    }
    return asciiString;
}

The above program will remove the non ascii string and return the string. Thanks to @Oleg Pavliv for pattern.

For ex:

raw = ��+919986774157

asciiString = +919986774157

arulraj.net
  • 4,269
  • 2
  • 33
  • 36
0

For JavaScript it'll be /^[\x00-\x7F]*$/.test('blah')

Flexo
  • 84,884
  • 22
  • 182
  • 268
catamphetamine
  • 3,856
  • 28
  • 25