30

I am using JS Animated Contact Form with this line of validation regex:

rx:{".name":{rx:/^[a-zA-Z'][a-zA-Z-' ]+[a-zA-Z']?$/,target:'input'}, other fields...

I just found out, that I can't enter name like "Müller". The regex will not accept this. What do I have to do, to allow also Umlauts?

ajtrichards
  • 28,323
  • 13
  • 90
  • 97
user1555112
  • 1,837
  • 6
  • 23
  • 42

6 Answers6

45

You should use in your regex unicode codes for characters, like \u0080. For German language, I found following table:

Zeichen     Unicode
------------------------------
Ä, ä        \u00c4, \u00e4
Ö, ö        \u00d6, \u00f6
Ü, ü        \u00dc, \u00fc
ß           \u00df

(source http://javawiki.sowas.com/doku.php?id=java:unicode)

IProblemFactory
  • 9,201
  • 8
  • 47
  • 65
24

Try using this:

/^[\u00C0-\u017Fa-zA-Z'][\u00C0-\u017Fa-zA-Z-' ]+[\u00C0-\u017Fa-zA-Z']?$/

I have added the unicode range \u00C0-\u017F to the start of each of the square bracket groups.

Given that /^[\u00C0-\u017FA-Za-z]+$/.test("aeiouçéüß") returns true, I expect it should work.

Credit to https://stackoverflow.com/a/11550799/940252.

Community
  • 1
  • 1
Josh Harrison
  • 5,842
  • 1
  • 28
  • 44
  • `[\u00C0-\u017Fa-zA-Z']?`$/ is kind of redundant, what are you trying to do? –  Feb 25 '14 at 17:17
  • I'm not sure as I'm not terribly hot on regex and the OP didn't specify the pattern they're hoping to match. I just worked with their original code. If you can clean it up please do! :) – Josh Harrison Feb 25 '14 at 17:21
  • I would venture to change that space to something else to capture all non-word characters like hyphens. Here's a test: https://regex101.com/r/zH5uV0/4 – Mike Kormendy Jul 24 '16 at 14:01
  • 2
    `/^[\u00C0-\u017Fa-zA-Z'][\u00C0-\u017Fa-zA-Z-' ]+[\u00C0-\u017Fa-zA-Z']?$/.test("ü") -> false` – Zane Hitchcox Aug 18 '19 at 04:13
6

I came up with a combination of different ranges:

[A-Za-zÀ-ž\u0370-\u03FF\u0400-\u04FF]

But I see that it misses some letters of @SambitD proposal, refer to: https://rubular.com/r/2g00QJK4rBS8Y4

Tsunamis
  • 5,282
  • 1
  • 18
  • 21
3

I used

A-Za-z-ÁÀȦÂÄǞǍĂĀÃÅǺǼǢĆĊĈČĎḌḐḒÉÈĖÊËĚĔĒẼE̊ẸǴĠĜǦĞG̃ĢĤḤáàȧâäǟǎăāãåǻǽǣćċĉčďḍḑḓéèėêëěĕēẽe̊ẹǵġĝǧğg̃ģĥḥÍÌİÎÏǏĬĪĨỊĴĶǨĹĻĽĿḼM̂M̄ʼNŃN̂ṄN̈ŇN̄ÑŅṊÓÒȮȰÔÖȪǑŎŌÕȬŐỌǾƠíìiîïǐĭīĩịĵķǩĺļľŀḽm̂m̄ʼnńn̂ṅn̈ňn̄ñņṋóòôȯȱöȫǒŏōõȭőọǿơP̄ŔŘŖŚŜṠŠȘṢŤȚṬṰÚÙÛÜǓŬŪŨŰŮỤẂẀŴẄÝỲŶŸȲỸŹŻŽẒǮp̄ŕřŗśŝṡšşṣťțṭṱúùûüǔŭūũűůụẃẁŵẅýỳŷÿȳỹźżžẓǯßœŒçÇ

which supports almost all the chars in Europe. Source of truth

isambitd
  • 799
  • 7
  • 14
  • 7
    No sane programmer would list all characters, when there are shorthand character classes and ranges. Please, don't do that. – user1438038 Dec 17 '19 at 14:21
1

In JS, you can use the u flag on regular expressions to enable access to a special "meta sequence", namely \P. \P is a Unicode aware lookup that has a special Letter category. This category will match German, Swedish, Scandinavian, cyrillic characters etc.

In short, use this:

/\p{Letter}/u

Props to this article by Till Sanders.

fredrikekelund
  • 1,756
  • 2
  • 16
  • 32
0

The problem with the \uXXXX approach is, that it is not supported by all Regex flavours. For example Visual C++ does not support it. There, you would need to enumerate the actual letters.

I recommend to use a tool like https://www.regexbuddy.com/ that knows as many flavors as possible.