12

I need a well tested Regular Expression (.net style preferred), or some other simple bit of code that will parse a USA/CA phone number into component parts, so:

  • 3035551234122
  • 1-303-555-1234x122
  • (303)555-1234-122
  • 1 (303) 555 -1234-122

etc...

all parse into:

  • AreaCode: 303
  • Exchange: 555
  • Suffix: 1234
  • Extension: 122
AliciaBytes
  • 7,151
  • 6
  • 36
  • 46
Tristan Havelick
  • 63,483
  • 19
  • 53
  • 64

6 Answers6

21

None of the answers given so far was robust enough for me, so I continued looking for something better, and I found it:

Google's library for dealing with phone numbers

I hope it is also useful for you.

cheffe
  • 9,075
  • 2
  • 43
  • 55
mmoossen
  • 1,197
  • 3
  • 21
  • 30
  • +1 for good resource, -0.1 for not .NET (which was only a preference) Sadly we're doing integer math and it gets truncated to 0. – Mir Dec 12 '12 at 23:46
  • 2
    Can you show how this library was able to parse a phone number into its constituent parts? As the OP mentioned, into Area Code, Exchange, and Suffix? – TSmith Feb 11 '14 at 18:47
3

This is the one I use:

^(?:(?:[\+]?(?<CountryCode>[\d]{1,3}(?:[ ]+|[\-.])))?[(]?(?<AreaCode>[\d]{3})[\-/)]?(?:[ ]+)?)?(?<Number>[a-zA-Z2-9][a-zA-Z0-9 \-.]{6,})(?:(?:[ ]+|[xX]|(i:ext[\.]?)){1,2}(?<Ext>[\d]{1,5}))?$

I got it from RegexLib I believe.

Philip Rieck
  • 32,047
  • 10
  • 86
  • 99
  • 4
    That's horrible. My eyes are bleeding. – Paul Nathan Oct 22 '08 at 21:38
  • javascript doesn't have named groups, and it wasn't capturing the extension until I put a ? after the {6,} range. Wound up with: `/^(?:(?:[\+]?(\d{1,3}(?:\s+|[\-\.])))?[\(]?(\d{3})[\-\/)]?(?:\s+)?)?([a-zA-Z2-9][a-zA-Z0-9 \-\.]{6,}?)(?:(?:\s+|[xX]|(?:[Ee]xt[\.]?)){1,2}(\d{1,5}))?$/` – Jeff Lowery Aug 19 '14 at 20:03
1

Strip out anything that's not a digit first. Then all your examples reduce to:

/^1?(\d{3})(\d{3})(\d{4})(\d*)$/

To support all country codes is a little more complicated, but the same general rule applies.

Peter Stone
  • 3,676
  • 4
  • 21
  • 14
1

This regex works exactly as you want with your examples:

Regex regexObj = new Regex(@"\(?(?<AreaCode>[0-9]{3})\)?[-. ]?(?<Exchange>[0-9]{3})[-. ]*?(?<Suffix>[0-9]{4})[-. x]?(?<Extension>[0-9]{3})");
Match matchResult = regexObj.Match("1 (303) 555 -1234-122");

// Now you have the results in groups 
matchResult.Groups["AreaCode"];
matchResult.Groups["Exchange"];
matchResult.Groups["Suffix"];
matchResult.Groups["Extension"];
Christian C. Salvadó
  • 769,263
  • 179
  • 909
  • 832
  • Adding +1 into consideration `new Regex(@"1?\(?(?[0-9]{3})\)?[-. ]?(?[0-9]{3})[-. ]*?(?[0-9]{4})[-. x]?(?[0-9]{3})");` e.g. `+13035551234122` – Jaider Feb 21 '18 at 18:38
1

Here is a well-written library used with GeoIP for instance:

http://highway.to/geoip/numberparser.inc

ridoy
  • 6,158
  • 2
  • 27
  • 60
Ruslan Abuzant
  • 621
  • 6
  • 17
0

here's a method easier on the eyes provided by the Z Directory (vettrasoft.com), geared towards American phone numbers:

string_o s2, s1 = "888/872.7676";
z_fix_phone_number (s1, s2);
cout << s2.print();      // prints "+1 (888) 872-7676"
phone_number_o pho = s2;
pho.store_save();

the last line stores the number to database table "phone_number". column values: country_code = "1", area_code = "888", exchange = "872", etc.

gorth
  • 9
  • 3