3

Let say I have the following string:

getPasswordLastChangedDatetime

How would I be able to split that up by capital letters so that I would be able to get:

get
Password
Last
Changed
Datetime
ryanzec
  • 26,456
  • 38
  • 110
  • 164
  • possible duplicate of this: http://stackoverflow.com/questions/5020906/python-convert-camel-case-to-space-delimited-using-regex-and-taking-acronyms-int – Marek Sebera Jul 04 '11 at 14:48

7 Answers7

6

If you only care about ASCII characters:

$parts = preg_split("/(?=[A-Z])/", $str);

DEMO

The (?= ..) construct is called lookahead [docs].

This works if the parts only contain a capital character at the beginning. It gets more complicated if you have things like getHTMLString. This could be matched by:

$parts = preg_split("/((?<=[a-z])(?=[A-Z])|(?=[A-Z][a-z]))/", $str);

DEMO

Felix Kling
  • 756,363
  • 169
  • 1,062
  • 1,111
1

Asked this a little too soon, found this:

preg_replace('/(?!^)[[:upper:]]/',' \0',$test);
ryanzec
  • 26,456
  • 38
  • 110
  • 164
  • 2
    I would add a **+**: `preg_replace('/(?!^)[[:upper:]]+/',' \0',$test);` to be able to get a good match on "getMyAPI". (That is, if you want the API as one word). – johnhaggkvist Jul 04 '11 at 15:00
1

For instance:

(?:^|\p{Lu})\P{Lu}*
Artefacto
  • 93,596
  • 16
  • 191
  • 218
  • More information can be found here: http://www.regular-expressions.info/unicode.html – Felix Kling Jul 04 '11 at 14:56
  • What do you propose to do with this regex? If you use `preg_split` or `preg_replace`, the characters it matches will be deleted; if you use `preg_match_all`, everything *except* those characters will be deleted. – Alan Moore Jul 04 '11 at 15:58
  • @Alan ? `preg_match_all` deletes nothing, it finds matches. You could e.g. do `preg_match_all('/(?:^|\p{Lu})\P{Lu}*/iu', 'getPasswordLastChangedDatetime', $result)`; then `$result[0]` will have the strings the OP wants. – Artefacto Jul 04 '11 at 16:02
  • You're right, I was reading the regex wrong. And I should have said it would *ignore* the other characters, leaving them out of the results (but of course, it doesn't ignore anything). But now I'm curious about that `i` modifier: it doesn't seem to have any effect in my tests--which is good, since case is the whole point. – Alan Moore Jul 04 '11 at 16:33
  • @Alan I put the `i` accidentally (force of habit). In any case, since the expression is looking up the properties of the characters directly, it has no effect. – Artefacto Jul 04 '11 at 16:36
0

No need to over complicated solution. This does it

preg_replace('/([A-Z])/',"\n".'$1',$string);

This doens't take care of acronyms of course

dynamic
  • 45,586
  • 54
  • 150
  • 229
0
preg_split('@(?=[A-Z])@', 'asAs')
azat
  • 3,495
  • 1
  • 27
  • 30
  • 2
    FYI, there are no quantifiers and no dots in this regex, so the `U` (ungreedy quantifiers) and `s` (dot matches all) modifiers aren't doing any good. – Alan Moore Jul 04 '11 at 15:47
0

Use this: [a-z]+|[A-Z][a-z]* or \p{Ll}+|\p{Lu}\p{Ll}*

Kirill Polishchuk
  • 52,773
  • 10
  • 120
  • 121
0
 preg_split("/(?<=[a-z])(?=[A-Z])/",$password));
Bob Vale
  • 17,579
  • 40
  • 48