3

I would like to come up with a script to program a custom formula for google sheet. The idea is to split a string composed of multiple words. The formula should recognize the words where there is a capital letter and separate them. The result would be a string where the words are separated by ",".

To clarify this is an example of the string:

Nursing StudentStudentNurseNursing School

Desired Result:
Nursing Student,Student,Nurse,Nursing School

I have tried to use a formula in Google Sheet:

=split(regexreplace(A1,"[A-Z][^A-Z]*","$0"&char(9)),char(9))

However, it generates 6 cells with the below strings:

Nursing Student Student Nurse Nursing School

Can anybody help me or give me some hint?

pnuts
  • 56,678
  • 9
  • 81
  • 133
Stefano
  • 159
  • 5
  • 16

2 Answers2

5
=REGEXREPLACE(A1,"(\B)([A-Z])",",$2")
  • \B not a word Border.
  • [A-Z] Upper case letter.
  • If \B is followed by a upper case letter, replace the \B with ,
TheMaster
  • 37,620
  • 6
  • 43
  • 68
3

If you plan to insert a comma in between a lowercase letter and an uppercase letter, you may use either of:

=REGEXREPLACE(A1,"([a-z])([A-Z])","$1,$2")
=REGEXREPLACE(A1,"([[:lower:]])([[:upper:]])","$1,$2")

where

  • ([a-z]) / ([[:lower:]]) - Capturing group 1 (later referred to with $1 from the replacement pattern): any lowercase ASCII letter
  • ([A-Z]) / ([[:upper:]]) - Capturing group 2 (later referred to with $2 from the replacement pattern): any uppercase ASCII letter

enter image description here

Note that another suggestion, based on a non-word boundary \B, that can be written as =REGEXREPLACE(A1,"\B[A-Z]",",$0"), will also match an uppercase letter after _ and any digit, so it might overfire if you do not expect that behavior.

Wiktor Stribiżew
  • 561,645
  • 34
  • 376
  • 476