35

I use this #(\s|^)([a-z0-9-_]+)#i for capitalize every first letter every word, i want it also to capitalize the letter if it's after a special mark like a dash(-)

Now it shows:

This Is A Test For-stackoverflow

And i want this:

This Is A Test For-Stackoverflow

Any suggestions/samples for me?

I'am not a pro, so try to keep it simple for me to understand.

quietmint
  • 13,260
  • 6
  • 47
  • 72
Simmer
  • 351
  • 1
  • 3
  • 5

8 Answers8

33

+1 for word boundaries, and here is a comparable Javascript solution. This accounts for possessives, as well:

var re = /(\b[a-z](?!\s))/g;
var s = "fort collins, croton-on-hudson, harper's ferry, coeur d'alene, o'fallon"; 
s = s.replace(re, function(x){return x.toUpperCase();});
console.log(s); // "Fort Collins, Croton-On-Hudson, Harper's Ferry, Coeur D'Alene, O'Fallon"
NotNedLudd
  • 339
  • 3
  • 2
  • toUpperCase is capitalizing the whole word. Here is the solution: s.replace(re, function(x){return x.charAt(0).toUpperCase() + x.slice(1);}); – Polopollo May 09 '16 at 20:26
  • 2
    @Polopollo, in this case the regex is only returning one letter if it matches but globally. So there is no need for that extra coding and it should work as is. – adam-beck Apr 26 '17 at 19:51
  • This will not work as OP has asked since a single character would not get capitalized. Just for anybody who comes to this question like I did. – adam-beck Apr 26 '17 at 19:51
  • 1
    I fear this doesn't work: word boundaries include things like '. So `don't` becomes `Don'T` – Anderas Apr 13 '18 at 05:28
  • @Anderas that's what the negative lookahead is for: `(?!\s)` checks if it's not a character before whitespace. On the other hand, this fails when a word like `don't` is followed by a non-whitespace, non-alphanumeric character like a comma, period or exclamation mark. It would be better to use a word boundary in the lookahead: `/(\b[a-z](?!\b))/g;` – Guido Bouman May 03 '18 at 12:22
  • @GuidoBouman: Your suggested regex fails for Coeur D'Alene and O'Fallon though. – davemyron May 23 '19 at 00:56
19

A simple solution is to use word boundaries:

#\b[a-z0-9-_]+#i

Alternatively, you can match for just a few characters:

#([\s\-_]|^)([a-z0-9-_]+)#i
Kobi
  • 130,553
  • 41
  • 252
  • 283
  • Thank you! Works like a charm! – Simmer Jun 06 '11 at 11:56
  • 2
    @Tim - I took artistic freedom and didn't change the way the OP matches letters - It's *possible* Simmer wants the letter as output, change their colors or whatnot. Also, didn't gave it that much thought, I only had 4 minutes `:P` – Kobi Jun 06 '11 at 14:35
  • 1
    Can someone please add jsfiddle example would be helpful – Pravin W Jun 09 '16 at 10:33
  • 1
    Which language's regex is this for? – JohnK Jun 22 '17 at 15:32
  • @JohnK - Both of these are simple enough and should work in all languages. `#` is a separator here, so your language may need `"\\b[a-z0-9-_]+"` and an `IgnoreCase` flag. – Kobi Jun 22 '17 at 15:44
7

Actually dont need to match full string just match the first non-uppercase letter like this:

'~\b([a-z])~'
anubhava
  • 713,503
  • 59
  • 514
  • 593
6

If you want to use pure regular expressions you must use the \u.

To transform this string:

This Is A Test For-stackoverflow

into

This Is A Test For-Stackoverflow

You must put: (.+)-(.+) to capture the values before and after the "-" then to replace it you must put:

$1-\u$2

If it is in bash you must put:

echo "This Is A Test For-stackoverflow" | sed 's/\(.\)-\(.\)/\1-\u\2/'

Jaime Roman
  • 529
  • 1
  • 7
  • 19
4

For JavaScript, here’s a solution that works across different languages and alphabets:

const originalString = "this is a test for-stackoverflow"
const processedString = originalString.replace(/(?:^|\s|[-"'([{])+\S/g, (c) => c.toUpperCase())

It matches any non-whitespace character \S that is preceded by a the start of the string ^, whitespace \s, or any of the characters -"'([{, and replaces it with its uppercase variant.

Michael Schmid
  • 2,566
  • 19
  • 22
2

my solution using javascript

function capitalize(str) {
  var reg = /\b([a-zÁ-ú]{3,})/g;
  return string.replace(reg, (w) => w.charAt(0).toUpperCase() + w.slice(1));
}

with es6 + javascript

const capitalize = str => 
    str.replace(/\b([a-zÁ-ú]{3,})/g, (w) => w.charAt(0).toUpperCase() + w.slice(1));



/<expression-here>/g
  1. [a-zÁ-ú] here I consider all the letters of the alphabet, including capital letters and with accentuation. ex: sábado de Janeiro às 19h. sexta-feira de janeiro às 21 e horas
  2. [a-zÁ-ú]{3,} so I'm going to remove some letters that are not big enough
    ex: sábado de Janeiro às 19h. sexta-feira de janeiro às 21 e horas
  3. \b([a-zÁ-ú]{3,}) lastly i keep only words that complete which are selected. Have to use () to isolate the last expression to work.
    ex: sábado de Janeiro às 19h. sexta-feira de janeiro às 21 e horas

after achieving this, I apply the changes only to the words that are in lower case

string.charAt(0).toUpperCase() + w.slice(1); // output -> Output

joining the two

str.replace(/\b(([a-zÁ-ú]){3,})/g, (w) => w.charAt(0).toUpperCase() + w.slice(1));

result:
Sábado de Janeiro às 19h. Sexta-Feira de Janeiro às 21 e Horas

1

Here's my Python solution

>>> import re
>>> the_string = 'this is a test for stack-overflow'
>>> re.sub(r'(((?<=\s)|^|-)[a-z])', lambda x: x.group().upper(), the_string)
'This Is A Test For Stack-Overflow'

read about the "positive lookbehind" here: https://www.regular-expressions.info/lookaround.html

nmz787
  • 1,711
  • 1
  • 18
  • 31
-1

this will make

R.E.A.C De Boeremeakers

from

r.e.a.c de boeremeakers

(?<=\A|[ .])(?<up>[a-z])(?=[a-z. ])

using

    Dim matches As MatchCollection = Regex.Matches(inputText, "(?<=\A|[ .])(?<up>[a-z])(?=[a-z. ])")
    Dim outputText As New StringBuilder
    If matches(0).Index > 0 Then outputText.Append(inputText.Substring(0, matches(0).Index))
    index = matches(0).Index + matches(0).Length
    For Each Match As Match In matches
        Try
            outputText.Append(UCase(Match.Value))
            outputText.Append(inputText.Substring(Match.Index + 1, Match.NextMatch.Index - Match.Index - 1))
        Catch ex As Exception
            outputText.Append(inputText.Substring(Match.Index + 1, inputText.Length - Match.Index - 1))
        End Try
    Next
Sedecimdies
  • 152
  • 1
  • 10