0

How can I split a string based on a string and have the resulting array contain the separators as well?

Example:

If my string = "Hello how are you.Are you fine?.How old are you?"

And I want to split based on string "you", then result I want is an array with items { "Hello how are", "you", ".Are", "you", "fine?.How old are", "you", "?" }.

How can I get a result like this? I tried String.Split and

string[] substrings = Regex.Split(source, stringSeparators);

But both are giving the result array without the occurrence of you in it.

Also, I want to split only on the whole word you. I don't want to split if you is a part of some other words. For example, in the case Hello you are so young, I want the result as { "Hello", "you", "so young" }. I don't want to split the word young to { "you", "ng" }.

ErikE
  • 46,564
  • 22
  • 147
  • 188
Sebastian
  • 4,313
  • 14
  • 65
  • 132
  • 1
    possible duplicate of [C# split string but keep split chars / separators](http://stackoverflow.com/questions/521146/c-sharp-split-string-but-keep-split-chars-separators) – Ondrej Janacek Dec 03 '14 at 06:28
  • @Ondrej Its not a possible duplicate . I want to try for words only . See the edit – Sebastian Dec 03 '14 at 06:35

4 Answers4

4

You can put the seperator into a match group, then it will be part of the result array:

string[] substrings = System.Text.RegularExpressions.Regex.Split(source, "(you)");

Output would be :

"Hello how are ", 
"you" ,
".Are ",
"you",
" fine?.How old are ",
"you",
"?"

Update regarding your additional question: Use word-boundaries around the keyword:

Split(source, "\\b(you)\\b");
dognose
  • 19,568
  • 9
  • 58
  • 104
  • Also i want to split only for the word "you". I dont want to split ifyou is a part of some other words . Example Hello you are so young . In this case i want result as "Hello" "you" "so young" . Dont want to split the word young to "you" and "ng" – Sebastian Dec 03 '14 at 06:36
  • ie i want to split either by "you" or " you " or " you" or "you " . [ without space , with space , with space in left / right etc] – Sebastian Dec 03 '14 at 06:37
  • string[] substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "(you)"); int s = substrings.Length; substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "\b(you)\b"); s = substrings.Length; i tried the suggestion , but first one gives some result. Second one not returning anything – Sebastian Dec 03 '14 at 07:04
  • @JMat My bad, you ofc. need to escape the `\b` in the pattern, see the edit. – dognose Dec 03 '14 at 07:17
  • Yes Its perfect now. Just one more case to handle Case sensitiveness? How can i achieve that . Now if You is present in string it skips them since Y is in caps – Sebastian Dec 03 '14 at 07:37
  • @JMat without providing `RegexOptions.IgnoreCase` as third parameter, the splitting is already case-sensitive. – dognose Dec 03 '14 at 08:32
1
\b(you)\b

Split by this and you have your result.

vks
  • 65,133
  • 10
  • 87
  • 119
  • string[] substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "(you)"); int s = substrings.Length; substrings = System.Text.RegularExpressions.Regex.Split("Hello how are you.Are you fine?.How old are you?.Hello how old are you?.Hello you are so young", "\b(you)\b"); s = substrings.Length; i tried the suggestion , but first one gives some result. Second one not returning anything – Sebastian Dec 03 '14 at 07:04
0

regex replace :

(you) with |\1|

now you will have a string like this :

Hello how are |you|.Are |you| fine?.How old are |you|?

now you can simply split on |

Hope that helps

aelor
  • 10,430
  • 3
  • 30
  • 46
0

string[] substrings = System.Text.RegularExpressions.Regex.Split(source,"\s* you\s*");

This should work. Below is the output.

"Hello how are"

".Are"

"fine?.How old are"

"?"

praveen.upadhyay
  • 243
  • 1
  • 3
  • 14