How can I match whitespace in sed? In my data I want to match all of 3+ subsequent whitespace characters (tab space) and replace them by 2 spaces. How can this be done?
6 Answers
The character class \s will match the whitespace characters <tab> and <space>.
For example:
$ sed -e "s/\s\{3,\}/ /g" inputFile
will substitute every sequence of at least 3 whitespaces with two spaces.
REMARK:
For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension. See the POSIX specifications for sed and BREs
- 9,998
This works on MacOS 10.8:
sed -E "s/[[:space:]]+/ /g"
- 1,278
-
3
-
3Not generally, GNU sed won't have -E. From the BSD sed man page: "The -E, -a and -i options are non-standard FreeBSD extensions and may not be available on other operating systems." – Brad Koch Mar 18 '14 at 21:19
-
1Why do you need the -E flag, for the + operator? Most expressions would probably be fine with * instead, then this would work on other platforms. – Samuel Mar 21 '15 at 00:05
-
7@Samuel If you use *, the regex will match zero or more spaces, and you will get a space between every character, and a space at each end of each line. If you don't have the -E flag, then you want
sed "s/[[:space:]]\+/ /g"to match one or more spaces. – jbo5112 Jan 20 '16 at 20:49 -
1
-
@BradKoch The fact that
-Eis non-standard does not imply GNU sed does not have that option. You linked document exactly states the availability of-Eoption for GNU sed as well. – xuhdev Mar 06 '18 at 22:08 -
@xuhdev You're correct, GNU sed added support for
-Ein version 4.3, released in 2017. Older versions will still fail with-E. – Brad Koch Mar 06 '18 at 22:18 -
@BradKoch OK, I think I know what is confusing. Older versions already support
-Ebut it is not documented. It was documented later since it seems that-Eis coming to POSIX standard. See https://unix.stackexchange.com/a/310454/38242 – xuhdev Mar 06 '18 at 22:25 -
1For curious readers: GNU sed has had -r since as long as I can remember (prior to 2004 switch to git). -E was added as an undocumented alias to -r in Aug 2006 (rev 3a8e165). They documented -E in Oct 2013 (rev 8b65e079, prior to v4.1; they didn't git tag prior releases). All v4.3 added w/re to -E was examples in the HTML documentation. Regardless, any GNU sed running in 2010 shouldn't have had any problems with -E, but it was undocumented at the time... git://git.sv.gnu.org/sed – bobpaul Mar 01 '19 at 19:17
sed 's/[ \t]*/"space or tab"/'
- 427
-
3Is this guaranteed to work on any version of
sedon any system? If not it might be worth mentioning where this does work in a similar fashion as the other answers, just so we know the limitations and where this might not have the intended result. – Mokubai Jul 22 '14 at 20:34 -
3This RE is what I use to match whitespace. It is simpler than character classes just to match tab or space. It uses only the most basic conventions of regular expressions, so it should work anywhere with a functional implementation of regular expressions. – Nate Oct 18 '14 at 04:50
-
4On Mac 10.9.5 this matches for spaces and 't'. I used Michael Douma's above to match whitespace chars (it also works with -e). – Alien Life Form Jul 31 '15 at 18:32
-
1Doesn't work sensibly on my SUSE system. It matches the first place on the line where there is zero or more spaces, which is before the first character. I doubt that is the intended function, and certainly wasn't the requested use case. I believe you want to change the '*' for '+' (or '{3,}' per the question) and maybe put a g at the end of the sed command to match all occurrences of the pattern. Replacing [ \t] with [[:space:]] may also be desirable as well, in case there is something else for whitespace in the line. – jbo5112 Jan 20 '16 at 20:59
-
1
Some older versions of sed may not recognize \s as a white space matching token. In that case you can match a sequence of one or more spaces and tabs with '[XZ][XZ]*' where X is a space and Z is a tab.
- 2,144
-
1So for the particular need here, with an older sed, you could do:
$ sed 's/[XZ][XZ][XZ][XZ]*/ /g' inputfile
where X is a tab and Z is a space.
– Marnix A. van Ammers Apr 12 '10 at 15:08
None of the above worked for me. Yet I found the simplest answer ever by using awk
user@~[]$ cat /tmp/file
/nospace/in/here
/this/one space
/well/seems we have spaces
user@~[]$ cat /tmp/file |awk 'NF>1'
/this/one space
/well/seems we have spaces
user@~[]$
I don't know if it can help but I just did that :
MacBook-Pro-van-User:training user$ cat sed.txt
My name is Bob
MacBook-Pro-van-User:training user$ sed s/"My name is Bob"/"My Lastname is Montoya"/g sed.txt
My Lastname is Montoya
I just added "" in the command.
-
1Welcome to Super User! Before answering an old question having an accepted answer (look for green ✓) as well as other answers ensure your answer adds something new or is otherwise helpful in relation to them. Here is a guide on [answer]. There is also a site [tour] and a [help]. – help-info.de Oct 06 '22 at 17:15
sedI had to use[[:space:]]because\sdid not work for me. Perhaps\sis a GNU sed extension? – Jared Beck Jun 17 '13 at 23:24-estopped it working, but-rmade it work (Mint 16). I.e. changing fromsed -e -rtosed -rwas what I needed to do. However I was using[[:space:]]by this point, as I couldn't get\sto work. – Darren Cook Aug 16 '14 at 17:58[:space:]character class,\swill not only match<tab>and<space>, but also the<newline>character (trysed 'N;s/\s/x/' <<<$'aaa\nbbb'in bash). – Witiko Sep 11 '16 at 18:08[[:space:]one could use[[:blank:]]which does match horizontal tabs and spaces only (but no newlines, vertical tabs etc.). – stefanct Oct 13 '17 at 13:10\sin thedestination part(i.e. thereplace-with) part of the regular expression? I want to avoid using keyboardspacesand/ortabsthere, as well. – NYCeyes Jul 09 '21 at 20:40