1

I would like to replace all leading whitespace characters with an equal amount of tabs for every line in a file, using grep or sed. Each line has a few spaces followed by a dash and some text.

 -Line 1  
  -Line 2  
   -Line 3

Finding them is not a problem, but I don't see how to replace these characters using the backreferences. Something like:

sed 's/^([\s]+)(-.*)/\1\2/' file.txt

How can I solve this? Or is it even possible?

ganzpopp
  • 304
  • 2
  • 12

2 Answers2

1

Depending on your tab width, you might want to replace blocks of for example 4 or 8 spaces with tabs, like

sed 's/^ \{4\}/\t/g' infile

or

sed 's/^ \{8\}/\t/g' infile

This turns a file that looks like

$ cat infile
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
       7 spaces
        8 spaces
         9 spaces
          10 spaces
           11 spaces

into this (replacing tabs with ^I so we can see them):

$ sed 's/^ \{4\}/\t/g' infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
^I4 spaces
^I 5 spaces
^I  6 spaces
^I   7 spaces
^I^I8 spaces
^I^I 9 spaces
^I^I  10 spaces
^I^I   11 spaces

or this

$ sed 's/ \{8\}/\t/g' infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
       7 spaces
^I8 spaces
^I 9 spaces
^I  10 spaces
^I   11 spaces

The tab width can be parameterized (notice double quotes):

$ tw=7
$ sed "s/ \{$tw\}/\t/g" infile | cat -T
no space
 1 space
  2 spaces
   3 spaces
    4 spaces
     5 spaces
      6 spaces
^I7 spaces
^I 8 spaces
^I  9 spaces
^I   10 spaces
^I    11 spaces

Notice how this can be easily done also in vim, see this question.

Spaces only at start of line

The commands above replace any group of four or eight spaces with a tab. If you want to only replace spaces at the start of a line, say for a file like this:

$ cat infile 
    4 spaces    word
     5 spaces    word
      6 spaces    word
       7 spaces    word
        8 spaces    word 
         9 spaces    word

you can use

$ sed ':a;s/^\(\t*\) \{4\}/\1\t/;/^\t* \{4\}/ba' infile | cat -T
^I4 spaces    word
^I 5 spaces    word
^I  6 spaces    word
^I   7 spaces    word
^I^I8 spaces    word 
^I^I 9 spaces    word

What this does:

# Label to branch to
:a

# Replace optional leading tabs followed by four spaces
# by the same amount plus one tabs
s/^\(\t*\) \{4\}/\1\t/

# If there are still four spaces after leading tabs, branch to a
/^\t* \{4\}/ba

Update

Turns out the question was actually about replacing spaces at the start of the line with a tab each.

For this input

0 spaces
 1 space
  2 spaces
   3 spaces

the following sed command works:

$ sed ':a;s/^\(\t*\) /\1\t/;ta' infile | cat -T
0 spaces$
^I1 space$
^I^I2 spaces$
^I^I^I3 spaces$

Explained:

:a                # Label to branch to
s/^\(\t*\) /\1\t/ # Capture tabs at start of line, replace next space with tab
ta                # Branches to :a if there was a substitution
Community
  • 1
  • 1
Benjamin W.
  • 38,596
  • 16
  • 96
  • 104
0

Keep it simple and just use awk:

$ awk '{s=$0; sub(/[^ ].*/,"",s); gsub(/ /,"\t",s); sub(/^ +/,s)} 1' file
        -Line 1
                -Line 2
                        -Line 3
Ed Morton
  • 172,331
  • 17
  • 70
  • 167