1

I would like to use sed to change a line breaks preceding a specific character and replace it with a simple space:

Example:

<link rel="colorSchemeMapping
"
href="marinedrugs-790193-original.fld/colorschememapping.xml">

Should be:

<link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml">

I'm aware of the

':a;N;$!ba;s/\n/ /g'

but am willing to add the double quotes as being mandatory preceding the line break.

Wiktor Stribiżew
  • 561,645
  • 34
  • 376
  • 476
Milos Cuculovic
  • 18,933
  • 50
  • 154
  • 263

2 Answers2

2

I suggest replacing newline+"+newline with the " and space, and any other newline with a space:

sed -i -E ':a;N;$!ba;s/\n(")\n|\n/\1 /g' file
sed -i ':a;N;$!ba;s/\n"\n/" /g; s/\n/ /g' file

or

sed -e ':a;N;$!ba' -e 's/\n"\n/" /g' -e 's/\n/ /g' file > newfile

LINE ENDING NOTE: If your endings are CRLF, you need to replace \n with \r\n in the above patterns.

Note -E will enable POSIX ERE syntax (to avoid using too many backslashes in the pattern). The regex means

  • \n(")\n - a newline, then " is captured into Group 1 and then a newline
  • | - or
  • \n - a newline.

The replacement is Group 1 value (" if it was matched) and a space.

See the online sed demo:

s='<link rel="colorSchemeMapping
"
href="marinedrugs-790193-original.fld/colorschememapping.xml">'
sed -E ':a;N;$!ba;s/\n(")\n|\n/\1 /g' <<< "$s"
# => <link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml"> 
Wiktor Stribiżew
  • 561,645
  • 34
  • 376
  • 476
2

Since you're using GNU sed anyway:

$ sed -z 's/\n"\n/" /g' file
<link rel="colorSchemeMapping" href="marinedrugs-790193-original.fld/colorschememapping.xml">

If you find yourself using constructs other than s, g, and p (with -n) in sed then you're using the wrong tool and should instead be using awk or similar. All other sed constructs became obsolete 40+ years ago when awk was invented.

Ed Morton
  • 172,331
  • 17
  • 70
  • 167