0

Basic sed question:

This sed substitution works fine to replace a string of Ns with a Z:

cat test1 | sed -E "s/N{10}/Z/g"

But the reverse replaces a Z with the literal string "N{10}"

cat test2 | sed -e "s/Z/N{10}/g"

and returns something like this: AAAA**N{10}**AAAA

How can I replace "Z" with a string of 10 Ns? I know I can key in NNNNNNNNNN into the sed command, but I'm trying to understand the syntax. For some reason N{10} works as in the to-be-replaced but not as replace-with. I've tried to to this a variety of ways, but can't find anything that works.

Any advice will be appreciated.

HatLess
  • 5,048
  • 4
  • 8
  • 28
Zhou Sun
  • 19
  • 2
  • See https://unix.stackexchange.com/a/658489/279389 – Wiktor Stribiżew May 29 '22 at 21:11
  • `I know I can key in NNNNNNNNNN into the sed command, but I'm trying to understand the syntax` There is no syntax, you have to type 10 Ns. – KamilCuk May 29 '22 at 21:15
  • Thank you for taking the time to post that useful information. But, I'm still a bit confused as how to set up the command. Can I impose on you once more: what command can I use to replace "Z" with "NNNNNNNNNN" (or 100 Ns) using this method? – Zhou Sun May 29 '22 at 21:38

5 Answers5

2

can do it:

perl -pe 's/Z/"N" x 10/ge' file

With N x 10, you ask explicitly to repeat 10 times N

Gilles Quenot
  • 154,891
  • 35
  • 213
  • 206
  • Thank you! I appreciate this very much! – Zhou Sun May 29 '22 at 21:45
  • Marked as correct answer. :) – Zhou Sun May 29 '22 at 21:46
  • 1
    Note: `N x 10` is short for `"N" x 10`. Omitting the quotes is, shall we say, unconventional. We usually ask Perl to not let us avoid these quotes, normally. – ikegami May 30 '22 at 07:49
  • You can avoid the eval flag if you use a temporary variable: `$x = "N" x 10; s/Z/$x/g;` Also, I think the intent here is probably to use the global `/g` modifier and replace all occurrences, like in the original, not just the first. – TLP May 30 '22 at 08:10
1

but I'm trying to understand the syntax

From https://pubs.opengroup.org/onlinepubs/009604499/utilities/sed.html:

s/BRE/replacement/flags

Substitute the replacement string for instances of the BRE in the pattern space. [...]

The replacement string shall be scanned from beginning to end. An ampersand ( '&' ) appearing in the replacement shall be replaced by the string matching the BRE. The special meaning of '&' in this context can be suppressed by preceding it by a backslash. The characters "\n", where n is a digit, shall be replaced by the text matched by the corresponding backreference expression. The special meaning of "\n" where n is a digit in this context, can be suppressed by preceding it by a backslash. For each other backslash ( '\' ) encountered, the following character shall lose its special meaning (if any). The meaning of a '\' immediately followed by any character other than '&', '\', a digit, or the delimiter character used for this command, is unspecified.

Generally & and \1 \2 ... \9 and \\ are "special" in replacement. There is also \n that can be replacement list that is not in POSIX standard, but it is supported in sed implementations. Also https://www.gnu.org/software/sed/manual/sed.html#The-_0022s_0022-Command .

BRE is:

The sed utility shall support the BREs described in the Base Definitions volume of IEEE Std 1003.1-2001, Section 9.3, Basic Regular Expressions, with the following additions: [...]

In regular expression {10} means to match a group repeated 10 times. BRE and replacement have very different rules, and regular expression is used to match, not generate, strings.

I can recommend https://regexcrossword.com/ to learn regex with fun.

KamilCuk
  • 96,430
  • 6
  • 33
  • 74
1

Using sed

$ sed "s/Z/echo $(printf -- 'N%.0s' {1..10})/e" input_file
NNNNNNNNNN
HatLess
  • 5,048
  • 4
  • 8
  • 28
1

Using sed plus bash:

$ echo 'fooNxyzNbar' | sed "s/N/$(printf 'Z%.0s' {1..10})/g"
fooZZZZZZZZZZxyzZZZZZZZZZZbar
Ed Morton
  • 172,331
  • 17
  • 70
  • 167
  • Interesting approach :) – HatLess May 29 '22 at 23:31
  • @HatLess yeah, I thought about just commenting on yours but since there's 3 differences (I have `g`, you have `echo` and `e`) and I didn't know the reasons for your `echo` and `e` decided to just post my own. – Ed Morton May 30 '22 at 12:14
  • The magic is in the `printf`. But, yes, I agree, my brain was in gear one yesterday (plus was in a rush) so just gave any solution rather than nothing. – HatLess May 30 '22 at 12:33
  • If you update your answer to be the same as mine I'll delete mine. – Ed Morton May 30 '22 at 12:36
  • 1
    Please don't, I quite like yours (and mine :)) – HatLess May 30 '22 at 12:38
0

This might work for you (GNU sed):

sed -E ':a;/Z/{G;s/\n/&/10;Ta;s/Z([^\n]*)(\n.*)/\2\1/;y/\n/N/;ta}' file

If a line matches the desired character. Append the number of required newlines to the end of that line.

Then substitute the newlines for the match and lastly translate the newlines to the required format.

Repeat if necessary.

N.B. The t command reset the internal substitution switch which is necessary for the T command to operate.

potong
  • 51,370
  • 6
  • 49
  • 80