2

I need to remove line beginning with '#' in some txt file. but ignoring the first line as it header. how to make grep ignore first lines and remove any line beginning with # for rest of the lines?

cat sample.txt
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,xyz

cat sample.txt | grep -v "^\s*[#\;]\|^\s*$" > "out.txt"

but this removes the header too!

Cyrus
  • 77,979
  • 13
  • 71
  • 125
Aprilian8
  • 341
  • 2
  • 9
  • Possible duplicate of [Omitting the first line from any Linux command output](https://stackoverflow.com/q/7318497/608639), [Print a file skipping first X lines in Bash](https://stackoverflow.com/q/604864/608639), etc. – jww Apr 21 '19 at 05:30
  • i dont think its same. I need to write header in the output file too – Aprilian8 Apr 21 '19 at 05:36

6 Answers6

6

With sed:

sed '2,${/^#/d}' sample.txt

From second row (2) to last row ($): search (/.../) for rows beginning (^) with # and delete (d) them. Default action of sed is to print current row.

Output:

#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Cyrus
  • 77,979
  • 13
  • 71
  • 125
1

Try a combination of head and grep like so:

head -1 sample.txt > out.txt && grep -v "^#" sample.txt >> out.txt

Result

#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz

Alternate method

grep "^#" sample.txt | head -1 > out.txt && grep -v "^#" sample.txt >> out.txt

That is - grep lines beginning with # but just choose the first one and write it to a file. Then, grep all lines not starting with # and append those liens to the same output file.

zedfoxus
  • 32,227
  • 4
  • 59
  • 60
1

This will cause any awk to print each line if its line number is 1 or it doesn't start with #:

$ awk 'NR==1 || !/^#/' file
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz
Ed Morton
  • 172,331
  • 17
  • 70
  • 167
1

This might work for you (GNU sed):

sed '1b;/^#/d' file

Ignore the first line and delete any other lines that start with #.

potong
  • 51,370
  • 6
  • 49
  • 80
1

Applying an arbitrary command to all but the first line - a "header" - of a file or stream of tabular data is such a common task for me that I define a helper utility called body for it:

As a shell function (put this in your ~/.bashrc or equivalent):

body() {
  IFS= read -r header
  printf '%s\n' "$header"
  "$@"
}

Now:

$ cat sample.txt | body grep -v '^#'
#"EVENT",VERSION, NAME
1,2,xyz
1,2,abc
1,2,asd
1,2,ert
1,2,xyz
1,2,abc
1,2,xyz

Credit: adapted from: Command line tools for doing data science, where it's a one of many handy data tools you can put in your shell's PATH variable. Wish many of these could canonicalized as standard UNIX tools.

raven-rock
  • 23
  • 4
0

tried on gnu sed

sed '0,/^#/n;/^#/d' sample.txt