0

I'm trying to read a file line by line, do string manipulations to each line and write the output to a file;

cat fileName | awk '{...}' >> fileOut

The specific string manipulation I am trying to accomplish is to, for each line, firstly print all the content after some index, the same for each line, say X, excluding the terminating newline, then " : ", then the first column, although I could also do this by substring if needed. I have found examples which combine variable declaration of column values, setting them to zero, variable declaration of substrings (with or without terminating on the last index), and combining these with print/f, but in all examples the use of substring and column indexing are mutually exclusive.

In every attempt to substitute one for the other in examples, the content of the first column always seems to simply replace the content of the substring. As I have tried many ways around this, I will provide the most recent attempt;

Say a line of input was "1234 abcd efgh IJKL mnop" and I want to print everything from index 10, then " : " then column 1, my command would look like:

cat fileName | awk '{printf(“%s : %s/n”,substr($0,10),$1)}' >> fileOut
cat fileName | awk '{A=substr($0,10);B=$1;printf(“%s : %s/n”,A,B)}' >> fileOut
cat fileName | awk '{print substr($0,10)” : “$1}' >> fileOut

However in every case so far, the string returned starts with the " : " followed by the contents of $1, followed by the substr with the first consistent number of characters removed from the front, e.g.

" : 1234L mnop", when I expect "efgh IJKL mnop : 1234"

Why does using a column overwrite the return of substr?

Jonathan Leffler
  • 698,132
  • 130
  • 858
  • 1,229
skenvy
  • 56
  • 5
  • 1
    After fixing your `/n` syntax error and "smart" quotes, `echo '1234 abcd efgh IJKL mnop' | awk '{printf("%s : %s\n",substr($0,10),$1)}'` produces `efgh IJKL mnop : 1234` for me (with a space at the start, SO won't let me put that in). – Thomas Jun 19 '18 at 14:59
  • Why are you using `cat` just to stream a single file? – Toby Speight Jun 19 '18 at 15:00
  • The question was liberally sprinkled with 66 (`“`) and 99 (`”`) smart quotes instead of ASCII double quotes (`"`), leaving me to wonder whether Windows is involved, and part of the trouble is that you've got CRLF line endings in the file, and the 'end of each line' ends with a CR that moves the cursor back to the start of the line. I also observe that `cat file | awk` is a UUoC ([Useless Use of `cat`](https://stackoverflow.com/questions/11710552/useless-use-of-cat/11710888#11710888)) — `awk` is perfectly capable of reading files. Amongst other possibilities, try `od -c fileName | sed 3q`. – Jonathan Leffler Jun 19 '18 at 15:31
  • I feel emboldened that my first foray into the wide world of SO only received one down-vote! @JonathanLeffler this appears to have been the case; Thomas' replacement produced the right outcome. FYA, this is on the Windows Linux Subsystem using the Ubuntu app. I considered the CR, so I also attempted it terminating the substr at the 1st index in $0 of the CR, which had no effect. Oddly this question was written on an iPad, which must have also used smart quotes, as I've now replaced them with 0x22, and it's fine now! Thanks for the UUoC ref, although I enjoy the aesthetic syntax for now. – skenvy Jun 20 '18 at 02:07
  • Thanks @Thomas ; for completeness, as this is on the Windows Linux Subsystem, the command that formats is `cat fileIn | awk '{printf("%s : %s\r\n",substr($0,10),$1)}' >> fileOut`; surprised (although not entirely) that it came down to quotes. – skenvy Jun 20 '18 at 02:21

0 Answers0