2

I have a file exported from some system, and in vi when I open the file, there's a green <92> in it when I open it up where an ' ought to be.

It messed up my bash script and left a line that reads Binary file (standard input) matches after I grep'd it.

When I run file against it, it says the file is encoded as Non-ISO extended-ASCII text, with CRLF line terminators

What is that green <92> and why is it binary and messing up grep?

leeand00
  • 3,555
  • 5
  • 24
  • 40

1 Answers1

1

This is a win-1252 quote character which means you're not using the correct encoding. See Unicode characters 2018 and 2019. The encoding is also called CP1252.

In most other encodings, characters between 0x80 and 0x9F are viewed as "graphical controls". Microsoft wanted to add a few more characters to the ISO-8859-1 (probably to have a "better experience" in MS-Word and such where those characters are used a lot). That was a time when writing characters in a word processor or on the screen was not capable of handling a font with more than 256 characters. So having all the characters cramped up in one encoding was useful.

On my end, I often have to change those into code point 0x27 ('), i.e. the apostrophe character.

Alexis Wilke
  • 387
  • 2
  • 9