-2

I’ve only found ways to do that by reading the file’s content line by line, so, I’m genuinely curious to find out if there’s a way to read the file’s content character by character and save them all in an array.

1 Answers1

2
  • In bash you can index or 'splice' a string using a character index: echo "${foo:2:3}" prints three characters, starting at (and including) character three (strings are zero indexed, like arrays).
  • This offers some of the functionality of arrays, but not all.
  • You can use mathematical expressions for the index, eg. ${foo:2+i:3}.

You can make a real array of characters by using the character limit flags for read:

while IFS= read -d '' -rn1; do
    chars+=("$REPLY")
done < file

For bash >= v4.1 (-N is not available before 4.1):

while IFS= read -rN1; do
    chars+=("$REPLY")
done < file
  • -N reads all characters, -n reads all characters except the delimiter (new line by default).
  • Both of these will copy all ASCII characters, including newlines (and trailing new lines).
  • You can ommit the delimiter (ie. new line characters) from your array, by using the second version, but with -n instead of -N (also meaning it's not restricted to v4.1 or higher).
dan
  • 3,701
  • 4
  • 11
  • Note that under at least some versions of bash, this won't handle multibyte characters (e.g. non-ASCII characters in UTF-8 encoding) very well. – Gordon Davisson Feb 12 '22 at 06:26
  • @GordonDavisson A good point to raise. Current bash handles UTF-8 fine AFAIK. I'm unsure when this was implemented. Substring expansion (string 'splice') and `read -N`/`-n` are based on characters, not bytes. In the array example, a single UTF-8 character would be copied to a single array element, which could be more than 1 byte long. Similarly for a substring, `${str:0:1}` may be more than 1 byte, if `str` contained UTF-8. Tested on bash `5.1`. I also did some rough tests using: https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html, everything seemed to work as expected. – dan Feb 12 '22 at 13:05