Is there any way I can save a file’s content character by character to an array?(in a bash script)

Question

I’ve only found ways to do that by reading the file’s content line by line, so, I’m genuinely curious to find out if there’s a way to read the file’s content character by character and save them all in an array.

read line and then split like https://stackoverflow.com/questions/7578930/bash-split-string-into-character-array — Iłya Bursov, Feb 12 '22 at 01:24

score 2 · Answer 1 · answered Feb 12 '22 at 02:50

In bash you can index or 'splice' a string using a character index: echo "${foo:2:3}" prints three characters, starting at (and including) character three (strings are zero indexed, like arrays).
This offers some of the functionality of arrays, but not all.
You can use mathematical expressions for the index, eg. ${foo:2+i:3}.

You can make a real array of characters by using the character limit flags for read:

while IFS= read -d '' -rn1; do
    chars+=("$REPLY")
done < file

For bash >= v4.1 (-N is not available before 4.1):

while IFS= read -rN1; do
    chars+=("$REPLY")
done < file

-N reads all characters, -n reads all characters except the delimiter (new line by default).
Both of these will copy all ASCII characters, including newlines (and trailing new lines).
You can ommit the delimiter (ie. new line characters) from your array, by using the second version, but with -n instead of -N (also meaning it's not restricted to v4.1 or higher).

Note that under at least some versions of bash, this won't handle multibyte characters (e.g. non-ASCII characters in UTF-8 encoding) very well. — Gordon Davisson, Feb 12 '22 at 06:26
@GordonDavisson A good point to raise. Current bash handles UTF-8 fine AFAIK. I'm unsure when this was implemented. Substring expansion (string 'splice') and `read -N`/`-n` are based on characters, not bytes. In the array example, a single UTF-8 character would be copied to a single array element, which could be more than 1 byte long. Similarly for a substring, `${str:0:1}` may be more than 1 byte, if `str` contained UTF-8. Tested on bash `5.1`. I also did some rough tests using: https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html, everything seemed to work as expected. — dan, Feb 12 '22 at 13:05

Is there any way I can save a file’s content character by character to an array?(in a bash script)

1 Answers1