I’ve only found ways to do that by reading the file’s content line by line, so, I’m genuinely curious to find out if there’s a way to read the file’s content character by character and save them all in an array.
Asked
Active
Viewed 40 times
1 Answers
2
- In bash you can index or 'splice' a string using a character index:
echo "${foo:2:3}"prints three characters, starting at (and including) character three (strings are zero indexed, like arrays). - This offers some of the functionality of arrays, but not all.
- You can use mathematical expressions for the index, eg.
${foo:2+i:3}.
You can make a real array of characters by using the character limit flags for read:
while IFS= read -d '' -rn1; do
chars+=("$REPLY")
done < file
For bash >= v4.1 (-N is not available before 4.1):
while IFS= read -rN1; do
chars+=("$REPLY")
done < file
-Nreads all characters,-nreads all characters except the delimiter (new line by default).- Both of these will copy all ASCII characters, including newlines (and trailing new lines).
- You can ommit the delimiter (ie. new line characters) from your array, by using the second version, but with
-ninstead of-N(also meaning it's not restricted to v4.1 or higher).
dan
- 3,701
- 4
- 11
-
Note that under at least some versions of bash, this won't handle multibyte characters (e.g. non-ASCII characters in UTF-8 encoding) very well. – Gordon Davisson Feb 12 '22 at 06:26
-
@GordonDavisson A good point to raise. Current bash handles UTF-8 fine AFAIK. I'm unsure when this was implemented. Substring expansion (string 'splice') and `read -N`/`-n` are based on characters, not bytes. In the array example, a single UTF-8 character would be copied to a single array element, which could be more than 1 byte long. Similarly for a substring, `${str:0:1}` may be more than 1 byte, if `str` contained UTF-8. Tested on bash `5.1`. I also did some rough tests using: https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html, everything seemed to work as expected. – dan Feb 12 '22 at 13:05