4

How can I count how many characters appear within a file, minus those from a specific list. Here is an example file:

你好吗?
我很好,你呢?
我也很好。

I want to exclude any occurrences of , , and from the count. The output would look like this:

3
5
4
gniourf_gniourf
  • 41,910
  • 9
  • 88
  • 99
Village
  • 20,305
  • 41
  • 116
  • 158

5 Answers5

3

A pure bash solution:

while IFS= read -r l; do
    l=${l//[?,。]/}
    echo "${#l}"
done < file
gniourf_gniourf
  • 41,910
  • 9
  • 88
  • 99
2

Try

sed 's/[,。?]//g' file | perl -C -nle 'print length'

The sed part removes unwanted characters, and the perl part counts the remaining characters.

Hari Menon
  • 31,521
  • 13
  • 78
  • 107
2

One way is to remove those characters from the stream and then use wc -m. Here is an example that uses perl to remove the characters:

perl -pe 's/(\?|,|,|。)//g' file.txt | \ 
  while read -r line; do 
    printf "$line" | wc -m ; 
  done
jordanm
  • 29,762
  • 6
  • 59
  • 72
2

or more simple:

tr -d [?,,。] <file | wc -m
thom
  • 2,224
  • 11
  • 9
1

A simple solution, approached to this one, but using awk:

sed 's/[?,。]//g' file | awk '{ print length($0) }'
Community
  • 1
  • 1
Radu Rădeanu
  • 2,474
  • 2
  • 23
  • 42