10

I felt there must a better way to count occurrence instead of writing a sub in perl, shell in Linux.

#/usr/bin/perl -w
use strict;
return 1 unless $0 eq __FILE__;
main() if $0 eq __FILE__;
sub main{
    my $str = "ru8xysyyyyyyysss6s5s";
    my $char = "y";
    my $count = count_occurrence($str, $char);
    print "count<$count> of <$char> in <$str>\n";
}
sub count_occurrence{
    my ($str, $char) = @_;
    my $len = length($str);
    $str =~ s/$char//g;
    my $len_new = length($str);
    my $count = $len - $len_new;
    return $count;
}
Robert
  • 6,055
  • 26
  • 41
  • 54
Gang
  • 2,500
  • 3
  • 15
  • 37

4 Answers4

15

If the character is constant, the following is best:

my $count = $str =~ tr/y//;

If the character is variable, I'd use the following:

my $count = length( $str =~ s/[^\Q$char\E]//rg );

I'd only use the following if I wanted compatibility with versions of Perl older than 5.14 (as it is slower and uses more memory):

my $count = () = $str =~ /\Q$char/g;

The following uses no memory, but might be a bit slow:

my $count = 0;
++$count while $str =~ /\Q$char/g;
ThisSuitIsBlackNot
  • 22,821
  • 9
  • 57
  • 105
ikegami
  • 343,984
  • 15
  • 249
  • 495
14

Counting the occurences of a character in a string can be performed with one line in Perl (as compared to your 4 lines). There is no need for a sub (although there is nothing wrong with encapsulating functionality in a sub). From perlfaq4 "How can I count the number of occurrences of a substring within a string?"

use warnings;
use strict;

my $str = "ru8xysyyyyyyysss6s5s";
my $char = "y";
my $count = () = $str =~ /\Q$char/g;
print "count<$count> of <$char> in <$str>\n";
ikegami
  • 343,984
  • 15
  • 249
  • 495
toolic
  • 52,335
  • 14
  • 70
  • 111
4

In a beautiful* Bash/Coreutils/Grep one-liner:

$ str=ru8xysyyyyyyysss6s5s
$ char=y
$ fold -w 1 <<< "$str" | grep -c "$char"
8

Or maybe

$ grep -o "$char" <<< "$str" | wc -l
8

The first one works only if the substring is just one character long; the second one works only if the substrings are non-overlapping.

* Not really.

Benjamin W.
  • 38,596
  • 16
  • 96
  • 104
  • both works well, I added them to my tools box, first time heard the cmd fold, appreciated! – Gang Dec 23 '15 at 15:23
  • 1
    @gliang: It's a bit like the `split //` of Bash, I only discovered it recently as well. Glad you find them useful! – Benjamin W. Dec 23 '15 at 15:25
2

toolic has given a correct answer, but you might consider not hardcoding your values to make the program reusable.

use strict;
use warnings;

die "Usage: $0 <text> <characters>" if @ARGV < 1;
my $search = shift;                    # the string you are looking for
my $str;                               # the input string
if (@ARGV && -e $ARGV[0] || !@ARGV) {  # if str is file, or there is no str
    local $/;                          # slurp input
    $str = <>;                         # use diamond operator
} else {                               # else just use the string
    $str = shift;
}
my $count = () = $str =~ /\Q$search\E/gms;
print "Found $count of '$search' in '$str'\n";

This will allow you to use the program to count for the occurrence of a character, or a string, inside a string, a file, or standard input. For example:

count.pl needles haystack.txt
some_process | count.pl foo
count.pl x xyzzy
Community
  • 1
  • 1
TLP
  • 64,859
  • 9
  • 88
  • 146
  • what is the gms for? g - for all occurence, m for match, s for ? – Gang Dec 23 '15 at 15:34
  • 1
    @gliang: "Treat string as single line", i.e., `.` matches newlines, which it normally doesn't, see http://perldoc.perl.org/perlre.html#Modifiers – Benjamin W. Dec 23 '15 at 15:51
  • Benjamin, thanks, it is not what i thought it is. I will read the document again. – Gang Dec 23 '15 at 16:08
  • 1
    @gliang No, `m` is multiline, meaning newline is matched by `^` and `$` (and equivalent), and `s` like Benjamin said makes `.` also match newline. They are not strictly needed in your example, and `\Q` excludes the use of the `.` metacharacter anyway. But since we do not know what your input will be, this is a safer way. – TLP Dec 23 '15 at 17:01