1

I am using isdigit() function in c++, but i found it's slow, so i implemented my own is_digit(), see my code below:

#include<iostream>
#include<cctype>
#include<ctime>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    time_t t1=clock(),t2,t3;
    for(int i=0;i<1e9;i++)
        is_digit(c);
    t2=clock();
    for(int i=0;i<1e9;i++)
        isdigit(c);
    t3=clock();
    cout<<"is_digit:"<<(t2-t1)/CLOCKS_PER_SEC<<"\nisdigit:"<<(t3-t2)/CLOCKS_PER_SEC<<endl;

    return 0;
}

After running, is_digit() took only 1 second(1161ms), but isdigit() took 4 seconds(3674ms), I know that isdigit is implemented by bit operation, Shouldn't isdigit() be faster than is_digit()?


update1

I use MS VS2010 with default option, release version, how do i do to make isdigit() faster than is_digit() in VS?

update2

Thanks to all of you. When in release mode in VS, project will be optimized for speed default(-O2).

All in release mode.

VS2010: is_digit:1182(ms) isdigit:3724(ms)

VS2013: is_digit:0(ms) isdigit:3806(ms)

Codeblocks with g++(4.7.1) with -O3: is_digit:1275(ms) isdigit:1331(ms)

So here is the conclusion:

is_digit() is faster than isdigit() in VS but slower than isdigit() in g++.

And isdigit() in g++ is faster than isdigit() in VS.

So "VS sucks" in performance?

user1024
  • 902
  • 3
  • 12
  • 26
  • 1
    How did you compile the code? With `g++ -O0` the libc version is faster for me. – vanza Feb 10 '15 at 06:19
  • 8
    How do you compile? Also, [`std::isdigit`](http://en.cppreference.com/w/cpp/locale/isdigit) considers `locale`s – Ivan Aksamentov - Drop Feb 10 '15 at 06:19
  • It might be that isdigit() gets "function called" and your is_digit() gets inlined. – BitTickler Feb 10 '15 at 06:19
  • 1
    As vanza mentioned above, what are your compiler settings? Also, what happens when you make it a non-inline function? – Cloud Feb 10 '15 at 06:19
  • 2
    Findings for gcc: With -O0, your version is twice as slow as the libc version. With -O3, both calls are optimized out, because the results are not used. – Markus Mayr Feb 10 '15 at 06:24
  • on my bsd vm with clang and libc++ both are "4" with -O0 and with -O3 they are optimized away :) – BitTickler Feb 10 '15 at 06:27
  • I can't imagine those loops would even exist in a fully optimized binary. Also, `isdigit` does more work than your version does. – Ed S. Feb 10 '15 at 06:30
  • For me; with -O3 (g++) is_digit:1 isdigit:1 with -O0 is_digit:3 isdigit:1 – Doonyx Feb 10 '15 at 06:35
  • Well I compile my code with MS VS2010 with default option. BTY what do -O3 and -O0 mean? – user1024 Feb 10 '15 at 06:46
  • @Dogbert when I make it a non-inline function, `is_digit()` still faster than `isdigit()`? How can i add -O3 like option in VS? – user1024 Feb 10 '15 at 07:07
  • Most probably you are at debug mode and i does a lot of unneccasary checks and asserts because of that.But again 4 sec is too much :D – oknsnl Feb 10 '15 at 07:09
  • @oknsnl I am at release mode, but after i set -O2 flag in VS like -O3 in g++, `is_digit()` still beats `isdigit()`, and `isdigit()` took more time in VS than in g++. – user1024 Feb 10 '15 at 07:20
  • 3
    Comparing non-optimized builds is pointless. And you must use the result of the function or it'll be optimized out and the time needed to run the loop is 0. Besides, to measure performance correctly, use [`QueryPerformanceCounter`](http://stackoverflow.com/questions/1739259/how-to-use-queryperformancecounter) https://msdn.microsoft.com/en-us/library/windows/desktop/ms644904%28v=vs.85%29.aspx – phuclv Feb 10 '15 at 08:17
  • Minor note: `isdigit` receives an `int`, **not** `char` – phuclv Feb 10 '15 at 08:42

3 Answers3

2

Have a look on this code (works with g++) with -O3

#include<iostream>
#include<cctype>
#include<ctime>
#include <time.h>
#include <sys/time.h>
using namespace std;
static inline bool is_digit(char c)
{
    return c>='0'&&c<='9';
}
int main()
{
    char c='8';
    struct timeval tvSt, tvEn;
    time_t t1=clock(),t2,t3;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        is_digit(c);
    gettimeofday(&tvEn, 0);
    cout << "is_digit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;
    gettimeofday(&tvSt, 0);
    for(int i=0;i<1e9;i++)
        isdigit(c);
    gettimeofday(&tvEn, 0);
    cout << "isdigit:" << (tvEn.tv_sec - tvSt.tv_sec)*1000000 + (tvEn.tv_usec - tvSt.tv_usec) << " us"<< endl;

    return 0;
}

Results:

is_digit:1610771 us
isdigit:1055976 us

So, C++ implementation beats yours.
Normally, when you measure performance, it's not a good idea to do it with seconds. At lease consider microseconds level.

I'm not sure about VS. Please find out microsecond level clock and measure.

PS. Please refer https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx for VS optimizations

Doonyx
  • 576
  • 2
  • 11
  • What the `-O3` option means? By the way, how can i do to make `isdigit()` faster than `is_digit()` in VS? – user1024 Feb 10 '15 at 06:54
  • @user1024 -O3 is an optimization flag used in g++. It allows a higher level of compile time optimization. There should be some optimization flags in VS as well. I don't know much about VS. However, if you your simple code beats VS implementation, that should be another reason to say "VS sucks" in terms of performance. I'm not sure. – Doonyx Feb 10 '15 at 07:00
  • @user1024 maybe https://msdn.microsoft.com/en-us/library/19z1t1wy.aspx explains them? – Doonyx Feb 10 '15 at 07:05
  • I see, but after i set -O2 flag in VS (like -O3 in g++), `is_digit()` still beats `isdigit()`, and `isdigit()` took more time in VS than in g++, "VS sucks" may be true. What's strange is that after i set -O2 flag in VS, 'is_digit()` took 0 us, is that because char c always be '8'? – user1024 Feb 10 '15 at 07:22
  • Not convinced this is measuring anything. Both loops are optimized out entirely by any decent optimizer. And [I'm extremely doubtful that you compiled your code with `-O3`](http://coliru.stacked-crooked.com/a/728485a5b4d7dba3). – T.C. Feb 10 '15 at 07:30
  • @T.C. I have no clue why you doubt using -O3? I have no idea how -O2 comes from your link. But I compiled in it my Linux box with -O3. You need a screen capture to prove this. On the other hand, if you are "Not convinced", please present your facts rather just saying "Not convinced". – Doonyx Feb 10 '15 at 07:36
  • @user1024 Can you measure the time in microsecond rather seconds? – Doonyx Feb 10 '15 at 07:38
  • Sure, [`-O3`](http://coliru.stacked-crooked.com/a/9bca6df1ec46d1f6), doesn't change a thing. Which version of GCC you are using? – T.C. Feb 10 '15 at 07:41
  • @T.C. Moreover, how could you say which optimization flags which I used by just looking at the code? – Doonyx Feb 10 '15 at 07:42
  • @T.C. gcc version 4.7.2 – Doonyx Feb 10 '15 at 07:44
  • 3
    Because I expected the optimizer to take out the loop entirely, and not waste millions of microseconds running the loop. Turns out that before GCC 4.8 [it optimizes away the function call but not the loop itself when your loop condition does a comparison with the floating point number `1e9`](http://goo.gl/Iw85mk), but does take out the loop too [when you use `1000000000` instead](http://goo.gl/l37WFA). Regardless, the `is_digit`/`isdigit` calls are still optimized away, so you are still not measuring anything meaningful. – T.C. Feb 10 '15 at 07:54
  • 1
    @SargiEran Thanks to you. `isdigit()` is faster than `is_digit()` in g++. See my update. I am going to use g++. – user1024 Feb 10 '15 at 08:19
  • it's not strange at all. Any optimized compilers will eliminate those non-sense loops away and the result should be 0. It's not that VS is stupid but because your code is a very bad performance checker – phuclv Feb 10 '15 at 08:29
2

In clang/llvm [the compiler of my choice], isdigit and is_digit will turn into exactly the same code, as it has optimisation for that specific library call to translate it into ((unsigned)(c-48) < 10u).

The return c>='0' && c <='9'; is also turned into c-48 > 10 by the optimisation (as a generic if x >= N && x <= M -> x-N > (M-N) conversion that the compiler does).

So, in theory, both the loops SHOULD turn into the same code (at least with a compiler that has this type of optimisation for isdigit - whether MSVC does or not, I can't say, as the source code is not available to the general public). I know that gcc has similar code to optimise library calls, but I don't have gcc source on my machine at present, and I can't be bothered to look it up [in my experience, it'll be a bit more difficult to read than the llvm code, anyways].

Code in llvm:

Value *LibCallSimplifier::optimizeIsDigit(CallInst *CI, IRBuilder<> &B) {
  Function *Callee = CI->getCalledFunction();
  FunctionType *FT = Callee->getFunctionType();
  // We require integer(i32)
  if (FT->getNumParams() != 1 || !FT->getReturnType()->isIntegerTy() ||
      !FT->getParamType(0)->isIntegerTy(32))
    return nullptr;

  // isdigit(c) -> (c-'0') <u 10
  Value *Op = CI->getArgOperand(0);
  Op = B.CreateSub(Op, B.getInt32('0'), "isdigittmp");
  Op = B.CreateICmpULT(Op, B.getInt32(10), "isdigit");
  return B.CreateZExt(Op, CI->getType());
}

For those not familiar with LLVM code: It first checks that the function call has the correct number of parameters and parameter types. If that fails, it returns NULL to indicate "I can't optimise this". Otherwise, it builds the chain of operations to do the if (c - '0' > 10) using unsigned comparison to cope with "negative" values [which in unsigned are huge values].

It would goes wrong if you do this:

bool isdigit(int x)
{
   return image_contains_finger(imagefiles[x]); 
}

[But then replacing library functions with your own version that does something will most likely have interesting effects in general!]

Mats Petersson
  • 123,518
  • 13
  • 127
  • 216
1

Your function is_digit can be implemented faster by:

#define ISDIGIT(X) (((uint32_t)X - '0') < 10u)

where you save one comparison. I think, that this is the normal approch in gcc, but in Microsoft Visual Studio i guess you had a localized version of isdigit() (which therefor takes a long time in checking locales).

PatrickF
  • 416
  • 2
  • 11