I would like to know how products like quantify achieve to measure the time spend in functions/methods without modifying the code. Does someone know ?
Do you have web page describing how to start writing your own tool?
I would like to know how products like quantify achieve to measure the time spend in functions/methods without modifying the code. Does someone know ?
Do you have web page describing how to start writing your own tool?
A non-intrusive profiler can compile the code into an executable form, by the profiler. This format need not match the actually execution format required by the OS. This is similar to Java's Virtual Machine.
The profiler uses a fundamental unit, such as clock cycles, to measure performace. After determining the amount of cycles, the sum can be multiplied by a constant to come up with an approximate time unit. The value is approximate since the program isn't directly running on a processor, but a "virtual" processor.
Other profilers modify code to call "begin measurement" and "end measurement" where the profiling needs to occur (usually at beginning and ends of functions).
JTAG debuggers and other emulators call the measurement functions when specific addresses found.
From an embedded systems perspective, the most accurate performance measurement technique is to find an unused pin or test point, and send a "begin" pulse to the pin and later on an "end-pulse" and use an oscilloscope to measure the exact time difference. Advanced oscilloscopes can provide histograms of this time difference.
You are asking a very big question. However from what I have seen profilers do modify the code in many cases. EQUATEC for example creates a copy of your executables and libraries which are instrumented. Others will create caches and copies of the code when the profiler is run. So they are not necessarily writing any instrumentation into the code you are working with but they are instrumenting copies of the code or the IL.
My guess is that the CPU is put into "single step mode."
I can guess. Usually profilers 'instruments" your code, on building stage or execution. They can put their measurement calls at the start and at the end of each your function. And do many-many other things.
I don't know about quatify, but one frequent technique is to use stochastic sampling: interrupt every 100 microseconds or so, and save the current instruction pointer. Then work out from the symbol table which function is was in, and totalize those.
Most profilers do a lot more, and will also instrument the code in some ways, in order to provide additional information.
The profiler may have following attributes:
It's helpful to keep this in mind:
That sounds obvious, right? But notice:
So not just any old sampler or instrumenter will serve this purpose.
IMHO, what works best is something that collects samples of the stack, not just the program counter, and samples at random wall clock time, not just CPU time, and reports not just by function, but by line of code, the percent of samples containing that line. Zoom is such a profiler.
In other words a) Don't take a large number of small samples and mush them together into numbers. b) Do take a small number of large samples and understand what they are telling you.
As an extreme example, if as few as three stack samples are taken, if a particular statement (or instruction) appears somewhere on two of them, what does that statement cost? Well, it's conceivable it's just a coincidence - a false positive, but on average, what it will save is (2+1)/(3+2) = 60% of overall run time. In terms of "bang for buck", it's hard to beat that.
Here's a more detailed summary of the issues.
Another technique useful in the embedded domain, similar to that descrubed by Thomas Matthews, is to hook up a frequency generator to a pin that will generate an NMI (ie. A non maskable interrupt). Then sample the program counter that was stored on the stack as part of the interrupt frame. This will give a very accurate statistical view of your application with minimal changes to your application.
Of course this is only applicable in specialised cases.