In-place profiling in c++11: Part 1: Timing
This tutorial shows how to implement a tiny robust benchmarking tool in C++11. This tool might be helpful for CS students as easy benchmark for algorithmic tasks and for real projects as a template for build-in profiler of mission critical methods.
- Getting started
- Timer class implementation
- Simple implementation of Logger
- Future improvements
C++11 standard provides two types of real-time clocks : old-fascinate std::clock_t and brand new std::chrono interface. I chose std::chrono::high_resolution_clock for this example.
The idea is to create a single object controlling the timer in the scope of benchmarked portion of code. For example, if we’ll create an object that starts timers at the beginning of the function it is destroyed automatically at the end of the function. So we can place the stats logging facilities into the destructor of our class.
Let’s start with the code. As a first step we’ll declare namespace BENCH11 containing benchmarking class Timer and logging class Logger.
The Timer class inteface:
As you can see, the Timer class does not have any public methods besides the constructor and the destructor. It has fields to keep starting values of the timers and encapsulates a Logger object. The constructor has a set of arguments to identify the place in the code where it was called.
Now we’re going to defile the interface of the Logger class:
Here are just one public method logMessage() and a public constructor.
Now it’s a time for a black magic of defines. Let’s say we do not need benchmarking in a release build and we do not want to pass all these boring arguments to the constructor.
So let’s make a couple of #defines :
If the macro BENCH11 was defined the macro BENCH11_TIMER is creating an object of BENCH11::Timer class with appropriate source code references. If the macro BENCH11 was not defined (in a release build for example) the macro BENCH11_TIMER is empty. With these defines it’s enough to place BENCH11_TIMER macro to the very beginning of the profiled method or scope. See a usage example below:
Here are two nested benchmarks inside the worker() method.
Timer class implementation
Implementation of Timer is pretty straightforward. In the constructor we store the current count of a system timer and initialize the Logger. In the destructor we calculate the difference between the moments of construction and destruction and pass a message to the Logger. I’d place these routines right into the declaration section.
Simple implementation of Logger
Our simple logger will write messages to standard error stream. There are two important thinks to do. The first is a tagging messages with the location in the source code. And the second issue we have to handle is I/O delays. We’ll do output asynchronously using C++ threads. I’d place the implementation right into the declaration again:
Here we have a new field string TAG and two helper methods. TAG is filled in in the constructor and is utilized as a prefix to the message by the logMessage() method. logMessage() calls the static helper method logMessageAsync(). This method creates a detached thread for the static worker method logMessageSync() which performs real I/O operations.
This example might be build using C++11 or higher standard. Successful linkage require threading support. For gcc compiler command line will look like that:
There are a great variety of possible improvements according to the usage purposes. As far as the presented realization relays on the wall but not on the processor’s it is not suitable for benchmarking methods with I/O calls and might give wrong results if there were high CPU load from the background processes. Adding std::clock() timer will help to handle these issues.
There are a lot of ways to implement more reliable Logger. For example, it could write to a file or network socket. Also the primitive concurrency model implemented in this example might mix up output from different Logger instances. Therefore it would be better to implement a single output thread with a message queue.
Despite of the facts above this example is a good starting point.
You can find the full updated sources of this example and much more at my GitHub page.
Stay tuned for the Part 2: Memory and Leaks