Identifying Device Performance Issues
Introduction
This document details the process required to use valgrind to identify performance issues on a device.
Process
The process for using valgrind is as follows:
- Build Qtopia and Qt/E for desktop in release mode adding the options -g and -O2 to the specification file for RELEASE. The specification file(s) to modify depend on the release of Qtopia being used, that is:
- Qtopia Version 2.x modify tmake.conf and qmake.conf
- Qtopia Version 4.x modify qmake.conf
- Start qvfb
- Set the following environment variables to start the Qtopia program:
- QPEDIR
- PATH
- LD_LIBRARY_PATH
- Run using:
callgrind --dump-every-bb=500000 --lowlat-syscalls=yes worldtime -qws
- Ctrl-C to exit the application after startup
- kcachegrind &
Checking Memory Usage
valgrind can provide memory usage information as follows:
- For total memory usage to standard output run:
valgrind --tool=addrcheck --leak-check=yes qpe
- For a breakdown of memory usage in html and to provide a graph in postscirpt format run:
valgrind --tool=massif --format=html qpe
- To check for pthread race conditions and print to standard output run:
valgrind --tool=helgrind qpe
Things to Note
- Ensure View->Cycle Detection is off otherwise data will not display correctly.
- When determining performance measures use the following as a guide:
- There are approximately 10,000,000 instructions every second on 200MHz ARM device.
For example, language program startup time is 6s, however with callgrind the number of total instructions is 60,000,000.
- Run callgrind on the desktop ,identify the problem and resolve.
- Run callgrind again if the total instructions have reduced to 40,000,000 (20,000,000)
- Estimated improvement on 200Mhz device is:
20/10 = 2 seconds
Note: How was the estimate of 10M instructions/sec was determined?
- it is known from callgrind that 60M instructions are executed on startup.
- it is known that the time taken to start was 6 seconds.
Therefore, 60M / 6s = 10M instructions/second.
It is difficult to obtain accurate values as other factors effect results so when measuring startup times on device, repeat it a number of times and take the average value.