Analysis with Time-Based Profiling

After creating a new project (or opening an existing project), CodeAnalyst is ready to collect and analyze data. The currently selected profile configuration is shown in the toolbar. You may choose a different profile configuration from this list without affecting the other session settings. This is a fast way to collect and analyze the program from a different perspective while keeping run control options the same. This section uses the Time-based profile configuration.

  1. Click the Start button  in the toolbar to start data collection. You may also select Profile > Start from the Profile menu.

CodeAnalyst starts data collection and launches the application program that is specified in the session settings. Session status is displayed in the status bar in the lower left corner of the CodeAnalyst window. Session progress is displayed in the lower right corner.

After data collection is complete, CodeAnalyst processes the data and displays results in three tabbed panels—System Data, System Graph and Processes. A new session appears under “TBP Sessions” in the sessions area at the left-hand side of the CodeAnalyst window.

The System Data table displays a module-by-module breakdown of timer samples. This breakdown shows the distribution of execution time across modules that were active while CodeAnalyst was collecting data. CodeAnalyst is a system-wide profiling tool and it collects data on all active software components. System-wide profiling assists the analysis of multi-process applications, drivers, operating system modules, libraries, and other software components. Each sample represents 1 millisecond of execution time.

The System Graph shows the module-by-module breakdown in a bar chart. A color key is displayed below the graph. Scroll down to see the color key, if necessary. The modules that consume the most execution time are the best candidates for optimization.

The System Tasks tab displays a task-by-task breakdown of timer samples. The system-idle process is clearly identified. The measurements for this example were taken on a dual core system with an AMD Family 0xF processors (K8). The example program, classic, is single-threaded and fully utilizes the equivalent of only one core.

Changing the View of Performance Data

The CodeAnalyst GUI offers one or more views of performance data. A view shows one or more kinds of performance data or computed performance measurements that are calculated from the performance data. Select the current view from the drop-down list that appears directly above the result tabs. The “Timer-based profile” view is offered for TBP data.

  1. Click the Manage button. A dialog box appears to change aspects of the currently selected view. (The tutorial section on event-based profiling revisits View Management.)

One aspect of a view is the separation and aggregation of performance data. An aggregated sample count is the sum total of the samples taken across all CPUs (processes or threads.) Aggregated data is shown by default. Data may be separated by CPU, process, or thread using the check boxes in the view management dialog box.

  1. Select the Separate CPUs check box to enable the separation of samples by CPU.

When the Separate CPUs option is enabled for a view, CodeAnalyst displays sample data broken out for each core. The following screen shot shows sample data for each module by individual core. The application program, classic, executed on core 1 (C1).

  1. Double-click on a module in the System Data table to drill down into the data for the module.
  2. When viewing the System Graph, double-click on a bar in the bar chart to drill down into the corresponding module.
  3. When viewing the System Tasks, double-click on a process to drill down. A new tab is created displaying a function-by-function breakdown of timer examples. The distribution of timer samples across all of the functions in the module is shown. The functions with the most timer samples take the most time to execute and are the best places to look for optimization opportunities.

  1. Double-click on a function to drill down into the data for that function. CodeAnalyst displays a new tab containing the source for the function along with the number of timer samples taken for each line of code. A code density chart is drawn at the top of the source panel. The number of samples for each CPU is broken out since the Separate CPUs option is still enabled.

Some users may choose to hide the code density chart in order to devote more screen area within a source panel to code.

  1. To hide the code density chart, select Windows > Show Density Charts from the Tools menu.
  1. Click on an expand (+) square to expand and display a region of source or assembly code.
  2. Click on a collapse () square to hide a region of source or assembly code.

Timer samples for each source line and assembly instruction are shown. The source panel shows the program regions that are taking the most execution time. These hot-spots are the best candidates for optimization.

Next: Analysis with Event-Based Profiling