Performance Analysis Tools Clause Samples
Performance Analysis Tools. The objectives of the performance analysis tools are (1) to help the programmer choose which parts of the application to accelerate with the FPGA fabric and (2) to quantify the performance and power consumption of the application to determine if the current imple- mentation satisfies requirements. Figure 17b shows the interfaces involved in gathering the performance and power profiles. The key component is the Power Measurement Util- ity (PMU) which we introduced in Section 5.3. The PMU samples power consumption by directly accessing the compute platform’s power lines. In addition, it samples the Program Counter (PC) values of all CPUs over the JTAG interface (see Section 3.5.5). The PMU correlates the PC samples with the power consumption samples to non-intrusively produce highly detailed performance and power profiles. To map the power and performance profiles to the program structure, we use Clang to gen- erate an LLVM Intermediate Representation (IR) for the application (see Figure 17a). We use the LLVM IR to insert additional debug information into the CPU binary so that we later can identify which PCs occur within each basic block. This enables us to pinpoint the source code construct (e.g., function, for loop, etc.) that causes power and performance issues. Clang is the front-end for languages in the C-family in the LLVM [25] compiler infrastructure. Although the LLVM IR is not formally a standard, its specification is freely available. In addition, JTAG is used to program the FPGA fabric. We leverage this capability to do automated Design Space Exploration (DSE) of the HLS configuration space. To enable this, the PMU supports passing commands issued by a Xilinx JTAG programmer through the PMU to the compute platform. However, JTAG can also be used to connect to the compute platform directly when the PMU is not used. Finally, we use the Universal Serial Bus (USB) standard to communicate between the host computer and the PMU. The main use of this channel is to stream the power and PC samples to the host as they are collected. Within the TULIPP project, we have defined the format of this data stream. This format could be standardised to ensure inter-operability between different PMU-like devices. Finally, the compute platform uses USB to send terminal output to the host so that it can be shown to the developer.
