Using Performance Bounds to Guide Code Compilation and Processor Design

Show full item record

Title: Using Performance Bounds to Guide Code Compilation and Processor Design
Author: Zhou, Huiyang
Advisors: Thomas M. Conte, Committee Chair
Gregory T. Byrd, Committee Member
Eric Rotenberg, Committee Member
S. Purushothaman Iyer, Committee Member
Abstract: Performance bounds represent the best achievable performance that can be delivered by target microarchitectures on specified workloads. Accurate performance bounds establish an efficient way to evaluate the performance potential of either code optimizations or architectural innovations. We advocate using performance bounds to guide code compilation. In this dissertation, we introduce a novel bound-guided approach to systematically regulate code-size related instruction level parallelism (ILP) optimizations, including tail duplication, loop unrolling, and if-conversion. Our approach is based on the notion of code size efficiency, which is defined as the ratio of ILP improvement over static code size increase. With such a notion, we (1) develop a general approach to selectively perform optimizations to maximize the ILP improvement while minimizing the cost in code size, (2) define the optimal tradeoff between ILP improvement and code size overhead, and (3) develop a heuristic to achieve this optimal tradeoff. We extend our performance bounds as well as code size efficiency to perform code-size-aware compilation for real-time applications. The profile independent performance bounds are proposed to reveal the criticality for each path in a task. Code optimizations can then focus on the critical paths (even at the cost of non-critical ones) to reduce the worst-case execution time, thereby improving the overall schedulability of the real-time system. For memory intensive applications featuring heavy pointer chasing, we develop an analytical model based on performance bounds to evaluate memory latency hiding techniques. We model the performance potential of these techniques and use the analytical results to motivate an architectural innovation, called recovery-free value prediction, to enhance memory level parallelism (MLP). The experimental results show that our proposed technique improves MLP significantly and achieves impressive speedups.
Date: 2003-07-10
Degree: PhD
Discipline: Computer Engineering
URI: http://www.lib.ncsu.edu/resolver/1840.16/4026


Files in this item

Files Size Format View
etd.pdf 460.5Kb PDF View/Open

This item appears in the following Collection(s)

Show full item record