Length Adaptive Processors: A Solution for the Energy/Performance Dilemma in Embedded Systems

Abstract

Embedded-handheld devices are the predominant computing platform today. These devices are required to perform complex tasks yet run on batteries. Some architects use ASIC to combat this energy-performance dilemma. Even though they are efficient in solving this problem, an ASIC can cause code-compatibility problems for the future generations. Thus, it is necessary for a general purpose solution. Furthermore, no single processor configuration provides the best energy-performance solution over a diverse set of applications or even throughout the life of a single application. As a result, the processor needs to be adaptable to the specific workload behavior. Code-generation and code-compatibility are the biggest challenges in such adaptable processors. At the same time, embedded systems have fixed energy source such as a 1-Volt battery. Thus, the energy consumption of these devices must be predicted with utmost accuracy. A gross miscalculation can cause the system to be cumbersome for the user. In this work, we provide a new paradigm of embedded processors called Dynamic Length-Adaptive Processors that have the flexibility of a general purpose processor with the specialization of an ASIC. We create such a processor called Clustered Length-Adaptive Word Processor (CLAW) that is able to dynamically modify its issue width with one VLIW instruction overhead. This processor is designed in Verilog, synthesized, DRC-checked, and placed and routed. Its energy and performance values are reported using industrial-strength transistor-level analysis tools to dispel several myths that were thought to be dominating factors in embedded systems. To compile benchmarks for the CLAW processor, we provide the necessary software tools that help produce optimized code for performance improvement and energy reduction, and discuss some of the code-generation procedures and challenges. Second, we try and understand the code-generator patterns of the compiler by sampling a representative application and design an ISA opcode-configuration that helps minimize the energy necessary to decode the instructions with no performance-loss. We discover that having a well designed opcode-configuration, not only reduces energy in the decoder by also other units such as the fetch and exception units. Moreover, the sizable amount of energy reduction can be achieved in a diverse set of applications. Next, we try to reduce the energy consumption and power-dissipation of register-read and register-writes by using popular common-value register-sharing techniques that are used to enhance performance. We provide a power-model for these structures based on the value localities of the application. Finally, we perform a case-study using the IEEE 802.11n PHY Transmitter and Decoder and identify its energy-hungry units. Then, we apply our techniques and show that CLAW is a solution for such hybrid complex algorithms for providing high-performance while reducing the total energy.

Description

Keywords

Energy Reduction Low-Power Embedded Processors Len

Citation

Degree

PhD

Discipline

Computer Engineering

Collections