Hardware Security
Professor Mark Karpovsky, Associate Professor Alexander
Taubin
Cryptographic algorithms are designed so that by
observing only the inputs and outputs of the algorithm, it
is computationally infeasible to break the cipher, or equivalently
determine the secret key used in encryption and decryption.
Thus, the algorithm itself
does not leak enough useful information
during its operation to compromise its security. However, when
a physical implementation of the algorithm is considered, additional
information like power consumption, behavior as a result of
internal faults, and timing of the circuit implementing the
algorithm can provide enough information to compromise the
security of the system. This type of data can now be readily
gathered since cryptographic hardware is accessible to anyone
(smart cards, SIM cards,
USB tokens). Attacks based on the use of this implementation specific
information are known as Side
Channel Attacks (SCA). Contrary
to traditional cryptanalysis attacks, examples show that a
very small amount of side-channel information is enough to
completely break a cryptosystem. The
goal of the project is to develop methods and designs to make
such attacks infeasible.
Project components: application of Robust Codes to protect
encryption hardware against Differential Fault Analysis (DFA) and
natural soft errors; and analysis and design of balanced dual
rail asynchronous gates as a countermeasure against Simple
Power Analysis (SPA) and Differential Power Analysis (DPA)
of encryption hardware
Right: A robust, DFA resistant FPGA
implementation of AES
Back to top
Weaver: Asynchronous Electronic Design Automation
Associate Professor Alexander
Taubin
Process variations and signal integrity stretch the timing margins
in static timing analysis to the point where they become too conservative
and result in significant overdesign. However, the electronics industry
is reluctant to adopt asynchronous design, due to the lack of a conventional
design flow based on commercial-quality asynchronous EDA tools.
EDA flow (Weaver) starts from traditional synthesizable
or gate-level HDL and separates logic optimization from timing
by introducing local clocks at the post-synthesis design
stage. It is based on commercial EDA tools, it can handle circuits
of the same complexity as contemporary synchronous RTL flows,
and it permits the finest degree of pipelining (gate-level,
with dynamic CMOS gates). Automated pipelining provides a tremendous
increase in circuit performance. In synchronous designs, automated pipelining
is difficult to implement because it changes the number and position
of registers, and eventually results in a completely new circuit,
which cannot be plugged in the same context as a non-pipelined one.
For asynchronous circuits this issue does not exist, because
local clocks are internal to each module and input-output behavior remains
the same regardless of pipelining granularity. Robustness of asynchronous
circuits to delay variations allows them to run at very low voltage
levels, even slightly below the transistor threshold. Hence,
automated asynchronous circuit design combines high performance with
low power and short time to market.
For ASIC technology the key advantage of the proposed approach
is the high tolerance of the derived implementation to delay
variations. This eases the timing convergence problem and completely
eliminates any timing margins, thus reducing both design and
production cost. We
are developing Weaver design flow for integrated circuits at
65 nm and beyond. The flow accepts synthesizable HDL specifications
and produce fabricatable mask layout in GDSII form. The flow uses
standard, commercial-quality, state-of-the-art synchronous
tools. It is able to use designed dynamic CMOS libraries for
maximum performance and minimum power consumption and hardware
security. The key differences from the conventional methodology are
a novel mechanism for synchronizing design components locally, based
on handshaking, and a novel mechanism for detecting the stabilization
of combinational logic outputs, exploiting true, data-dependent, path
delays. A dynamic standard cell library (including low leakage aspects)
are also into development now.
Back to top
Practical Data Synchronization: Minimizing Communication
Ari
Trachtenberg
The problem of data synchronization is inherent to systems
that require consistency among distributed information. With
the increasing communication between mobile or fixed computing
devices, efficient data synchronization is increasingly becoming
an essential technology. We
feel that current trends towards pervasive computing point
to communication becoming the primary bottleneck for synchronization.
For example, some Personal Digital Assistant (PDA) can take
even 20 minutes to synchronize with a desktop PC in certain
practical scenarios.
Our work suggests a practical means for performing data
synchronization over a limited
communication channel. Our research involves the following
broad goals: developing the theoretical basis for the use
of error-control codes in data synchronization; establishing practical
data synchronization techniques through implementation
and experimentation; and integrating this research with
an educational plan involving graduates, undergraduates,
and even high school students through properly abstracted
and supervised projects.
Right: A comparison of wholesale data transfer and
our algorithm CPIsync. The latter necessarily requires less
communication for synchronization in most cases, and the
graph shows that it also requires
less time to completion
in some settings.
Back to top
Tools and Methodologies for Heterogeneous
MPSoC
Embedded Systems
Assistant Professor Wei
Qin
Heterogeneous multi-processor system-on-chips
(MPSoC) are complex systems consisting of processors, interconnection
networks, memories, and peripherals. Development of such
systems involves numerous design languages, electronic
design automation (EDA) tools, software development tool-chains,
operating systems, and system verification tools. This
research aims to advance the languages and tools, and to
develop systematic methodologies to craft MPSoCs for several
application domains, such as multimedia and wireless communication.
Our research currently involves the following components: architecture modeling
and synthesis of MPSoC platforms, software mapping and synthesis, fast prototyping
of MPSoC, dynamic-compiled simulation of processors and systems, and functional
verification of processors and systems. Our research focuses on improving both
the productivity of the development process, and the efficiency of the final
design.

Above: A diagram showing our recent dynamic-compiled
simulation technique. The technique translates pages of
target binary code into shared libraries on the fly. The
technique offers 5 to 10 times speedup over interpretation.
It is fully based on C programming and thus highly portable.
It can also be conveniently retargeted to different processor
architectures through an architecture description language
(ADL) that we designed.
Back to top