As a part of the CEDA community, we need to spread the word about the excitement of EDA – especially to university students!
CEDA Chapters (and universities or persons intending to establish a CEDA chapter) can request some of our best-known luminaries for a CEDA Distinguished Lecture at their location. CEDA will cover the travel costs of the lecturer.
Our Continuing Distinguished Lecturers
Majority-based Logic Synthesis
Technologies and Platforms for Cyberphysical Systems
3-Dimensional Nano-Devices: Models and Design Tools
Customizable Computing at Datacenter Scale
In the past decade, CDSC has been exploring customizable computing, which emphasizes extensive use of customized accelerators on programmable fabrics for much greater performanc and energy efficiency. With Intel’s $17B acquistion of Altera in 2015 and Amazon’s introduction of FPGAs in its AWS public cloud in 2017, customizable computing is going from advanced research into mainstream computing.
Although the performance and energy efficiency benefits have been clearly demonstrated, a significant challenges, however, is the efficient design and implementation of various acceleration on FPGAs, which is a barrier to many software programmers. In this talk, I shall talk about our effort on developing an automated compilation flow from high-level programming languages to FPGAs. I start with a quick review of our early work on high-level synthesis. Then, I shall present our recent effort on source-code level transformation and optimization for customizable computing, including support of high-level domain-specific languages (DSL) for deep learning (with Caffe or TensorFlow), imaging processing (with Halide), and big-data processing (with Spark), and support automated compilation to customized micro-archictecture templates, such as systolic arrays, stencils, and CPP (composable parallel and pipelined).
High-Level Synthesis and Beyond
Automatic Customizable Computing: From DSLs to FPGAs for Deep Learning and Beyond
In SOCC’2006, my group presented an invited paper on xPilot – the high-level synthesis (HLS) tool developed at UCLA for automatic synthesis of behavior-level C/C++ specifications into highly optimized RTL code. In the same year, The startup company AutoESL was formed to commercialize our research on HLS – an effort that many EDA companies tried but failed for over two decades. Nevertheless, the AutoESL tool (renamed to Vivado HLS after Xilinx acquisition in 2011) becomes probably the most successful and widely used HLS tool for FPGAs. As we come to the end of Moore’s Law scaling, HLS plays a critical role in enabling customizable computing, which emphasizes extensive use of customized accelerators on programmable fabrics for much greater performance and energy efficiency. With Intel’s $17B acquisition of Altera in 2015 and Amazon’s introduction of FPGAs in its AWS public cloud in 2017, customizable computing is going from advanced research into mainstream computing.
In this talk, I shall first review the progress we made on HLS. Then, I talk about our effort on developing an automated compilation flow from high-level programming languages to FPGAs for customizable computing. I shall present our recent effort on source-code level transformation and optimization for customizable computing, including support of high-level domain-specific languages (DSL) for deep learning (with Caffe or TensorFlow), imaging processing (with Halide), and big-data processing (with Spark), and exploration of microarchitecture templates, such as systolic arrays, stencils, and CPP (composable parallel and pipelined), for efficient support of these application domains.
DL Program Additions in 2018:
The Future Of Low Power Circuits And Embedded Intelligence: Emerging Devices And New Design Paradigms
Technology variability and global environment variations are bringing many constraints on circuits and architectures today making difficult to reach high energy efficiency. After a brief overview of adaptive circuits for low power multi-processors and IoT architectures, the talk will detail new technologies opportunities for more flexibility/adaptivity. Digital and mixed-signal architectures using 3D technologies will be presented in the scope of multi-processors activity as well as imagers and neuro-inspired circuits. Also, the integration of non-volatile memories will be shown in the perspective of new architectures for computing. Finally, embedding learning will be addressed to solve power challenges at the edge and in end-devices: some new design approaches will be discussed.
Auto-adaptive Digital Circuits – Application to Low-power Multicores and Ultra-low-power Wireless Sensor Nodes
Today’s sources of variations are affecting a lot circuits’ energy efficiency: this talk will bring innovative technological, circuit and architectural techniques for efficient automatic performance regulation. Given the numerous sources of variations encountered by today’s integrated systems, it becomes very challenging to implement highly energy-efficient circuits. Whether the variations are in the process, in the application needs or in the environmental characteristics, the common solution is an adaptation. This talk is exploring automatic adaptation techniques at architectural applied to MPSoCs as well as autonomous Wireless Sensor Nodes.
FDSOI Circuit Design for High Energy Efficiency: Wide Operating Range and ULP Applications – A 7-year Experience
With the increasing complexity of today’s MPSoC applications, extremely high performance has become the main requirement. However, high performances do not only mean high speed but also low power. However, most of the time, ultra-low power architectures cannot reach high speed and conversely, at high speed, a lot of power is consumed. Designing Ultra Wide Voltage Range (UWVR) systems at the nanometer regime is a way to achieve high energy efficiency but introduces many challenges due to the emphasis of parasitic phenomenon effects driven by the scaling of bulk MOSFETs, making circuits more sensitive to the manufacturing process fluctuations and less energy efficient. How to improve the trade-off between leakage, variability and speed at low-voltage? Obviously, the trend is to use thin film devices. Undoped thin-film planar FDSOI devices are being investigated in this presentation as an alternative to bulk devices in 28nm node and beyond. This talk will highlight the development of an UWVR multi-VT design platform in FDSOI planar technology on Ultra Thin Body and Box (UTBB) for the 28nm node. The use of an efficient Body Biasing (BB) shows an extremely efficient performance tuning for high-energy efficiency. We will also explore FDSOI benefits for new ULP applications and IoT perspectives.
How to Design Asynchronous Circuits? Design/System and Flow Overview - Application to Multicores and End-devices
Asynchronous circuits have characteristics that differ significantly from those of synchronous circuits in terms of their power and robustness to variations. This talk will show how it is possible to exploit these characteristics to design ultra-low power and robust circuits in the scope of the Internet-of-Everything (IoE) and also Globally Asynchronous and Locally Synchronous architectures. More specifically, the aims of the talk are to give fundamentals of asynchronous circuits design and to detail design methodologies with practical low power asynchronous circuits examples. At the end of the talk, attendees should be able to differentiate between the usefulness of an asynchronous circuit compared to a synchronous one according to their application needs.
Low-power Cyber-physical Systems
Low-power cyber-physical systems: This talk introduces cross-layer, holistic optimization of both cyber and physical worlds of CPS such as fuel cell systems, energy storage systems, electric vehicles, drones, etc.
Design Automation of Electric Vehicles
Design automation of electric vehicles: This talk introduces systematic power modeling, runtime and design-time power optimization of electric vehicles for given driving missions.
Energy Harvesting and Storage for CPS and IoT
Energy harvesting for CPS and Internet of Things: This talk introduces various energy harvesting such as solar energy and thermal energy for embedded systems and Internet of Things. This talk also covers electrical energy storage issues. This talk demonstrates how system-level techniques can overcome the inherited limitations of energy harvesting devices.
System-level Low-power Embedded Design
System-level low-power embedded design: This tutorial covers power modeling, estimation and optimization of embedded systems from the power sources to power consumers.
System-level low-power embedded design
- Source of power consumption (dynamic power and leakage power)
- Power consumption of microprocessors
- Power consumption of memory and I/O
- Power conversion and loss
- Dynamic power management and dynamic voltage scaling
Introduction to Electric Vehicles
Introduction to electric vehicles: This tutorial introduces general knowledge of electric vehicles including electric powertrain, steering and braking systems, cooling systems, wheels, and suspensions, etc.
Introduction to electric vehicles
- Electric motors and control
- Battery chemistry and battery management
- Drivetrain of electric vehicle
- Wheels and suspensions
- Steering and braking systems
- HVAC for electric vehicles
- Charing the electric vehicles
- Power converters and low-voltage architectures
- Electric vehicle conversion
Power supply for Embedded Systems and Internet of Things
Power supply for embedded systems and Internet of Things: This tutorial introduces various power sources and conversion methods for embedded systems and Internet of Things. This talk will cover battery power sources and energy harvesting power sources.
Power supply for embedded systems and Internet of Things
- Voltage regulators (LDO and DC-DC converters)
- Battery and battery chemistry
- Photovoltaic cells and the maximum power point tracking
- Thermoelectric generators
- Piezoelectric generators
- Power conversion loss of low-power electronics
- Power supply optimization for Internet of Things
- Power supply for dynamic power management and dynamic voltage scaling
Running Sparse and Low-Precision Neural Networks: When Algorithm Meets Hardware
Fast growth of the computation cost associated with training and testing of deep neural networks (DNNs) inspired various acceleration techniques. Reducing topological complexity and simplifying data representation of neural networks are two approaches that popularly adopted in deep learning society: many connections in DNNs can be pruned and the precision of synaptic weights can be reduced, respectively, incurring no or minimal impact on inference accuracy. However, the practical impacts of hardware design are often ignored in these algorithm-level techniques, such as the increase of the random accesses to memory hierarchy and the constraints of memory capacity. On the other side, the limited understanding about the computational needs at algorithm level may lead to unrealistic assumptions during the hardware designs. In this talk, we will discuss this mismatch and show how we can solve it through an interactive design practice across both software and hardware levels.
Applications of Emerging Non-volatile Memory Technologies in Next-generation Storage and Computing Systems
The goal of this seminar is to give an overview of working mechanisms of several emerging nonvolatile memory technologies, i.e., spintronic memory, phase change memory, resistive memory, and ferroelectric memory, etc., and their applications in next-generation storage and computing systems. The presenter will first introduce electrical properties of these memory technologies that make them unique from the mainstream memory technologies. Next, the presenter will discuss some typical circuit designs targeting the concerned applications, i.e., TCAM, on-chip cache, standalone memory, and storage class memory. The presenter will then illustrate the new requirements of next-generation storage and computing systems and their solutions by leveraging the unique properties of emerging memory technologies, such as fast access time, nonvolatility, high integration density, and good CMOS process compatibility. At the end of this talk, the presenter will list some common circuit design challenges of these emerging memory technologies, e.g., high write cost, asymmetric programming performance at 1 and 0, and the approaches that can alleviate these challenges at circuit design and computer architecture levels.
Deep Learning Acceleration on Mobile Platforms
Although Deep Neural Networks (DNN) are ubiquitously utilized in many applications, it is generally difficult to deploy DNNs on resource-constrained devices, e.g., mobile platforms. In practical use, both testing (inference) phase and sophisticated training (learning) phase are required, calling for efficient testing and training methods with higher accuracy and shorter converging time. In this tutorial, we first introduce DNNs from a historical perspective and then present some representative techniques to reduce the computation cost of DNN, including network pruning, model compression, low precision design etc. In rial, we will show some examples to perform and optimize the training and testing of DNN on distributed mobile systems.
• Introduction to neural networks
- History, structure, algorithms, and software
Yesterday, Today, and Tomorrow of Emerging Non-volatile Memory: A Holistic and Historical View of Technologies, Designs, and Applications
This tutorial will provide attendees a holistic and historical view about technology evolutions of emerging nonvolatile memory in devices, circuit design, and architectural applications. The presenter will first briefly describe the basic physics of several important emerging nonvolatile memory technologies – magnetic memory, resistive memory, phase change memory, and ferroelectric memory, as well as some common device engineering tradeoffs. After that, the presenter will introduce several typical emerging memory cell designs that maximize the advantages of these emerging storage devices, e.g., non-volatility, multi-level cell, and small footprint, and alleviate device drawbacks like high programming current/voltage etc. The focus will be given to the circuit design development following the advance in device engineering in last two decades from a device-circuit co-design perspective. The presenter will then go through the computer architectures that have built their memory hierarchy with the emerging non-volatile memory technologies for various concerns on power, performance, and reliability. Again, some solutions that mitigate the drawbacks of emerging memories and significantly enhance the computing systems’ efficiency and robustness will be discussed in details. Finally, the presenter will give the prospect of new applications of emerging nonvolatile memory technologies in future computing systems like neuromorphic computing etc.
- Device physics and basic working mechanisms
- Device engineering for various applications
- Memory cell design tradeoffs between area, power, performance, and reliability
- Special memory cell designs to overcome the device drawbacks and multi-level cells
- Memory structures for fast access time and high endurance
- On-chip memory hierarchy: performance-driven designs
- Off-chip memory hierarchy: density-driven designs
- Applications in persistency retaining and neuromorphic computing
A Cross-Layer Perspective for Energy Efficient Processing - From Beyond-CMOS Devices to Deep Learning
As Moore’s Law based device scaling and accompanying performance scaling trends are slowing down, there is increasing interest in new technologies and computational models for fast and more energy-efficient information processing. Meanwhile, there is growing evidence that, with respect to traditional Boolean circuits and von Neumann processors, it will be challenging for beyond-CMOS devices to compete with the CMOS technology. Nevertheless, some beyond-CMOS devices demonstrate other unique characteristics such as ambipolarity, negative differential resistance, hysteresis, and oscillatory behavior. Exploiting such unique characteristics, especially in the context of alternative circuit and architectural paradigms, has the potential to offer orders of magnitude improvement in terms of power, performance, and capability.
In order to take full advantage of beyond-CMOS devices, however, it is no longer sufficient to develop algorithms, architectures, and circuits independent of one another. Cross-layer efforts spanning from devices to circuits to architectures to algorithms are indispensable. This talk will examine energy-efficient neural network accelerators for embedded applications in this context. Several deep neural network accelerator designs based on cross-layer efforts spanning from alternative device technologies, circuit styles and architectures will be highlighted. A comprehensive application-level benchmarking study for the MNIST dataset will be presented. The discussions will demonstrate that cross-layer efforts indeed can lead to orders of magnitude gain towards achieving extreme scale energy-efficient processing.
Exploiting Ferroelectric FETs: From Logic-in-Memory to Neural Networks and Beyond
The inevitable slowdown of the CMOS scaling trend has fueled an explosion of research endeavors in finding a CMOS replacement. However, recent studies suggest that many of the emerging devices being investigated, if used as simple drop-in replacement for MOSFETs, may only achieve speedups that mirror historical trends in the best case. The consensus from the community is that cross-layer efforts are essential in combating the CMOS scaling challenge with emerging devices. This talk presents such an effort centered around a particular emerging device, ferroelectric FETs (FeFETs).
An FeFET is made by integrating a ferroelectric material layer in the gate stack of a MOSFET. It is a non- volatile device that can behave as both a transistor and a storage element. This unique property of FeFETs enables area efficient and low-power fine-grained logic-in-memory, which are desirable for many data analytic and machine learning applications. This presentation will elaborate novel circuits based on FeFETs to accomplish basic logic-in-memory operations, ternary content addressable memory (TCAM) as well as FeFET based crossbars for binary Convolutional Neural Networks. Comparisons of these FeFET based circuits with other alternative technologies will be discussed.
Network Resource Management in Wireless Networked Control Systems
Wireless networked control systems (WNCSs) are fundamental to many Internet-of-Things (IoT) applications and must work under real-time constraints in order to ensure timely collection of environmental data and proper delivery of control decisions. The Quality of Service (QoS) offered by a WNCS is thus often measured by how well it satisfies the end-to-end deadlines of the real-time tasks executed in the WNCS. Network resource management in WNCSs plays a critical role in achieving the desired QoS. Unexpected internal and external disturbances that may appear in WNCSs concurrently make resource management inherently challenging. The explosive growth of IoT applications especially in terms of their scale and complexity further exacerbate the level of difficulty in network resource management.
In this talk, I first give a general introduction of WNCSs and the challenges that they present to network resource management. In particular, I will discuss the complications due to external disturbances and the need for dynamic data-link layer scheduling. I then highlight our recent work that aims at tackling this challenge. Our work balances the scheduling effort between a gateway (or access points) and the rest of the nodes in a network. It paves the way towards decentralized network resource management in order to achieve scalability. Experimental implementation on a wireless testbed further validates the applicability of our proposed techniques. I will end the talk outlining our on-going effort in this exciting and growing area of research.
Is it Logic or Memory? - Blurring the Gap
The performance gap between logic and memory has long been a challenge for system architects. However, the emergence of new technologies such as 2D cross-point memories, Ferroelectric FETs, monolithic 3D integration promise to blur the gap between memory and logic. This talk will introduce fundamentals of these devices and associated circuits that enable computational logic to be embedded within memory. Finally, the relevance of such in-memory compute fabrics for emerging workloads in machine learning and internet-of-things applications will be discussed.
Beyond Von Neumann Systems
Computer systems have made rapid progress over the past sixty years. However, there has been very little change in the Von Neumann style principle for designing system architecture. While this approach has worked extremely well for number crunching and data manipulation applications, a drastically different architectural model and computational approach may be required for allowing machines to reach the cognitive abilities of the human brain for perceptual tasks. This talk will introduce brain-inspired system architectures that employ brain-inspired algorithms on traditional computational fabrics, machine learning accelerators and neuromorphic architectures that mimic functional fabrics in the brain. Next, the talk will introduce systems that exploit the intrinsic physics of a dynamically coupled oscillator system to solve computationally complex tasks efficiently. The seminar will provide new insights on the interplay between new device technologies and computational models.
Non-Volatile Processors – Enabling a new generation of battery-less Internet of Things
A new class of embedded processors – called as non-volatile processors - that operate reliably with unreliable, scavenged power will be introduced. These processors integrate non-volatile memory along with computational structures to provide an instant sleep/wake-up feature. This talk will explore the design space of device technologies, power management circuits and processor architectures involved in non-volatile processor driven systems. Next, compiler and software-level design interactions with a focus on emerging IoT applications will be introduced. This seminar will prepare the attendees for the new wave of self-powered IoT applications that are emerging.
Spintronics: From Devices to Circuits to Systems
Spintronics technology provides an exciting platform for implementing computational structures, and recent work has demonstrated the potential for leveraging its nonvolatility properties to build energy-efficiency systems. This talk presents a view of the state of the art in this field, as well as a view of cutting-edge research directions. We will present results from our collaborative efforts involving physicists, material scientists, circuit designers, and architects, which have led to the development of novel device structures, circuits, and memory arrays. Together, these help construct viable pathways for building spin-based structures for computation, memory, and in-memory computation, including for AI applications.
Spin-based memories are nonvolatile and are conventionally based on arrays of magnetic tunneling junctions (MTJs). The talk will first show the current state of technology for building spin-based memories, and then present directions for next-generation improvements in spintronic memory technologies. We will then present spin-based structures that have also been shown to be highly efficient for logic applications in specific scenarios, such as those that require nonvolatility or are used for error resilient applications. Finally, we will show methods for building spin-based compute-in-memory structures that are greatly advantageous for data-intensive applications, and demonstrate the efficiencies that can be achieved by this model for a neuromorphic application.
Reliability, Error-resilience, and Approximation in Integrated Systems
As CMOS technology matures, the problem of building fully reliable circuits has become more challenging, as a variety of mechanisms that perturb system performance have come into play. These range from ”one-time” drifts due to process variations, which shift circuit performance, to ”run-time” shifts caused by aging mechanisms, which cause degradations and/or failures in devices and interconnects. Developing design mechanisms that model and overcome these shifts requires an understanding from the device level, circuit level, system level, and application level.
At the device level, methods that comprehend performance shifts due to statistical as well as deterministic variations due to process and aging are key. At the circuit level, these must be factored into statistical and intelligent corner-based performance analysis approaches, as well as mechanisms that enable post-silicon compensation. At the system level, compensation and redundancy schemes must be utilized to ensure that the system operates at the desired performance, within a specified power budget.
An equally important consideration arises from application-level requirements. While some mission-critical applications, or segments of applications, require absolute accuracy, many emerging applications (e.g., signal filtering, image/video operations, neuromorphic applications) show a good deal of error-resilience, implying that absolute accuracy is not essential. In these scenarios, it is possible to selectively ignore ”accidental” errors due to process and aging, or even introduce deliberate errors in hardware to build approximate systems that provide just enough accuracy for the application. Using a case study approach, it will be shown how application-level considerations will be used to approximate systems that optimize power and performance within a specified error budget.
EDA has truly changed the world. Without EDA, today's extremely complex chips containing billions of transistors would not be possible. We would not have the Internet or smartphones, and cars would not have nearly the capabilities they have today. We would not dare to think of driverless cars! Our overall industrial productivity would be much lower.
EDA keeps evolving and reinventing itself. The discipline has always adapted to new challenges. In recent years, EDA has branched out from its core to also addressing biochips, security, smart grid – to name just a few.
EDA is exciting, as it is strategically located at the intersection of micro- and nano-electronics, computer science, and mathematics. Yet, many people do not appreciate the importance or the excitement and beauty of EDA. Specifically, not enough young and promising students grasp what kind of intellectual fulfillment and broad career perspectives EDA has to offer them.
Request a Distinguished Lecture for your event by completing the DL Request Form.
- DL requests are reviewed by the CEDA DL Committee.
- Please return your completed form to Tsung-Yi Ho, CEDA's DL Program Manager.
- Request early, and be flexible with your dates. Our lecturers are much sought after, and they travel frequently, so they usually appreciate the opportunity to combine a Distinguished Lecture with other travel. This works best if sufficient lead-time is given.