Keynote Speeches

 

Day 1 (Dec. 10)

Logic Emulation in the MegaLUT Era - Moore’s Law Beats Rent’s Rule

                                                                                        by Mike Butts, Synopsys

 

 

 

Abstract

Throughout its twenty-five year history, logic emulation architectures have been governed by Rent’s Rule. This empirical observation, first used to build 1960s mainframes, predicts the average number of cut nets that result when a digital module is arbitrarily partitioned into multiple parts, such as the FPGAs of a logic emulator.

A fundamental advantage of emulation is that, unlike most devices, FPGAs always grow in capacity according to Moore’s Law, just as the designs to be emulated have grown. Unfortunately packaging technology advances at a far slower pace, leaving emulators short on the pins demanded by Rent’s Rule. Many cut nets are now sent through each package pin, which costs speed, power and area.

At today’s system-on-chip level of design, the number of system-level modules is growing, while their sizes are remaining constant. In the meantime, FPGAs have grown from a handful of logic lookup tables (LUTs) at the beginning to over a million LUTs today. At this scale, an entire system-level module such as an advanced 64-bit CPU can fit inside a single FPGA. Fewer module-internal nets need be cut, so Rent’s Rule constraints are relaxing. Fewer and higher-level cut nets means logic emulation with megaLUT FPGAs is becoming faster, cooler, smaller, cheaper, and more reliable. FPGA’s Moore’s Law scaling is escaping from Rent’s Rule.

 

Speaker's bio

  • MIKE BUTTS is Senior Member Technical Staff of the Verification Group at Synopsys. He has a rich history of innovation in reconfigurable hardware and hardware-based verification.
  • Mike co-invented hardware logic emulation, which has developed into an essential tool for validating and modeling large silicon projects. Mike architected and designed a number of reconfigurable FPGA and crossbar chips and system products in over twenty years in the electronic design automation industry, at Mentor Graphics, Quickturn, Cadence, where he was a Cadence Fellow, and now at Synopsys.
  • Mike spent ten years on the silicon side, co-founding FPGA start-up Tabula, leading a revolutionary many-core parallel processor technology as Chief Architect at Ambric, and participating as co-architect on Nvidia’s Denver 64-bit ARM CPU. Mike's roots are in advanced computer architecture, which he practiced at Data General, Kurzweil Computer and Floating Point Systems. The pioneering VLIW CPU he architected in an earlier startup became Mentor Graphics' Compute Engine accelerator.
  • Mike has 51 US patents and others worldwide, and ten peer-reviewed publications.  He's a long-time member of the steering and program committees of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), and a long-time program committee member of the ACM FPGA Monterey conference.  Mike earned his BS and MS degrees in Electrical Engineering and Computer Science from M.I.T.

   



 

Day 2 (Dec. 11)

Automating Customized Computing

                                                                                        by Prof. Jason Cong, UCLA

 

Abstract

Customized computing has been of interest to the research community for over three decades. The interest has intensified in the recent years as the power and energy become a significant limiting factor to the computing industry. For example, the energy consumed by the datacenters of some large internet service providers is well over 109 Kilowatt-hours. FPGA-based acceleration has shown 10-1000X performance/energy efficiency over the general-purpose processors in many applications. However, programming FPGAs as a computing device is still a significant challenge. Most of accelerators are designed using manual RTL coding. The recent progress in high-level synthesis (HLS) has improved the programming productivity considerably where one can quickly implement functional blocks written using high-level programming languages as C or C++ instead of RTL. But in using the HLS tool for accelerated computing, the programmer still faces a lot of design decisions, such as implementation choices of each module and communication schemes between different modules, and has to implement additional logic for data management, such as memory partitioning, data prefetching and reuse. Extensive source code rewriting is often required to achieve high-performance acceleration using the existing HLS tools.

In this talk, I shall present the ongoing work at UCLA to enable further automation for customized computing. One effort is on automated compilation to combining source-code level transformation for HLS with efficient parameterized architecture template generations. I shall highlight our progress on loop restructuring and code generation, memory partitioning, data prefetching and reuse, combined module selection, duplication, and scheduling with communication optimization. These techniques allow the programmer to easily compile computation kernels to FPGAs for acceleration. Another direction is to develop efficient runtime support for scheduling and transparent resource management for integration of FPGAs for datacenter-scale acceleration, which is becoming a reality (for example, Microsoft recently used over 1,600 servers with FPGAs for accelerating their search engine and reported very encouraging results). Our runtime system provides scheduling and resource management support at multiple levels, including server node-level, job-level, and datacenter-level so that programmer can make use the existing programming interfaces, such as MapReduce or Hadoop, for large-scale distributed computation.

 

Speaker's bio

  • JASON CONG received his B.S. degree in computer science from Peking University in 1985, his M.S. and Ph. D. degrees in computer science from the University of Illinois at Urbana-Champaign in 1987 and 1990, respectively.  Currently, he is a Chancellor’s Professor at the Computer Science Department of University of California, Los Angeles, the Director of Center for Domain-Specific Computing (funded by NSF Expeditions in Computing Award), and co-director of the VLSI CAD Laboratory.  He also served as the department chair from 2005 to 2008.
  • Dr. Cong’s research interests include computer-aided design of VLSI circuits and systems, design and synthesis of system-on-a-chip, programmable systems, novel computer architectures, nano-systems, and highly scalable algorithms.  He has published over 400 research papers and led over 50 research projects in these areas.  Dr. Cong received many awards and recognitions, including 10 Best Paper Awards and the 2011 ACM/IEEE A. Richard Newton Technical Impact Award in Electric Design Automation.  He was elected to an IEEE Fellow in 2000 and ACM Fellow in 2008.  He is the recipient of the 2010 IEEE Circuits and System (CAS) Society Technical Achievement Award "For seminal contributions to electronic design automation, especially in FPGA synthesis, VLSI interconnect optimization, and physical design automation."
  • Dr. Cong has served on the Technical Advisory Board of a number of EDA and silicon IP companies, including Atrenta, eASIC, Get2Chip, and Magma Design Automation. He was the founder and the president of Aplus Design Technologies, Inc., until its acquisition by Magma Design Automation in 2003 (now part of Synopsys). He was a co-founder and the chief technology advisor of AutoESL Design Technologies until its acquisition by Xilinx (2006-2010). He was a co-founder and the chief scientist of Neptune Design Automation until its acquisition by Xilinx (2011-2013).  Currently, Dr. Cong is a distinguished visiting professor at Peking University (PKU) and co-director of UCLA/PKU Joint Research Institute in Science and Engineering.
  • Dr. Cong has graduated 31 PhD students.  A number of them are now faculty members in major research universities, including Cornell, Georgia Tech., Peking University, Purdue, SUNY Binghamton, UCLA, UIUC, and UT Austin. Four of them were co-founders, together with Dr. Cong, of two startups originated from UCLA – Aplus Design Technologies (acquired by Magma in 2003, now part of Synopsys) and AutoESL Design Technologies (acquired by Xilinx in 2011). Others are in key R&D or management positions in various companies related to the information technologies, such as Arista, Bloomberg, Broadcom, Cadence, Google, IBM, Intel, Micron, Synopsys, and Xilinx.

    



 

Day 3 (Dec. 12)

Doing FPGA in a Former Software Company

                                                                                        by Feng-hsiung HsuMicrosoft Research Asia

 

Abstract

Microsoft has gone through massive changes in the last few years. First, it was the dominant software company. Then, it became a “Devices and Services” company, and now it is “Mobile First, Cloud First”. Of course, deep down in the bones, it is still a software company. In this talk, I will give a personal account on how FPGA acceleration gradually gained traction inside Microsoft, difficulties and lessons learned in getting acceptance, FPGA’s apparently imminent deployment inside Microsoft data centers, and finally what may be needed in FPGA programming software tool developments for wider acceptance inside a company like Microsoft.

 

Speaker's bio

  • FENG-HSIUNG HSU is the research manager for Hardware Computing Group at Microsoft Research Asia. Prior to Microsoft, he had worked at IBM’s T. J. Watson Research Center, Compaq’s Western Research Lab, and HP’s Research Lab. He received his Ph. D. in Computer Science from Carnegie Mellon University in 1989 and B. S. in Electrical Engineering from National Taiwan University in 1980. He is sometimes known by his nick name “CB”, which stands for “Crazy Bird”.
  • CB’s research interests include VLSI design, special purpose algorithms, machine learning, device physics, optics, FPGA systems, computer architecture, mobile systems, 3D imaging systems, human-computer interface, and “whatever makes sense”. Recently, he has been known to dabble in keyboard design, among other things.
  • CB received ACM’s Grace Murray Hopper Award for his work at Carnegie Mellon on Deep Thought, the first chess machine to play chess at Grandmaster level. To the best of his knowledge, Deep Thought was also the first chess machine to use FPGAs (as part of the evaluation function). In 1997, CB won the Fredkin Prize, along with Murray Scott Campbell and Arthur Joseph Hoane, for Deep Blue’s defeating the World Chess Champion (Gary Kasparov) in a set match. CB served as the chip designer and system architect for Deep Blue. CB is the author of the book, “Behind Deep Blue: Building the Computer that Defeated the World Chess Champion”.