[Weekly Review] 2020/05/18-24

Published: by Creative Commons Licence (Last updated: )

2020/05/18-24

This week, I finished the TPT examination. Next week, I'll continue to modify the FYP slides and FPY modelling.

Hardware-Software Co-Design Reappears

(Also see this.)

The initial idea behind co-design was that a single language could be used to describe hardware and software. With a single description, it would be possible to optimize the implementation, partitioning off pieces of functionality that would go into accelerators, pieces that would be implemented in custom hardware and pieces that would run as software on the processor—all at the touch of a button.

This way of working failed to materialize for several reasons.

  • Primarily, the systems at that time were largely single-threaded applications designed to run on a single processor with a fairly simple system architecture. In order to do analysis, the environment and its implementation had to be deterministic, and any partitioning had to be coarse-grained enough so that the communications costs could be amortized. Under these restrictions, most partitioning was somewhat intuitive and thus automation did not provide enough value.
  • When you consider how conservative most project teams are, the likeliness of a new methodology being adopted is inversely proportional to the number of changes it requires.

What survived

Still, several technologies did come out of the co-design efforts, including virtual prototypes, co-verification, high-level synthesis (HLS) and software synthesis, although this last technology was not developed within the traditional EDA companies.

Virtual prototypes

Virtual platforms are an abstraction of the hardware implementation. “I would claim that virtual platforms with software executing on a virtual model of the hardware are a half step towards co-design,” says Schirrmeister. “Virtual platforms have gained some traction because the number of changes is confined to one group, hardware, while the software guys are happy to use it because it is early and fast.”

Co-verification

“Co-verification products gave developers the ability to run and debug hardware and software together for the first time,” says Klein. “We thought that was going to be the beginning of the end of the great divide between the hardware and software teams. It didn’t have that impact on the industry. While folks who used it really liked it, it usually did not cause the company to adopt a more cooperative development methodology — despite the clear benefits.”

In the software industry, some companies use model-based development methodologies where the software is automatically generated from verified models.

In the hardware world, SystemC was created as an attempt to bridge the two worlds. “The architect may write a single description, perhaps in SystemC,” says Tom Anderson, technical marketing consultant for OneSpin Solutions. “But it is modified extensively during the design process to produce RTL code suitable for synthesis and C/C++ code to run on the embedded processors in the SoC.”

HLS

“High-level synthesis (HLS) does abstract things, but you cannot just download some software and run it through HLS to get hardware,” says Schirrmeister. “You have to write the input in a way that it is predictable, and you know that it is going to be mapped into hardware.”

  • The tool performs automatic and guided code re-factoring to make the code synthesizable by HLS compilers.
  • Then we analyze the code to find parallelism and automatically insert HLS pragmas to guide the compiler on how to implement the functions in hardware.
  • This approach can significantly reduce the barriers to using HLS and optimize the performance of the IP.

Change through necessity

Two things could cause a breakthrough — necessity and/or opportunity.

“The pressure is there, be it from extremely efficient artificial intelligence processors for L4/L5 autonomous driving, from ultra-low power designs for distributed sensor networks, or from ultra-low latency communications for the tactile Internet,” says Fraunhofer’s Jancke. “Maybe the breakthrough is indeed accelerated if open-source hardware architectures, like RISC-V, take the step from academia to industry, and complementary open-source design tools are coming of age.”

Industry pressures are changing, too. “Turnaround times are beginning to drive this more and more,” says Achronix’s Nijssen. “Systems are reaching the scale where partitioning could be done by hand, and you could probably do a better job, but you don’t have the time or the people to do it. So you pay a price in terms of efficiency and let the tools give you a result much faster.”

Technology may force change, as well. “We are seeing the breaking of the existing methodology,” says Mentor’s Klein. “In the past, algorithms developed as software would be deployed on embedded system as software running on the embedded processors. If they did not run fast enough, the answer was simply to get a faster processor or more of them, and processor vendors obliged with ever-bigger, faster cores in larger clusters. What we see now, especially in the inferencing space on convolutional neural networks, is that that they simply cannot run anywhere near fast enough with software. These algorithms must be implemented, at least partially, in hardware. This is new territory for a lot of developers.”

Today, the whole world has the software and they need better platforms to run it on. So we have moved to a completely different place where they have the software and they are trying to build a hardware fabric to efficiently execute this software in parallel. This has created a resurgence in architecture.

New opportunities

Several new technologies are creating opportunities, either because they change the status-quo or because they advance what we can do with existing technologies.

RISC-V processor ISA architecture

It provides a specification from which solutions can be generated.

This represents an interesting co-design opportunity. “Moving functionality back and forth between hardware and software is not trivial and is done infrequently,” says OneSpin’s Anderson. “RISC-V may change this situation a bit, since the specification allows custom instructions to be added to the processor. The decision whether to implement a new function in hardware potentially can be made later in the project. This puts pressure on the verification process, since the new features must be verified and the baseline functionality must be proven to be unbroken. Any RISC-V verification solution must be flexible enough to accommodate such extensions.”

The openness of the RISC-V architecture makes it a fertile ground for innovative approaches to ISA optimizations.

There are multiple dimensions to that possibility.

  • First, in the individual instructions;
  • second, in the fabrics and the way in which they are laid out with different processors;
  • third, in new memory hierarchies. Traditionally, you had your L1, L2 and L3 cache and that seemed to work well. But with the new fabrics that is no longer the right structure and people need to innovate with them.”

A different space

When you start to question one architectural choice, others also come into the picture. “You will see demand for much more fine-grained co-processing,” predicts Nijssen. “That also brings up questions of cache coherency, because the more fine-grained it becomes, the more control you need to manage the memory resources so that you do not have any conflicts or inconsistencies in how the memory gets transferred.”

“What we found in the RISC-V world is that people are not building sophisticated out-of-order pipelines, they are not speculatively executing—they are building 64-bit in-order relatively simple pipelines and not a lot of superscalar,” says Davidmann. “We can do quite good timing analysis on this. We can do performance analysis using estimation about the way software is running on a single processor, and then allow you to add instructions and put timing for those in and help people tune the way in which the algorithm runs on the hardware. This is very different from the past, when people expected cycle-accurate timing.”

ESL & DSA

Domain-Spec computing

The steps for a DSA:

  • HW architecture design
    • instructions
    • micro-architectures
    • read/load protocol
    • IO
  • A tool chains to support SW develop and debug
    • A Domain-Specific Language to describe the applications and algorithms
    • compilers
    • OS
    • driver, runtime and library

ESL

the utilization of appropriate abstractions in order to increase comprehension about a system, and to enhance the probability of a successful implementation of functionality in a cost-effective manner.

Gajski-Kuhn Y-chart

The problem is that ESL tool vendors all talk about the same advantages they offer to their users. These may include:

  • System-level design
  • HW/SW co-design
  • Architecture exploration
  • Virtual prototyping
  • Co-simulation/co-verification
  • HLS