Communication Scheduling and Buslet-based Design: New Paradigms for High Level Synthesis

2015-10-21T00:00:00Z (GMT) by Enzo Tartaglione
Current nanoscale designs are highly interconnect dominated, taking about 70% of the chip area. Interconnects also consume a significant part of the dynamic power and are responsible of about the 60% of signal delays. It is, thus, important to be able to synthesize much lower interconnect-complexity designs than are possible with current high-level synthesis (HLS) tools and algorithms. Towards that end, we have developed the following new paradigms in the scheduling, binding and general architecture synthesis problems of HLS: • Flexibly-structured that connect a few neighborhood functional units (FUs) instead of dedicated interconnects between pairs of FUs, thereby sharing interconnects among a number of FU pairs that need to communicate. • Communication scheduling (followed by standard operation scheduling that respects the communication schedules) in which communications between FUs are scheduled at ap- propriate times to minimize the number of buslets needed, subject to buslet cardinality constraints (for the purpose of upper bounding signal delay). • Buslet binding techniques, aiming to respect both buslet cardinality constraint and a con- straint on maximum fanin and fanout for the functional units. These techniques will range from simple but effective approaches like chronological binding (CB) to more sophisticated ones, like the use of lookahead approaches and simultaneous binding of iso-scheduled com- munications (communications scheduled in the same clock cycle). Furthermore, in this direction, similar solutions detection mechanism was developed, in order to improve the final quality of the result. Finally, also a force directed approach was used to solve the binding problem (FDB). All these techniques were implemented and compared in terms of both performance and complexity. • Buslet power modeling. A number of configurations with multiple tri-state buffers for interconnecting FUs through a buslet were implemented, aiming to minimize the total power consumed using buslets. These range from techniques using minimum spanning trees to more sophisticated structures with constraints on maximum graph distance be- tween connected FUs to hierarchical partitioning. Using the aforementioned techniques, we obtain significant wirelength (WL) reduction, ranging between 35% and 71%, compared to conventional designs with dedicated interconnects between communicating FU-pairs. The total chip area, including total FU area, also reduces in our designs compared to conventional designs. The power, on the other side of the coin, will increase with buslet size, but sublinearly. Empirical results show that we are able to limit the increment of power consumed by buslets compared to dedicated-interconnect designs, to a logarithmic function of the maximum buslet cardinality.