Q. If you are going to do something lots of times, is it always more efficient to invest heavily in planning to get a rapid execution ?
A1. Following that approach lead Intel to Itanium VLIW processor. But has that has sunk?
A2. Classical HLS has saved a lot of energy based on static schedules. (Have you tried compressing MPEG on your laptop and then wondered how your mobile manages it so easily ?)
Even without data-dependent control flow, variable-latency operations are incompatible with a completely static schedulle. Keeping a large system in global synchronisation is bound to miss opportunities locally available, but overly-fine-grain dynamic schedulling has a lot of management overhead.
The Kiwi HLS flow uses classical HLS on each thread in turn to generate a static schedule for that thread, but these interact dynamically. E.g. using FIFO queues between components.
Combined with a server farm we get localised static schedules and global dynamics.
A static computational graph takes less management than approches that dynamically map work to processing nodes.
The work stealing schedulling approach is widely used for dynamic schedulling of workload that is already statically divided into parallel tasks that may dynamically vary in execution time. »Wikipedia
A hardware server can be shared (contended for) by multiple clients. For example, Bluespec's rich library contains a Completion Buffer and other flexible structures for easy creation of pools of servers for dynamic load sharing.
Alternatively: current research, (Ali Zaidi) discards input control flow and compiles to locally to dataflow hardware: The VSFG-S Approach. KiwiC should have a plugin for this soon...
|38: (C) 2012-18, DJ Greaves, University of Cambridge, Computer Laboratory.|