HOME
UP
  PREV
NEXT (Instruction-Level Parallelism)
Parellelism: The key to high performance.
Transistors are abundant and having a lot of hardware is not itself a problem (We discussed the Power Wall
elsewhere).
Three forms of parallel speed-up are well-known for classical imperative parallel programming:
- Task-Level Parallelism: partition the input data over nodes and run the same program on each node without inter-node communication (aka embarassingly parallel).
- Programmer-defined, Thread-Level Parallelism: The programmer uses constructs such as pthreads or CSharp Parallel.for loop to explicitly denote local regions of concurrent activity that typically communicate using shared variables.
- Instruction-Level Parallelism: The imperative program (or local region of) is converted to dataflow form, where all ALU operations can potentially be run in parallel, but operands remain pre-requisite to results and load/store operations on a given mutable object must respect program order.
A major (yet sadly less-popular) alternative to thread-level parallelism is programmer-defined channel-based communication, that bans mutable shared variables (examples: Erlang/Occam/Handel-C/Kahn Networks).