Load Sharing

There are two aspects of load sharing: when to migrate processes to idle nodes and how to find which nodes are idle. The former is simple to decide; if the computational capacity of the node is exceeded, then the load sharing algorithm attempts to migrate processes elsewhere. The latter is rather more involved, and there are various techniques:

Centralised approaches: A central node collects load statistics and details of processes and decides which should migrate to where at what time. As with all centralised approaches this is subject to three major drawbacks: the state information is always out of date, it is a potential bottleneck, and it is a single point of failure. This is not as critical as for some other distributed systems tasks, since load sharing is not a correctness criterion (so long as we have no timeliness constraints), merely a method of performance enhancement. In addition, techniques have been proposed for monitoring the state of servers and recovering them to other nodes in event of failure [#LamANAFI##1#].
Distributed approaches: Workstations exchange information and cooperate in deciding which processes to move. Since there is true concurrency in distributed systems, there is a need for synchronisation between different nodes. In view of the additional communications overhead and suboptimality this is less efficient than the centralised approach. However, it is rather more robust.

In the distributed approach, various different techniques exist for finding idle processors. Amongst these are [#GoscinskiDOSTL##1#]:

A logical hierarchy of processors, as in MICROS [#WittieMADOS##1#].
A logical ring of processors.
No particular structure, as in Worms [#ShochTWPEE##1#]
The Condor system [#LitzkowCAHOI##1#]