Department of Computer Science and Technology

Technical reports

Exploiting tightly-coupled cores

Daniel Bates

January 2014, 162 pages

This technical report is based on a dissertation submitted July 2013 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Robinson College.

DOI: 10.48456/tr-846


As we move steadily through the multicore era, and the number of processing cores on each chip continues to rise, parallel computation becomes increasingly important. However, parallelising an application is often difficult because of dependencies between different regions of code which require cores to communicate. Communication is usually slow compared to computation, and so restricts the opportunities for profitable parallelisation. In this work, I explore the opportunities provided when communication between cores has a very low latency and low energy cost. I observe that there are many different ways in which multiple cores can be used to execute a program, allowing more parallelism to be exploited in more situations, and also providing energy savings in some cases. Individual cores can be made very simple and efficient because they do not need to exploit parallelism internally. The communication patterns between cores can be updated frequently to reflect the parallelism available at the time, allowing better utilisation than specialised hardware which is used infrequently.

In this dissertation I introduce Loki: a homogeneous, tiled architecture made up of many simple, tightly-coupled cores. I demonstrate the benefits in both performance and energy consumption which can be achieved with this arrangement and observe that it is also likely to have lower design and validation costs and be easier to optimise. I then determine exactly where the performance bottlenecks of the design are, and where the energy is consumed, and look into some more-advanced optimisations which can make parallelism even more profitable.

Full text

PDF (1.5 MB)

BibTeX record

  author =	 {Bates, Daniel},
  title = 	 {{Exploiting tightly-coupled cores}},
  year = 	 2014,
  month = 	 jan,
  url = 	 {},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-846},
  number = 	 {UCAM-CL-TR-846}