Computer Laboratory

ACS Project Suggestions from Simon Moore

Background

Research in the Computer Architecture Group that I am conducing focuses on prototyping multiprocessor systems on FPGAs in order to do architectural evaluation on none trivial benchmarks. We are currently completing a 64-bit MIT style processor with multithreaded extensions. We are also working on a project with Prof Steve Furber in Manchester looking at million processor systems.

Note that these projects are aimed at those who might like to continue to undertake research into computer architecture for their Ph.D.


Multithreaded ARM style processor in Bluespec

Research question: in order to undertake research on multiprocessors whilst using little hardware, can we time multiplex an ARM processor pipeline so that it emulates N processors running at 1/Nth of the speed (e.g. where N might be from 8 to 1024)?

Suggested approach: following on from the Advanced Computer Design course, design an ARM style processor in Bluespec SystemVerilog. To simplify the design, instructions should be executed sequentially, i.e. there is only one instruction in flight down the pipeline at any one time. But to improve performance a number (e.g. 128) of contexts can be scheduled in a round robin manner. Provided there are more contexts than pipeline stages the pipeline will remain full. The ARMv3 instruction set variant it to be targeted since it is believed to be out of patent (or nearly out of patent) but still has good compiler support. Initially the simpler single cycle instructions should be implemented with more complex instructions (e.g. load and store multiple) added as a possible extension.

Possible extensions: include adding a timing model to the ARM emulation to allow cycle times to be reported which reflect the time taken for a conventional ARM processor pipeline.


Operating system emulation for FPGA based experimental processors

Research question: can we emulate an operating system (OS) for novel processors prototyped on FPGA so that we do not need to port an OS for every variant of the processor?

Further motivation: One of the problems with undertaking research on novel computer architectures is that you don't really want to implement a full operating system on a novel processor and yet so many benchmarks will only run with operating system support. However, it is often the case that OS support is only required at the beginning and end of the benchmark and is not of particular interest in the performance critical middle phase of execution. So provided some mechanism can be provided to provided OS support, it doesn't matter if it is slow.

Approach: This project aims to provide operating system stubs to allow many benchmarks to run. This is analogous to the often used computer architecture simulation technique of providing operating stubs which "magically" provide operating system support. One approach is to replace the base C library (libc) with one which just has stubs which in turn cause the simulator to handle the OS calls via the OS that it sits on top of. We need something analogous for our FPGA prototypes.

Proposed infrastructure: For this project a prototype system will be implemented on FPGA using a dual-NIOS (soft core from Altera) system (an example of which can be provided by the proposer of this project). The processors can easily be provided with a set of FIFO communication channels for efficient communication. One processor (the "client") will be designated as the processor being "simulated" with a view to replacing this with something more exotic/experimental (e.g. the Mamba multithreaded processor we are using for research) in the future. The other processor, the "server", will provide operating system support and will locally run the MicroC/OS-II RTOS provided by Altera for the NIOS. The client processor will run a newlibc (cut down libc) stub library (written by this project) which turns operating system calls into remote procedure calls to the "server" processor. The server processor will then handle the calls locally or will, optionally, talk to a host PC to fulfil the request (e.g. to allow remote file access).

Possible extensions: include adding profiling and timing information to allow statistics on the OS calls to be obtained and for timing information to be added, e.g. the "simulation"/"client" processor can be delayed by a fixed number of clock cycles to allow it to be "billed" for time spent in the OS.