Project suggestions from David Greaves.

Part II Project Suggestion(s)

Random Instruction Sequence Generator

The OpenRISC processor is an open source family of CPU cores and SoCs with a GNU C compiler and GNU toolchain. Currently there are various RTL models in Verilog and a fast instruction set simulator (ISS) written in C. The two are tested using both hand-crafted and compiler-generated sequences of instructions. However, testing with random sequences of instructions has not been done, despite this being a known good means of finding bugs.

In this project you will generate random sequences of instructions that are valid. You will take one of the simulators for OpenRISC, such as the SystemC simulator from Greaves+Pusovnik that contains both the RTL and fast ISS simulators. You will measure and predict the 'fault coverage' you have achieved. You will also find one or two real bugs in the OpenRISC implementation - a useful contribution since this core is now being used in real projects (eg on the International Space Station).

CPU Energy Use Logging

The OpenRISC processor is an open source family of CPU cores and SoCs with a GNU C compiler and GNU toolchain. It is available in Verilog RTL and other forms. The Verilog can be converted to C++ using a free program called Verilator. The resulting C++ uses assignment macros for each update. Energy use in processors depends greatly on the number of bits that change value at each clock event. The project is to replace the macros with assignment functions that log the number of bits that have changed. (Verilator is an open source tool similar to the commercial tool from Carbon Design Systems and they already have some energy use logging.) The resulting numbers can be logged by the TLM POWER3 library. The project is to understand how different application programs cause different patterns of energy use in the different parts of the processor. An interesting aspect for exploration is how frequently the bit-level activity needs to be observed to get an accurate measurement of energy use and whether results of a similar accuracy can be obtained from suitable annotations to the high-level instruction set simulator (ISS) for the OpenRISC.

Parallel SystemC Implementation

The free SystemC simulation library provides C++ threads to component models. However, all of these threads run on a single CPU core on the hosting workstation which is no longer ideal, given the prevelance of multicore CPUs. Coding styles used in C++ tend to assume non-reentrant, non-preemptive schedulling.

The project is to take the free simulation kernel and make it use multiple processor-level threads (using say posix pthreads) and then to look at the problems in user-level models that may arise from assumptions about the threading model.

Evaluation can be in terms of how much speed up is achieved per additional core and on what percentage of some existing code bases of SystemC needed any modification for truely parallel execution.

Parallel Verilator Implementation

This one is perhaps too complex for part II and should be an ACS PROJECT.

The OpenRISC processor is an open source family of CPU cores and SoCs with a GNU C compiler and GNU toolchain. It is available in Verilog RTL and other forms. The Verilog can be converted to C++ using a free program called Verilator. However, Verilator generates models that only exploit one CPU core of today's multicore workstations.

The project is to see what style of cooperation between posix pthreads can support the fine-grain parallelism needed to make these models go faster. If the resulting hardware model 'clocks' at tens of killohertz the inter-thread communication will typically need to be an order or two faster, meaning that spinning on shared variables is the best approach. The project will examine the metrics reported by 'oprofile' and similar and try to find an analytical explanation for any speedup gained by using multiple threads.

There is already some discussion on the Verilator IRC about this project. A fair amount of work would be involved in restructuring the output from Verilator to run on multiple cores - finding good static partitions and hoping they make a good dynamic partition or else using profile-directed feedback to refine the partitioning.

Algorithmic Energy on Multicore

Multicore computers pass cache update messages between the cores to maintain an accurate view of main memory. There is a view that main memory is now a cheap resource and algorithms that write to each heap location only once are feasible, especially on multicore systems where evicting modified cache lines consumes more energy that using fresh memory that will never change ownership between cores.

In the past...

Previous Years' Suggestions.

ACS Project Suggestion(s)

Originator DJ Greaves

Parallel Verilator Implementation

See above.

Scheduler for Toy Bluespec RTL Compiler

There is a locally-written, toy Bluespec Verilog compiler on this LINK.

A basic parser has just been added but there are many details missing compared with the compiler from Bluespec Inc. The most important and interesting thing is the rule scheduler. The toy version currently just puts the rules in the priority order found in the source file.

The toy compiler is written in F Sharp.

The project would be to consider several basic design problems that can benefit from a good scheduler or which cannot be scheduled using the standard approach. Several small examples and one larger example would make a good basis. The next step is to make sure the toy compiler can compile these designs to some extent (one or two basic Bluespec features might need to be added to the compiler. Finally, explore the performance of the designs as scheduled in new and different ways.