P35: Ex 4 17/18: Mini-Project and Structured Research Essay. The deadline for all P35 work is the first day of Easter Term. Notes: Please ensure you have completed earlier exercises and feel to free to reuse text or results from earlier exercises for the Mini-Project (4a). Collaborating is not allowed for the Research Essay and is only allowed for any parts of the mini-project that are borrowed from the term-time work or with express permission that will only be granted if the nature of the collaboration will enable individual contributions to be clearly discriminated. Your audience is the External Examiner, Second Assessor and readers of Design and Reuse or Electronics Times. It is therefore worthwhile explaining material that would perhaps be well known to others directly involved in this module. Please feel free to contact DJG as much as you like for assistance and advice with Exercise 4 A/B over the Easter Vac. --------------------------------------------------------------------- Exercise 4a (accounts for 30 credit points): Ex4 Part A: (Mini-Project): Construct an interesting argument based on practical work you have conducted using design tools for FPGA and System-on-Chip embedded software, accelerators and virtual platforms. This will most likely contain your own personal, deep evaluation of the group mini-project (Exercise 3a/b) conducted during term, but you need to make perfectly clear what your own contribution to the work is and any measurements must be your own work. You may also further extend the work from Exercise 3, or you may describe other practical work. Example arguments for 4a are: 1. Accelerating our project application saves energy because ... 2. Using a virtual platform was a good idea because ... 3. Having determined the performance looked good using a simple FPGA experiment we can explore rack scale and/or custom silicon performance using ... 4. Building and/or performing architectural exploration for accelerators of this sort exposes shortcoming in current toolchains/architectures ... But you may expound on any sensible and interesting result from your practical work. Write a report in a style suitable for publication in Electronics Times or Design and Reuse (or similar). You should aim to write at least 2000 words but full credit is available if information is instead conveyed in diagrams and figures. All of the words must be your own work but diagrams from any source may be included if credited properly. You argument itself does not have to be original: basing your report on an existing D&R or ET article is acceptable. Most importantly: think carefully about your report structure. Cite relevant prior work or alternative solutions. It is generally easiest to use a provocative title that poses a question, then expand on the question in the introduction and answer it at the end. Feel free to ask DJG for further pointers on specific topics. --------------------------------------------------------------------- Exercise 4b: Structured Research Essay Task: See companion sheet. --------------------------------------------------------------------- Points arising: Following the pattern from previous years, I have put a little additional information at the bottom of the exercise 4a sheet regarding power estimation and any other feedback arising from emails. This is under 'Points Arising' 1. There is some further (draft) information on energy and area modelling here: http://www.cl.cam.ac.uk/~djg11/vlsi-and-fpga-estimation-notes.txt 2. I have a replacement folder for ksubs3 that can make DMA master operations on the DRAM and which does not contain the large amount of unnecessary Xilinx AXI3 to AXI4 conversion logic in the older ksubs3. It is linked on the toolinfo page and ksusb page. 3. If you put a lot of results in your 3b report these should be reproduced in 4a which is where the results should be. 4. Clearly PIO is not a good interface technique to an accelerator. Although you have been encouraged to keep things simple in your practical work, it can be a good idea to discuss or cite what can potentially be achieved on the Zynq platform. Better performance arises using either the DMA units or one or two of the 64-bit high-performance AXI busses that have direct access to the DRAM. There is also a large SRAM on the Zynq chip and you might try or consider the peformance using that as a staging buffer. 5. Note the ARM9 is slightly superscalar, executing two Thumb instructions in one clock cycle if they are both 16 bit and aligned in a single 32-bit word. I dont recall whether the prazor model sets the IPC to an average of about 1.3 in Thumb mode or else measures the thumb alignment and gets it exact. END