//
// SoC P35 Exercise 2 15/16 .. and 16/17 and 17/18.
// 


In this Exercise, we shall create another two versions of PERIPHERAL_DEVICE but coded as a transactional modelling blocking target.

You may skip the TLM-1 or TOY-ESL part of this exercise entirely if you wish.  But it is arguably a less steep introduction to TLM-coding to
do it simple first as the TLM 2.0 libraries are then not needed. These libraries use quite complex C++ code where the gcc
error messages can be tricky to understand.

// Please do the following 
  
     1. Create another version of PERIPHERAL_DEVICE which is a coded as a blocking transactional modelling target.
        Use either the TLM-1 style coding or the generic payload from the TOY-ESL classes. 

     2. Create a test for the TLM-1 version of PERIPHERAL_DEVICE.  This can be as trivial as you like, such
        as just a couple of reads or writes to its registers.  Or you may wish to port the test from Exericise 1.
	The test should be separate (in a different file or class) from the PERIPHERAL_DEVICE model. 
	Your test may be a standalone test wrapper or it may be a program running on a processor model
        such as the nominal processor from the toy classes (or even on the Prazor Zynq Model if you are speeding onwards). 

     3. Somehow compare the performance of your answers to Exercises 1 and 2 (RTL versus TLM). For instance,
        you might make the testbench loop for 10^7 times and the unix 'time' command.
        Your answer will typically be in terms of time to simulate each I/O operation.  Include a discussion
        of how many interactions there with the SystemC event kernel per I/O instruction.

        Note: You don't have to run precisely the same testbench in both cases, but you will then have to
        estimate some correction factor to get a comparable performance figure if they are widely different.

	Note: Make sure you use the same level of C++ optimisation for both (such as -O2).  

        Note: Take a note of the bogomips of your workstation or laptop given in /proc/cpuinfo.

     4. Now write a TLM-2 style version of PERIPHERAL_DEVICE.  It should have a b_transport method that
        is registered as a callback to a TLM 2.0 simple_target_socket.   Full credit is available for
        a valid C++ design that does not necessarily do anything.  We will get experience connecting such
        a model to the Prazor Zynq model in a week or so's time.

Optional extensions (no formal credit):

     (5.) Optional. Create a Transactor that, when assembled/composed with your answer from Exercise 1 gives something
        that offers the same TLM interface as your answer to part 1 of this exercise (Exercise 2). In fact, you might wish to do
        this step first.  The Transactor should itself be a SystemC module.

     (6.) Optional.  Run two complete copies of your answer above under one SystemC controller with each having its own
        SystemC thread.  Compare the performance as you adjust your timing quantum: performance should be perhaps double for
        a large quantum compared with a small one that forces cycle accurate behaviour.
        

     (7.) If you are already able to build your own copy of Prazor, modify the platform file in Prazor to instantiate your peripheral.
        The file(s) you would need to change is/are something like (to be cross-checked for 2018)
          /usr/groups/han/clteach/btlm/current/vhls/src/platform/arm/zynq/parallella/zynq{.cpp,.h}
        Look at the way an existing I/O device is wired in, such as the UART.  You can see it is connected 
        to the I/O bus with the following line
             BUSMUX64_BIND(busmux0, UARTS[uu]->port0, start, UART_SPACING)
        and PERIPHERAL_DEVICE can be connected to busmux0 with an additional, similar such call.
        Note that Prazor uses an extended generic payload, not the default one provided in SystemC.


As before, include your source code and examples outputs in your report. 

Answer the following questions in one or two sentences each.

Questions: 

 Q1.  Does you TLM model use the clock ? If not why not?

 Q2.  There is no Q2, but make sure you compare the performance as mentioned above.


END

------------------------------------------

Further notes and Q&A

You do not need to include any timing or power annotations in either TLM model. You just measure the performance of the simulation.

Note: You cannot convey the interrupt signal easily using only the TLM blocking coding style.  This can stay as a shared variable net model.  Or else you can use another TLM
socket in the reverse direction.  (The blocking TLM sockets provide a reverse channel intrinsically but in this task we are using non-blocking).

Q. I need the peripheral to set the interrupt after some amount of
time so as to simulate the delay of calculation. That would require
registering an event to happen in the future so the transaction can
return without blocking.

A. The most obvious way to model this form of behaviour is to use an
additional sc_method that sets the interrupt wire and which is
sensitive to an sc_event that you trigger from the blocking TLM access
point.  You will generally need to compute the trigger time based on
the sum of the current kernel time and the local thread's
loosely-timed delay. We can discuss this in one of the sessions.

END