HPR L/S (aka Orangepath) Facilities

HPR L/S (aka Orangepath) is a library and framework designed for synthesis and simulations of a broad class of computer systems, protocols and interfaces in hardware and software forms.

The HPR L/S library provides facilities for a number of experimental compilers. This part of the manual describes the core features, not all of which will be used in every flow.

FILES AND DIRECTORIES

When an Orangepath tool is run, it creates a directory in the current directory for temporary files. This is the obj directory. This obj directory contains temporary files used during compilation.

The .plt files are plot files that can be viewed using diogif, either on an X display or converted to .gif files.

The h2logs file contains a log of the most recent compilation. These are placed in a folder named with the early arg -log-dir-name.


Espresso

The traditional unix espresso tool is not needed for Fsharp implementation of HPR L/S since this has its own internal implementation.

The Moscow ML implementation of the Orangepath tool required Espresso to be installed in /usr/local or else the ESPRESSO environment variable to point to the binary. If set to the ASCII string NULL then the optimiser is not used.

The -no-espresso flag can also be used to disable call outs to this optimiser. Internal code may be used instead.


Cone Refine

The cone refine optimiser deletes parts of the design that have no observable output. It can be disabled using the flag -cone-refine=disable.

It may also be programmed to retain other named features of interest.


HPR Command Line Flags

The very first args to an HPR/Orangepath tool are the early args that enable the receipe file to be selected and the logging level and location to be set.

The first argument to an HPR/Orangepath tool, such as h2comp or KiwiC, is a source file name. Everything else that follows is an option. Options are now described in turn.

The HPR/LS logger makes an object directory and writes log files to it.

Flag -verboselevel=n turns on diversion of log file content to be mirrored on the standard output. 0 is the default and 10 makes everything also come out on the console. Console writes are flushed after each line and this is also a means of viewing the final part of a log that has not been flushed owing to stdio buffering.

Flag -verbose turns on a level of console reporting. Certain lines that are written to the obj/log files appear also on the console.

Flag -verbose2 turns on a further level of console reporting. Certain lines that are written to the obj/log files appear also on the console.

Flag -recipe fn.xml sets the file name for the recipe that will be followed.

Flag -loglevel n sets the logging level with 100 being the maximum n that results in the most output.

Flag -give-backtrace prevents interceptions of HPR backtraces and will therefore give a less processed, raw error output from mono.

The developer mode flag, -devx, enables internal messages from the toolchain that are for the benefit of developers of the tool. Setting the environment variable HPRLS_DEVX=1 performs the same action.

NOTE: Many of the command line flags listed here have a different command line syntax using the FSharp version of KiwiC. This manual is still being updated. To get their effect one must currently either make manual edits to the recipe xml file (e.g. kiwici00.rcp) or else simply list then on the command line using the form -flagname value

If the special name -GLOBALS is specified as a root, then the outermost scope of the assembly, covering items such as the globals found in the C language, is scanned for variable declarations.

Flag -preserve-sequencer structures output code with an explicit case or switch statement for each finite-state machine.

Synthcontrol -bevelab-repack-pc=disable creates sequencer encodings where the PC ranges directly over the h2 line numbers: easier for cross-referencing when debugging. Otherwise it defaults to a packed binary or unary coding depending on -bevelab-onehot-pc.

Option -array-scalarise all converts all arrays to register files. Other forms allows names to be specifically listed. See § [*].

 -vnl-resets=none
 -vnl-resets=synchronous
 -vnl-resets=asynchronous

or change this XML line in the file /distro/lib/recipes/KiwiC00.rcp

 <defaultsetting> resets none </defaultsetting>

When doing RTL simulation of the KiwiC-generated RTL output, one can sometimes encounter a `lock up' where the design makes no further progress. Tracing the `pc' variable in the output code will reveal it is stuck when trying to make a conditional branch whose predicate evaluates to dont-care owing to un-initialised registers or disconnected inputs.

HPR (KiwiC) (by default) does not generate initialisation code to set static variables to their default values (zero for integers and floats and false for booleans). The same goes for RAM contents.

For RAM contents, with KiwiC, the user code must contain an explicit clear operation in a C# loop.

To overcome the problem with uninitialised registers, we can potentially use -vnl-resets=synchronous or -vnl-resets=asynchronous. This will make the RTL simulate properly and overcomes most lockup problems. But we get additional wiring in the output that can repeat the FPGA's own hardwired or global reset mechanisms.

Clearly the design can be synthesised separately with and without resets. But to avoid the duplication of effort, hence with a common RTL file (one synthesis run only), one must take one of the following five routes, where the first two use a KiwiC compile with the default -vnl-resets=none.

  1. use an RTL simulator option that has an option where all registers start as zero instead of X,

  2. add a set of additional initial statements to the generated RTL that are ignored for FPGA synthesis (HPR vnl could generate these automatically but does not at the moment),

  3. request a reset input to the generated sub-system (using -vnl-resets=synchronous) but tie this off to the inactive state at the FPGA instantiation of that subsystem and expect the FPGA tools to strip it out as redundant logic so that it does not consume FPGA resource.

  4. trust the FPGA tools to detect a synchronous reset net as such (by boolean dividing FPGA D-input expressions by it) and map it to the FPGA hardwired reset mechanisms so that it does not consume FPGA resource.

  5. use -vnl-resets=asynchronous and trust the FPGA tools to map this to the hardware global reset net.
Note, the vnl output stage always generates subsystems with a reset input but this is (mostly) ignored under the default option of -vnl-resets=none.

See § [*].

 "-subexps=off"

The subexps flag turns off sub-expression commoning-up in the backend.

 -vnl-rootmodname name

Use the -vnl-rootmodname flag to set the output module name in Verilog RTL output files.

 -vnl-roundtrip name= [ enable | disable ]
Converts generated Verilog back to internal VM form for further processing.

When enabled, generated RTL will be converted back again before (for example) being simulated with diosim. When disabled, the input to the verilog generate (vnl) recipe stage will be passed on unchanged and a typical recipe will then simulate that directly.

 "-ifshare=on"
 "-ifshare=none"
 "-ifshare=simple"

The default ifshare operation is that guards are tally counted and the most frequently used guard expressions are placed outermost in a nested tree of if statements.

The ifshare flag turns off if-block generation in output code. If set to 'none' then ever statement has its own 'if' statement around it. If it is set to 'simple' then minimal processing is performed. The default setting is 'on'.

 "-dpath=on"
 "-dpath=none"
 "-dpath=simple"

When dpath=on, with the preserve sequencer options for a thread, a separate 'datapath' engine is split out per threads and shared over all data operations by that thread.

Synthcontrol cone-refine-keep=a,b,c accepts a comma-separated list of identifiers names as an argument and instructs the cone-refine optimiser/trimmer to retain logic that supports those nets.

-xtor mode specifies the generation of TLM transactors and bus monitors. The mode may be initiator, target or monitor.

-render-root rootname specifies the root facet for output from the the current run. If not specified, the root facet is used. This has effect for interface synthesis where the root module is not actually what is wanted as the output from the current run.

-ubudget n specifies a budget number of basic blocks to loop unwind when generating RTL style outputs.

The -finish={true false} flag controls what happens when the main thread exits. Supplying this flag causes generated output code to exit to the simulation environment rather than hanging forever. When running under a simulator such as Modelsim, or when generating SystemC, it is helpful to exit the simulation but certain design compiler and FPGA tools will not accept input code that finishes since there is no gate-level equivalent (no self-destruct gate).

Other output formats

The -smv flag causes the tool to generate a nuSMV output file.

The -ucode flag causes generation of UIA microprocessor code for the design.

-vnl fn.v specifies to generate a Verilog model and write it to file fn.v.

-gatelib NAME requests that the Verilog output is in gate netlist format instead of RTL. The identifier NAME specifies the cell library and is currently ignored: a default CAMHDL cell library is used.

-gatelib NAME requests that the Verilog output is in gate netlist format. This takes precedence over -vnl that causes RTL output.

General Command Line Flags

The -version flag give tool version and help string.

The -help flag give tool version and help string.

The -opentrace flag sets the opentrace level: this alters the debugging output but most debugging is in the h2log file anyway.

The -rwtrace flag sets the rwtrace level, rather like the -opentrace option.

«««< HEAD


SoC Render

The SoC Render compiler takes a set of HPR VMs and generates an hierarchic netlist to wire up their ports using pre-defined rules that are based on the concept of domains of connection. It will instantiate as many protocol adaptors, bus switches and arbiters as is needed. The resulting structure is typically rendered as RTL. In the future it can invoke Greaves/Nam glue logic synthesis or other generators and then instantiate the glue in the netlist.

The resulting system can then be emitted without the actual instances using other recipe stages, such as SystemC, RTL or IP-XACT. These output files will typically be combined with the instantiated components in external tools, such as FPGA logic synthesis.

The resulting system can also be passed on to the Diosim simulator for execution within Orangepath, for auditing tools to run, or for any other purpose.

Its internal datastructure, priory to rendering the output, is in a form that can be output as IP-XACT spirit:design document. Hence a subsequent part of the external tool chain then knows how to assemble the SoC.

A future facility to read in and obey IP-XACT spirit:design documents could easily be added, but there are plenty of third-party tools offering that service.

SoC Render supports:

  1. Creating inter-module wiring structures with tie-off of unused ports.
  2. Working both at the TLM level and structural net list level.
  3. Glue logic insertion in the form of instantiated adapators from the library are readily inserted automatically using rules based on interface type differences.
  4. Allocation of AXI tag numbers.
  5. Custom glue logic from the Greaves/Nam cross-product technique can also be rendered.
  6. Outputs are rendered in Verilog, IP-XACT, SystemC TLM, SystemC behavioural and SystemC RTL-styles depending on the subsequent recipe stage the output is passed to.
  7. Server farm mode supporting dynamic dispatch will be added during 2017.

The SoC Render rule engine understands the following types of component:

Every block is accompanied with non-functional meta-info that gives an area, latency, throughput and energy cost using IP-XACT extentions.

Every external block port and port on a primary IP block must also be manually given a so-called domain name. Connections that cross domain names are not synthesised by the standing rules. There will generally be at least one domain name for each connection between separately-compiled modules in an incremental compilation. Also, there will be domains associated with each disjoint memory map/space and one for the debug/directing logic.

The system synthesis is guided by a goal function, which is a scalar metric that factors area, delay and energy according to a weights that the user can adjust as desired.

The automatic generation axioms are:

  1. The number of primary IP blocks and external ports is set in the initial configuration, together with their instance names. Their plurality may not not be adjusted by SoC Render.

  2. The plurality of all other components may be freely adjusted by SoC Render, but it may not replicate state-bearing components (unless they have mirror rules defined in the future).

  3. All initiating ports must be connected to a matching target port with a one-to-one direct connection.

  4. The resulting design should give a low value for the goal function.

    This will tend to minimise the number of additionally instantiated components and typically causes them to be wired in tree-like structures to minimise latency.

Per domain metric functions and upper bounds

Algorithm: for each domain name, while there is an unconnected initiator, create a connection for it to a suitable serving resource. If the serving resource is an external port that is currently disconnected, a direct connection can be made. But if the external port is already bound, an additional bus switch will be instantiated or the arity of an existing one will be increased.

If the serving resource would be an instance of replicatable IP block, ...

If the serving resource would be an instance of mirrorable IP block, ...

Diosim Simulator

The HPR L/S library provides a built-in simulator called diosim. It is intended to be able to execute any mixture of intermediate codes since all have executable semantics.

Diosim is invoked by the recipe. Typically a recipe may invoke it on the same intermediate form that is being rendered as RTL or SystemC etc..

Simulation Control Command Line Flags

As well as providing simulation output in VCD and console form, diosim can collect statistics and help with profile generating. However, it is fairly slow and it is best to collect profiles from faster execution engines, such as via Verilator.

The statistics that diosim can collect range from net-level switching activity to higher-level statistics like imperative DIC instructions executed, RTL sequential and combinational assignment counts.

Only the two Verilog output forms, RTL and gatelevel, support conversion back into HPR machine form for post generation simulation.

-sim n specifies to simulate the system using the builtin HPR event-driven simulator for n cycles. The output is written to t.plt for viewing. The -traces flag provides a list of net patterns to trace in the simulator.

The -title title flag names the diosim plot title.

The -sim-rtl flag causes diosim to simulate the results of the generator processor (e.g. compilation to FSM) rather than the input form.

The -sim-gates flag causes diosim to simulate the results of compilation to gates (-gatelib is used) rather than the input form.

The -diosim-techno=enable flag causes print statements from the simulator to include ANSI colour escape codes for various highlighting options.

The -plot plotfile flag causes plot file output of the diosim simulation to a named plot file.

The plot file can be viewed under X-windows and/or converted to a gif using the diogif program.

The Orangepath system contains its own simulator called diosim. Since the target is output from the compiler as portable code to be fed into third-party C and Verilog compilers, it is not strictly necessary to use the Orangepath simulator. However, the simulator provides a self-contained means of evaluating a generated target without using external tools.

The simulator accepts a hierarchical set of VM2 machines and simulates them and their interactions.

The simulator will verify all safety assertion rules that contain no temporal logic operators. Other safety and all liveness assertions are ignored.

Non-deterministic choices are made on the basis of a PRBS that the user may seed.

The PRBS is also used for synthetic input generation from plant machines or external inputs. PRBS values used for external inputs are checked against plant safety assertions and rejected if they would violate.

Output is a log and plot file. The plot file is currently in diogif plot format, but a VCD format should be added.

Detailed logging can be found in the obj/log files. If a program prints the string 'diosim:traceon' or 'diosim:traceoff' the level of logging is changed.

If a program prints 'diosim:exit' then diosim will exit a though builtin function hpr_exit() were called.

KiwiC using C++ instead of C#

Visual Basic, Visual C++ and gcc4cil will generate dotnet portable assemblies from C++ code.

Using the gcc4cil compiler you should find a binary called "cil32-gcc" in the <path_to_cross_compiler>/bin directory. To create a CIL file use this compiler with the -S option.

Getting gcc4cil.

   1. Get Gcc4Cil from the svn-repository that is mentioned on the
 Gcc4Cil website (http://www.mono-project.com/Gcc4cil) 
 "svn co  svn://gcc.gnu.org/svn/gcc/branches/st/cli"

   2. As Gcc4Cil wants to compile files for the Mono-platform, you
 need the Mono-project installed on your system. The easiest way to
 install it is to use "Linux installer for x86" that can be found
 under http://www.mono-project.com/Downloads . Installation
 instructions are available under
 http://www.mono-project.com/InstallerInstructions .

   3. It may be possible that you need to install the portable .NET
 project. During the manual compilation of gcc4cil I got errors, that
 made me install this project. However I could not find a line in the
 automatic generated Makefile that has a reference to the p.net path
 in my home-dir. If you get the impression that you need it, you can
 find it here: http://www.gnu.org/software/dotgnu/pnet-install.html

   4. Because I did not know that there was a automatic script for this, I did a
      <path_to_gcc4cil>/configure using the following options
      --prefix=<where it should be installed to>
      --with-mono=<install_dir_of_mono>
      --with-gmp=<install_dir_of_glib>

      I then did a make bootstrap-lean and installed the following libraries because
      of compile errors:
      - bison-2.3.tar.gz*
      - glib-2.12.9.tar.gz
      - pkg-config-0.22.tar.gz

      I think it is likely that you may want so skip this step, as
 this step DOES_NOT generate a compiler for cil but for boring x86
 code (what I learned after I did this). However I set up paths to the
 installed libraries in this step, so I mention it. I do not know for
 sure if all those paths are needed in the end. As it works for me
 now, I wont remove them:

      setenv HOST_MONOLIB "/home/petero/mono-1.2.5.1/lib"
      setenv HOST_MONOINC "/home/petero/mono-1.2.5.1/include/mono-1.0:/home/petero/mono-1.2.5.1/include/mono-1.0/mono:/home/petero/mono-1.2.5.1/include/mono-1.0/mono/cil:/home/petero/mono-1.2.5.1/include/mono-1.0/mono/jit:/home/petero/mono-1.2.5.1/include/mono-1.0/mono/metadata"
      setenv CIL_AS "/home/petero/p.net/lib:/home/petero/p.net/bin"

   5. in the directory where you put the gcc4cil source code, you can
 find a shell script called "cil32-crosstool.sh". Execute this and the
 crosscompiler for C-to-CIL compilation hopefully now gets compiled.

Nov 2016 note: The main gcc4cil problem was a lack of any sort of linker, as I recall. I do not recall why a linker was critical since KiwiC and dotnet are both happy to accept multiple dll files. Perhaps there was a related problem with .h files. I don't know whether gcc4cil maintenance is now abandoned.

Of course Visual C++ produces dotnet code that should work pretty much as well as the recent Visual Basic demo. I don't know how much Visual C++ resembles standard C++ or whether it can only be compiled on windows.

All of the HPR recipe stages except for the first, kiwife, are independent of dotnet. The intermediate HPR VM forms between recipe stages are all supposed to be serialisable to disk: you use recipe files that start and end with a load and save of VM code. But that facility has not been used recently. It might become important again to help overcome long monolithic compile times.



Subsections
David Greaves 2017-04-29