Kôika: A Core Language for Rule-Based Hardware Design

This is the home of Kôika, an experimental hardware design language inspired by BlueSpec SystemVerilog. Kôika programs are built from rules, small bits of hardware that operate concurrently to compute state updates but provide the illusion of serializable (atomic) updates. Kôika has simple, precise semantics that give you strong guarantees about the behavior of your designs.

Our distribution includes an executable reference implementation of the language written using the Coq proof assistant, machine-checked proofs ensuring that Kôika's semantics are compatible with one-rule-at-a-time execution, and a formally-verified compiler that generates circuits guaranteed to correctly implement your designs.

Kôika programs are typically written inside of Coq using an embedded DSL (this lets you leverage Coq's powerful metaprogramming features and modular abstractions), though we also have a more limited standalone front-end that accepts programs in serialized (s-expressions) form. For simulation, debugging, and testing purposes, Kôika programs can be compiled into readable, cycle-accurate C++ models, and for synthesis the Kôika compiler targets a minimal subset of synthesizable Verilog supported by all major downstream tools.

Kôika is a research prototype: the circuits that our compiler generates typically have reasonable-to-good performance and area usage, but our RTL optimizer is very simple and can miss obvious improvements. Because our simulator can take advantage of high-level information, Kôika designs typically run reasonably fast in C++ simulation.

Our largest example at the moment is a simple RISCV (RV32I) 4-stage pipelined core.

Kôika is currently developed as a joint research effort at MIT, involving members of CSG (the Computation Structure Group) and PLV (the Programming Languages & Verification group). Our latest draft is a good place to get details about the research that powers it. The name “Kôika” (甲イカ) is Japanese for “cuttlefish”; we chose it because cuttlefishes have blue blood (a tribute to the name Bluespec), and because their eight arms are equipped with independent neurons that allow them operate semi-independently towards a shared purpose, much like rules in Kôika designs.

This README provides practical information to get started with Kôika. For details about Kôika's semantics, its compilation algorithm, and the work it draws inspiration from, please read our PLDI 2020 paper: The Essence of Bluespec, A Core Language for Rule-Based Hardware Design.

Getting started

Installing dependencies and building from source

OCaml 4.14 opam 2.0 or later, GNU make.
Coq 8.18:
```
opam install coq=8.18
```
Dune 3.14 or later:
```
opam upgrade dune
```

Some OCaml packages:

opam install base core core_unix stdio parsexp hashcons zarith

To run the tests of our RISCV core, a RISCV compilation toolchain.
To run C++ simulations: a recent C++ compiler (clang or gcc), libboost-dev, and optionally clang-format.
To compile this README, you need sphinx and its tool rst2html5.

You can compile the full distribution, including examples, tests, and proofs by running dune build @runtest in the top-level directory of this repo. Generated files are placed in _build.

For reproducibility, here is one set of versions known to work:

OCaml 4.14.2 with opam install base=v0.16.2 coq=8.18 core=v0.16.2 dune=3.15.1 hashcons=1.4 parsexp=v0.16.0 stdio=v0.16.0 zarith=1.13
Additionally, we provide a flake.nix file that allows you to download, install and use a working set of programs by just calling nix develop (provided, that you have Nix installed. See also the Nix Package Manager).

Browsing examples

The examples/ directory of this repo demonstrates many of Kôika's features. The examples can be compiled using dune build @examples/runtest, and browsed in either CoqIDE or Proof General. There is basic Emacs support for .lv files (the “Lispy Verilog” alternative frontend for Kôika) in etc/lv-mode.el.

See browsing the sources below for information about the repository's organization.

Compiling your own programs

Our compiler (cuttlec) supports multiple targets:

verilog (an RTL implementation of your design expressed in the synthesizable subset of Verilog 2000)
hpp (a cuttlesim model of your design, i.e. a cycle-accurate C++ implementation of it, useful for debugging)
cpp (a driver for the cuttlesim model of your design)
verilator (a simple C++ driver to simulate the Verilog using Verilator)
makefile (an auto-generated Makefile including convenient targets to debug, profile, trace, or visualize the outputs of your design)
dot (a basic representation of the RTL generated from your design)

For quick experimentation, you can just drop your files in examples/ and run:

dune build @examples/genrules (only needs to be done once for every new file) followed by
dune build @examples/<your-file.v>.d/runtest.

Programs written in the Coq EDSL

On the Coq side, after implementing your design, use the following to define a “package” (see the examples/ directory for more information, or read the syntax section below):

Definition package :=
  Interop.Backends.register
    {| ip_koika := …;
       ip_sim := …;
       ip_verilog := … |}.
Extraction "xyz.ml" package.

Compile your Coq sources using coqc or dune to generate xyz.ml, then compile that file using cuttlec xyz.ml -T ….

Among other things, a package contains instances of the Show typeclass used to print register names. These instances are typically derived automatically, but customizing them makes it possible to control the names given to signals in the generated Verilog and C++ code (for example, instead of x0, x1, …, x31, we use zero, ra, sp, gp, etc. in the RISCV core).

Programs written in serialized syntax (“Lispy Verilog”)

Use cuttlec your-program.lv -T verilog, or any other output option as described by cuttlec --help.

Technical overview

Details about Kôika's design and implementation can be found in our research paper.

Execution model

Kôika programs are made of rules, orchestrated by a scheduler. Each rule is a program that runs once per cycle, as long as it does not conflict with other rules. When conflicts arise (for example, when two rules attempt to write to the same register), the priority order specified by the scheduler determines which rule gets to fire (i.e. execute). Concretely, a rule might look like this (this is a rule that takes one step towards computing the GCD of the numbers in registers gcd_a and gcd_b):

Definition gcd_compute := {{
  let a := read0(gcd_a) in
  let b := read0(gcd_b) in
  if a != |16`d0| then
    if a < b then
      write0(gcd_b, a);
      write0(gcd_a, b)
    else
      write0(gcd_a, a - b)
  else
    fail
}}

The semantics of Kôika guarantee that each rule executes atomically, and that generated circuits behave one-rule-at-a-time — that is, even when multiple rules fire in the same cycle, the updates that they compute are as if only one rule had run per cycle (previous work used this property to define the language; in contrast, our semantics are more precise, and this atomicity property is proven in coq/OneRuleAtATime.v).

As an example, consider a simple two-stage pipeline with two single-element input FIFOs and one single-element output FIFO:

We implement these FIFOs using three single-bit registers (…_empty) indicating whether each FIFO is empty, and three data registers (…_data) holding the contents of these FIFOs. We have three rules: two to dequeue from the input FIFOs into a middle FIFO (deq0 and deq1), and one to dequeue from the middle FIFO and write a result (the input plus 412) into an output FIFO (process). The code looks like this (guard(condition) is short for if !condition then fail):

(* This is a compact way to define deq0, deq1, and process: *)
Definition rules (rl: rule_name_t) :=
  match rl with
  | deq0 =>
    {{ guard(!read0(in0_empty) && read0(fifo_empty));
       write0(fifo_data, read0(in0_data));
       write0(fifo_empty, Ob~0);
       write0(in0_empty, Ob~1) }}
  | deq1 =>
    {{ guard(!read0(in1_empty) && read0(fifo_empty));
       write0(fifo_data, read0(in1_data));
       write0(fifo_empty, Ob~0);
       write0(in1_empty, Ob~1) }}
  | process =>
    {{ guard(!read1(fifo_empty) && read0(out_empty));
       write0(out_data, read1(fifo_data) + |32`d412|);
       write1(fifo_empty, Ob~1);
       write0(out_empty, Ob~0) }}
  end.

A conflict arises when both inputs are available; what should happen in this case? The ambiguity is resolved by the scheduler:

Definition pipeline : scheduler :=
  deq0 |> deq1 |> process |> done.

This sequence indicates that deq0 has priority, so in_data0 is processed first. When both inputs are available and the middle FIFO is empty, when deq1 attempts to run, it will dynamically fail when trying to write into fifo_data.

This example includes a simple form of backpressure: if the middle FIFO is full, the first two rules will not run; if the output FIFO is full, the last rule will not run. This is made explicit by the guard statements (those would be hidden inside the implementation of the dequeue and enqueue methods of the FIFO in a larger example, as demonstrated below).

Looking carefully, you'll notice that reads and writes are annotated with 0s and 1s. These are forwarding specifications, or “ports”. Values written at port 0 are visible in the same cycle at port 1, and values written at port 1 overwrite values written at port 0. Hence, this example defines a bypassing FIFO: values written by deq0 and deq1 are processed by process in the same cycle as they are written, assuming that there is space in the output FIFO. If we had used a read0 instead, we would have had a pipelined FIFO.

In this example, starting with the following values:

in0_empty  ⇒ false
in0_data   ⇒ 42
in1_empty  ⇒ false
in1_data   ⇒ 73
fifo_empty ⇒ true
fifo_data  ⇒ 0
out_empty  ⇒ true
out_data   ⇒ 0

we get the following output:

in0_empty  ⇒ true
in0_data   ⇒ 42
in1_empty  ⇒ false
in1_data   ⇒ 73
fifo_empty ⇒ true
fifo_data  ⇒ 42
out_empty  ⇒ false
out_data   ⇒ 454

Syntax

Kôika programs are written using an embedded DSL inside of the Coq proof assistant. After compiling the distribution, begin your file with Require Import Koika.Frontend.

Preamble and types

Start by defining the following types:

reg_t: An enumerated type describing the state of your machine. For example,

Inductive reg_t :=
(* These bypassing FIFOs are used to communicate with the memory *)
| to_memory (state: MemReqFIFO.reg_t)
| from_memory (state: MemRespFIFO.reg_t)
(* These FIFOs are used to connect pipeline stages *)
| d2e (state: fromDecodeFIFO.reg_t)
| e2w (state: fromExecuteFIFO.reg_t)
(* The register file and the scoreboard track and record reads and writes *)
| register_file (state: Rf.reg_t)
| scoreboard (state: Scoreboard.reg_t)
(* These are plain registers, not module instances *)
| pc
| epoch.

ext_fn_t: An enumerated type describing custom combinational primitives (custom IP) that your program should have access to (custom sequential IP is implemented using external rules, which are currently a work in progress; see examples/rv/RVCore.v for a concrete example). Use empty_ext_fn_t if you don't use external IP in your design. For example,
```
Inductive ext_fn_t :=
| custom_adder (size: nat).
```

Then, declare the types of the data held in each part of your state and the signatures of your external (combinational) IP (we usually name these functions R and Sigma). (In addition to bitsets, registers can contain structures, enums, or arrays of values; examples of these are given below.)

Definition R (reg: reg_t) :=
  match reg with
  (* The type of the other modules is opaque; it's defined by the Rf module *)
  | to_memory st => MemReqFIFO.R st
  | register_file st => Rf.R st
  …
  (* Our own state is described explicitly: *)
  | pc => bits_t 32
  | epoch => bits_t 1
  end.

Definition Sigma (fn: ext_fn_t): ExternalSignature :=
  match fn with
  | custom_adder sz => {$ bits_t sz ~> bits_t sz ~> bits_t sz $}
  end.

As needed, you can define your own custom types; here are a few examples:

Definition proto :=
  {| enum_name := "protocol";
     enum_members :=
       vect_of_list ["ICMP"; "IGMP"; "TCP"; "UDP"];
     enum_bitpatterns :=
       vect_of_list [Ob~0~0~0~0~0~0~0~1; Ob~0~0~0~0~0~0~1~0;
                     Ob~0~0~0~0~0~1~1~0; Ob~0~0~0~1~0~0~0~1] |}.

Definition flag :=
  {| enum_name := "flag";
     enum_members := vect_of_list ["set"; "unset"];
     enum_bitpatterns := vect_of_list [Ob~1; Ob~0] |}.

Definition ipv4_address :=
  {| array_len := 4;
     array_type := bits_t 8 |}.

Definition ipv4_header :=
  {| struct_name := "ipv4_header";
     struct_fields :=
       [("version", bits_t 4);
        ("ihl", bits_t 4);
        ("dscp", bits_t 6);
        ("ecn", bits_t 2);
        ("len", bits_t 16);
        ("id", bits_t 16);
        ("reserved", enum_t flag);
        ("df", enum_t flag);
        ("mf", enum_t flag);
        ("fragment_offset", bits_t 13);
        ("ttl", bits_t 8);
        ("protocol", enum_t proto);
        ("checksum", bits_t 16);
        ("src", array_t ipv4_address);
        ("dst", array_t ipv4_address)] |}.

Definition result (a: type) :=
  {| struct_name := "result";
     struct_fields := [("valid", bits_t 1); ("value", a)] |}.

Definition response := result (struct_t ipv4_header).

Rules

The main part of your program is rules. You have access to the following syntax (there is no distinction between expressions and statements; statements are just expressions returning unit):

pass: Do nothing
fail: Abort the current rule, reverting all state changes
let var := val in body: Let bindings
set var := val: Assignments
stmt1; stmt2: Sequence
if val then val1 else val2: Conditional
match val with | pattern => body… return default: body: Switches (case analysis)
read0(reg), read1(reg), write0(reg), write1(reg): Read or write a register at port 0 or 1
pack(val), unpack(type, val): Pack a value (go from struct, enum, or arrays to bits) or unpack a bitset
get(struct, field), subst(struct, field, value): Get a field of a struct value, or replace a field in a struct value (without mutating the original one)
getbits(struct, field), substbits(struct, field, value): Like get and subst, but on packed bitsets
!x, x && y, x || y, x ^ y: Logical operators (not, and, or, xor)
x + y, x - y, x << y, x >> y, x >>> y, zeroExtend(x, width), sext(x, width): Arithmetic operators (plus, minus, logical shits, arithmetic shift right, left zero-extension, sign extension)
x < y, x <s y, x > y, x >s y, x <= y, x <s= y, x >= y, x >s= y, x == y, x != y: Comparison operators, signed and unsigned
x ++ y, x[y], x[y :+ z]: Bitset operators (concat, select, indexed part-select)
instance.(method)(arg, …): Call a method of a module
function(args…): Call an internal function
extcall function(args…): Call an external function (combinational IP)
Ob~1~0~1~0, |4`d10|: Bitset constants (here, the number 10 on 4 bits)
struct name { field_n := val_n;… }: Struct constants
enum name { member }: Enum constants
#val: Lift a Coq value (for example a Coq definition)

For example, the following rule decreases the ttl field of an ICMP packet:

Definition _decr_icmp_ttl := {{
  let hdr := unpack(struct_t ipv4_header, read0(input)) in
  let valid := Ob~1 in
  match get(hdr, protocol) with
  | enum proto { ICMP } =>
    let t := get(hdr, ttl) in
    if t == |8`d0| then set valid := Ob~0
    else set hdr := subst(hdr, ttl, t - |8`d1|)
  return default: fail
  end;
  write0(output, pack(struct response { valid := valid; value := hdr }))
}}.

This rule fetches the next instruction in our RV32I core:

Definition fetch := {{
  let pc := read1(pc) in
  write1(pc, pc + |32`d4|);
  toIMem.(MemReq.enq)(struct mem_req {
       byte_en := |4`d0|; (* Load *)
       addr := pc;
       data := |32`d0|
     });
  f2d.(fromFetch.enq)(struct fetch_bookkeeping {
       pc := pc;
       ppc := pc + |32`d4|;
       epoch := read1(epoch)
    })
}}.

Rules are written in an untyped surface language; to typecheck a rule, use tc_action R Sigma rule_body, or use tc_rules as shown below.

Schedulers

A scheduler defines a priority order on rules: in each cycle rules appear to execute sequentially, and later rules that conflict with earlier ones do not execute (of course, all this is about semantics; the circuits generated by the compiler are (almost entirely) parallel).

A scheduler refers to rules by name, so you need three things:

A rule name type:

Inductive rule_name_t :=
  start | step_compute | get_result.

A scheduler definition:

Definition scheduler :=
  start |> step_compute |> get_result |> done.

A mapping from rule names to (typechecked) rules:

Definition rules :=
  tc_rules R Sigma
    (fun rl =>
     match rl with
     | start => {{ … rule body … }}
     | step_compute => gcd_compute
     | get_result => {{ … rule body … }}
     end).

Formal semantics

The semantics of Kôika programs are given by a reference interpreter written in Coq. The results computed by this interpreter are the specification of the meaning of each program.

The reference interpreter takes three inputs:

A program, using the syntax described above

The initial value of each state element, r

Definition r (reg: reg_t): R reg :=
  match reg with
  | to_memory st => MemReqFIFO.r st
  | register_file st => Rf.r st
  …
  | pc => Bits.zero
  | epoch => Bits.zero
  end.

A Coq model of the external IP that you use, if any:

Definition sigma (fn: ext_fn_t): Sig_denote (Sigma fn) :=
  match fn with
  | custom_adder sz => fun (bs1 bs2: bits sz) => Bits.plus bs1 bs2
  end.

Then you can run your code:

Definition cr := ContextEnv.(create) r.

(* This computes a log of reads and writes *)
Definition event_log :=
  tc_compute (interp_scheduler cr sigma rules scheduler).

(* This computes the new value of each register *)
Definition interp_result :=
  tc_compute (commit_update cr event_log).

This interp_scheduler function implements the executable reference semantics of Kôika; it can be used to prove properties about programs, to guarantee that program transformation are correct, or to verify a compiler.

Compiler verification

In addition to the reference interpreter, we have a verified compiler that targets RTL. “Verified”, in this context, means that we have a machine-checked proof that the circuits produced by the compiler compute the exact same results as the original programs they were compiled from (the theorem is compiler_correct in coq/CircuitCorrectness.v).

For instance, in the following example, our theorem guarantees that circuits_result matches interp_result above:

Definition is_external (r: rule_name_t) :=
  false.

Definition circuits :=
  compile_scheduler rules is_external collatz.

Definition circuits_result :=
  tc_compute (interp_circuits empty_sigma circuits (lower_r (ContextEnv.(create) r))).

C++ Simulation

For simulation, debugging, and testing purposes, we have a separate compiler, cuttlesim, which generates C++ models from Kôika designs. The models are reasonably readable, suitable for debugging with GDB or LLDB, and typically run significantly faster than RTL simulation. Here is a concrete example, generated from examples/gcd_machine.v:

bool rule_step_compute() noexcept {
  {
    bits<16> a;
    READ0(step_compute, gcd_a, &a);
    {
      bits<16> b;
      READ0(step_compute, gcd_b, &b);
      if ((a != 16'0_b)) {
        if ((a < b)) {
          WRITE0(step_compute, gcd_b, a);
          WRITE0(step_compute, gcd_a, b);
        } else {
          WRITE0(step_compute, gcd_a, (a - b));
        }
      } else {
        FAIL_FAST(step_compute);
      }
    }
  }

  COMMIT(step_compute);
  return true;
}

The Makefile generated by cuttlec contains multiple useful targets that can be used in connection with cuttlesim; for example, coverage statistics (using gcov) can be used to get a detailed picture of which rules of a design tend to fail, and for what reasons, which makes it easy to diagnose e.g. back-pressure due to incorrect pipelining setups. Additionally, cuttlesim models can be used to generate value change dumps that can be visualized with GTKWave.

We wrote a paper about cuttlesim.

Compilation

The usual compilation process for programs defined using our Coq EDSL in as follows:

Write you program as shown above.
Write a package, gathering all pieces of your program together; packages are documented in coq/Interop.v.
Export that package using extraction to OCaml.
Compile this package to Verilog, C++, etc. using cuttlec; this invokes the verified compiler to circuits and a thin unverified layer to produce RTL, or separate (unverified) code to produce C++ models and graphs.

Additional topics

RTL Simulation, tracing, profiling, etc.

Running the cuttlec with the -t all flag generates all supported output formats, and a Makefile with a number of useful targets, including the following (replace collatz with the name of your design):

make obj_dir/Vcollatz

Compile the generated RTL with Verilator.
make gdb

Compile the C++ model of your design in debug mode, then run it under GDB.
make collatz.hpp.gcov

Generate coverage statistics for the C++ model of your design (this shows which rules firer, how often then fire, and why they fail when they do).
make NCYCLES=25 gtkwave.verilator

Compile the generated RTL with Verilator in --trace mode, then a VCD trace over 25 cycles and open it in GTKWave.

Use make help in the generated directory to learn more.

Function definitions

It is often convenient to define reusable combinational functions separately, as in this example:

Definition alu32: UInternalFunction reg_t empty_ext_fn_t := {{
  fun (funct3: bits_t 3) (inst_30: bits_t 1)
      (a: bits_t 32) (b: bits_t 32): bits_t 32 =>
    let shamt := b[Ob~0~0~0~0~0 :+ 5] in
    match funct3 with
    | #funct3_ADD  => if (inst_30 == Ob~1) then a - b else a + b
    | #funct3_SLL  => a << shamt
    | #funct3_SLT  => zeroExtend(a <s b, 32)
    | #funct3_SLTU => zeroExtend(a < b, 32)
    | #funct3_XOR  => a ^ b
    | #funct3_SRL  => if (inst_30 == Ob~1) then a >>> shamt else a >> shamt
    | #funct3_OR   => a || b
    | #funct3_AND  => a && b
    return default: |32`d0|
    end
}}.

That function would be called by writing alu32(fn3, i30, a, b).

Modularity

Function definitions are best for stateless (combinational) programs. For stateful code fragments, Kôika has a limited form of method calls.

The following (excerpted from examples/conflicts_modular.v) defines a Queue32 module implementing a bypassing FIFO, with methods to dequeue at port 0 and 1 and a method to enqueue at port 0.

Module Import Queue32.
  Inductive reg_t := empty | data.

  Definition R reg :=
    match reg with
    | empty => bits_t 1
    | data => bits_t 32
    end.

  Definition dequeue0: UInternalFunction reg_t empty_ext_fn_t :=
    {{ fun dequeue0 () : bits_t 32 =>
         guard(!read0(empty));
         write0(empty, Ob~1);
         read0(data) }}.

  Definition enqueue0: UInternalFunction reg_t empty_ext_fn_t :=
    {{ fun enqueue0 (val: bits_t 32) : unit_t =>
         guard(read0(empty));
         write0(empty, Ob~0);
         write0(data, val) }}.

  Definition dequeue1: UInternalFunction reg_t empty_ext_fn_t :=
    {{ fun dequeue1 () : bits_t 32 =>
         guard(!read1(empty));
         write1(empty, Ob~1);
         read1(data) }}.
End Queue32.

Our earlier example of conflicts can then be written thus:

Inductive reg_t :=
| in0: Queue32.reg_t -> reg_t
| in1: Queue32.reg_t -> reg_t
| fifo: Queue32.reg_t -> reg_t
| out: Queue32.reg_t -> reg_t.

Inductive rule_name_t := deq0 | deq1 | process.

Definition R (reg: reg_t) : type :=
  match reg with
  | in0 st => Queue32.R st
  | in1 st => Queue32.R st
  | fifo st => Queue32.R st
  | out st => Queue32.R st
  end.

Definition urules (rl: rule_name_t) :=
  match rl with
  | deq0 =>
    {{ fifo.(enqueue0)(in0.(dequeue0)()) }}
  | deq1 =>
    {{ fifo.(enqueue0)(in1.(dequeue0)()) }}
  | process =>
    {{ out.(enqueue0)(fifo.(dequeue1)() + |32`d412|) }}
  end.

Machine-friendly input

When generating Kôika code from another language, it can be easier to target a format with a simpler syntax than our Coq EDSL. In that case you can use Lispy Verilog, an alternative syntax for Kôika based on s-expressions. See the examples/ and tests/ directories for more information; here is one example; the Coq version of the same program is in examples/collatz.v:

;;; Computing terms of the Collatz sequence (Lispy Verilog version)

(defun times_three ((v (bits 16))) (bits 16)
  (+ (<< v 1'1) v))

(module collatz
  (register r0 16'19)

  (rule divide
    (let ((v (read.0 r0))
          (odd (sel v 4'0)))
      (when (not odd)
        (write.0 r0 (lsr v 1'1)))))

  (rule multiply
    (let ((v (read.1 r0))
          (odd (sel v 4'0)))
      (when odd
        (write.1 r0 (+ (times_three v) 16'1)))))

  (scheduler main
    (sequence divide multiply)))

Running on FPGA

The Makefiles that cuttlec generates include targets for generating ECP5 and ICE40 bitstreams. The default ECP5 target is set up for the ULX3S-85k FPGA. The default ICE40 target is set up for the TinyFPGA BX. Both are reasonably affordable FPGAs (but note that right now the RV32i code does not fit on the TinyFPGA BX).

To run the RISCV5 core on the ULX3S on Ubuntu 20:

Download a prebuilt ECP5 toolchain from https://github.com/YosysHQ/fpga-toolchain/releases.
Make sure that the trivial example at https://github.com/ulx3s/blink works.
Run make core in examples/rv to compile the RISCV core (other designs should work too, but you'll need to create a custom wrapper in Verilog to map inputs and outputs to your FPGAs pins.
Run make top_ulx3s.bit in examples/rv/_objects/rv32i.v/ to generate a bitstream. You can prefix this command with MEM_NAME=integ/morse (or any other test program) to load a different memory image into the bitstream.
Run fujprog top_ulx3s.bit to flash the FPGA.
To see the output of putchar(), use a TTY application like tio: tio /dev/ttyUSB0 (the default baud rate is 115200). Alternatively, use tty -F /dev/ttyUSB0 115200 igncr to set up the terminal and then use cat /dev/ttyUSB0.

Browsing the sources

The following list shows the current state of the repo:

coq/

CompilerCorrectness/

(Circuits)

CircuitCorrectness.v: Compiler correctness proof
LoweringCorrectness.v: Proof of correctness for the lowering phase

Correctness.v: End-to-end correctness theorem

(Circuits)

CircuitGeneration.v: Compilation of lowered ASTs into RTL circuits
CircuitOptimization.v: Local optimization of circuits
CircuitProperties.v: Lemmas used in the compiler-correctness proof
CircuitSemantics.v: Interpretation of circuits
CircuitSyntax.v: Syntax of circuits (RTL)

(Frontend)

Desugaring.v: Desugaring of untyped actions
ErrorReporting.v: Typechecking errors and error-reporting functions
Frontend.v: Top-level module imported by Kôika programs
IdentParsing.v: Ltac2-based identifier parsing for prettier notations
Parsing.v: Parser for the Kôika EDSL
Syntax.v: Untyped syntax
SyntaxFunctions.v: Functions on untyped ASTs, including error localization
SyntaxMacros.v: Macros used in untyped programs
TypeInference.v: Type inference and typechecking

(Interop)

Compiler.v: Top-level compilation function and helpers
ExtractionSetup.v: Custom extraction settings (also used by external Kôika programs
Interop.v: Exporting Kôika programs for use with the cuttlec command-line tool

(Language)

CPS.v: Continuation-passing semantics and weakest precondition calculus
CompactLogs.v: Alternative implementation of logs
CompactSemantics.v: Semantics of typed Kôika programs with compact logs
Logs.v: Logs of reads and writes
LoweredSemantics.v: Semantics of Lowered Kôika programs
LoweredSyntax.v: Lowered ASTs (weakly-typed)
Lowering.v: Compilation from typed ASTs to lowered ASTs
Primitives.v: Combinational primitives available in all Kôika programs
TypedSemantics.v: Semantics of typed Kôika programs
TypedSyntax.v: Typed ASTs
Types.v: Types used by Kôika programs

(ORAAT)

OneRuleAtATime.v: Proof of the One-rule-at-a-time theorem
SemanticProperties.v: Properties of the semantics used in the one-rule-at-a-time theorem

(Stdlib)

Std.v: Standard library

(Tools)

LoweredSyntaxFunctions.v: Functions defined on lowered ASTs
Magic.v: Universal axiom to replace the ‘admit’ tactic
TypedSyntaxFunctions.v: Functions defined on typed ASTs
TypedSyntaxProperties.v: Lemmas pertaining to tools on typed syntax

(Utilities)

Common.v: Shared utilities
DeriveShow.v: Automatic derivation of Show instances
Environments.v: Environments used to track variable bindings
EqDec.v: Decidable equality typeclass
FiniteType.v: Finiteness typeclass
IndexUtils.v: Functions on Vect.index elements
Member.v: Dependent type tracking membership into a list
Show.v: Show typeclass (α → string)
Vect.v: Vectors and bitvector library

BitTactics.v: Tactics for proofs about bit vectors
PrimitiveProperties.v: Equations showing how to implement functions on structures and arrays as bitfuns
ProgramTactics.v: Tactics for proving user-defined circuits

etc/

vagrant/

provision.sh: Set up a Vagrant VM for Kôika development

configure: Generate dune files for examples/ and tests/

examples/

cosimulation.v.etc/

blackbox.v: Blackbox verilog module (a simple one-cycle delay) used to demonstrate Cuttlesim+Verilator cosimulation
cosimulation.cpp: Custom Cuttlesim driver that implements a Kôika extfun using a Verilator model

fft.v.etc/

fft.bsv: A Bluespec implementation of the fft.v example

fir.v.etc/

extfuns.hpp: C++ implementation of external functions for the fir example
fir.bsv: A Bluespec implementation of the fir.v example
mod19.v: Verilog implementation of external functions for the fir example

function_call.v.etc/

extfuns.hpp: C++ implementation of external functions for the function_call example
fetch_instr.v: Verilog implementation of external functions for the function_call example

rv/

etc/

nangate45/

synth.sh: Yosys synthesis script for Nangate Open Cell Library (45nm)

sv/

ext_mem.v: Wrapper used to connect the BRAM model with Kôika
memory.v: Verilog model of a BRAM
pll.v: PLL configuration for the TinyFPGA BX board
top.v: Verilog wrapper for the Kôika core (for use in simulation)
top_ice40_uart.v: Verilog wrapper for the Kôika core (for use in FPGA synthesis, with a UART interface)
top_ice40_usb.v: Verilog wrapper for the Kôika core (for use in FPGA synthesis, with a USB interface)
top_uart.v: Verilog wrapper for the Kôika core with a UART interface

cvc64.sh: Simulate the core with CVC
elf.hpp: Support for loading ELF files
iverilog.sh: Simulate the core with Icarus Verilog
rvcore.cuttlesim.cpp: C++ driver for rv32i simulation with Cuttlesim
rvcore.pyverilator.py: Python driver for C++ simulation with Verilator
rvcore.verilator.cpp: C++ driver for rv32 simulation with Verilator
testbench.v: Testbench used with CVC and Icarus Verilog
top_ulx3s.v: Verilog wrapper for the Kôika core (for use in FPGA synthesis, with a UART interface)

Multiplier.v: Implementation of a multiplier module
MultiplierCorrectness.v: Proof of correctness of the multiplier module
RVCore.v: Implementation of our RISC-V core
RVEncoding.v: Encoding-related constants
Scoreboard.v: Implementation of a scoreboard
rv32.v: Definition of a pipelined schedule
rv32e.v: Pipelined instantiation of an RV32E core
rv32i.v: Pipelined instantiation of an RV32I core

uart.v.etc/

top.v: UART testbench
uart.verilator.cpp: Verilator driver for the UART testbench

collatz.lv: Computing terms of the Collatz sequence (Lispy Verilog version)
collatz.v: Computing terms of the Collatz sequence (Coq version)
combinational_proof_tutorial.v: Tutorial: Verifying a small combinational circuit
conflicts.v: Understanding conflicts and forwarding
conflicts_modular.v: Understanding conflicts and forwarding, with modules
cosimulation.v: Using black-box Verilog models (combining Cuttlesim and Verilator)
datatypes.v: Using structures, enums, and arrays
external_rule.v: Calling external (verilog) modules from Kôika
fft.v: Computing an FFT
fir.v: Computing a FIR (Coq version)
function_call.v: Calling external functions
gcd_machine.v: Computing GCDs
method_call.v: Calling methods of internal modules
pipeline.v: Building simple pipelines
pipeline_tutorial.v: Tutorial: Simple arithmetic pipeline
save_restore.v: Save and restore simulation state
uart.v: UART transmitter
vector.v: Representing vectors of registers using Coq inductives

ocaml/

backends/

resources/

cuttlesim.cpp: Default driver for Kôika programs compiled to C++ using Cuttlesim
cuttlesim.hpp: Preamble shared by all Kôika programs compiled to C++
verilator.cpp: Default driver for Kôika programs compiled to C++ using Verilator
verilator.hpp: Preamble shared by all Kôika programs compiled to C++ using Verilator

coq.ml: Coq backend (from Lispy Verilog sources)
cpp.ml: C++ backend
dot.ml: Graphviz backend
gen.ml: Embed resources/* into resources.ml at build time
makefile.ml: Makefile backend (to make it easier to generate traces, statistics, models, etc.)
rtl.ml: Generic RTL backend
verilator.ml: Verilator backend exporting a simple C++ driver
verilog.ml: Verilog backend

common/

common.ml: Shared utilities

cuttlebone/

(Interop)

Extraction.v: Extraction to OCaml (compiler and utilities)

cuttlebone.ml: OCaml wrappers around functionality provided by the library extracted from Coq

frontends/

coq.ml: Simple frontend to compile and load OCaml files extracted from Coq
lv.ml: Lispy Verilog frontend

cuttlec.ml: Command line interface to the compilers
interop.ml: Functions to use if compiling Kôika programs straight from Coq, without going through cuttlec
koika.ml: Top-level library definition
registry.ml: Stub used to load Kôika programs extracted from Coq into cuttlec

tests/

trivial_state_machine.etc/

stm.v: Cleaned-up state machine example

arrays.lv: Unit tests for array functions
bigint.lv: Computations with large bitvectors (the simulator uses boost for >64 bits)
comparisons.lv: Unit tests for comparison operators
cross_cycle.v: Cross-cycle optimization in Cuttlesim models
datatypes.lv: Simple uses of structs and enums
double_write.v: Double-write detection and prevention
errors.1.lv: Syntax and typing errors in LV
errors.v: Syntax and typing errors in Coq
extcall.v: External functions
internal_functions.v: Intfun tests
large_trace.lv: Make sure that snapshots in large traces don't copy data
large_writeset.v: Make sure that the large writeset heuristics in the scheduler don't break things
mul.lv: Computations involving multiplication
muxelim.v: Sanity check for mux-elimination optimization
name_mangling.lv: Unit tests for name mangling
read1_write1_check.v: Detect and reject programs that call read1 after write1 in simulation
register_file_bypassing.v: Ensure that area is reasonable when bypasses don't need extra tracking
shadowing.lv: Unit tests for name shadowing
shifts.v: Regression test for signed shifts
signed.lv: Computations involving sign bits
struct_init.v: Structure initialization
switches.v: Test various forms of switches
taint_analysis.lv: Unit tests to ensure that impure functions are not optimized out
tmp_var.lv: Unit test for “tmp” variable
trivial_state_machine.v: Trivial state machine
unpack.v: Structure unpacking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.rst

README.rst

Kôika: A Core Language for Rule-Based Hardware Design

Getting started

Installing dependencies and building from source

Browsing examples

Compiling your own programs

Programs written in the Coq EDSL

Programs written in serialized syntax (“Lispy Verilog”)

Technical overview

Execution model

Syntax

Preamble and types

Rules

Schedulers

Formal semantics

Compiler verification

C++ Simulation

Compilation

Additional topics

RTL Simulation, tracing, profiling, etc.

Function definitions

Modularity

Machine-friendly input

Running on FPGA

Browsing the sources

Files

README.rst

Latest commit

History

README.rst

File metadata and controls

Kôika: A Core Language for Rule-Based Hardware Design

Getting started

Installing dependencies and building from source

Browsing examples

Compiling your own programs

Programs written in the Coq EDSL

Programs written in serialized syntax (“Lispy Verilog”)

Technical overview

Execution model

Syntax

Preamble and types

Rules

Schedulers

Formal semantics

Compiler verification

C++ Simulation

Compilation

Additional topics

RTL Simulation, tracing, profiling, etc.

Function definitions

Modularity

Machine-friendly input

Running on FPGA

Browsing the sources