Change stack and register models to bytes instead of values #245

gergo- · 2018-08-23T15:06:31Z

This pull request changes CompCert's stack and register model from a
value-based one to a more low-level one based on bytes. That is, from the
LTL level onwards, data structures like Locmap and Pregmap are no longer
mappings from location names to values, but memory-like blocks of bytes.
Their get and set operations decode bytes to values and vice versa
on-the-fly. (However, register-to-register copies actually copy raw bytes.)
To abstract similarities between the mreg and preg representation
levels, a new Regfile module (functor) is introduced.

As enoding and decoding of values is type-sensitive, lemmas about these
operations enforce well-typedness. In particular, the gss lemma only
guarantees that a value written to a location can be read back unchanged if
the value actually fits into the location. In other words, 64-bit values can
no longer be stored into 32-bit registers and read back.

Large parts of these patches enforce this well-typedness at all intermediate
language levels on all supported architectures.

This work is meant as a basis for further extensions for modeling
subregister aliasing correctly in the future.

The BR_splitlong constructor used to be recursive, meaning that a long result could in theory be split into an arbitrary tree of atomic parts. But we only ever split longs into exactly two ints, so this generality was not needed. This simplification will help with the LTL typing proof.

If LTLtyping finds that the program after register allocation is well-typed, then execution preserves well-typedness of the state. In particular, this typing property ensures that Locmap accesses are well-typed: All register writes are of values with a type compatible with the register's type.

Locmap.set now uniformly uses `Val.load_result` to model stores to registers and to stack slots equivalently.

Instead of mapping locations to values directly, map them to lists of encoded bytes (`list memval`). Locmap accesses must now pass through encoding and decoding steps. The new `ls @ l` notation wraps read access (i.e., `Locmap.get`) to locset `ls` at location `l`. As a side effect, as read accesses `ls @ l` involve a decoding step involving the type of the location `l`, the result read always has the type of `l`. The notion of a well-typed locset, used in LTL and Linear semantics, becomes trivial: All locsets are now well-typed. This simplifies some proofs in Lineartyping and LTLtyping. A future patch can simplify things further.

As the `wt_locset` predicate is now true for all locsets, we no longer need to track it in preservation proofs.

Introduce a new module `Registerfile` that models the register file as a block of bytes. Register accesses map register names to addresses in this block and load/store them using the machinery for memory loads/stores. The registers' addresses in the register file are computed from their `IndexedMreg` indices. When computing addresses, these indices are interpreted as machine words. The indices of 64-bit registers must thus be spaced at least 2 apart. This patch also adjusts the indices for all architectures.

Replace the `ty` type field in stack slots with a `q` quantity field. The idea is to make typing more coarse-grained, similarly to registers. This solves some problems related to copying data between strictly typed slots and more liberally typed registers.

A fragment block is a memory-like block of fragments of values. It provides for reading and writing values. There are also lower-level functions for reading and writing lists of "bytes" (memvals). The blocks are conceptually infinite and indexed by "offsets" expressed in words (32-bit blocks). Accesses touch a 32- or 64-bit quantity. This limits the number of cases of overlapping accesses to take into account. Using quantities also maps cleanly to encoding values using Many32/Many64 chunks. This way, values are encoded using the projection/injection functions that can produce Fragments or Undefs, but never Byte memvals. This avoids some kinds of type punning that can be modeled by CompCert for general memory accesses (encoding a value as Bytes, then decoding a subsequence of those Bytes to a different value). Fragment blocks are meant to be the single unified basis for modeling the register file at Mach and Asm levels, as well as stack frames. In particular, the byte copy functions provide for spilling 64-bit registers without having to know if the fragments stored within encode a single 64-bit value or two 32-bit values.

The Regfile module now uses FragmentBlock for its representation, simplifying the proofs in the module itself. The stack is now modeled using a new Stack module based on FragmentBlock as well.

This patch turns the Regfile module into a functor. Before, only mreg registers were based on Regfile, but with this patch preg uses Regfile too.

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute PPC64 instructions in 32-bit mode is an error.

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute x86_64 instructions in 32-bit mode is an error.

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute 64-bit instructions in 32-bit mode is an error.

gergo- added 15 commits August 23, 2018 15:50

Add LTLtyping and enforce well-typed LTL in Allocation

3c5e96e

Adapt LTL typing proof for all architectures

02ed73f

Unify register and stack cases in Locmap.set

7c3c74e

Locmap.set now uniformly uses `Val.load_result` to model stores to registers and to stack slots equivalently.

Simplify properties and proofs using wt_locset

fb34616

As the `wt_locset` predicate is now true for all locsets, we no longer need to track it in preservation proofs.

Change Locations (register file and stack) to FragmentBlock

9eb9c9f

The Regfile module now uses FragmentBlock for its representation, simplifying the proofs in the module itself. The stack is now modeled using a new Stack module based on FragmentBlock as well.

Change Regfile to a functor, base mreg and ARM's preg on it

3b59dae

This patch turns the Regfile module into a functor. Before, only mreg registers were based on Regfile, but with this patch preg uses Regfile too.

Port PowerPC's pregs to the Registerfile representation

b0d3279

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute PPC64 instructions in 32-bit mode is an error.

Port x86's pregs to the Registerfile representation

675e82b

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute x86_64 instructions in 32-bit mode is an error.

Port RISC-V's pregs to the Registerfile representation

8ea0903

This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute 64-bit instructions in 32-bit mode is an error.

xavierleroy force-pushed the master branch from aa95f89 to 0aeff47 Compare November 7, 2020 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change stack and register models to bytes instead of values #245

Change stack and register models to bytes instead of values #245

gergo- commented Aug 23, 2018

Change stack and register models to bytes instead of values #245

Are you sure you want to change the base?

Change stack and register models to bytes instead of values #245

Conversation

gergo- commented Aug 23, 2018