-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change stack and register models to bytes instead of values #245
Open
gergo-
wants to merge
15
commits into
AbsInt:master
Choose a base branch
from
gergo-:locmap-encoding
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The BR_splitlong constructor used to be recursive, meaning that a long result could in theory be split into an arbitrary tree of atomic parts. But we only ever split longs into exactly two ints, so this generality was not needed. This simplification will help with the LTL typing proof.
If LTLtyping finds that the program after register allocation is well-typed, then execution preserves well-typedness of the state. In particular, this typing property ensures that Locmap accesses are well-typed: All register writes are of values with a type compatible with the register's type.
Locmap.set now uniformly uses `Val.load_result` to model stores to registers and to stack slots equivalently.
Instead of mapping locations to values directly, map them to lists of encoded bytes (`list memval`). Locmap accesses must now pass through encoding and decoding steps. The new `ls @ l` notation wraps read access (i.e., `Locmap.get`) to locset `ls` at location `l`. As a side effect, as read accesses `ls @ l` involve a decoding step involving the type of the location `l`, the result read always has the type of `l`. The notion of a well-typed locset, used in LTL and Linear semantics, becomes trivial: All locsets are now well-typed. This simplifies some proofs in Lineartyping and LTLtyping. A future patch can simplify things further.
As the `wt_locset` predicate is now true for all locsets, we no longer need to track it in preservation proofs.
Introduce a new module `Registerfile` that models the register file as a block of bytes. Register accesses map register names to addresses in this block and load/store them using the machinery for memory loads/stores. The registers' addresses in the register file are computed from their `IndexedMreg` indices. When computing addresses, these indices are interpreted as machine words. The indices of 64-bit registers must thus be spaced at least 2 apart. This patch also adjusts the indices for all architectures.
Replace the `ty` type field in stack slots with a `q` quantity field. The idea is to make typing more coarse-grained, similarly to registers. This solves some problems related to copying data between strictly typed slots and more liberally typed registers.
A fragment block is a memory-like block of fragments of values. It provides for reading and writing values. There are also lower-level functions for reading and writing lists of "bytes" (memvals). The blocks are conceptually infinite and indexed by "offsets" expressed in words (32-bit blocks). Accesses touch a 32- or 64-bit quantity. This limits the number of cases of overlapping accesses to take into account. Using quantities also maps cleanly to encoding values using Many32/Many64 chunks. This way, values are encoded using the projection/injection functions that can produce Fragments or Undefs, but never Byte memvals. This avoids some kinds of type punning that can be modeled by CompCert for general memory accesses (encoding a value as Bytes, then decoding a subsequence of those Bytes to a different value). Fragment blocks are meant to be the single unified basis for modeling the register file at Mach and Asm levels, as well as stack frames. In particular, the byte copy functions provide for spilling 64-bit registers without having to know if the fragments stored within encode a single 64-bit value or two 32-bit values.
The Regfile module now uses FragmentBlock for its representation, simplifying the proofs in the module itself. The stack is now modeled using a new Stack module based on FragmentBlock as well.
This patch turns the Regfile module into a functor. Before, only mreg registers were based on Regfile, but with this patch preg uses Regfile too.
This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute PPC64 instructions in 32-bit mode is an error.
This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute x86_64 instructions in 32-bit mode is an error.
This makes all preg accesses typed, so a large number of proofs must be adjusted. Some other details change as well due to typing; for example, trying to execute 64-bit instructions in 32-bit mode is an error.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request changes CompCert's stack and register model from a
value-based one to a more low-level one based on bytes. That is, from the
LTL level onwards, data structures like
Locmap
andPregmap
are no longermappings from location names to values, but memory-like blocks of bytes.
Their
get
andset
operations decode bytes to values and vice versaon-the-fly. (However, register-to-register copies actually copy raw bytes.)
To abstract similarities between the
mreg
andpreg
representationlevels, a new
Regfile
module (functor) is introduced.As enoding and decoding of values is type-sensitive, lemmas about these
operations enforce well-typedness. In particular, the
gss
lemma onlyguarantees that a value written to a location can be read back unchanged if
the value actually fits into the location. In other words, 64-bit values can
no longer be stored into 32-bit registers and read back.
Large parts of these patches enforce this well-typedness at all intermediate
language levels on all supported architectures.
This work is meant as a basis for further extensions for modeling
subregister aliasing correctly in the future.