Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve conceptual integrity of SVM nomenclature #14582

Merged
merged 11 commits into from
Mar 7, 2022
70 changes: 35 additions & 35 deletions doc/compiler/aot/SymbolValidationManager.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<!--
Copyright (c) 2018, 2021 IBM Corp. and others
Copyright (c) 2018, 2022 IBM Corp. and others

This program and the accompanying materials are made available under
the terms of the Eclipse Public License 2.0 which accompanies this
Expand All @@ -24,16 +24,16 @@ SPDX-License-Identifier: EPL-2.0 OR Apache-2.0 OR GPL-2.0 WITH Classpath-excepti

In order for the OpenJ9 Compiler to compile Ahead Of Time (AOT),
or more accurately, Relocatable, code, it needs to be able to store
information for the AOT load infrastructure to validate all the
assumptions made during the compile run, since the AOT load is
information for the AOT load infrastructure to validate all the
assumptions made during the compile run, since the AOT load is
performed in a different JVM environment than the one the compilation
was performed in. For these reasons, the Compiler and the Shared Class
Cache (SCC) has infrastructure to facilate these validations.

Without the Symbol Validation Manager, the validations involves either
creating a relocation record to validate an Arbitrary Class, a Class
from a Constant Pool, an Instance Field, or a Static Field. However,
when getting a Class By Name, the validation stored would be an
when getting a Class By Name, the validation stored would be an
Arbitrary Class Validation, which is technically incorrect when the
Name is retrieved from the signature of a method (since the answer
depends on what the Class of that method "sees"). In order to deal
Expand All @@ -46,7 +46,7 @@ and unambigious.

The Symbol Validation Manager (SVM) is very much similar to a symbol table
that a linker might use. The idea is to store information about where
a particular symbol (eg `J9Class`, `J9Method`, etc.) came from. For example,
a particular value (eg `J9Class`, `J9Method`, etc.) came from. For example,
if the compiler got the `J9Class` of `java/lang/String` by name from the
parameter list of method `C.foo(Ljava/lang/String)V`, then the SVM will
store a record stating that the compiler got `J9Class` `java/lang/String`
Expand All @@ -58,18 +58,18 @@ guaranteed to be compatible in the current environment.

However, since `J9Class`es are artifacts of the current environment, how
does the SVM enable a new environment to validate it? The core idea is to
associate unique IDs to each new symbol the compiler gets a hold of. Thus,
when a symbol is acquired by asking a query using a different symbol, the
validation record that gets generated refers to these two symbols by their
IDs, which are environment agnostic. In the load run, the AOT load
infrastructure uses these IDs to build up its table of symbols that are
associate unique IDs to each new value the compiler gets a hold of. Thus,
when a value is acquired by asking a query using a different value, the
validation record that gets generated refers to these two values by their
IDs (symbols), which are environment agnostic. In the load run, the AOT load
infrastructure uses these IDs to build up its table of values that are
valid in _its_ environment.

To give a concrete example:

The compiler decides to compile `A.foo()`. As part of the compilation, it
gets class `B` from `A`'s constant pool, gets class `C` from `A`'s
constant pool, and finally gets `B` by name as seen by `C`. Thus, the
gets class `B` from `A`'s constant pool, gets class `C` from `A`'s
constant pool, and finally gets `B` by name as seen by `C`. Thus, the
validation records that get generated would be:

1. Root Class Record for `A`; `A` gets ID 1
Expand All @@ -89,7 +89,7 @@ following manner:

Thus the information written in the validation records are entirely
environment agnostic. In the load run, the AOT infrastructure will
go through these validations and build up the symbols that are valid
go through these validations and build up the values that are valid
in _its_ environment:

1. Map ID=1 to the class of the method being compiled
Expand All @@ -102,13 +102,13 @@ and map it to ID=4
5. Get the class by name as seen by the class associated with ID=4
and verify that it is the same class associated with ID=3

If any of the validations fail, the load is aborted. Thus the general
If any of the validations fail, the load is aborted. Thus the general
approach in the load run is:

1. Find the symbol as specified by the validation record
2. Map the ID from the validation record to the symbol
3. If the map already contains a symbol for that ID, validate that
the symbol in the map is the same as the symbol currently being
1. Find the value as specified by the validation record
2. Map the ID (symbol) from the validation record to the value
3. If the map already contains a value for that ID, validate that
the value in the map is the same as the value currently being
validated; fail otherwise


Expand All @@ -118,7 +118,7 @@ validated; fail otherwise

Class Chains are a core component of AOT validation; they ensure that
the shape of a class is the same as that during the compile run. For
technical details about Class Chains, see
technical details about Class Chains, see
[AOT Class Chains](https://github.com/eclipse-openj9/openj9/blob/master/doc/compiler/aot/AOTClassChains.md).
Thus, for every validation record involving classes, a Class Chain
Validation Record is also created (unless a record for that class
Expand Down Expand Up @@ -161,9 +161,9 @@ unambiguous.
Because Primitive Classes (and their associated array classes) are always
guaranteed to exist, and because the Root Class and method from root class
have already been validated prior to even attempting to perfom the AOT
load, these symbols don't need to be validation records stored in
load, these values don't need to be validation records stored in
the SCC. Therefore, the SVM, in its constructor, initializes the maps
with these IDs and the associated symbols.
with these IDs and the associated values.

### Well-known Classes

Expand Down Expand Up @@ -207,38 +207,38 @@ in the same way, to match the IDs assigned during compilation.

## Benefits

1. Makes explicit the provenance of every symbol acquired by the compiler
1. Makes explicit the provenance of every value acquired by the compiler
2. Enables code paths previously disabled under AOT
3. Allows relocations of classes/methods associated with cpIndex=-1
4. Facilitates the enablement of Known Object Table under AOT
5. Facilitates the enablement of JSR292 under AOT

## When to create a new Validation Record

Any time a new front end query is created, or an existing query is
modified, the appropriate validation record should be created, or
modified, respectively<sup>1</sup>. "Front end query" means, a query that the
Any time a new front end query is created, or an existing query is
modified, the appropriate validation record should be created, or
modified, respectively<sup>1</sup>. "Front end query" means, a query that the
compiler (the "back end") makes of the runtime (the "front end") in order
to get information about the environment such that execution of the
compiled method will be functionally correct. The SVM uses the
validation record to redo the query in order to perform the validation.
compiled method will be functionally correct. The SVM uses the
validation record to redo the query in order to perform the validation.
Therefore, the validation record needs to contain all the information
necessary to redo the query in a different JVM instance. Each unique
front end query will have its own associated validation record; this is
a fundamental aspect of the SVM.

Note the validation is required where the query result is used in a way
that could affect **functional correctness** - queries used only for
heuristic purposes do not necessarily _need_ to be validated since,
by nature, they do not affect the correctness of the program.
All omissions on the basis of heuristic usage should be documented as
such in the code to make the omission clearly intentional and the basis
Note the validation is required where the query result is used in a way
that could affect **functional correctness** - queries used only for
heuristic purposes do not necessarily _need_ to be validated since,
by nature, they do not affect the correctness of the program.
All omissions on the basis of heuristic usage should be documented as
such in the code to make the omission clearly intentional and the basis
for that omission clear. The `enterHeuristicRegion`/`exitHeuristicRegion`
APIs are used to facilitate making frontend queries without generating
validation records.

<hr/>

1. We can have "compound" queries that simply combine other
existing queries for convenience (e.g. `getMethodFromName`). These
1. We can have "compound" queries that simply combine other
existing queries for convenience (e.g. `getMethodFromName`). These
queries don't necessarily need their own validation records.
4 changes: 2 additions & 2 deletions runtime/compiler/aarch64/codegen/J9AheadOfTimeCompile.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (c) 2019, 2021 IBM Corp. and others
* Copyright (c) 2019, 2022 IBM Corp. and others
*
* This program and the accompanying materials are made available under
* the terms of the Eclipse Public License 2.0 which accompanies this
Expand Down Expand Up @@ -109,7 +109,7 @@ J9::ARM64::AheadOfTimeCompile::initializePlatformSpecificAOTRelocationHeader(TR:
TR_RelocationRecordDiscontiguousSymbolFromManager *dsfmRecord = reinterpret_cast<TR_RelocationRecordDiscontiguousSymbolFromManager *>(reloRecord);

uint8_t *symbol = (uint8_t *)relocation->getTargetAddress();
uint16_t symbolID = self()->comp()->getSymbolValidationManager()->getIDFromSymbol(static_cast<void *>(symbol));
uint16_t symbolID = self()->comp()->getSymbolValidationManager()->getSymbolIDFromValue(static_cast<void *>(symbol));

uint16_t symbolType = (uint16_t)(uintptr_t)relocation->getTargetAddress2();

Expand Down
Loading