Skip to content

Commit

Permalink
Merge rust-lang#27
Browse files Browse the repository at this point in the history
27: Return from direct recursion disambiguation r=ptersilie a=vext01



Co-authored-by: Edd Barrett <[email protected]>
  • Loading branch information
bors[bot] and vext01 authored Mar 16, 2022
2 parents 5bbd767 + ce869db commit 3375c6f
Show file tree
Hide file tree
Showing 2 changed files with 108 additions and 12 deletions.
2 changes: 2 additions & 0 deletions llvm/lib/Target/X86/X86FastISel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2590,6 +2590,8 @@ bool X86FastISel::TryEmitSmallMemcpy(X86AddressMode DestAM,

// Add an annotation to an intrinsic instruction, specifying whether the
// intrinsic has been inlined or not.
//
// This is only necessary for intrinsics which may emit machine code.
void annotateIntrinsic(const IntrinsicInst *II, bool Inlined) {
IntrinsicInst *CI = const_cast<IntrinsicInst *>(II);
LLVMContext& C = CI->getContext();
Expand Down
118 changes: 106 additions & 12 deletions llvm/lib/Transforms/Yk/BlockDisambiguate.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
//===- BlockDisambiguate.cpp - Unambiguous block mapping for yk ----===//
//
// This pass ensures that yk is able to unambiguously map machine blocks back
// to LLVM IR blocks.
// to LLVM IR blocks. Specifically it does two separate, but related, things:
//
// - Inserts blocks to disambiguate intra-function branching.
// - Inserts blocks to disambiguate returning from direct recursion.
//
// Intra-function branch disambiguation
// ------------------------------------
//
// In the JIT runtime, the mapping stage converts the *machine* basic blocks of
// a trace back to high-level basic blocks (the ones in LLVM IR). A problem
Expand Down Expand Up @@ -108,18 +114,74 @@
// The former unambiguously expresses that `bbA` was executed twice. The latter
// unambiguously expresses that `bbA` was executed only once.
//
// Return from direct recursion disambiguation
// -------------------------------------------
//
// A similar case that requires disambiguation is where a function recursively
// calls itself (i.e. the function is "directly recursive") immediately before
// returning.
//
// When a series of recursive calls bubble up from the recursion
// we will get repeated entries for the high-level block containing the
// return statement. But since the return block may include other instructions
// which may themselves lower to multiple machine basic blocks, we need to do
// something in order to differentiate recursive returning from non-recursive
// returning when we see repeated entries for a block containing a return
// statement.
//
// In other words, given the return block for a function `myself()` shown in
// Fig 3b. and the high-level trace `[bbRet, bbRet, bbRet, bbRet]`, how many
// times have we returned from recursive calls? We can't say because
// `<instructions>` may generate multiple machine basic blocks that all map
// back to `bbRet` [1].
//
// Our solution is to insert a (high-level) padding block between the return
// statement and the instructions preceding it.
//
// ┌────────────────┐ ┌────────────────┐
// │bbRet: │ │bbRet1: │
// │ <instructions>│ │ <instructions>│
// │ call @myself()│ │ call @myself()│
// │ ret │ │ br %bbRet2 │
// └────────────────┘ └────────────────┘
//
// (Fig 3a, above) ▼
// Return block before ┌────────────┐
// transformation. │bbRet2: │
// │ br %bbRet3│
// └────────────┘
//
// (Fig 3b, right) ▼
// Return block after transformation ┌───────┐
// with padding block. │bbRet3:│
// │ ret │
// └───────┘
//
// After transformation, repeated entries for a block can only occur in the
// mapped trace if `<instructions>` becomes multiple machine blocks during
// code-gen (e.g. `[bbRet1, bbRet1, bbRet1, bbRet1]`), whereas returning from
// direct recursion will cause sequences of `bbRet2, bbRet3` to appear in the
// mapped trace.
//
// As with intra-function branch disambiguation, the mapper is then free to
// collapse repeated entries for the same block when constructing the mapped
// trace.
//
// Discussion
// ----------
//
// The pass runs after high-level IR optimisations (and requires some backend
// optimisations disabled) to ensure that LLVM doesn't undo our work, by
// folding the machine block for `bbB` back into its predecessor in `bbA`.
//
// Alternative approaches that we dismissed, and why:
//
// - Consider branches back to the entry machine block of a high-level block
// as a re-execution of the high-level block. Even assuming that we can
// identify the entry machine block for a high-level block, this is flawed.
// As can be seen in the example above, both internal and non-internal
// control flow can branch back to the entry block. Additionally, there may
// not be a unique entry machine basic block.
// - For intra-function branches, consider branches back to the entry machine
// block of a high-level block as a re-execution of the high-level block.
// Even assuming that we can identify the entry machine block for a
// high-level block, this is flawed. As can be seen in the example above,
// both internal and non-internal control flow can branch back to the entry
// block. Additionally, there may not be a unique entry machine basic block.
//
// - Mark (in the machine IR) which branches are exits to the high-level IR
// block and encode this is the basic block map somehow. This is more
Expand All @@ -132,7 +194,8 @@
// likely that some LLVM IR constructs require internal control flow for
// correct semantics.
//
// Footnotes:
// Footnotes
// ---------
//
// [0]: For some targets, a single high-level LLVM IR instruction can even
// lower to a machine-IR-level loop, for example `cmpxchng` on some ARM
Expand All @@ -142,6 +205,10 @@
// a potentially unbounded number of machine blocks can be executed
// within the confines of a single high-level basic block.
//
// [1]: Futher those machine blocks may branch to the same address that a
// return from direct recursion would land at, adding another layer of
// ambiguity.
//
//===----------------------------------------------------------------------===//

#include "llvm/Transforms/Yk/BlockDisambiguate.h"
Expand Down Expand Up @@ -178,8 +245,9 @@ class YkBlockDisambiguate : public ModulePass {
}

private:
BasicBlock *makeDisambiguationBB(LLVMContext &Context, BasicBlock *BB,
std::vector<BasicBlock *> &NewBBs) {
// Create a block for intra-function branch disambiguation.
BasicBlock *makeBranchDisambiguationBB(LLVMContext &Context, BasicBlock *BB,
std::vector<BasicBlock *> &NewBBs) {
BasicBlock *DBB = BasicBlock::Create(Context, "");
NewBBs.push_back(DBB);
IRBuilder<> Builder(DBB);
Expand All @@ -202,7 +270,7 @@ class YkBlockDisambiguate : public ModulePass {
SuccIdx++) {
BasicBlock *SuccBB = BI->getSuccessor(SuccIdx);
if (SuccBB == &BB) {
BasicBlock *DBB = makeDisambiguationBB(Context, &BB, NewBBs);
BasicBlock *DBB = makeBranchDisambiguationBB(Context, &BB, NewBBs);
BI->setSuccessor(SuccIdx, DBB);
BB.replacePhiUsesWith(&BB, DBB);
}
Expand All @@ -213,11 +281,37 @@ class YkBlockDisambiguate : public ModulePass {
SuccIdx++) {
BasicBlock *SuccBB = SI->getSuccessor(SuccIdx);
if (SuccBB == &BB) {
BasicBlock *DBB = makeDisambiguationBB(Context, &BB, NewBBs);
BasicBlock *DBB = makeBranchDisambiguationBB(Context, &BB, NewBBs);
SI->setSuccessor(SuccIdx, DBB);
BB.replacePhiUsesWith(&BB, DBB);
}
}
} else if (isa<ReturnInst>(TI)) {
// Apply return from direct recursion disambiguation.
//
// YKFIXME: We do this even if the function is not directly recursive.
// If we can prove that it is not, then we can skip this step.

// Make the New Return Block (NRBB) and the Padding Block (PBB).
BasicBlock *NRBB = BasicBlock::Create(Context, "");
BasicBlock *PBB = BasicBlock::Create(Context, "");

// Make the original return block branch to the padding block.
IRBuilder<> Builder(Context);
Builder.SetInsertPoint(TI);
Builder.CreateBr(PBB);

// Make the padding block branch to the new return block.
Builder.SetInsertPoint(PBB);
Builder.CreateBr(NRBB);

// Move the original return instruction into the new return block.
Builder.SetInsertPoint(NRBB);
TI->removeFromParent();
Builder.Insert(TI);

NewBBs.push_back(NRBB);
NewBBs.push_back(PBB);
}
}

Expand Down

0 comments on commit 3375c6f

Please sign in to comment.