[FuncSpec] Update function specialization to handle phi-chains #71442

Leporacanthicus · 2023-11-06T20:45:34Z

When using the LLVM flang compiler with alias analysis (AA) enabled, SPEC2017:548.exchange2_r was running significantly slower than wihtout the AA.

This was caused by the GVN pass replacing many of the loads in the pre-AA code with phi-nodes that form a long chain of dependencies, which the function specialization was unable to follow.

This adds a function to follow phi-nodes when they are a strongly connected component, with some limitations to avoid spending ages analysing phi-nodes.

The minimum latency savings also had to be lowered - fewer load instructions means less saving.

Adding some more prints to help debugging the isProfitable decision.

No significant change in compile time or generated code-size.

When using the LLVM flang compiler with alias analysis (AA) enabled, SPEC2017:548.exchange2_r was running significantly slower than wihtout the AA. This was caused by the GVN pass replacing many of the loads in the pre-AA code with phi-nodes that form a long chain of dependencies, which the function specialization was unable to follow. This adds a function to follow phi-nodes when they are a strongly connected component, with some limitations to avoid spending ages analysing phi-nodes. The minimum latency savings also had to be lowered - fewer load instructions means less saving. Adding some more prints to help debugging the isProfitable decision. No significant change in compile time or generated code-size. Co-authored-by: Alexandros Lamprineas <[email protected]>

llvmbot · 2023-11-06T20:46:06Z

@llvm/pr-subscribers-function-specialization

@llvm/pr-subscribers-llvm-transforms

Author: Mats Petersson (Leporacanthicus)

Changes

When using the LLVM flang compiler with alias analysis (AA) enabled, SPEC2017:548.exchange2_r was running significantly slower than wihtout the AA.

This was caused by the GVN pass replacing many of the loads in the pre-AA code with phi-nodes that form a long chain of dependencies, which the function specialization was unable to follow.

This adds a function to follow phi-nodes when they are a strongly connected component, with some limitations to avoid spending ages analysing phi-nodes.

The minimum latency savings also had to be lowered - fewer load instructions means less saving.

Adding some more prints to help debugging the isProfitable decision.

No significant change in compile time or generated code-size.

Full diff: https://github.com/llvm/llvm-project/pull/71442.diff

3 Files Affected:

(modified) llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h (+4)
(modified) llvm/lib/Transforms/IPO/FunctionSpecialization.cpp (+107-26)
(added) llvm/test/Transforms/FunctionSpecialization/discover-strongly-connected-phis.ll (+87)

diff --git a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
index 50f9aae73dc53e2..f35543cb8411b35 100644
--- a/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
+++ b/llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
@@ -183,6 +183,8 @@ class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
   DenseSet<BasicBlock *> DeadBlocks;
   // PHI nodes we have visited before.
   DenseSet<Instruction *> VisitedPHIs;
+  // PHI nodes forming a strongly connected component.
+  DenseSet<PHINode *> StronglyConnectedPHIs;
   // PHI nodes we have visited once without successfully constant folding them.
   // Once the InstCostVisitor has processed all the specialization arguments,
   // it should be possible to determine whether those PHIs can be folded
@@ -217,6 +219,8 @@ class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
   Cost estimateSwitchInst(SwitchInst &I);
   Cost estimateBranchInst(BranchInst &I);
 
+  void discoverStronglyConnectedComponent(PHINode *PN, unsigned Depth);
+
   Constant *visitInstruction(Instruction &I) { return nullptr; }
   Constant *visitPHINode(PHINode &I);
   Constant *visitFreezeInst(FreezeInst &I);
diff --git a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
index b75ca7761a60b62..23e665a1901b5e1 100644
--- a/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
@@ -39,10 +39,15 @@ static cl::opt<unsigned> MaxClones(
     "The maximum number of clones allowed for a single function "
     "specialization"));
 
+static cl::opt<unsigned> MaxDiscoveryDepth(
+    "funcspec-max-discovery-depth", cl::init(10), cl::Hidden,
+    cl::desc("The maximum recursion depth allowed when searching for strongly "
+             "connected phis"));
+
 static cl::opt<unsigned> MaxIncomingPhiValues(
-    "funcspec-max-incoming-phi-values", cl::init(4), cl::Hidden, cl::desc(
-    "The maximum number of incoming values a PHI node can have to be "
-    "considered during the specialization bonus estimation"));
+    "funcspec-max-incoming-phi-values", cl::init(8), cl::Hidden,
+    cl::desc("The maximum number of incoming values a PHI node can have to be "
+             "considered during the specialization bonus estimation"));
 
 static cl::opt<unsigned> MaxBlockPredecessors(
     "funcspec-max-block-predecessors", cl::init(2), cl::Hidden, cl::desc(
@@ -64,9 +69,9 @@ static cl::opt<unsigned> MinCodeSizeSavings(
     "much percent of the original function size"));
 
 static cl::opt<unsigned> MinLatencySavings(
-    "funcspec-min-latency-savings", cl::init(70), cl::Hidden, cl::desc(
-    "Reject specializations whose latency savings are less than this"
-    "much percent of the original function size"));
+    "funcspec-min-latency-savings", cl::init(45), cl::Hidden,
+    cl::desc("Reject specializations whose latency savings are less than this"
+             "much percent of the original function size"));
 
 static cl::opt<unsigned> MinInliningBonus(
     "funcspec-min-inlining-bonus", cl::init(300), cl::Hidden, cl::desc(
@@ -262,30 +267,86 @@ Cost InstCostVisitor::estimateBranchInst(BranchInst &I) {
   return estimateBasicBlocks(WorkList);
 }
 
+void InstCostVisitor::discoverStronglyConnectedComponent(PHINode *PN,
+                                                         unsigned Depth) {
+  if (Depth > MaxDiscoveryDepth)
+    return;
+
+  if (PN->getNumIncomingValues() > MaxIncomingPhiValues)
+    return;
+
+  if (!StronglyConnectedPHIs.insert(PN).second)
+    return;
+
+  for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
+    Value *V = PN->getIncomingValue(I);
+    if (auto *Phi = dyn_cast<PHINode>(V)) {
+      if (Phi == PN || DeadBlocks.contains(PN->getIncomingBlock(I)))
+        continue;
+      discoverStronglyConnectedComponent(Phi, Depth + 1);
+    }
+  }
+}
+
 Constant *InstCostVisitor::visitPHINode(PHINode &I) {
   if (I.getNumIncomingValues() > MaxIncomingPhiValues)
     return nullptr;
 
   bool Inserted = VisitedPHIs.insert(&I).second;
   Constant *Const = nullptr;
+  SmallVector<PHINode *, 8> UnknownIncomingValues;
 
-  for (unsigned Idx = 0, E = I.getNumIncomingValues(); Idx != E; ++Idx) {
-    Value *V = I.getIncomingValue(Idx);
-    if (auto *Inst = dyn_cast<Instruction>(V))
-      if (Inst == &I || DeadBlocks.contains(I.getIncomingBlock(Idx)))
-        continue;
-    Constant *C = findConstantFor(V, KnownConstants);
-    if (!C) {
-      if (Inserted)
-        PendingPHIs.push_back(&I);
-      return nullptr;
+  auto CanConstantFoldPhi = [&](PHINode *PN) -> bool {
+    UnknownIncomingValues.clear();
+
+    for (unsigned I = 0, E = PN->getNumIncomingValues(); I != E; ++I) {
+      Value *V = PN->getIncomingValue(I);
+
+      // Disregard self-references and dead incoming values.
+      if (auto *Inst = dyn_cast<Instruction>(V))
+        if (Inst == PN || DeadBlocks.contains(PN->getIncomingBlock(I)))
+          continue;
+
+      if (Constant *C = findConstantFor(V, KnownConstants)) {
+        if (!Const)
+          Const = C;
+        // Not all incoming values are the same constant. Bail immediately.
+        else if (C != Const)
+          return false;
+      } else if (auto *Phi = dyn_cast<PHINode>(V)) {
+        // It's not a strongly connected phi. Collect it and bail at the end.
+        if (!StronglyConnectedPHIs.contains(Phi))
+          UnknownIncomingValues.push_back(Phi);
+      } else {
+        // We can't reason about anything else.
+        return false;
+      }
+    }
+    return UnknownIncomingValues.empty();
+  };
+
+  if (CanConstantFoldPhi(&I))
+    return Const;
+
+  if (Inserted) {
+    // First time we are seeing this phi. We'll retry later, after all
+    // the constant arguments have been propagated. Bail for now.
+    PendingPHIs.push_back(&I);
+    return nullptr;
+  }
+
+  for (PHINode *Phi : UnknownIncomingValues)
+    discoverStronglyConnectedComponent(Phi, 1);
+
+  bool CannotConstantFoldPhi = false;
+  for (PHINode *Phi : StronglyConnectedPHIs) {
+    if (!CanConstantFoldPhi(Phi)) {
+      CannotConstantFoldPhi = true;
+      break;
     }
-    if (!Const)
-      Const = C;
-    else if (C != Const)
-      return nullptr;
   }
-  return Const;
+  StronglyConnectedPHIs.clear();
+  return CannotConstantFoldPhi ? nullptr : Const;
 }
 
 Constant *InstCostVisitor::visitFreezeInst(FreezeInst &I) {
@@ -809,20 +870,40 @@ bool FunctionSpecializer::findSpecializations(Function *F, unsigned FuncSize,
       auto IsProfitable = [](Bonus &B, unsigned Score, unsigned FuncSize,
                              unsigned FuncGrowth) -> bool {
         // No check required.
-        if (ForceSpecialization)
+        if (ForceSpecialization) {
+          LLVM_DEBUG(dbgs() << "Force is on\n");
           return true;
+        }
         // Minimum inlining bonus.
-        if (Score > MinInliningBonus * FuncSize / 100)
+        if (Score > MinInliningBonus * FuncSize / 100) {
+          LLVM_DEBUG(dbgs()
+                     << "FnSpecialization: Min inliningbous: Score = " << Score
+                     << " > " << MinInliningBonus * FuncSize / 100 << "\n");
           return true;
+        }
         // Minimum codesize savings.
-        if (B.CodeSize < MinCodeSizeSavings * FuncSize / 100)
+        if (B.CodeSize < MinCodeSizeSavings * FuncSize / 100) {
+          LLVM_DEBUG(dbgs()
+                     << "FnSpecialization: Min CodeSize Saving: CodeSize = "
+                     << B.CodeSize << " > "
+                     << MinCodeSizeSavings * FuncSize / 100 << "\n");
           return false;
+        }
         // Minimum latency savings.
-        if (B.Latency < MinLatencySavings * FuncSize / 100)
+        if (B.Latency < MinLatencySavings * FuncSize / 100) {
+          LLVM_DEBUG(dbgs()
+                     << "FnSpecialization: Min Latency Saving: Latency = "
+                     << B.Latency << " > " << MinLatencySavings * FuncSize / 100
+                     << "\n");
           return false;
+        }
         // Maximum codesize growth.
-        if (FuncGrowth / FuncSize > MaxCodeSizeGrowth)
+        if (FuncGrowth / FuncSize > MaxCodeSizeGrowth) {
+          LLVM_DEBUG(dbgs() << "FnSpecialization: Max Func Growth: CodeSize = "
+                            << FuncGrowth / FuncSize << " > "
+                            << MaxCodeSizeGrowth << "\n");
           return false;
+        }
         return true;
       };
 
diff --git a/llvm/test/Transforms/FunctionSpecialization/discover-strongly-connected-phis.ll b/llvm/test/Transforms/FunctionSpecialization/discover-strongly-connected-phis.ll
new file mode 100644
index 000000000000000..3463ddb6f066de8
--- /dev/null
+++ b/llvm/test/Transforms/FunctionSpecialization/discover-strongly-connected-phis.ll
@@ -0,0 +1,87 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+;
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -S < %s | FileCheck %s --check-prefix=FUNCSPEC
+; RUN: opt -passes="ipsccp<func-spec>" -funcspec-min-function-size=20 -funcspec-for-literal-constant -funcspec-max-discovery-depth=5 -S < %s | FileCheck %s --check-prefix=NOFUNCSPEC
+
+define i64 @bar(i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10) {
+; FUNCSPEC-LABEL: define i64 @bar(
+; FUNCSPEC-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i1 [[C5:%.*]], i1 [[C6:%.*]], i1 [[C7:%.*]], i1 [[C8:%.*]], i1 [[C9:%.*]], i1 [[C10:%.*]]) {
+; FUNCSPEC-NEXT:  entry:
+; FUNCSPEC-NEXT:    [[F1:%.*]] = call i64 @foo.specialized.1(i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0:![0-9]+]]
+; FUNCSPEC-NEXT:    [[F2:%.*]] = call i64 @foo.specialized.2(i64 4, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG1:![0-9]+]]
+; FUNCSPEC-NEXT:    [[ADD:%.*]] = add nuw nsw i64 [[F1]], [[F2]]
+; FUNCSPEC-NEXT:    ret i64 [[ADD]]
+;
+; NOFUNCSPEC-LABEL: define i64 @bar(
+; NOFUNCSPEC-SAME: i1 [[C1:%.*]], i1 [[C2:%.*]], i1 [[C3:%.*]], i1 [[C4:%.*]], i1 [[C5:%.*]], i1 [[C6:%.*]], i1 [[C7:%.*]], i1 [[C8:%.*]], i1 [[C9:%.*]], i1 [[C10:%.*]]) {
+; NOFUNCSPEC-NEXT:  entry:
+; NOFUNCSPEC-NEXT:    [[F1:%.*]] = call i64 @foo(i64 3, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0:![0-9]+]]
+; NOFUNCSPEC-NEXT:    [[F2:%.*]] = call i64 @foo(i64 4, i1 [[C1]], i1 [[C2]], i1 [[C3]], i1 [[C4]], i1 [[C5]], i1 [[C6]], i1 [[C7]], i1 [[C8]], i1 [[C9]], i1 [[C10]]), !range [[RNG0]]
+; NOFUNCSPEC-NEXT:    [[ADD:%.*]] = add nuw nsw i64 [[F1]], [[F2]]
+; NOFUNCSPEC-NEXT:    ret i64 [[ADD]]
+;
+entry:
+  %f1 = call i64 @foo(i64 3, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10)
+  %f2 = call i64 @foo(i64 4, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10)
+  %add = add i64 %f1, %f2
+  ret i64 %add
+}
+
+define internal i64 @foo(i64 %n, i1 %c1, i1 %c2, i1 %c3, i1 %c4, i1 %c5, i1 %c6, i1 %c7, i1 %c8, i1 %c9, i1 %c10) {
+entry:
+  br i1 %c1, label %l1, label %l9
+
+l1:
+  %phi1 = phi i64 [ %n, %entry ], [ %phi2, %l2 ]
+  %add = add i64 %phi1, 1
+  %div = sdiv i64 %add, 2
+  br i1 %c2, label %l1_5, label %exit
+
+l1_5:
+  br i1 %c3, label %l1_75, label %l6
+
+l1_75:
+  br i1 %c4, label %l2, label %l3
+
+l2:
+  %phi2 = phi i64 [ %phi1, %l1_75 ], [ %phi3, %l3 ]
+  br label %l1
+
+l3:
+  %phi3 = phi i64 [ %phi1, %l1_75 ], [ %phi4, %l4 ]
+  br label %l2
+
+l4:
+  %phi4 = phi i64 [ %phi5, %l5 ], [ %phi6, %l6 ]
+  br i1 %c5, label %l3, label %l6
+
+l5:
+  %phi5 = phi i64 [ %phi6, %l6_5 ], [ %phi7, %l7 ]
+  br label %l4
+
+l6:
+  %phi6 = phi i64 [ %phi4, %l4 ], [ %phi1, %l1_5 ]
+  br i1 %c6, label %l4, label %l6_5
+
+l6_5:
+  br i1 %c7, label %l5, label %l8
+
+l7:
+  %phi7 = phi i64 [ %phi9, %l9 ], [ %phi8, %l8 ]
+  br i1 %c8, label %l5, label %l8
+
+l8:
+  %phi8 = phi i64 [ %phi6, %l6_5 ], [ %phi7, %l7 ]
+  br i1 %c9, label %l7, label %l9
+
+l9:
+  %phi9 = phi i64 [ %n, %entry ], [ %phi8, %l8 ]
+  %sub = sub i64 %phi9, 1
+  %mul = mul i64 %sub, 2
+  br i1 %c10, label %l7, label %exit
+
+exit:
+  %res = phi i64 [ %div, %l1 ], [ %mul, %l9]
+  ret i64 %res
+}
+

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

kiranchandramohan

Thanks for this patch. A few minor comments inline.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

labrinea

I've added some inline comments about the debug output, otherwise looks fine. I would like @momchil-velikov to have a say since I have co-authored this patch. I checked the compile time impact on the llvm-test-suite, which didn't raise any concerns (same amount of specializations happening, 2 specializations in ClamAV for the O3 pipeline, 2 specializations in SPASS for the LTO pipeline; geomean was +0.028% for the O3 pipeline and +0.012% for the full LTO pipeline on x86)

github-actions · 2023-11-07T21:24:02Z

✅ With the latest revision this PR passed the C/C++ code formatter.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

momchil-velikov · 2023-11-08T18:19:20Z

So the algorithm works like this:
We collect a (sub-)set of all the PHI nodes, reachable from the initial PHI node along the incoming value edges.
Thus we obtain the set of all the possible values that can flow into the initial PHI node.
Then we check across all the collected PHI nodes that each incoming value is either the same constant
or a PHI node that is a member of the set.

(As a side note, that does not mean the set forms a strongly connected component in the SSA graph, it could even be acyclic).

Note that the initial PHI node should also be added to the set to account for the possibility that some of the incoming values
are the initial PHI node itself.

momchil-velikov · 2023-11-08T18:29:20Z

What would be the result of an input like:

C       I
  \    /
    P

i.e. we visit the PHI node P , which has two operands, one constant C and one instruction I that is not a PHI.

As far as I can tell, the first time around we will see I and return false from CanConstantFoldPhi so we exit immediately.

Next time we again return with false from CanConstantFoldPhi with Const set to C and UnknownIncomingValues empty.
We will not perform any iterations of the loop that calls discoverStronglyConnectedComponent (line 376).
We will not perform any iterations of the loop the calls CanConstantFoldPhi (line 380).
CannotConstantFoldPhi is initialised to false, so at the return at line 387 we will return Const which has a non-null value,
set by the first call of CanConstantFoldPhi at line 364.

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

labrinea · 2023-11-09T14:05:50Z

What would be the result of an input like:
C       I
  \    /
    P
i.e. we visit the PHI node P , which has two operands, one constant C and one instruction I that is not a PHI.

As far as I can tell, the first time around we will see I and return false from CanConstantFoldPhi so we exit immediately.

Next time we again return with false from CanConstantFoldPhi with Const set to C and UnknownIncomingValues empty. We will not perform any iterations of the loop that calls discoverStronglyConnectedComponent (line 376). We will not perform any iterations of the loop the calls CanConstantFoldPhi (line 380). CannotConstantFoldPhi is initialised to false, so at the return at line 387 we will return Const which has a non-null value, set by the first call of CanConstantFoldPhi at line 364.

I agree, the patch does not handle this case correctly as is.

Leporacanthicus · 2023-11-10T10:23:30Z

What would be the result of an input like:
C       I
  \    /
    P
i.e. we visit the PHI node P , which has two operands, one constant C and one instruction I that is not a PHI.

As far as I can tell, the first time around we will see I and return false from CanConstantFoldPhi so we exit immediately.

Next time we again return with false from CanConstantFoldPhi with Const set to C and UnknownIncomingValues empty. We will not perform any iterations of the loop that calls discoverStronglyConnectedComponent (line 376). We will not perform any iterations of the loop the calls CanConstantFoldPhi (line 380). CannotConstantFoldPhi is initialised to false, so at the return at line 387 we will return Const which has a non-null value, set by the first call of CanConstantFoldPhi at line 364.

I will post a patch soon. I think this is solved in the new code, but I just realized a small problem.

NOTE: We need to re-write the overall commit message, as it is not close to accurate any longer.

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h

Leporacanthicus · 2023-11-20T19:06:53Z

Closing this PR, a new PR at #72903

Thanks for the feedback here.

ChuanqiXu9 · 2023-11-21T09:21:52Z

Closing this PR, a new PR at #72903

Thanks for the feedback here.

Then let's try to close it actually. Feel open to reopen it if I miss understand anything.

When using the LLVM flang compiler with alias analysis (AA) enabled, SPEC2017:548.exchange2_r was running significantly slower than wihtout the AA. This was caused by the GVN pass replacing many of the loads in the pre-AA code with phi-nodes that form a long chain of dependencies, which the function specialization was unable to follow. This adds a function to discover phi-nodes in a transitive set, with some limitations to avoid spending ages analysing phi-nodes. The minimum latency savings also had to be lowered - fewer load instructions means less saving. Adding some more prints to help debugging the isProfitable decision. No significant change in compile time or generated code-size. (A previous attempt to fix this was abandoned: #71442) --------- Co-authored-by: Alexandros Lamprineas <[email protected]>

Leporacanthicus requested review from DavidTruby, tblah, kiranchandramohan, momchil-velikov and labrinea November 6, 2023 20:45

Leporacanthicus self-assigned this Nov 6, 2023

llvmbot added function-specialization llvm:transforms labels Nov 6, 2023

labrinea reviewed Nov 7, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 7, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

kiranchandramohan reviewed Nov 7, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Show resolved Hide resolved

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 7, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 7, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 7, 2023

View reviewed changes

kiranchandramohan reviewed Nov 8, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

kiranchandramohan requested a review from davemgreen November 8, 2023 14:47

momchil-velikov requested changes Nov 8, 2023

View reviewed changes

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

labrinea reviewed Nov 9, 2023

View reviewed changes

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp Outdated Show resolved Hide resolved

Update based on review comments

3fb7efd

NOTE: We need to re-write the overall commit message, as it is not close to accurate any longer.

Leporacanthicus force-pushed the fix-phi-nodes branch from 5f83d4a to 3fb7efd Compare November 10, 2023 13:23

Leporacanthicus added 2 commits November 10, 2023 14:27

Fix some more debug output

c951ec9

Don't bail out completely when unable to discover all phi-nodes

cc8ac51

labrinea reviewed Nov 14, 2023

View reviewed changes

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h Outdated Show resolved Hide resolved

Leporacanthicus added 2 commits November 14, 2023 17:14

Adjust latency limit

a942a55

Improve constant-foldign of nested phinodes.

ad3bf33

Leporacanthicus force-pushed the fix-phi-nodes branch from 27425bb to ad3bf33 Compare November 15, 2023 16:34

Leporacanthicus added 2 commits November 16, 2023 12:28

Update canConstantFold lambdas

c22b3fe

Revert small lambda change that broke things

e5b5bba

ChuanqiXu9 closed this Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FuncSpec] Update function specialization to handle phi-chains #71442

[FuncSpec] Update function specialization to handle phi-chains #71442

Leporacanthicus commented Nov 6, 2023

llvmbot commented Nov 6, 2023 •

edited

Loading

kiranchandramohan left a comment

labrinea left a comment

github-actions bot commented Nov 7, 2023 •

edited

Loading

momchil-velikov commented Nov 8, 2023

momchil-velikov commented Nov 8, 2023

labrinea commented Nov 9, 2023

Leporacanthicus commented Nov 10, 2023

Leporacanthicus commented Nov 20, 2023

ChuanqiXu9 commented Nov 21, 2023

[FuncSpec] Update function specialization to handle phi-chains #71442

[FuncSpec] Update function specialization to handle phi-chains #71442

Conversation

Leporacanthicus commented Nov 6, 2023

llvmbot commented Nov 6, 2023 • edited Loading

kiranchandramohan left a comment

Choose a reason for hiding this comment

labrinea left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 7, 2023 • edited Loading

momchil-velikov commented Nov 8, 2023

momchil-velikov commented Nov 8, 2023

labrinea commented Nov 9, 2023

Leporacanthicus commented Nov 10, 2023

Leporacanthicus commented Nov 20, 2023

ChuanqiXu9 commented Nov 21, 2023

llvmbot commented Nov 6, 2023 •

edited

Loading

github-actions bot commented Nov 7, 2023 •

edited

Loading