-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ninja build #7
base: master
Are you sure you want to change the base?
Ninja build #7
Commits on Jul 15, 2020
-
Configuration menu - View commit details
-
Copy full SHA for f4821a9 - Browse repository at this point
Copy the full SHA f4821a9View commit details
Commits on Jul 16, 2020
-
Revert "[InstSimplify] Remove select ?, undef, X -> X and select ?, X…
…, undef -> X transforms" and subsequent patches This reverts most of the following patches due to reports of miscompiles. I've left the added test cases with comments updated to be FIXMEs. 1cf6f210a2e [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. 469da663f2d [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison 122b0640fc9 [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison ac0af12ed2f [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison 9b1e95329af [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms (cherry picked from commit 00f3579aea6e3d4a4b7464c3db47294f71cef9e4)
Configuration menu - View commit details
-
Copy full SHA for 0ea431c - Browse repository at this point
Copy the full SHA 0ea431cView commit details -
[InstCombine] update datalayout in test file; NFC
We need to specify legal integer widths to trigger PR46712, so add those here. This doesn't appear to affect any existing tests, and it's not clear why a datalayout would not include any legal integer widths. While here, change some variable names that include 'tmp' to avoid warnings from the auto-generating script for CHECK lines. (cherry picked from commit efc30e591bb5a6e869fd8e084bd310ae516b0fae)
Configuration menu - View commit details
-
Copy full SHA for c3cb455 - Browse repository at this point
Copy the full SHA c3cb455View commit details -
[InstCombine] prevent infinite looping in or-icmp fold (PR46712)
I'm not sure if the test is truly minimal, but we need to induce a situation where a value becomes a constant but is not immediately folded before getting to the 'or' transform. (cherry picked from commit d8b268680d0858aaf30cb1a278b64b11361bc780)
Configuration menu - View commit details
-
Copy full SHA for 4af794b - Browse repository at this point
Copy the full SHA 4af794bView commit details
Commits on Jul 17, 2020
-
Temporarily Revert "[AssumeBundles] Use operand bundles to encode ali…
…gnment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit 8d09f20798ac180b1749276bff364682ce0196ab. (cherry picked from commit 7bfaa40086359ed7e41c862ab0a65e0bb1be0aeb)
Configuration menu - View commit details
-
Copy full SHA for 5e978f5 - Browse repository at this point
Copy the full SHA 5e978f5View commit details -
[X86] Add test case for PR46455.
(cherry picked from commit 9adf7461f721170419058684a8d3f9228d641d59)
Configuration menu - View commit details
-
Copy full SHA for de6acde - Browse repository at this point
Copy the full SHA de6acdeView commit details -
[X86] Move integer hadd/hsub formation into a helper function shared …
…by combineAdd and combineSub. There was a lot of duplicate code here for checking the VT and subtarget. Moving it into a helper avoids that. It also fixes a bug that combineAdd reused Op0/Op1 after a call to isHorizontalBinOp may have changed it. The new helper function has its own local version of Op0/Op1 that aren't shared by other code. Fixes PR46455. Reviewed By: spatel, bkramer Differential Revision: https://reviews.llvm.org/D83971 (cherry picked from commit 5408024fa87e0b23b169fec07913bd4357acdbc4)
Configuration menu - View commit details
-
Copy full SHA for 3ad8be5 - Browse repository at this point
Copy the full SHA 3ad8be5View commit details -
Add -flang flag to the test-release.sh script
The flag is off by default. (cherry picked from commit 033ef8420cec57187fffac1f06322f73aa945c4c)
Configuration menu - View commit details
-
Copy full SHA for 4fef486 - Browse repository at this point
Copy the full SHA 4fef486View commit details -
[docs] Add Deprecated section to ReleaseNotes
This is brought up in https://reviews.llvm.org/D83915. We would like to remove some feature in PowerPC. We did send RFC before, but we think it might be a better idea that we indicate planned removal in the Release Notes for version 11 and actual removal in those for version 12.. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D83968
Jinsong Ji committedJul 17, 2020 Configuration menu - View commit details
-
Copy full SHA for 490a4c3 - Browse repository at this point
Copy the full SHA 490a4c3View commit details -
Remove TwoAddressInstructionPass::sink3AddrInstruction.
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa5724feaa89a9538dcab97e018110b0e4bc3. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708 (cherry picked from commit 60433c63acb71935111304d71e41b7ee982398f8)
Configuration menu - View commit details
-
Copy full SHA for 491e87b - Browse repository at this point
Copy the full SHA 491e87bView commit details
Commits on Jul 18, 2020
-
[RelocationResolver] Support R_PPC_REL32 & R_PPC64_REL{32,64}
This suppresses `failed to compute relocation: R_PPC_REL32, Invalid data was encountered while parsing the file` and its 64-bit variants when running llvm-dwarfdump on a PowerPC object file with .eh_frame Unfortunately it is difficult to test the computation: DWARFDataExtractor::getEncodedPointer does not use the relocated value and even if it does, we need to teach llvm-dwarfdump --eh-frame to do some linker job to report a reasonable address. (cherry picked from commit b922004ea29d54534c4f09b9cfa655bf5f3360f0)
Configuration menu - View commit details
-
Copy full SHA for 1923b84 - Browse repository at this point
Copy the full SHA 1923b84View commit details -
[RelocationResolver] Support R_AARCH64_PREL32
Code from D83800 by Yichao Yu (cherry picked from commit 3073a3aa1ef1ce8c9cac9b97a8e5905dd8779e16)
Configuration menu - View commit details
-
Copy full SHA for 4157b3a - Browse repository at this point
Copy the full SHA 4157b3aView commit details
Commits on Jul 20, 2020
-
[InstCombine] Fix replace select with Phis when branch has the same l…
…abels ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = select i1 %cond, i32 123, i32 456 ret i32 %result } ``` In this test, after applying transformation of replacing select with Phis, the result will be: ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = i32 phi [123, %exit], [123, %exit] ret i32 %result } ``` That is, select is transformed into an invalid Phi, which will then be reduced to 123 and the second value will be lost. But it is worth noting that this problem will arise only if select is in the InstCombine worklist will be before the branch. Otherwise, InstCombine will replace the branch condition with false and transformation will not be applied. The fix is to check the target labels in the branch condition for equality. Patch By: Kirill Polushin Differential Revision: https://reviews.llvm.org/D84003 Reviewed By: mkazantsev (cherry picked from commit c98988107868db41c12b9d782fae25dea2a81c87)
Configuration menu - View commit details
-
Copy full SHA for 4c5291b - Browse repository at this point
Copy the full SHA 4c5291bView commit details -
[InstCombine][Test] Test for fix of replacing select with Phis when b…
…ranch has the same labels An additional test that allows to check the correctness of handling the case of the same branch labels in the dominator when trying to replace select with phi-node. Patch By: Kirill Polushin Differential Revision: https://reviews.llvm.org/D84006 Reviewed By: mkazantsev (cherry picked from commit df6e185e8f895686510117301e568e5043909b66)
Configuration menu - View commit details
-
Copy full SHA for 129228c - Browse repository at this point
Copy the full SHA 129228cView commit details -
[RISCV] Add support for -mcpu option.
Summary: 1. gcc uses `-march` and `-mtune` flag to chose arch and pipeline model, but clang does not have `-mtune` flag, we uses `-mcpu` to chose both infos. 2. Add SiFive e31 and u54 cpu which have default march and pipeline model. 3. Specific `-mcpu` with rocket-rv[32|64] would select pipeline model only, and use the driver's arch choosing logic to get default arch. Reviewers: lenary, asb, evandro, HsiangKai Reviewed By: lenary, asb, evandro Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D71124 (cherry picked from commit 294d1eae75bf8867821a4491f0d67445227f8470)
Configuration menu - View commit details
-
Copy full SHA for b333356 - Browse repository at this point
Copy the full SHA b333356View commit details -
[X86] Teach assembler parser to accept lsl and lar with a 64 or 32 so…
…urce register when the destination is a 64 register. Previously we only accepted a 32-bit source with a 64-bit dest. Accepting 64-bit as well is more consistent with gas behavior. I think maybe we should accept 16 bit register as well, but I'm not sure. (cherry picked from commit 3c2a56a857227b6bc39285747269f02cd7a9dbe5)
Configuration menu - View commit details
-
Copy full SHA for 033a7c5 - Browse repository at this point
Copy the full SHA 033a7c5View commit details -
[X86] Allow lsl/lar to be parsed with a GR16, GR32, or GR64 as source…
… register. This matches GNU assembler behavior. Operand size is determined only from the destination register. (cherry picked from commit 71b49aa438b22b02230fff30e8874ff756336e6d)
Configuration menu - View commit details
-
Copy full SHA for 48eff08 - Browse repository at this point
Copy the full SHA 48eff08View commit details -
[ms] [llvm-ml] Remove unused function
Summary: Remove unused function Reviewed By: lbenes Differential Revision: https://reviews.llvm.org/D83898 (cherry picked from commit 47a3b85a97136fca4a388646cbaec10b71414b60)
Configuration menu - View commit details
-
Copy full SHA for 589287d - Browse repository at this point
Copy the full SHA 589287dView commit details
Commits on Jul 21, 2020
-
[ConstantFolding] check applicability of AllOnes constant creation first
The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870 (cherry picked from commit 8b354cc8db413f596c95b4f3240fabaa3e2c931e)
Configuration menu - View commit details
-
Copy full SHA for 014e600 - Browse repository at this point
Copy the full SHA 014e600View commit details -
[LLVMgold.so] -plugin-opt=save-temps: save combined module to .lto.o …
…instead of .o This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1 .o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o) ``` # Use bfd for the example, -fuse-ld=gold is similar. clang -flto -c a.c # generate bitcode file a.o clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o # The user repeats the command but get surprised, because a.o is now a combined module. clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84132 (cherry picked from commit 55fa315b0352b63454206600d6803fafacb42d5e)
Configuration menu - View commit details
-
Copy full SHA for 272e73d - Browse repository at this point
Copy the full SHA 272e73dView commit details -
[LLVMgold.so][test] Fix tests after D84132/55fa315b0352
(cherry picked from commit aa830e9768303ff8d27c015759294c4ce704d50c)
Configuration menu - View commit details
-
Copy full SHA for e24a844 - Browse repository at this point
Copy the full SHA e24a844View commit details
Commits on Jul 22, 2020
-
[PowerPC] Precommit test case for PR46759. NFC.
(cherry picked from commit 817767abeec8343b20de83f8b1b2c8c20bbbe00a)
Configuration menu - View commit details
-
Copy full SHA for 6395f30 - Browse repository at this point
Copy the full SHA 6395f30View commit details -
[PowerPC] Fix wrong codegen when stack pointer has to realign in prol…
…ogue Current powerpc backend generates wrong code sequence if stack pointer has to realign if -fstack-clash-protection enabled. When probing in prologue, backend should generate a subtraction instruction rather than a `stux` instruction to realign the stack pointer. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84218 (cherry picked from commit 8912252252c87d8ef6623ecf9fdde444560ee4b9)
Configuration menu - View commit details
-
Copy full SHA for 0514e05 - Browse repository at this point
Copy the full SHA 0514e05View commit details -
[PowerPC] Fix wrong codegen when stack pointer has to realign perform…
…ing dynalloc Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152 (cherry picked from commit c3f9697f1f227296818fbaf1a770a29842ea454c)
Configuration menu - View commit details
-
Copy full SHA for d71db2c - Browse repository at this point
Copy the full SHA d71db2cView commit details
Commits on Jul 23, 2020
-
[SCEV] Remove premature assert. PR46786
This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri (cherry picked from commit b96114c1e1fc4448ea966bce013706359aee3fa9)
Configuration menu - View commit details
-
Copy full SHA for 19ae7d2 - Browse repository at this point
Copy the full SHA 19ae7d2View commit details -
Drop the npm run line from llvm/test/Analysis/ScalarEvolution/pr46786.ll
since it's failing.
Configuration menu - View commit details
-
Copy full SHA for 62e5233 - Browse repository at this point
Copy the full SHA 62e5233View commit details -
[InstCombine] Add test for PR46680 (NFC)
(cherry picked from commit 13ae440de4a408cf9d1a448def09769ecbecfdf7)
Configuration menu - View commit details
-
Copy full SHA for 8d7d00c - Browse repository at this point
Copy the full SHA 8d7d00cView commit details -
[InstCombine] Fix store merge worklist management (PR46680)
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the deferred worklist mechanism, so that processing of newly added instructions is prioritized. There's one side-effect of the worklist order change which could be classified as a regression. An add op gets pushed through a select that at the time is not a umax. We could add a reverse transform that tries to push adds in the reverse direction to restore a min/max, but that seems like a sure way of getting infinite loops... Seems like something that should best wait on min/max intrinsics. Differential Revision: https://reviews.llvm.org/D84109 (cherry picked from commit d12ec0f752e7f2c7f7252539da2d124264ec33f7)
Configuration menu - View commit details
-
Copy full SHA for c505dd4 - Browse repository at this point
Copy the full SHA c505dd4View commit details -
[X86][AVX] getTargetShuffleMask - don't decode VBROADCAST(EXTRACT_SUB…
…VECTOR(X,0)) patterns. getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal. (cherry picked from commit 5b5dc2442ac7a574a3b7d17c15ebeeb9eb3bec26)
Configuration menu - View commit details
-
Copy full SHA for dea959b - Browse repository at this point
Copy the full SHA dea959bView commit details
Commits on Jul 27, 2020
-
[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zb…
…b asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the base subset (zbb subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79870 (cherry picked from commit e2692f0ee7f338fea4fc918669643315cefc7678)
Configuration menu - View commit details
-
Copy full SHA for 718a1e2 - Browse repository at this point
Copy the full SHA 718a1e2View commit details -
[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zb…
…p asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the permutation subset (zbp subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79871 (cherry picked from commit 31b52b4345e36b169a2b6a89eac44651f59889dd)
Configuration menu - View commit details
-
Copy full SHA for f9fcdd5 - Browse repository at this point
Copy the full SHA f9fcdd5View commit details -
[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zb…
…bp asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions belonging to both the permutation and the base subsets of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79873 (cherry picked from commit 6144f0a1e52e7f5439a67267ca65f2d72c21aaa6)
Configuration menu - View commit details
-
Copy full SHA for f0b84fb - Browse repository at this point
Copy the full SHA f0b84fbView commit details -
[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zb…
…s asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the single-bit subset (zbs subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79874 (cherry picked from commit d4be33374c07ea9a9362892876aa76b227298181)
Configuration menu - View commit details
-
Copy full SHA for b0acb1f - Browse repository at this point
Copy the full SHA b0acb1fView commit details -
[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zb…
…t asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the ternary subset (zbt subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79875 (cherry picked from commit c9c955ada8e65205312f2bc41b46eefa0e98b36c)
Configuration menu - View commit details
-
Copy full SHA for ab5d26c - Browse repository at this point
Copy the full SHA ab5d26cView commit details -
[MC] [COFF] Make sure that weak external symbols are undefined symbols
For comdats (e.g. caused by -ffunction-sections), Section is already set here; make sure it's null, for the weak external symbol to be undefined. This fixes PR46779. Differential Revision: https://reviews.llvm.org/D84507 (cherry picked from commit 9e81d8bbf19d72fca3d87b7334c613d1aa2a5795)
Configuration menu - View commit details
-
Copy full SHA for 3df565a - Browse repository at this point
Copy the full SHA 3df565aView commit details -
[llvm-lib] Support adding short import library objects with llvm-lib
This fixes PR 42837. Differential Revision: https://reviews.llvm.org/D84465 (cherry picked from commit 4d09ed953b5b8c70d9ca0aeaed8f26a237b612c6)
Configuration menu - View commit details
-
Copy full SHA for 48558c1 - Browse repository at this point
Copy the full SHA 48558c1View commit details -
[LegalizeTypes] Teach DAGTypeLegalizer::GenWidenVectorLoads to pad wi…
…th undef if needed when concatenating small or loads to match a larger load In the included test case the align 16 allowed the v23f32 load to handled as load v16f32, load v4f32, and load v4f32(one element not used). These loads all need to be concatenated together into a final vector. In this case we tried to concatenate the two v4f32 loads to match the type of the v16f32 load so we could do a second concat_vectors, but those loads alone only add up to v8f32. So we need to two v4f32 undefs to pad it. It appears we've tried to hack around a similar issue in this code before by adding undef padding to loads in one of the earlier loops in this function. Originally in r147964 by padding all loads narrower than previous loads to the same size. Later modifed to only the last load in r293088. This patch removes that earlier code and just handles it on demand where we know we need it. Fixes PR46820 Differential Revision: https://reviews.llvm.org/D84463 (cherry picked from commit 8131e190647ac2b5b085b48a6e3b48c1d7520a66)
Configuration menu - View commit details
-
Copy full SHA for 3e4949e - Browse repository at this point
Copy the full SHA 3e4949eView commit details -
[PowerPC] Fix computation of offset for load-and-splat for permuted l…
…oads Unfortunately this is another regression from my canonicalization patch (1fed131660b2). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is. (cherry picked from commit 7d076e19e31a2a32e357cbdcf0183f88fe1fb0fb)
Configuration menu - View commit details
-
Copy full SHA for a5f9c69 - Browse repository at this point
Copy the full SHA a5f9c69View commit details -
[PowerPC][NFC] Fix an assert that cannot trip from 7d076e19e31a
I mixed up the precedence of operators in the assert and thought I had it right since there was no compiler warning. This just adds the parentheses in the expression as needed. (cherry picked from commit cdead4f89c0eecf11f50092bc088e3a9c6511825)
Configuration menu - View commit details
-
Copy full SHA for 0310968 - Browse repository at this point
Copy the full SHA 0310968View commit details -
[JumpThreading] ProcessBranchOnXOR(): bailout if any pred ends in ind…
…irect branch (PR46857) SplitBlockPredecessors() can not split blocks that have such terminators, and in two other places we already ensure that we don't end up calling SplitBlockPredecessors() on such blocks. Do so in one more place. Fixes https://bugs.llvm.org/show_bug.cgi?id=46857 (cherry picked from commit 1da9834557cd4302a5183b8228ce063e69f82602)
Configuration menu - View commit details
-
Copy full SHA for 5158667 - Browse repository at this point
Copy the full SHA 5158667View commit details -
[BasicAA] Add additional negative phi tests. NFC
(cherry picked from commit 30fa57662760e1489cf70cb411c55fbe9fc189fe)
Configuration menu - View commit details
-
Copy full SHA for afe00fe - Browse repository at this point
Copy the full SHA afe00feView commit details -
[BasicAA] Fix -basicaa-recphi for geps with negative offsets
As shown in D82998, the basic-aa-recphi option can cause miscompiles for gep's with negative constants. The option checks for recursive phi, that recurse through a contant gep. If it finds one, it performs aliasing calculations using the other phi operands with an unknown size, to specify that an unknown number of elements after the initial value are potentially accessed. This works fine expect where the constant is negative, as the size is still considered to be positive. So this patch expands the check to make sure that the constant is also positive. Differential Revision: https://reviews.llvm.org/D83576 (cherry picked from commit 311fafd2c90aed5b3fed9566503eebe629f1e979)
Configuration menu - View commit details
-
Copy full SHA for 95c5899 - Browse repository at this point
Copy the full SHA 95c5899View commit details -
Configuration menu - View commit details
-
Copy full SHA for ca9c579 - Browse repository at this point
Copy the full SHA ca9c579View commit details
Commits on Jul 28, 2020
-
[X86] Detect if EFLAGs is live across XBEGIN pseudo instruction. Add …
…it as livein to the basic blocks created when expanding the pseudo XBEGIN causes several based blocks to be inserted. If flags are live across it we need to make eflags live in the new basic blocks to avoid machine verifier errors. Fixes PR46827 Reviewed By: ivanbaev Differential Revision: https://reviews.llvm.org/D84479 (cherry picked from commit 647e861e080382593648b234668ad2f5a376ac5e)
Configuration menu - View commit details
-
Copy full SHA for 73a82b1 - Browse repository at this point
Copy the full SHA 73a82b1View commit details -
[X86][SSE] Add additional (f)add(shuffle(x,y),shuffle(x,y)) tests for…
… D83789 (cherry picked from commit bfc4294ef61d5cf69fffe6b64287a323c003d90f)
Configuration menu - View commit details
-
Copy full SHA for 4a1983e - Browse repository at this point
Copy the full SHA 4a1983eView commit details -
[X86][SSE] Attempt to match OP(SHUFFLE(X,Y),SHUFFLE(X,Y)) -> SHUFFLE(…
…HOP(X,Y)) An initial backend patch towards fixing the various poor HADD combines (PR34724, PR41813, PR45747 etc.). This extends isHorizontalBinOp to check if we have per-element horizontal ops (odd+even element pairs), but not in the expected serial order - in which case we build a "post shuffle mask" that we can apply to the HOP result, assuming we have fast-hops/optsize etc. The next step will be to extend the SHUFFLE(HOP(X,Y)) combines as suggested on PR41813 - accepting more post-shuffle masks even on slow-hop targets if we can fold it into another shuffle. Differential Revision: https://reviews.llvm.org/D83789 (cherry picked from commit 182111777b4ec215eeebe8ab5cc2a324e2f055ff)
Configuration menu - View commit details
-
Copy full SHA for 0719918 - Browse repository at this point
Copy the full SHA 0719918View commit details
Commits on Jul 29, 2020
-
[InstCombine] avoid crashing on vector constant expression (PR46872)
(cherry picked from commit f75cf240d6ed528e1ce7770bbe09b417338b40ef)
Configuration menu - View commit details
-
Copy full SHA for 14afc00 - Browse repository at this point
Copy the full SHA 14afc00View commit details -
[AMDGPU] Don't combine memory intrs to v3i16
v3i16 and v3f16 currently cannot be legalized and lowered so they should not be emitted by inst combining. Moved the check down to still allow extracting 1 or 2 elements via the dmask. Fixes image intrinsics being combined to return v3x16. Differential Revision: https://reviews.llvm.org/D84223 (cherry picked from commit 2c659082bda6319732118e746fe025d8d5f9bfac)
Configuration menu - View commit details
-
Copy full SHA for a104e45 - Browse repository at this point
Copy the full SHA a104e45View commit details
Commits on Jul 31, 2020
-
Add flang to export.sh to it gets source tarballs in releases
(cherry picked from commit 9853786ce39b9510eeb2688baaef7a364d58e113)
Configuration menu - View commit details
-
Copy full SHA for 9162532 - Browse repository at this point
Copy the full SHA 9162532View commit details -
[AArch64][SVE] Add support for trunc to <vscale x N x i1>.
This isn't a natively supported operation, so convert it to a mask+compare. In addition to the operation itself, fix up some surrounding stuff to make the testcase work: we need concat_vectors on i1 vectors, we need legalization of i1 vector truncates, and we need to fix up all the relevant uses of getVectorNumElements(). Differential Revision: https://reviews.llvm.org/D83811 (cherry picked from commit b8f765a1e17f8d212ab1cd8f630d35adc7495556)
Configuration menu - View commit details
-
Copy full SHA for 48fbb59 - Browse repository at this point
Copy the full SHA 48fbb59View commit details -
[AArch64][SVE] Fix PCS for functions taking/returning scalable types.
The default calling convention needs to save/restore the SVE callee saves according to the SVE PCS when the function takes or returns scalable types, even when the `aarch64_sve_vector_pcs` CC is not specified for the function. Reviewers: efriedma, paulwalker-arm, david-arm, rengolin Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84041 (cherry picked from commit 9bacf1588583014538a0217add18f370acb95788)
Configuration menu - View commit details
-
Copy full SHA for bd7f2f5 - Browse repository at this point
Copy the full SHA bd7f2f5View commit details -
[AArch64][SVE] Correctly allocate scavenging slot in presence of SVE.
This patch addresses two issues: * Forces the availability of the base-pointer (x19) when the frame has both scalable vectors and variable-length arrays. Otherwise it will be expensive to access non-SVE locals. * In presence of SVE stack objects, it will allocate the emergency scavenging slot close to the SP, so that they can be accessed from the SP or BP if available. If accessed from the frame-pointer, it will otherwise need an extra register to access the scavenging slot because of mixed scalable/non-scalable addressing modes. Reviewers: efriedma, ostannard, cameron.mcinally, rengolin, david-arm Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70174 (cherry picked from commit bef56f7fe2382ed1476aa67a55626b364635b44e)
Configuration menu - View commit details
-
Copy full SHA for 7c1a857 - Browse repository at this point
Copy the full SHA 7c1a857View commit details -
[AArch64][SVE] Teach copyPhysReg to copy ZPR2/3/4.
It's sort of tricky to hit this in practice, but not impossible. I have a synthetic C testcase if anyone is interested. The implementation is identical to the equivalent NEON register copies. Differential Revision: https://reviews.llvm.org/D84373 (cherry picked from commit 993c1a3219a8ae69f1d700183bf174d75f3815d4)
Configuration menu - View commit details
-
Copy full SHA for 8c995d4 - Browse repository at this point
Copy the full SHA 8c995d4View commit details -
[AArch64][SVE] Don't support fixedStack for SVE objects.
Fixed stack objects are preallocated and defined to be allocated before any of the regular stack objects. These are normally used to model stack arguments. The AAPCS does not support passing SVE registers on the stack by value (only by reference). The current layout also doesn't place them before all stack objects, but rather before all SVE objects. Removing this simplifies the code that emits the allocation/deallocation around callee-saved registers (D84042). This patch also removes all uses of fixedStack from from framelayout-sve.mir, where this was used purely for testing purposes. Reviewers: paulwalker-arm, efriedma, rengolin Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84538 (cherry picked from commit 54492a5843a34684ce21ae201dd8ca3e509288fd)
Configuration menu - View commit details
-
Copy full SHA for 03811e1 - Browse repository at this point
Copy the full SHA 03811e1View commit details -
[AArch64][SVE] Don't align the last SVE callee save.
Instead of aligning the last callee-saved-register slot to the stack alignment (16 bytes), just align the SVE callee-saved block. This also simplifies the code that allocates space for the callee-saves. This change is needed to make sure the offset to which the callee-saved register is spilled, corresponds to the offset used for e.g. unwind call frame instructions. Reviewers: efriedma, paulwalker-arm, david-arm, rengolin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84042 (cherry picked from commit 26b4ef3694973ea2fa656d3d3a7f67f16f135654)
Configuration menu - View commit details
-
Copy full SHA for 4b9a803 - Browse repository at this point
Copy the full SHA 4b9a803View commit details -
[AArch64][SVE] Fix epilogue for SVE when the stack is realigned.
While deallocating the stackframe, the offset used to reload the callee-saved registers was not pointing to the SVE callee-saves, but rather to the whole SVE area. +--------------+ | GRP callee | | saves | +--------------+ <- FP | SVE callee | | saves | +--------------+ <- Should restore SVE callee saves from here | SVE Spills | | and Locals | +--------------+ <- instead of from here. | | : : | | +--------------+ <- SP Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D84539 (cherry picked from commit cda2eb3ad2bbe923e74d6eb083af196a0622d800)
Configuration menu - View commit details
-
Copy full SHA for b14a3de - Browse repository at this point
Copy the full SHA b14a3deView commit details -
[SVE] Don't use LocalStackAllocation for SVE objects
I have introduced a new TargetFrameLowering query function: isStackIdSafeForLocalArea that queries whether or not it is safe for objects of a given stack id to be bundled into the local area. The default behaviour is to always bundle regardless of the stack id, however for AArch64 this is overriden so that it's only safe for fixed-size stack objects. There is future work here to extend this algorithm for multiple local areas so that SVE stack objects can be bundled together and accessed from their own virtual base-pointer. Differential Revision: https://reviews.llvm.org/D83859 (cherry picked from commit 14bc85e0ebb6c00c1672158ab6a692bfbb11e1cc)
Configuration menu - View commit details
-
Copy full SHA for df55a6e - Browse repository at this point
Copy the full SHA df55a6eView commit details -
[CodeGen] Remove calls to getVectorNumElements in DAGTypeLegalizer::S…
…plitVecOp_EXTRACT_SUBVECTOR In DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR I have replaced calls to getVectorNumElements with getVectorMinNumElements, since this code path works for both fixed and scalable vector types. For scalable vectors the index will be multiplied by VSCALE. Fixes warnings in this test: sve-sext-zext.ll Differential revision: https://reviews.llvm.org/D83198 (cherry picked from commit 5d84eafc6b86a42e261af8d753c3a823e0e7c67e)
Configuration menu - View commit details
-
Copy full SHA for ebd7adf - Browse repository at this point
Copy the full SHA ebd7adfView commit details -
[SVE] Add checks for no warnings in CodeGen/AArch64/sve-sext-zext.ll
Previous patches fixed up all the warnings in this test: llvm/test/CodeGen/AArch64/sve-sext-zext.ll and this change simply checks that no new warnings are added in future. Differential revision: https://reviews.llvm.org/D83205 (cherry picked from commit f43b5c7a76ab83dcc80e6769d41d5c4b761312b1)
Configuration menu - View commit details
-
Copy full SHA for 72e8f44 - Browse repository at this point
Copy the full SHA 72e8f44View commit details -
[SVE][CodeGen] Add simple integer add tests for SVE tuple types
I have added tests to: CodeGen/AArch64/sve-intrinsics-int-arith.ll for doing simple integer add operations on tuple types. Since these tests introduced new warnings due to incorrect use of getVectorNumElements() I have also fixed up these warnings in the same patch. These fixes are: 1. In narrowExtractedVectorBinOp I have changed the code to bail out early for scalable vector types, since we've not yet hit a case that proves the optimisations are profitable for scalable vectors. 2. In DAGTypeLegalizer::WidenVecRes_CONCAT_VECTORS I have replaced calls to getVectorNumElements with getVectorMinNumElements in cases that work with scalable vectors. For the other cases I have added asserts that the vector is not scalable because we should not be using shuffle vectors and build vectors in such cases. Differential revision: https://reviews.llvm.org/D84016 (cherry picked from commit 207877175944656bd9b52d36f391a092854572be)
Configuration menu - View commit details
-
Copy full SHA for efb915b - Browse repository at this point
Copy the full SHA efb915bView commit details -
[SVE] Don't consider scalable vector types in SLPVectorizerPass::vect…
…orizeChainsInBlock In vectorizeChainsInBlock we try to collect chains of PHI nodes that have the same element type, but the code is relying upon the implicit conversion from TypeSize -> uint64_t. For now, I have modified the code to ignore PHI nodes with scalable types. Differential Revision: https://reviews.llvm.org/D83542 (cherry picked from commit 9ad7c980bb47edd7db8f8db828b487cc7dfc9921)
Configuration menu - View commit details
-
Copy full SHA for 4f27636 - Browse repository at this point
Copy the full SHA 4f27636View commit details -
[SVE][CodeGen] At -O0 fallback to DAG ISel when translating alloca wi…
…th scalable types When building code at -O0 We weren't falling back to DAG ISel correctly when encountering alloca instructions with scalable vector types. This is because the alloca has no operands that are scalable. I've fixed this by adding a check in AArch64ISelLowering::fallBackToDAGISel for alloca instructions with scalable types. Differential Revision: https://reviews.llvm.org/D84746 (cherry picked from commit 23ad660b5d34930b2b5362f1bba63daee78f6aa4)
Configuration menu - View commit details
-
Copy full SHA for 0525645 - Browse repository at this point
Copy the full SHA 0525645View commit details -
[llvm][sve] Reg + Imm addressing mode for ld1ro.
Reviewers: kmclaughlin, efriedma, sdesmalen Subscribers: tschuett, hiraditya, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D83357 (cherry picked from commit 809600d6642773f71245f76995dab355effc73af)
Configuration menu - View commit details
-
Copy full SHA for a232ab7 - Browse repository at this point
Copy the full SHA a232ab7View commit details -
[NFC][AArch64] Replace some template methods/invocations...
...with the non-template version, as the template version might increase the size of the compiler build. Methods affected: 1.`findAddrModeSVELoadStore` 2. `SelectPredicatedStore` Also, remove the `const` qualifier from the `unsigned` parameters of the methods to conform with other similar methods in the class. (cherry picked from commit dbeb184b7f54db2d3ef20ac153b1c77f81cf0b99)
Configuration menu - View commit details
-
Copy full SHA for c3a85d6 - Browse repository at this point
Copy the full SHA c3a85d6View commit details -
[llvm][CodeGen] Addressing modes for SVE ldN.
Reviewers: c-rhodes, efriedma, sdesmalen Subscribers: huihuiz, tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77251 (cherry picked from commit adb28e0fb2b0e97ea9dce422c09b36979cf7cd2f)
Configuration menu - View commit details
-
Copy full SHA for d7924b4 - Browse repository at this point
Copy the full SHA d7924b4View commit details -
[analyzer] Fix out-of-tree only clang build by not relaying on privat…
…e header It turned out that the D78704 included a private LLVM header, which is excluded from the LLVM install target. I'm substituting that `#include` with the public one by moving the necessary `#define` into that. There was a discussion about this at D78704 and on the cfe-dev mailing list. I'm also placing a note to remind others of this pitfall. Reviewed By: mgorny Differential Revision: https://reviews.llvm.org/D84929 (cherry picked from commit 63d3aeb529a7b0fb95c2092ca38ad21c1f5cfd74)
Configuration menu - View commit details
-
Copy full SHA for 2cd4771 - Browse repository at this point
Copy the full SHA 2cd4771View commit details
Commits on Aug 3, 2020
-
Align store conditional address
In cases where the alignment of the datatype is smaller than expected by the instruction, the address is aligned. The aligned address is used for the load, but wasn't used for the store conditional, which resulted in a run-time alignment exception. (cherry picked from commit 7b114446c320de542c50c4c02f566e5d18adee33)
Configuration menu - View commit details
-
Copy full SHA for 585524e - Browse repository at this point
Copy the full SHA 585524eView commit details -
[LAA] Avoid adding pointers to the checks if they are not needed.
Currently we skip alias sets with only reads or a single write and no reads, but still add the pointers to the list of pointers in RtCheck. This can lead to cases where we try to access a pointer that does not exist when grouping checks. In most cases, the way we access PositionMap masked that, as the value would default to index 0. But in the example in PR46854 it causes a crash. This patch updates the logic to avoid adding pointers for alias sets that do not need any checks. It makes things slightly more verbose, by first checking the numbers of reads/writes and bailing out early if we don't need checks for the alias set. I think this makes the logic a bit simpler to follow. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D84608 (cherry picked from commit 2062b3707c1ef698deaa9abc571b937fdd077168)
Configuration menu - View commit details
-
Copy full SHA for c96add5 - Browse repository at this point
Copy the full SHA c96add5View commit details -
[CMake] Pass bugreport URL to standalone clang build
BUG_REPORT_URL is currently used both in LLVM and in Clang but declared only in the latter. This means that it's missing in standalone clang builds and the driver ends up outputting: PLEASE submit a bug report to and include [...] (note the missing URL) To fix this, include LLVM_PACKAGE_BUGREPORT in LLVMConfig.cmake (similarly to how we pass PACKAGE_VERSION) and use it to fill BUG_REPORT_URL when building clang standalone. Differential Revision: https://reviews.llvm.org/D84987 (cherry picked from commit 21c165de2a1bcca9dceb452f637d9e8959fba113)
Configuration menu - View commit details
-
Copy full SHA for 2fc661f - Browse repository at this point
Copy the full SHA 2fc661fView commit details -
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::vis…
…itLoadInst Summary: This is in response to the review of https://reviews.llvm.org/D84873: The expensive check should be reordered last Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D84890 (cherry picked from commit 243376cdc7b719d443f42c8c4667e5d96af53dcc)
Configuration menu - View commit details
-
Copy full SHA for 5aeae17 - Browse repository at this point
Copy the full SHA 5aeae17View commit details
Commits on Aug 5, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 78f3018 - Browse repository at this point
Copy the full SHA 78f3018View commit details -
[llvm] Add RISCVTargetParser.def to the module map
This fixes the modules build. (cherry picked from commit 1b3c25e7b61f44b80788f8758f0d7f0b013135b5)
Configuration menu - View commit details
-
Copy full SHA for 1dec893 - Browse repository at this point
Copy the full SHA 1dec893View commit details -
RuntimeDyldELF: report_fatal_error instead of asserting for unimpleme…
…nted relocations (PR46816) This fixes the ExecutionEngine/MCJIT/stubs-sm-pic.ll test in no-asserts builds which is set to XFAIL on some platforms like 32-bit x86. More importantly, we probably don't want to silently error in these cases. Differential revision: https://reviews.llvm.org/D84390 (cherry picked from commit 6a3b07a4bf14be32569550f2e9814d8797d27d31)
Configuration menu - View commit details
-
Copy full SHA for d8c7836 - Browse repository at this point
Copy the full SHA d8c7836View commit details -
[llvm-rc] Allow string table values split into multiple string literals
This can practically easily be a product of combining strings with macros in resource files. This fixes mstorsjo/llvm-mingw#140. As string literals within llvm-rc are handled as StringRefs, each referencing an uninterpreted slice of the input file, with actual interpretation of the input string (codepage handling, unescaping etc) done only right before writing them out to disk, it's hard to concatenate them other than just bundling them up in a vector, without rearchitecting a large part of llvm-rc. This matches how the same already is supported in VersionInfoValue, with a std::vector<IntOrString> Values. MS rc.exe only supports concatenated string literals in version info values (already supported), string tables (implemented in this patch) and user data resources (easily implemented in a separate patch, but hasn't been requested by any end user yet), while GNU windres supports string immediates split into multiple strings anywhere (e.g. like (100 ICON "myicon" ".ico"). Not sure if concatenation in other statements actually is used in the wild though, in resource files normally built by GNU windres. Differential Revision: https://reviews.llvm.org/D85183 (cherry picked from commit b989fcbae6f179ad887d19ceef83ace1c00b87cc)
Configuration menu - View commit details
-
Copy full SHA for 872454e - Browse repository at this point
Copy the full SHA 872454eView commit details -
[PowerPC] fixupIsDeadOrKill start and end in different block fixing
In fixupIsDeadOrKill, we assume StartMI and EndMI not exist in same basic block, so we add an assertion in that function. This is wrong before RA, as before RA the true definition may exist in another block through copy like instructions. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D83365 (cherry picked from commit 36f9fe2d3493717dbc6866d96b2e989839ce1a4c)
Configuration menu - View commit details
-
Copy full SHA for 888e055 - Browse repository at this point
Copy the full SHA 888e055View commit details
Commits on Aug 6, 2020
-
[AArch64] [Windows] Error out on unsupported symbol locations
These might occur in seemingly generic assembly. Previously when targeting COFF, they were silently ignored, which certainly won't give the right result. Instead clearly error out, to make it clear that the assembly needs to be adjusted for this target. Also change a preexisting report_fatal_error into a proper error message, pointing out the offending source instruction. This isn't strictly an internal error, as it can be triggered by user input. Differential Revision: https://reviews.llvm.org/D85242 (cherry picked from commit f5e6fbac24f198d075a7c4bc0879426e79040bcf)
Configuration menu - View commit details
-
Copy full SHA for ece79ac - Browse repository at this point
Copy the full SHA ece79acView commit details
Commits on Aug 7, 2020
-
[GlobalISel][InlineAsm] Fix matching input constraint to physreg
Add given input and mark it as tied. Doesn't create additional copy compared to matching input constraint to virtual register. Differential Revision: https://reviews.llvm.org/D85122 (cherry picked from commit d893278bba01b0e1209e8b8accbdd5cfa75a0932)
Configuration menu - View commit details
-
Copy full SHA for 9a16e54 - Browse repository at this point
Copy the full SHA 9a16e54View commit details
Commits on Aug 17, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 4e760d6 - Browse repository at this point
Copy the full SHA 4e760d6View commit details -
[AArch64][SVE] Fix CFA calculation in presence of SVE objects.
The CFA is calculated as (SP/FP + offset), but when there are SVE objects on the stack the SP offset is partly scalable and should instead be expressed as the DWARF expression: SP + offset + scalable_offset * VG where VG is the Vector Granule register, containing the number of 64bits 'granules' in a scalable vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84043 (cherry picked from commit fd6584a22043b254a323635c142b28ce80ae5b5b)
Configuration menu - View commit details
-
Copy full SHA for 58d78be - Browse repository at this point
Copy the full SHA 58d78beView commit details -
[AArch64][SVE] Add missing unwind info for SVE registers.
This patch adds a CFI entry for each SVE callee saved register that needs unwind info at an offset from the CFA. The offset is a DWARF expression because the offset is partly scalable. The CFI entries only cover a subset of the SVE callee-saves and only encodes the lower 64-bits, thus implementing the lowest common denominator ABI. Existing unwinders may support VG but only restore the lower 64-bits. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84044 (cherry picked from commit bb3344c7d8c2703c910dd481ada43ecaf11536a6)
Configuration menu - View commit details
-
Copy full SHA for b70d8f0 - Browse repository at this point
Copy the full SHA b70d8f0View commit details -
[AArch64][SVE] Disable tail calls if callee does not preserve SVE regs.
This fixes an issue triggered by the following code, where emitEpilogue got confused when trying to restore the SVE registers after the call, whereas the call to bar() is implemented as a TCReturn: int non_sve(); int sve(svint32_t x) { return non_sve(); } Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D84869 (cherry picked from commit f2916636f83dfeb4808a16045db0025783743471)
Configuration menu - View commit details
-
Copy full SHA for 08c4f4d - Browse repository at this point
Copy the full SHA 08c4f4dView commit details -
[SVE][CodeGen] Fix bug with store of unpacked FP scalable vectors
Fixed an incorrect pattern in lib/Target/AArch64/AArch64SVEInstrInfo.td for storing out <vscale x 2 x f32> unpacked scalable vectors. Added a couple of tests to test/CodeGen/AArch64/sve-st1-addressing-mode-reg-imm.ll Differential Revision: https://reviews.llvm.org/D85441 (cherry picked from commit 0905d9f31ead399d054c5d2a2c353e690f5c8daa)
Configuration menu - View commit details
-
Copy full SHA for e312733 - Browse repository at this point
Copy the full SHA e312733View commit details -
Fix -Wconstant-conversion warning with explicit cast
Introduced by fd6584a22043b254a323635c142b28ce80ae5b5b Following similar use of casts in AsmParser.cpp, for instance - ideally this type would use unsigned chars as they're more representative of raw data and don't get confused around implementation defined choices of char's signedness, but this is what it is & the signed/unsigned conversions are (so far as I understand) safe/bit preserving in this usage and what's intended, given the API design here. (cherry picked from commit e31cfc4cd3e393300002e9c519787c96e3b67bab)
Configuration menu - View commit details
-
Copy full SHA for 4fc1aa9 - Browse repository at this point
Copy the full SHA 4fc1aa9View commit details
Commits on Aug 18, 2020
-
[SVE] Fix bug in SVEIntrinsicOpts::optimizePTest
The code wasn't taking into account that the two operands passed to ptest could be identical and was trying to erase them twice. Differential Revision: https://reviews.llvm.org/D85892 (cherry picked from commit 6c7957c9901714b7ad0a8d2743a8c431b57fd0c9)
Configuration menu - View commit details
-
Copy full SHA for de2e9a1 - Browse repository at this point
Copy the full SHA de2e9a1View commit details -
[PowerPC] Make StartMI ignore COPY like instructions.
Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D85659 (cherry picked from commit 4d52ebb9b9c72b656c1ccb6a1424841f246cd791)
Configuration menu - View commit details
-
Copy full SHA for f0d66a0 - Browse repository at this point
Copy the full SHA f0d66a0View commit details -
[InstCombine] Sanitize undef vector constant to 1 in X*(2^C) with X <…
…< C (PR47133) While x*undef is undef, shift-by-undef is poison, which we must avoid introducing. Also log2(iN undef) is *NOT* iN undef, because log2(iN undef) u< N. See https://bugs.llvm.org/show_bug.cgi?id=47133 (cherry picked from commit 12d93a27e7b78d58dd00817cb737f273d2dba8ae)
Configuration menu - View commit details
-
Copy full SHA for 7b99daf - Browse repository at this point
Copy the full SHA 7b99dafView commit details -
[X86] Optimize getImpliedDisabledFeatures & getImpliedEnabledFeatures…
… after D83273 Previously the time complexity is O(|number of paths from the root to an implied feature| * CPU_FWATURE_MAX) where CPU_FEATURE_MAX is 92. The number of paths can be large (theoretically exponential). For an inline asm statement, there is a code path `clang::Parser::ParseAsmStatement -> clang::Sema::ActOnGCCAsmStmt -> ASTContext::getFunctionFeatureMap` leading to potentially many calls of getImpliedEnabledFeatures (41 for my -march=native case). We should improve the performance a bit in case the number of inline asm statements is large (Linux kernel builds). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D85257 (cherry picked from commit 0c7af8c83bd1acb0ca78f35ddde29b6fde4363a0)
Configuration menu - View commit details
-
Copy full SHA for 5c79597 - Browse repository at this point
Copy the full SHA 5c79597View commit details -
[X86] Add test case for PR47000. NFC
(cherry picked from commit 13796d14238baabff972e15ceddb4ae61b1584b8)
Configuration menu - View commit details
-
Copy full SHA for c2ceb9b - Browse repository at this point
Copy the full SHA c2ceb9bView commit details -
[X86] Disable copy elision in LowerMemArgument for scalarized vectors…
… when the loc VT is a different size than the original element. For example a v4f16 argument is scalarized to 4 i32 values. So the values are spread out instead of being packed tightly like in the original vector. Fixes PR47000. (cherry picked from commit 08b2d0a963dbbf54317a137d69f430b347d1bfae)
Configuration menu - View commit details
-
Copy full SHA for f19cefc - Browse repository at this point
Copy the full SHA f19cefcView commit details -
[release][docs] Update contributions to LLVM 11 for SVE.
Differential Revision: https://reviews.llvm.org/D85977
Francesco Petrogalli committedAug 18, 2020 Configuration menu - View commit details
-
Copy full SHA for 5e39908 - Browse repository at this point
Copy the full SHA 5e39908View commit details
Commits on Aug 19, 2020
-
[globalopt] Change so that emitting fragments doesn't use the type si…
…ze of DIVariables When turning on -debug-info-kind=constructor we ran into a "fragment covers entire variable" error during thinlto. The fragment is currently always emitted if there is no type size, but sometimes the variable has a forward declared struct type which doesn't have a size. This changes the code to get the type size from the GlobalVariable instead. Differential Revision: https://reviews.llvm.org/D85572 (cherry picked from commit 54b6cca0f28484395ae43bcda4c9f929bc51cfe3)
Configuration menu - View commit details
-
Copy full SHA for fc50dce - Browse repository at this point
Copy the full SHA fc50dceView commit details
Commits on Aug 20, 2020
-
[RISCV] Indirect branch generation in position independent code
This fixes the "Unable to insert indirect branch" fatal error sometimes seen when generating position-independent code. Patch by msizanoen1 Reviewed By: jrtc27 Differential Revision: https://reviews.llvm.org/D84833 (cherry picked from commit 5f9ecc5d857fa5d95f6ea36153be19db40576f8a)
Configuration menu - View commit details
-
Copy full SHA for 71c87ee - Browse repository at this point
Copy the full SHA 71c87eeView commit details -
[release][docs] Note on lazy binding and SVE.
Francesco Petrogalli committedAug 20, 2020 Configuration menu - View commit details
-
Copy full SHA for edf75ab - Browse repository at this point
Copy the full SHA edf75abView commit details -
[release][docs] Move SVE release notes to AArch64 section.
Francesco Petrogalli committedAug 20, 2020 Configuration menu - View commit details
-
Copy full SHA for 1fefa51 - Browse repository at this point
Copy the full SHA 1fefa51View commit details
Commits on Aug 24, 2020
-
[PowerPC] Fix a typo for InstAlias of mfsprg
D77531 has a type for mfsprg, it should be mtsprg. This patch is to fix this typo. (cherry picked from commit 95e18b2d9d5f93c209ea81df79c2e18ef77de506)
Configuration menu - View commit details
-
Copy full SHA for ed779a8 - Browse repository at this point
Copy the full SHA ed779a8View commit details
Commits on Aug 25, 2020
-
Reuse OMPIRBuilder
struct ident_t
handling in ClangReplace the `ident_t` handling in Clang with the methods offered by the OMPIRBuilder. This cuts down on the clang code as well as the differences between the two, making further transitions easier. Tests have changed but there should not be a real functional change. The most interesting difference is probably that we stop generating local ident_t allocations for now and just use globals. Given that this happens only with debug info, the location part of the `ident_t` is probably bigger than the test anyway. As the location part is already a global, we can avoid the allocation, memcpy, and store in favor of a constant global that is slightly bigger. This can be revisited if there are complications. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D80735
Configuration menu - View commit details
-
Copy full SHA for fdbb91a - Browse repository at this point
Copy the full SHA fdbb91aView commit details -
[DAGCombine] Remove dead node when it is created by getNegatedExpression
We hit the compiling time reported by https://bugs.llvm.org/show_bug.cgi?id=46877 and the reason is the same as D77319. So we need to remove the dead node we created to avoid increase the problem size of DAGCombiner. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D86183 (cherry picked from commit 960cbc53ca170c8c605bf83fa63b49ab27a56f65)
Configuration menu - View commit details
-
Copy full SHA for e7d24f4 - Browse repository at this point
Copy the full SHA e7d24f4View commit details
Commits on Aug 26, 2020
-
Configuration menu - View commit details
-
Copy full SHA for c77d5eb - Browse repository at this point
Copy the full SHA c77d5ebView commit details -
[release][SVE] Move notes for SVE ACLE to the release notes of clang.
Francesco Petrogalli committedAug 26, 2020 Configuration menu - View commit details
-
Copy full SHA for 88141c7 - Browse repository at this point
Copy the full SHA 88141c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for a48e037 - Browse repository at this point
Copy the full SHA a48e037View commit details -
[MC][SVE] Fix data operand for instruction alias of
st1d
.The version of `st1d` that operates with vector plus immediate addressing mode uses the alias `st1d { <Zn>.d }, <Pg>, [<Za>.d]` for rendering `st1d { <Zn>.d }, <Pg>, [<Za>.d, #0]`. The disassembler was generating `<Zn>.s` instead of `<Zn>.d>`. Differential Revision: https://reviews.llvm.org/D86633
Francesco Petrogalli committedAug 26, 2020 Configuration menu - View commit details
-
Copy full SHA for 5f5352f - Browse repository at this point
Copy the full SHA 5f5352fView commit details
Commits on Aug 28, 2020
-
Configuration menu - View commit details
-
Copy full SHA for f4bf210 - Browse repository at this point
Copy the full SHA f4bf210View commit details -
[PowerPC] PPCBoolRetToInt: Don't translate Constant's operands
When collecting `i1` values via `findAllDefs`, ignore Constant's operands, since Constant's operands might not be `i1`. Fixes https://bugs.llvm.org/show_bug.cgi?id=46923 which causes ICE ``` llvm-project/llvm/lib/IR/Constants.cpp:1924: static llvm::Constant *llvm::ConstantExpr::getZExt(llvm::Constant *, llvm::Type *, bool): Assertion `C->getType()->getScalarSizeInBits() < Ty->getScalarSizeInBits()&& "SrcTy must be smaller than DestTy for ZExt!"' failed. ``` Differential Revision: https://reviews.llvm.org/D85007 (cherry picked from commit cbea17568f4301582c1d5d43990f089ca6cff522)
Configuration menu - View commit details
-
Copy full SHA for a98bee6 - Browse repository at this point
Copy the full SHA a98bee6View commit details -
[CodeGen] Properly propagating Calling Convention information when lo…
…wering vector arguments When joining the legal parts of vector arguments into its original value during the lower of Formal Arguments in SelectionDAGBuilder, the Calling Convention information was not being propagated for the handling of each individual parts. The same did not happen when lowering calls, causing a mismatch. This patch fixes the issue by properly propagating the Calling Convention details. This fixes Bugzilla #47001. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86715 (cherry picked from commit 3d943bcd223e5b97179840c2f5885fe341e51747)
Configuration menu - View commit details
-
Copy full SHA for a073304 - Browse repository at this point
Copy the full SHA a073304View commit details -
[AArch64][SVE] Fix calculation restore point for SVE callee saves.
This fixes an issue where the restore point of callee-saves in the function epilogues was incorrectly calculated when the basic block consisted of only a RET instruction. This caused dealloc instructions to be inserted in between the block of callee-save restore instructions, rather than before it. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D86099 (cherry picked from commit 5f47d4456d192eaea8c56a2b4648023c8743c927)
Configuration menu - View commit details
-
Copy full SHA for daf466c - Browse repository at this point
Copy the full SHA daf466cView commit details -
[AArch64][SVE] Add missing debug info for ACLE types.
This patch adds type information for SVE ACLE vector types, by describing them as vectors, with a lower bound of 0, and an upper bound described by a DWARF expression using the AArch64 Vector Granule register (VG), which contains the runtime multiple of 64bit granules in an SVE vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86101 (cherry picked from commit 4e9b66de3f046c1e97b34c938b0920fa6401f40c)
Configuration menu - View commit details
-
Copy full SHA for 43b4ea8 - Browse repository at this point
Copy the full SHA 43b4ea8View commit details -
[SSP] Restore setting the visibility of __guard_local to hidden for b…
…etter code generation. Patch by: Philip Guenther (cherry picked from commit d870e363263835bec96c83f51b20e64722cad742)
Configuration menu - View commit details
-
Copy full SHA for d812075 - Browse repository at this point
Copy the full SHA d812075View commit details
Commits on Aug 31, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 62d099a - Browse repository at this point
Copy the full SHA 62d099aView commit details -
[DAGCombine] Don't delete the node if it has uses immediately
This is the follow up patch for https://reviews.llvm.org/D86183 as we miss to delete the node if NegX == NegY, which has use after we create the node. ``` if (NegX && (CostX <= CostY)) { Cost = std::min(CostX, CostZ); RemoveDeadNode(NegY); return DAG.getNode(Opcode, DL, VT, NegX, Y, NegZ, Flags); #<-- NegY is used here if NegY == NegX. } ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D86689 (cherry picked from commit deb4b2580715810ecd5cb7eefa5ffbe65e5eedc8)
Configuration menu - View commit details
-
Copy full SHA for 9e52bd7 - Browse repository at this point
Copy the full SHA 9e52bd7View commit details -
[cmake] Don't build with -O3 -fPIC on Solaris/sparcv9
Tests on Solaris/sparcv9 currently show about 250 failures when building with gcc, most of them like the following: FAIL: LLVM-Unit :: Support/./SupportTests/TaskQueueTest.UnOrderedFutures (4269 of 67884) ******************** TEST 'LLVM-Unit :: Support/./SupportTests/TaskQueueTest.UnOrderedFutures' FAILED ******************** Note: Google Test filter = TaskQueueTest.UnOrderedFutures [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from TaskQueueTest [ RUN ] TaskQueueTest.UnOrderedFutures 0 SupportTests 0x0000000100753b20 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 32 1 SupportTests 0x0000000100752974 llvm::sys::RunSignalHandlers() + 68 2 SupportTests 0x0000000100752b18 SignalHandler(int) + 372 3 libc.so.1 0xffffffff7eedc800 __sighndlr + 12 4 libc.so.1 0xffffffff7eecf23c call_user_handler + 852 5 libc.so.1 0xffffffff7eecf594 sigacthandler + 84 6 SupportTests 0x00000001006f8cb8 std::thread::_State_impl<std::thread::_Invoker<std::tuple<llvm::ThreadPool::ThreadPool(llvm::ThreadPoolStrategy)::'lambda'()> > >::_M_run() + 512 7 libstdc++.so.6.0.28 0xfffffffc628117cc execute_native_thread_routine + 16 8 libc.so.1 0xffffffff7eedc6a0 _lwp_start + 0 Since it's effectively impossible to debug such a `SEGV` in a `Release` build, I tried a `Debug` build instead, only to find that the failures had gone away. Further investigation revealed that most of the issue centers around `llvm/lib/Support/ThreadPool.cpp`. That file is built with `-O3 -fPIC` in a `Release` build. The failure vanishes if - compiling without `-fPIC` - compiling with `-O -fPIC` - linking with GNU `ld` instead of Solaris `ld` It has meanwhile been determined that `gcc` doesn't correctly heed some TLS code sequences. To make things worse, Solaris `ld` doesn't properly validate its assumptions against the input, generating wrong code. `gld` like `gcc` is more liberal here and correctly deals with the code it gets fed from `gcc`. There's PR target/96607: GCC feeds SPARC/Solaris linker with unrecognized TLS sequences <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96607> now. An attempt to build with `-DLLVM_ENABLE_PIC=Off` initially failed since neither `libRemarks.so` (D85626 <https://reviews.llvm.org/D85626>) nor `LLVMPolly.so` (D85627 <https://reviews.llvm.org/D85627>) heed that option. Even with that fixed, a few codegen failures remain. Next I tried to build just `ThreadPool.cpp` with `-O -fPIC`. While that fixed the vast majority of the failures, 16 `LLVM :: CodeGen/X86` failures remained. Given that that solution was both incomplete and fragile, I went for building the whole tree with `-O -fPIC` for `Release` and `RelWithDebInfo` builds. As detailed in Bug 47304, 2-stage builds also show large numbers of failures when building with `-O3` or `-O2`, which are likewise worked around by building with `-O` until they are sufficiently analyzed and fixed. This way, all failures relative to a `Debug` build go away. Tested on `sparcv9-sun-solaris2.11`. Differential Revision: https://reviews.llvm.org/D85630 (cherry picked from commit 15c66b10114d239c96282cf8fc5330186178974b)
Configuration menu - View commit details
-
Copy full SHA for 9f6b532 - Browse repository at this point
Copy the full SHA 9f6b532View commit details -
[InstSimplify] Protect against more poison in SimplifyWithOpReplaced …
…(PR47322) Replace the check for poison-producing instructions in SimplifyWithOpReplaced() with the generic helper canCreatePoison() that properly handles poisonous shifts and thus avoids the problem from PR47322. This additionally fixes a bug in IIQ.UseInstrInfo=false mode, which previously could have caused this code to ignore poison flags. Setting UseInstrInfo=false should reduce the possible optimizations, not increase them. This is not a full solution to the problem, as poison could be introduced more indirectly. This is just a minimal, easy to backport fix. Differential Revision: https://reviews.llvm.org/D86834 (cherry picked from commit a5be86fde5de2c253aa19704bf4e4854f1936f8c)
Configuration menu - View commit details
-
Copy full SHA for 42e283c - Browse repository at this point
Copy the full SHA 42e283cView commit details
Commits on Sep 1, 2020
-
[PowerPC] Set v1i128 to expand for SETCC to avoid crash
Summary: PPC only supports the instruction selection for v16i8, v8i16, v4i32, v2i64, v4f32 and v2f64 for ISD::SETCC, don't support the v1i128, so v1i128 for ISD::SETCC will crash. This patch is to set v1i128 to expand to avoid crash. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D84238 (cherry picked from commit 802c043078ad653aca131648a130b59f041df0b5)
Configuration menu - View commit details
-
Copy full SHA for 7664478 - Browse repository at this point
Copy the full SHA 7664478View commit details
Commits on Sep 5, 2020
-
Configuration menu - View commit details
-
Copy full SHA for b00850c - Browse repository at this point
Copy the full SHA b00850cView commit details
Commits on Sep 7, 2020
-
Eliminate the sizing template parameter N from CoalescingBitVector
Since the parameter is not used anywhere, and the default size of 16 apparently causes PR47359, remove it. This ensures that IntervalMap will automatically determine the optimal size, using its NodeSizer struct. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D87044 (cherry picked from commit f26fc568402f84a94557cbe86e7aac8319d61387)
Configuration menu - View commit details
-
Copy full SHA for 0abe177 - Browse repository at this point
Copy the full SHA 0abe177View commit details -
[WebAssembly] Fix incorrect assumption of simple value types
Fixes PR47375, in which an assertion was triggering because WebAssemblyTargetLowering::isVectorLoadExtDesirable was improperly assuming the use of simple value types. Differential Revision: https://reviews.llvm.org/D87110 (cherry picked from commit caee15a0ed52471bd329d01dc253ec9be3936c6d)
Configuration menu - View commit details
-
Copy full SHA for 7da9b1d - Browse repository at this point
Copy the full SHA 7da9b1dView commit details -
[PowerPC] Do not legalize vector FDIV without VSX
Quite a while ago, we legalized these nodes as we added custom handling for reciprocal estimates in the back end. We have since moved to target-independent combines but neglected to turn off legalization. As a result, we can now get selection failures on non-VSX subtargets as evidenced in the listed PR. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47373 (cherry picked from commit 27714075848e7f05a297317ad28ad2570d8e5a43)
Configuration menu - View commit details
-
Copy full SHA for 20da396 - Browse repository at this point
Copy the full SHA 20da396View commit details -
[PowerPC] Fix broken kill flag after MI peephole
The test case in https://bugs.llvm.org/show_bug.cgi?id=47373 exposed two bugs in the PPC back end. The first one was fixed in commit 27714075848e7f05a297317ad28ad2570d8e5a43 but the test case had to be added without -verify-machineinstrs due to the second bug. This commit fixes the use-after-kill that is left behind by the PPC MI peephole optimization. (cherry picked from commit 69289cc10ffd1de4d3bf05d33948e6b21b6e68db)
Configuration menu - View commit details
-
Copy full SHA for 054e5f0 - Browse repository at this point
Copy the full SHA 054e5f0View commit details -
[MachineCopyPropagation] In isNopCopy, check the destination register…
…s match in addition to the source registers. Previously if the source match we asserted that the destination matched. But GPR <-> mask register copies on X86 can violate this since we use the same K-registers for multiple sizes. Fixes this ISPC issue ispc/ispc#1851 Differential Revision: https://reviews.llvm.org/D86507 (cherry picked from commit 4783e2c9c603ed6aeacc76bb1177056a9d307bd1)
Configuration menu - View commit details
-
Copy full SHA for 29f8bec - Browse repository at this point
Copy the full SHA 29f8becView commit details
Commits on Sep 8, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 2a92db2 - Browse repository at this point
Copy the full SHA 2a92db2View commit details -
Provide anchor for compiler extensions
This patch is cherry-picked from 04b0a4e22e3b4549f9d241f8a9f37eebecb62a31, and amended to prevent an undefined reference to `llvm::EnableABIBreakingChecks' (cherry picked from commit 38778e1087b2825e91b07ce4570c70815b49dcdc)
Configuration menu - View commit details
-
Copy full SHA for df4269f - Browse repository at this point
Copy the full SHA df4269fView commit details -
[X86] SSE4_A should only imply SSE3 not SSSE3 in the frontend.
SSE4_1 and SSE4_2 due imply SSSE3. So I guess I got confused when switching the code to being table based in D83273. Fixes PR47464 (cherry picked from commit e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac)
Configuration menu - View commit details
-
Copy full SHA for 6a6cc0b - Browse repository at this point
Copy the full SHA 6a6cc0bView commit details
Commits on Sep 9, 2020
-
[PowerPC] Set setMaxAtomicSizeInBitsSupported appropriately for 32-bi…
…t PowerPC in PPCTargetLowering Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D86165 (cherry picked from commit 88b368a1c47bca536f03041f7464235b94ea98a1)
Configuration menu - View commit details
-
Copy full SHA for f540d30 - Browse repository at this point
Copy the full SHA f540d30View commit details
Commits on Sep 11, 2020
-
[DebugInfo] Fixing CodeView assert related to lowerBound field of DIS…
…ubrange. This is to fix CodeView build failure https://bugs.llvm.org/show_bug.cgi?id=47287 after DIsSubrange upgrade D80197 Assert condition is now removed and Count is calculated in case LowerBound is absent or zero and Count or UpperBound is constant. If Count is unknown it is later handled as VLA (currently Count is set to zero). Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87406 (cherry picked from commit e45b0708ae81ace27de53f12b32a80601cb12bf3)
Configuration menu - View commit details
-
Copy full SHA for 85e75eb - Browse repository at this point
Copy the full SHA 85e75ebView commit details
Commits on Sep 14, 2020
-
[AMDGPU] Fix for folding v2.16 literals.
It was found some packed immediate operands (e.g. `<half 1.0, half 2.0>`) are incorrectly processed so one of two packed values were lost. Introduced new function to check immediate 32-bit operand can be folded. Converted condition about current op_sel flags value to fall-through. Fixes: SWDEV-247595 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D87158 (cherry picked from commit d03c4034dc80c944ec4a5833ba8f87d60183f866)
Configuration menu - View commit details
-
Copy full SHA for f527c84 - Browse repository at this point
Copy the full SHA f527c84View commit details
Commits on Sep 15, 2020
-
Reduce code duplication in simplifySelectWithICmpCond (NFC)
Canonicalize icmp ne to icmp eq and implement all the folds only once.
Configuration menu - View commit details
-
Copy full SHA for c5c1bd4 - Browse repository at this point
Copy the full SHA c5c1bd4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8dd4fad - Browse repository at this point
Copy the full SHA 8dd4fadView commit details -
Fix incorrect SimplifyWithOpReplaced transform (PR47322)
This is a followup to D86834, which partially fixed this issue in InstSimplify. However, InstCombine repeats the same transform while dropping poison flags -- which does not cover cases where poison is introduced in some other way. The fix here is a bit more comprehensive, because things are quite entangled, and it's hard to only partially address it without regressing optimization. There are really two changes here: * Export the SimplifyWithOpReplaced API from InstSimplify, with an added AllowRefinement flag. For replacements inside the TrueVal we don't actually care whether refinement occurs or not, the replacement is always legal. This part of the transform is now done in InstSimplify only. (It should be noted that the current AllowRefinement check is not sufficient -- that's an issue we need to address separately.) * Change the InstCombine fold to work by temporarily dropping poison generating flags, running the fold and then restoring the flags if it didn't work out. This will ensure that the InstCombine fold is correct as long as the InstSimplify fold is correct. Differential Revision: https://reviews.llvm.org/D87445
Configuration menu - View commit details
-
Copy full SHA for 6745ba4 - Browse repository at this point
Copy the full SHA 6745ba4View commit details -
[SelectionDAG] Remove unused FP constant in getNegatedExpression
960cbc53 immediately removes nodes that won't be used to avoid compilation time explosion. This patch adds the removal to constants to fix PR47517. Reviewed By: RKSimon, steven.zhang Differential Revision: https://reviews.llvm.org/D87614 (cherry picked from commit 2508ef014e8b01006de4e5ee6fd451d1f68d550f)
Configuration menu - View commit details
-
Copy full SHA for 9491343 - Browse repository at this point
Copy the full SHA 9491343View commit details -
[FastISel] Bail out of selectGetElementPtr for vector GEPs.
The code that decomposes the GEP into ADD/MUL doesn't work properly for vector GEPs. It can create bad COPY instructions or possibly assert. For now just bail out to SelectionDAG. Fixes PR45906 (cherry picked from commit 4208ea3e19f8e3e8cd35e6f5a6c43f4aa066c6ec)
Configuration menu - View commit details
-
Copy full SHA for 7b6b353 - Browse repository at this point
Copy the full SHA 7b6b353View commit details -
Revert "Double check that passes correctly set their Modified status"
This check fires during self-host. > The approach is simple: if a pass reports that it's not modifying a > Function/Module, compute a loose hash of that Function/Module and compare it > with the original one. If we report no change but there's a hash change, then we > have an error. > > This approach misses a lot of change but it's not super intrusive and can > detect most of the simple mistakes. > > Differential Revision: https://reviews.llvm.org/D80916 This reverts commit 3667d87a33d3c8d4072a41fd84bb880c59347dc0.
Configuration menu - View commit details
-
Copy full SHA for d3f2114 - Browse repository at this point
Copy the full SHA d3f2114View commit details -
Revert "[SelectionDAG] Remove unused FP constant in getNegatedExpress…
…ion" 2508ef01 doesn't totally fix the issue since we did not handle the case when unused temporary negated result is the same with the result, which is found by address sanitizer. (cherry picked from commit e1669843f2aaf1e4929afdd8f125c14536d27664)
Configuration menu - View commit details
-
Copy full SHA for 4daf36a - Browse repository at this point
Copy the full SHA 4daf36aView commit details -
[Docs] Add/update release notes for D71913 (LTO WPD changes)
This adds documentation for the options added / changed by D71913, which enabled aggressive WPD under LTO. The lld release notes already mentioned it, but I expanded the note. Differential Revision: https://reviews.llvm.org/D86958
Configuration menu - View commit details
-
Copy full SHA for 73b4967 - Browse repository at this point
Copy the full SHA 73b4967View commit details -
Revert "RegAllocFast: Record internal state based on register units"
This seems to have caused incorrect register allocation in some cases, breaking tests in the Zig standard library (PR47278). As discussed on the bug, revert back to green for now. > Record internal state based on register units. This is often more > efficient as there are typically fewer register units to update > compared to iterating over all the aliases of a register. > > Original patch by Matthias Braun, but I've been rebasing and fixing it > for almost 2 years and fixed a few bugs causing intermediate failures > to make this patch independent of the changes in > https://reviews.llvm.org/D52010. This reverts commit 66251f7e1de79a7c1620659b7f58352b8c8e892e, and follow-ups 931a68f26b9a3de853807ffad7b2cd0a2dd30922 and 0671a4c5087d40450603d9d26cf239f1a8b1367e. It also adjust some test expectations. (cherry picked from commit a21387c65470417c58021f8d3194a4510bb64f46)
Configuration menu - View commit details
-
Copy full SHA for 068754a - Browse repository at this point
Copy the full SHA 068754aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 88fac78 - Browse repository at this point
Copy the full SHA 88fac78View commit details
Commits on Sep 16, 2020
-
Configuration menu - View commit details
-
Copy full SHA for e9634e0 - Browse repository at this point
Copy the full SHA e9634e0View commit details
Commits on Sep 17, 2020
-
[X86][ELF] Prefer lowering MC_GlobalAddress operands to .Lfoo$local f…
…or STV_DEFAULT only This patch restricts the behaviour of referencing via .Lfoo$local local aliases, introduced in https://reviews.llvm.org/D73230, to STV_DEFAULT globals only. Hidden symbols via --fvisiblity=hidden (https://gcc.gnu.org/wiki/Visibility) is an important scenario. Benefits: - Improves the size of object files by using fewer STT_SECTION symbols. - The code reads a bit better (it was not obvious to me without going back to the code reviews why the canBenefitFromLocalAlias function currently doesn't consider visibility). - There is also a side benefit in restoring the effectiveness of the --wrap linker option and making the behavior of --wrap consistent between LTO and normal builds for references within a translation-unit. Note: this --wrap behavior (which is specific to LLD) should not be considered reliable. See comments on https://reviews.llvm.org/D73230 for more. Differential Revision: https://reviews.llvm.org/D85782 (cherry picked from commit 4cb016cd2d8467c572b2e5c5d34f376ee79e4ac1)
Configuration menu - View commit details
-
Copy full SHA for d8484b5 - Browse repository at this point
Copy the full SHA d8484b5View commit details -
[SelectionDAG] Check any use of negation result before removal
2508ef01 fixed a bug about constant removal in negation. But after sanitizing check I found there's still some issue about it so it's reverted. Temporary nodes will be removed if useless in negation. Before the removal, they'd be checked if any other nodes used it. So the removal was moved after getNode. However in rare cases the node to be removed is the same as result of getNode. We missed that and will be fixed by this patch. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87614 (cherry picked from commit a2fb5446be960ad164060b3c05fc268f7f72d67a)
Configuration menu - View commit details
-
Copy full SHA for b157b64 - Browse repository at this point
Copy the full SHA b157b64View commit details
Commits on Sep 22, 2020
-
[llvm] Add contains(KeyType) -> bool methods to SmallPtrSet
Matches C++20 API addition. Differential Revision: https://reviews.llvm.org/D83449 (cherry picked from commit a0385bd7acd6e1d16224b4257f4cb50e59f1d75e)
Configuration menu - View commit details
-
Copy full SHA for a654ae5 - Browse repository at this point
Copy the full SHA a654ae5View commit details -
PR47468: Fix findPHICopyInsertPoint, so that copies aren't incorrectl…
…y inserted after an INLINEASM_BR. findPHICopyInsertPoint special cases placement in a block with a callbr or invoke in it. In that case, we must ensure that the copy is placed before the INLINEASM_BR or call instruction, if the register is defined prior to that instruction, because it may jump out of the block. Previously, the code placed it immediately after the last def _or use_. This is wrong, if the use is the instruction which may jump. We could correctly place it immediately after the last def (ignoring uses), but that is non-optimal for register pressure. Instead, place the copy after the last def, or before the call/inlineasm_br, whichever is later. Differential Revision: https://reviews.llvm.org/D87865 (cherry picked from commit f7a53d82c0902147909f28a9295a9d00b4b27d38)
Configuration menu - View commit details
-
Copy full SHA for 5f31397 - Browse repository at this point
Copy the full SHA 5f31397View commit details -
[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGB…
…uilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla #47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844 (cherry picked from commit 53d238a961d14eae46f6f2b296ce48026c7bd0a1)
Configuration menu - View commit details
-
Copy full SHA for 8fcafdf - Browse repository at this point
Copy the full SHA 8fcafdfView commit details -
Configuration menu - View commit details
-
Copy full SHA for ca00b88 - Browse repository at this point
Copy the full SHA ca00b88View commit details
Commits on Sep 24, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 6dd12f3 - Browse repository at this point
Copy the full SHA 6dd12f3View commit details
Commits on Sep 25, 2020
-
AArch64/GlobalISel: Reduced patch for bug 47619
This is the relevant portions of an assert fixed by b98f902f1877c3d679f77645a267edc89ffcd5d6.
Configuration menu - View commit details
-
Copy full SHA for 2d82761 - Browse repository at this point
Copy the full SHA 2d82761View commit details
Commits on Sep 28, 2020
-
AArch64/GlobalISel: Narrow stack passed argument access size
This fixes a verifier error in the testcase from bug 47619. The stack passed s3 value was widened to 4-bytes, and producing a 4-byte memory access with a < 1 byte result type. We need to either widen the result type or narrow the access size. This copies the code directly from the AMDGPU handling, which narrows the load size. I don't like that every target has to handle this, but this is currently broken on the 11 release branch and this is the simplest fix. This reverts commit 42bfa7c63b85e76fe16521d1671afcafaf8f64ed. (cherry picked from commit 6cb0d23f2ea6fb25106b0380797ccbc2141d71e1)
Configuration menu - View commit details
-
Copy full SHA for 8cd42ca - Browse repository at this point
Copy the full SHA 8cd42caView commit details -
[CodeGen] Do not call
emitGlobalConstantLargeInt
for constant requi……res 8 bytes to store This is a fix for PR47630. The regression is caused by the D78011. After this change the code starts to call the `emitGlobalConstantLargeInt` even for constants which requires eight bytes to store. Differential revision: https://reviews.llvm.org/D88261 (cherry picked from commit c6c5629f2fb4ddabd376fbe7c218733283e91d09)
Configuration menu - View commit details
-
Copy full SHA for 172c27d - Browse repository at this point
Copy the full SHA 172c27dView commit details -
C API: functions to get mask of a ShuffleVector
This commit fixes a regression (from LLVM 10 to LLVM 11 RC3) in the LLVM C API. Previously, commit 1ee6ec2bf removed the mask operand from the ShuffleVector instruction, storing the mask data separately in the instruction instead; this reduced the number of operands of ShuffleVector from 3 to 2. AFAICT, this change unintentionally caused a regression in the LLVM C API. Specifically, it is no longer possible to get the mask of a ShuffleVector instruction through the C API. This patch introduces new functions which together allow a C API user to get the mask of a ShuffleVector instruction, restoring the functionality which was previously available through LLVMGetOperand(). This patch also adds tests for this change to the llvm-c-test executable, which involved adding support for InsertElement, ExtractElement, and ShuffleVector itself (as well as constant vectors) to echo.cpp. Previously, vector operations weren't tested at all in echo.ll. I also fixed some typos in comments and help-text nearby these changes, which I happened to spot while developing this patch. Since the typo fixes are technically unrelated other than being in the same files, I'm happy to take them out if you'd rather they not be included in the patch. Differential Revision: https://reviews.llvm.org/D88190 (cherry picked from commit 51cad041e0cb26597c7ccc0fbfaa349b8fffbcda)
Configuration menu - View commit details
-
Copy full SHA for 810086a - Browse repository at this point
Copy the full SHA 810086aView commit details -
[LLVM-C] Turn a ShuffleVector Constant Into a Getter.
It is not a good idea to expose raw constants in the LLVM C API. Replace this with an explicit getter. Differential Revision: https://reviews.llvm.org/D88367 (cherry picked from commit 55f727306e727ea9f013d09c9b8aa70dbce6a1bd)
Configuration menu - View commit details
-
Copy full SHA for 8a3d6aa - Browse repository at this point
Copy the full SHA 8a3d6aaView commit details -
Fix mysterious failure of SupportTests FileCheckTest.Binop
The test would fail in no-asserts release builds using MSVC for 64-bit Windows: Unexpected error message: TestBuffer:1:1: error: implicit format conflict between 'FOO' (%u) and '18\0' (%x), need an explicit format specifier Error message(s) not found: {implicit format conflict between 'FOO' (%u) and 'BAZ' (%x), need an explicit format specifier} It seems a string from a previous test case is finding its way into the latter one. This doesn't reproduce on master anymore after 998709b7d, so let's just hack around it here for the branch.
Configuration menu - View commit details
-
Copy full SHA for 2e0afe6 - Browse repository at this point
Copy the full SHA 2e0afe6View commit details
Commits on Sep 29, 2020
-
[LLVM 11] Add SystemZ changes to release notes
Differential Revision: https://reviews.llvm.org/D88479
Configuration menu - View commit details
-
Copy full SHA for 5603084 - Browse repository at this point
Copy the full SHA 5603084View commit details
Commits on Sep 30, 2020
-
[GlobalISel] Fix multiply with overflow intrinsics legalization gener…
…ating invalid MIR. During lowering of G_UMULO and friends, the previous code moved the builder's insertion point to be after the legalizing instruction. When that happened, if there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder would try to find that constant during the buildConstant(zero) call, and since it dominates itself would return the iterator unchanged, even though the def of the constant was *after* the current insertion point. This resulted in the compare being generated *before* the constant which it was using. There's no need to modify the insertion point before building the mul-hi or constant. Delaying moving the insert point ensures those are built/CSEd before the G_ICMP is built. Fixes PR47679 Differential Revision: https://reviews.llvm.org/D88514 (cherry picked from commit 1d54e75cf26a4c60b66659d5d9c62f4bb9452b03)
Configuration menu - View commit details
-
Copy full SHA for b8f4c23 - Browse repository at this point
Copy the full SHA b8f4c23View commit details -
[APFloat] prevent NaN morphing into Inf on conversion (PR43907)
We shift the significand right on a truncation, but that needs to be made NaN-safe: always set at least 1 bit in the significand. https://llvm.org/PR43907 See D88238 for the likely follow-up (but needs some plumbing fixes before it can proceed). Differential Revision: https://reviews.llvm.org/D87835 (cherry picked from commit e34bd1e0b03d20a506ada156d87e1b3a96d82fa2)
Configuration menu - View commit details
-
Copy full SHA for eaf2635 - Browse repository at this point
Copy the full SHA eaf2635View commit details
Commits on Oct 1, 2020
-
Configuration menu - View commit details
-
Copy full SHA for f137979 - Browse repository at this point
Copy the full SHA f137979View commit details -
Fix indentation for PowerPC ReleaseNotes
Ahsan Saghir committedOct 1, 2020 Configuration menu - View commit details
-
Copy full SHA for 6228b2b - Browse repository at this point
Copy the full SHA 6228b2bView commit details
Commits on Oct 5, 2020
-
ReleaseNotes: mention the machine outliner for ARM
As suggested by Yvan.
Configuration menu - View commit details
-
Copy full SHA for 67f791e - Browse repository at this point
Copy the full SHA 67f791eView commit details
Commits on Oct 6, 2020
-
[SelectionDAG] Don't remove unused negated constant immediately
This reverts partial of a2fb5446 (actually, 2508ef01) about removing negated FP constant immediately if it has no uses. However, as discussed in bug 47517, there're cases when NegX is folded into constant from other places while NegY is removed by that line of code and NegX is equal to NegY. In these cases, NegX is deleted before used and crash happens. So revert the code and add necessary test case. (cherry picked from commit b326d4ff946d2061a566a3fcce9f33b484759fe0)
Configuration menu - View commit details
-
Copy full SHA for ce10659 - Browse repository at this point
Copy the full SHA ce10659View commit details
Commits on Oct 7, 2020
-
[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR
Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823 (cherry picked from commit d2c61d2bf9bd1efad49acba2f2751112522686aa)
Configuration menu - View commit details
-
Copy full SHA for 1739071 - Browse repository at this point
Copy the full SHA 1739071View commit details
Commits on Nov 1, 2020
-
X86: Fix/workaround Small Code Model for JIT
Force RIP-relative jump tables and global values Force RIP-relative all zeros / all ones constants These things were causing crashes due to use of absolute addressing
Configuration menu - View commit details
-
Copy full SHA for 391548e - Browse repository at this point
Copy the full SHA 391548eView commit details -
MCJIT: don't finalize modules on symbol lookup (workaround)
This is extremely slow yet unnecessary with manual finalization. In LLVM 6 this wasn't a problem.
Configuration menu - View commit details
-
Copy full SHA for 5a08b9c - Browse repository at this point
Copy the full SHA 5a08b9cView commit details -
Disable GDBRegistrationListener
It makes emitting object extremely slow. GDB doesn't work properly with it anyway. GDB also often crashes because it cannot read the format.
Configuration menu - View commit details
-
Copy full SHA for c9f0684 - Browse repository at this point
Copy the full SHA c9f0684View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9eac890 - Browse repository at this point
Copy the full SHA 9eac890View commit details -
Configuration menu - View commit details
-
Copy full SHA for e9d729d - Browse repository at this point
Copy the full SHA e9d729dView commit details
Commits on Nov 2, 2020
-
X86: LowerShift: new algorithm for vector-vector shifts
Emit pair of shifts of double size if possible
Configuration menu - View commit details
-
Copy full SHA for 999da05 - Browse repository at this point
Copy the full SHA 999da05View commit details -
Configuration menu - View commit details
-
Copy full SHA for 883d6df - Browse repository at this point
Copy the full SHA 883d6dfView commit details -
X86: expand detectAVGPattern()
Allow all integer widths in the pattern, allow ashr Handle signed and mixed cases, allowing to replace truncation
Configuration menu - View commit details
-
Copy full SHA for b7587d5 - Browse repository at this point
Copy the full SHA b7587d5View commit details -
X86: add pattern for X86ISD::VSRAV
Detect clamping ashr shift amount to max legal value
Configuration menu - View commit details
-
Copy full SHA for 3223efa - Browse repository at this point
Copy the full SHA 3223efaView commit details -
X86: add patterns for X86ISD::VSHLV and X86ISD::VSRLV
Replace VSELECT instruction which zeroes their result on exceeding legal SHL/SRL shift amount.
Configuration menu - View commit details
-
Copy full SHA for 9c0f762 - Browse repository at this point
Copy the full SHA 9c0f762View commit details -
X86: avoid vector-scalar shifts if splat amount is directly a vector …
…ADD/SUB/AND op. Prefer vector-vector shifts if available (AVX2+). Improves code generated for rotate and funnel shifts. Otherwise it would generate a shuffle + slower vector-scalar shift.
Configuration menu - View commit details
-
Copy full SHA for ec657b9 - Browse repository at this point
Copy the full SHA ec657b9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8c02f52 - Browse repository at this point
Copy the full SHA 8c02f52View commit details
Commits on Nov 3, 2020
-
The test directory severely inflates the size of the AUR clone, and we're not even using the tests
Configuration menu - View commit details
-
Copy full SHA for 662c2b4 - Browse repository at this point
Copy the full SHA 662c2b4View commit details
Commits on Nov 6, 2020
-
Treat Zen3 as Zen2 until upstream adds Zen3 support
Configuration menu - View commit details
-
Copy full SHA for d74d689 - Browse repository at this point
Copy the full SHA d74d689View commit details
Commits on Dec 3, 2020
-
Configuration menu - View commit details
-
Copy full SHA for cb7748d - Browse repository at this point
Copy the full SHA cb7748dView commit details
Commits on Jan 9, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 716bb29 - Browse repository at this point
Copy the full SHA 716bb29View commit details
Commits on Mar 28, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 5d8643e - Browse repository at this point
Copy the full SHA 5d8643eView commit details
Commits on Apr 18, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 004a051 - Browse repository at this point
Copy the full SHA 004a051View commit details