-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sycl web #14302
Commits on Jun 22, 2024
-
[CodeGen][NewPM] Extract MachineFunctionProperties modification part …
…to an RAII class (#94854) Modify MachineFunctionProperties in PassModel makes `PassT P; P.run(...);` not work properly. This is a necessary compromise.
Configuration menu - View commit details
-
Copy full SHA for 8e9c6bf - Browse repository at this point
Copy the full SHA 8e9c6bfView commit details -
[clang-format] Don't count template template parameter as declaration…
… (#95025) In ContinuationIndenter::mustBreak, a break is required between a template declaration and the function/class declaration it applies to, if the template declaration spans multiple lines. However, this also includes template template parameters, which can cause extra erroneous line breaks in some declarations. This patch makes template template parameters not be counted as template declarations. Fixes llvm/llvm-project#93793 Fixes llvm/llvm-project#48746
Configuration menu - View commit details
-
Copy full SHA for 4a7bf42 - Browse repository at this point
Copy the full SHA 4a7bf42View commit details -
[NewPM] Add deduction guide to
MFPropsModifier
to suppress warning ……(#96384) Buildbot `clang-ppc64le-rhel` failed with: ```sh error: 'MFPropsModifier' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported] note: add a deduction guide to suppress this warning ``` after #94854. This PR adds deduction guide explicitly to suppress warning.
Configuration menu - View commit details
-
Copy full SHA for 4145ad2 - Browse repository at this point
Copy the full SHA 4145ad2View commit details -
Revert "[clang-format] Don't count template template parameter as dec…
…laration" (#96388) Reverts llvm/llvm-project#95025 ; many bots are broken
Configuration menu - View commit details
-
Copy full SHA for 34d44eb - Browse repository at this point
Copy the full SHA 34d44ebView commit details -
[lldb] Resolve executables more aggressively on the host
When unifying the ResolveExecutable implementations in #96256, I missed that RemoteAwarePlatform was able to resolve executables more aggressively. The host platform can rely on the current working directory to make relative paths absolute and resolve things like home directories. This should fix command-target-create-resolve-exe.test.
Configuration menu - View commit details
-
Copy full SHA for c3fe1c4 - Browse repository at this point
Copy the full SHA c3fe1c4View commit details -
[clang][utils] Remove ClangDataFormat.py for now (#96385)
This formatter doesn't currently provide much value. It only formats `SourceLocation` and `QualType`. The only formatting it does for `QualType` is call `getAsString()` on it. The main motivator for the removal however is that the formatter implementation can be very slow (since it uses the expression evaluator in non-trivial ways). Not infrequently do we get reports about LLDB being slow when debugging Clang, and it turns out the user was loading `ClangDataFormat.py` in their `.lldbinit` by default. We should eventually develop proper formatters for Clang data-types, but these are currently not ready. So this patch removes them in the meantime to avoid users shooting themselves in the foot, and giving the wrong impression of these being reference implementations.
Configuration menu - View commit details
-
Copy full SHA for 0fccae9 - Browse repository at this point
Copy the full SHA 0fccae9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 485d7ea - Browse repository at this point
Copy the full SHA 485d7eaView commit details -
[InstCombine] (uitofp bool X) * Y --> X ? Y : 0 (#96216)
Fold `mul (uitofp i1 X), Y` to `select i1 X, Y, 0.0` when the `mul` is `nnan` and `nsz` Proof: https://alive2.llvm.org/ce/z/_stiPm
Configuration menu - View commit details
-
Copy full SHA for a4ca225 - Browse repository at this point
Copy the full SHA a4ca225View commit details -
[clang][Interp] Fix CFStringMakeConstantString etc. evaluation
We're ultimately expected to return an APValue simply pointing to the CallExpr, not any useful value. Do that by creating a global variable for the call.
Configuration menu - View commit details
-
Copy full SHA for 170c194 - Browse repository at this point
Copy the full SHA 170c194View commit details -
[MC] AttemptToFoldSymbolOffsetDifference: remove MCDummyFragment chec…
Configuration menu - View commit details
-
Copy full SHA for 8fa4fe1 - Browse repository at this point
Copy the full SHA 8fa4fe1View commit details -
[ARM64EC] Fix thunks for vector args (#96003)
The checks when building a thunk to decide if an arg needed to be cast to/from an integer or redirected via a pointer didn't match how arg types were changed in `canonicalizeThunkType`, this caused LLVM to ICE when using vector types as args due to incorrect types in a call instruction. Instead of duplicating these checks, we should check if the arg type differs between x64 and AArch64 and then cast or redirect as appropriate.
Configuration menu - View commit details
-
Copy full SHA for 2c9c22c - Browse repository at this point
Copy the full SHA 2c9c22cView commit details -
[clang-format] Don't count template template parameter as declaration…
Configuration menu - View commit details
-
Copy full SHA for 6621505 - Browse repository at this point
Copy the full SHA 6621505View commit details -
[Support] Avoid warning about possibly uninitialized variable in form…
…at_provider (#95704) The original implementation of HelperFunctions::consumeHexStyle always sets Style when it returns true, but this is difficult for a compiler to understand since it requires seeing that Str starts with either an "x" or an "X" when starts_with_insensitive("x") return true. In particular, g++ 12 warns that HS may be used uninitialized in the format_provider::format caller. Change HelperFunctions::consumeHexStyle to return an optional HexPrintStyle and to make the fact that Str necessarily starts with an "X" when all other cases do not apply more explicit. This helps both the compiler and the human reader of the code. Co-authored-by: Sven Verdoolaege <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ca5ba2e - Browse repository at this point
Copy the full SHA ca5ba2eView commit details -
Configuration menu - View commit details
-
Copy full SHA for fc23564 - Browse repository at this point
Copy the full SHA fc23564View commit details -
[MC] Allocate MCFragment with a bump allocator
#95197 and 7500646 eliminated all raw `new MCXXXFragment`. We can now place fragments in a bump allocator. In addition, remove the dead `Kind == FragmentType(~0)` condition. ~CodeViewContext may call `StrTabFragment->destroy()` and need to be reset before `FragmentAllocator.Reset()`. Tested by llvm/test/MC/COFF/cv-compiler-info.ll using asan. Pull Request: llvm/llvm-project#96402
Configuration menu - View commit details
-
Copy full SHA for 8cb6e58 - Browse repository at this point
Copy the full SHA 8cb6e58View commit details -
[MC] Move computeBundlePadding closer to its only caller. NFC
There is only one caller after #95188.
Configuration menu - View commit details
-
Copy full SHA for c9f6a5e - Browse repository at this point
Copy the full SHA c9f6a5eView commit details -
Configuration menu - View commit details
-
Copy full SHA for f5b93ae - Browse repository at this point
Copy the full SHA f5b93aeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3ba7599 - Browse repository at this point
Copy the full SHA 3ba7599View commit details -
[Serialization] Change input file content hash from size_t to uint64_t
https://reviews.llvm.org/D67249 added content hash (see -fvalidate-ast-input-files-content) using llvm::hash_code (size_t). The hash value is 32-bit on 32-bit systems, which was unintentional. Fix #96379: #96136 switched the hash function to xxh3_64bit but did not update the ContentHash type, leading to mismatch between ASTReader and ASTWriter.
Configuration menu - View commit details
-
Copy full SHA for f3005d5 - Browse repository at this point
Copy the full SHA f3005d5View commit details
Commits on Jun 23, 2024
-
[HLSL][clang] Add elementwise builtins for trig intrinsics (#95999)
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This is part 3 of 4 PRs. It sets the ground work for using the intrinsics in HLSL. Add HLSL frontend apis for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh` llvm/llvm-project#70079 llvm/llvm-project#70080 llvm/llvm-project#70081 llvm/llvm-project#70083 llvm/llvm-project#70084 llvm/llvm-project#95966
Configuration menu - View commit details
-
Copy full SHA for f73ac21 - Browse repository at this point
Copy the full SHA f73ac21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 05ba5c0 - Browse repository at this point
Copy the full SHA 05ba5c0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 61c4d7b - Browse repository at this point
Copy the full SHA 61c4d7bView commit details -
[InstCombine] Improve coverage of
foldSelectValueEquivalence
for no……n-constants If f(Y) simplifies to Y, replace with Y. This requires Y to be non-undef. Closes #94719
Configuration menu - View commit details
-
Copy full SHA for b37a4b9 - Browse repository at this point
Copy the full SHA b37a4b9View commit details -
[MC] Change Subsection parameters from const MCExpr * to uint32_t
Follow-up to 05ba5c0. uint32_t is preferred over const MCExpr * in the section stack uses because it should only be evaluated once. Change the paramter type to match.
Configuration menu - View commit details
-
Copy full SHA for 95f983f - Browse repository at this point
Copy the full SHA 95f983fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6ec1ddf - Browse repository at this point
Copy the full SHA 6ec1ddfView commit details -
Configuration menu - View commit details
-
Copy full SHA for e7622ab - Browse repository at this point
Copy the full SHA e7622abView commit details -
[mlir][NVVM] Disallow results on kernel functions (#96399)
Functions that have the `nvvm.kernel` attribute should have 0 results.
Configuration menu - View commit details
-
Copy full SHA for 346c4a8 - Browse repository at this point
Copy the full SHA 346c4a8View commit details -
[mlir][GPUToNVVM] Fix memref function args/results (#96392)
The `gpu.func` op lowering accounts for memref arguments/results (both "normal" and bare-pointer supported), but the `gpu.return` op lowering did not. The lowering produced invalid IR that did not verify. This commit uses the same lowering strategy as for `func.return` in the `gpu.return` lowering. (The C++ implementation is copied. We may want to share some code between `func` and `gpu` lowerings in the future.)
Configuration menu - View commit details
-
Copy full SHA for 3f33d2f - Browse repository at this point
Copy the full SHA 3f33d2fView commit details -
AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (#95592)
Define subtarget features for atomic fmin/fmax support. The flat/global support is a real messe. We had float/double support at the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them. gfx11 removed the f64 versions again. gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.
Configuration menu - View commit details
-
Copy full SHA for a440a96 - Browse repository at this point
Copy the full SHA a440a96View commit details -
Configuration menu - View commit details
-
Copy full SHA for 414c741 - Browse repository at this point
Copy the full SHA 414c741View commit details -
AMDGPU: Remove ds atomic fadd intrinsics (#95396)
These have been replaced with atomicrmw fadd
Configuration menu - View commit details
-
Copy full SHA for 70c8b9c - Browse repository at this point
Copy the full SHA 70c8b9cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3ae6755 - Browse repository at this point
Copy the full SHA 3ae6755View commit details -
[InstCombine] fold
(a == c && b != c) || (a != c && b == c))
to `(a…… == c) == (b != c)` (#94915) resolves llvm/llvm-project#92966 alive proof https://alive2.llvm.org/ce/z/bLAQBS
Configuration menu - View commit details
-
Copy full SHA for f7fc72e - Browse repository at this point
Copy the full SHA f7fc72eView commit details -
[OpenMP][LLVM] Clone
omp.private
op in the parent module (#96024)Fixes a crash uncovered by test 0007_0019.f90 in the Fujitsu test suite. Previously, in the `PrivCB`, we cloned the `omp.private` op without inserting it in the parent module of the original op. This causes issues whenever there is an op that needs to lookup the parent module (e.g. for symbol lookup). This PR fixes the issue by cloning in the parent module instead of creating an orphaned op.
Configuration menu - View commit details
-
Copy full SHA for f372bb4 - Browse repository at this point
Copy the full SHA f372bb4View commit details -
[RISCV] Relax RISCVInsertVSETVLI output VL peeking to cover registers…
… (#96200) If the AVL in a VSETVLIInfo is the output VL of a vsetvli with the same VLMAX, we treat it as the AVL of said vsetvli. This allows us to remove a true dependency as well as treating VSETVLIInfos as equal in more places and avoid toggles. We do this in two places, needVSETVLI and computeInfoForInstr. However we don't do this in computeInfoForInstr's vsetvli equivalent, getInfoForVSETVLI. We also have a restriction only in computeInfoForInstr that the AVL can't be a register as we want to avoid extending live ranges. This patch does two interlinked things: 1) It adds this AVL "peeking" to getInfoForVSETVLI 2) It relaxes the constraint that the AVL can't be a register in computeInfoForInstr, since it removes a use of the output VL which can actually reduce register pressure. E.g. see the diff in @vector_init_vsetvli_N and @test6 Now that getInfoForVSETVLI and computeInfoForInstr are consistent, we can remove the check in needVSETVLI. We also need to update how we update LiveIntervals in insertVSETVLI, as we can now end up needing to extend the LiveRange of the AVL across blocks.
Configuration menu - View commit details
-
Copy full SHA for eb76bc3 - Browse repository at this point
Copy the full SHA eb76bc3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8990763 - Browse repository at this point
Copy the full SHA 8990763View commit details -
Configuration menu - View commit details
-
Copy full SHA for c19028f - Browse repository at this point
Copy the full SHA c19028fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 48cf6b6 - Browse repository at this point
Copy the full SHA 48cf6b6View commit details -
[mlir][memref] Improve
memref.subview
type inference (#96421)The `memref.subview` result type inference (`SubViewOp::inferResultType`) sometimes used to produce a dynamic offset when a static offset is possible. When a dynamic value (stride, size, etc.) is multiplied with zero, the result is always a "static 0". Based on this, the result type inference implementation can be improved to produce more static type information in memref types.
Configuration menu - View commit details
-
Copy full SHA for 6dc8de7 - Browse repository at this point
Copy the full SHA 6dc8de7View commit details -
[MC] Ensure all new sections have a MCDataFragment
MCAssembler::layout ensures that every section has at least one fragment, which simplifies MCAsmLayout::getSectionAddressSize (see e73353c from 2010). It's better to ensure the condition is satisfied at create time (COFF, GOFF, Mach-O) to simplify more fragment processing.
Configuration menu - View commit details
-
Copy full SHA for 21fac2d - Browse repository at this point
Copy the full SHA 21fac2dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 05d167f - Browse repository at this point
Copy the full SHA 05d167fView commit details -
[VPlan] Restructure code for BranchOnCond codegen. (NFCI)
Reoder code to exit early if the BranchOnCond isn't in an exiting block. This delays retrieving the parent region, which may not be present. Split off from llvm/llvm-project#92651.
Configuration menu - View commit details
-
Copy full SHA for ab9c2b1 - Browse repository at this point
Copy the full SHA ab9c2b1View commit details -
[VPlan] Rename Preheader -> Entry in createInitialVPlan (NFCI).
Split off from llvm/llvm-project#92651 as suggested.
Configuration menu - View commit details
-
Copy full SHA for 31a94bd - Browse repository at this point
Copy the full SHA 31a94bdView commit details -
[VPlan] Rewrite cloneSESE to use 2 depth-first passes (NFCI).
Rewrite cloneSESE to perform 2 depth-first passes with the first one cloning blocks and the second one updating the predecessors and successors. This is needed to preserve the correct predecessor/successor ordering with llvm/llvm-project#92651 and has been split off as suggested.
Configuration menu - View commit details
-
Copy full SHA for ef1773a - Browse repository at this point
Copy the full SHA ef1773aView commit details -
[MC,COFF] Register .llvm.call-graph-profile in finalizeImpl
All sections should have been created before MCAssembler::layout so that every section has an ordinal. Registering the section in WinCOFFObjectWriter::executePostLayoutBinding is a hack.
Configuration menu - View commit details
-
Copy full SHA for de0d482 - Browse repository at this point
Copy the full SHA de0d482View commit details -
[libc++][NFC] Replace _NOEXCEPT and _LIBCPP_CONSTEXPR macros with the…
… keywords in C++11 code (#96387)
Configuration menu - View commit details
-
Copy full SHA for 1f98ac0 - Browse repository at this point
Copy the full SHA 1f98ac0View commit details -
Configuration menu - View commit details
-
Copy full SHA for b704868 - Browse repository at this point
Copy the full SHA b704868View commit details -
[AArch64] Fix argument passing in reserved registers for preserve_non…
…ecc (#96259) These registers include: - X19, used by LLVM as the base pointer - X15 on Windows, where it is used for stack allocation. It can still be used on Linux/Darwin. - Adjust FrameLowering scratch register code to not assume X9 is available if the calling convention is preserve_nonecc. The code will then pick an unused register as scratch, and allow X9 to continue being used for argument passing.
Configuration menu - View commit details
-
Copy full SHA for f05fa6e - Browse repository at this point
Copy the full SHA f05fa6eView commit details -
[MC] Avoid some registerSection calls
changeSection is preferred to call the changeSectionImpl hook, which registers the section symbol.
Configuration menu - View commit details
-
Copy full SHA for abbf3bc - Browse repository at this point
Copy the full SHA abbf3bcView commit details -
[RISCV] Mark all registers marked isConstant as reserved (#96002)
This makes use of the information from TableGen instead of duplicating it in the code.
Configuration menu - View commit details
-
Copy full SHA for 6082a50 - Browse repository at this point
Copy the full SHA 6082a50View commit details -
[MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer::finish…
…Impl Follow-up to https://reviews.llvm.org/D128958 * Move target-specific code away from the generic ELFWriter. * All sections should have been created before MCAssembler::layout. * Remove one `registerSection` use, which should be considered private to MCAssembler.
Configuration menu - View commit details
-
Copy full SHA for 9d63506 - Browse repository at this point
Copy the full SHA 9d63506View commit details -
[NC,MIPS] Avoid some registerSection calls
Similar to abbf3bc. switchSection calls registerSection internally.
Configuration menu - View commit details
-
Copy full SHA for cde799f - Browse repository at this point
Copy the full SHA cde799fView commit details -
[ProfileData] Use std::array for ValueProfData (#96440)
This patch uses std::array for ValueProfData. Aside from reducing the line count and code duplication, the use of std::array here makes it easier to add a new type of value profiling without touching as many places.
Configuration menu - View commit details
-
Copy full SHA for 329e6b4 - Browse repository at this point
Copy the full SHA 329e6b4View commit details -
[MC,COFF] Remove unneeded BeginSymName
When `BeginSymName` is not null, `createTempSymbol` is called but the created symbol is not attached to a fragment. This is used as a hack to some DWARF tests to work. In the future, we should repurpose `BeginSymbol` as the section symbol in ELF.
Configuration menu - View commit details
-
Copy full SHA for a3cf14a - Browse repository at this point
Copy the full SHA a3cf14aView commit details
Commits on Jun 24, 2024
-
[MC] Remove unneeded setBeginSymbol. NFC
getELFSection ensures that the section symbol exists.
Configuration menu - View commit details
-
Copy full SHA for 905c58f - Browse repository at this point
Copy the full SHA 905c58fView commit details -
[BPI] Use BasicBlock::isEHPad() to check exception handling block. (#…
…95771) There is no need to iterate all predecessors of current block, check if current block is the invoke unwind destination of any predecessor. We can directly call `BasicBlock::isEHPad()` to check if current block is an exception handling block.
Configuration menu - View commit details
-
Copy full SHA for 8f49dab - Browse repository at this point
Copy the full SHA 8f49dabView commit details -
[Serialization] Register Speical types before register decls
We will only regsiter top level types and decls in ASTWriter and we will register the sub types and decls during the process of writing types and decls. So that the ID for the types in the sub level can be different if the writing decl process changes the order of the to-be- emitted type queues. This is not ideal since it causes unnecessary changes especially in no transitive changes model. This patch migrates the issue by regsitering special types before regsitering decls. This make sure that the special types in the 2nd top level can be registered early than the decls. But it might still be problematic if there are more levels in the special types. Luckily we just don't have such special types.
Configuration menu - View commit details
-
Copy full SHA for 1ecc5ae - Browse repository at this point
Copy the full SHA 1ecc5aeView commit details -
[Serialization] Revert specialization for DenseMapInfo<GlobalDeclID>:…
…:getHashValue The FIXME says to revert this when the underlying issue got fixed. And now the underlying issue got fixed in llvm/llvm-project#95734. So I think it should be fine to rever that one now.
Configuration menu - View commit details
-
Copy full SHA for 4061354 - Browse repository at this point
Copy the full SHA 4061354View commit details -
Revert "[MC] Move ELFWriter::createMemtagRelocs to AArch64ELFStreamer…
…::finishImpl" This reverts commit 9d63506. There is a heap-use-after-free.
Configuration menu - View commit details
-
Copy full SHA for 5997e7d - Browse repository at this point
Copy the full SHA 5997e7dView commit details -
[InstSimplify] Provide information about the range of possible values…
… that `ucmp`/`scmp` can return (#96410) This makes it possible to fold dumb comparisons like `ucmp(x, y) == 7`.
Configuration menu - View commit details
-
Copy full SHA for ffec315 - Browse repository at this point
Copy the full SHA ffec315View commit details -
[MC] Move ELFWriter::createMemtagRelocs to AArch64TargetELFStreamer::…
…finish Follow-up to https://reviews.llvm.org/D128958 * Move target-specific code away from the generic ELFWriter. * All sections should have been created before MCAssembler::layout. * Remove one `registerSection` use, which should be considered private to MCAssembler.
Configuration menu - View commit details
-
Copy full SHA for fec1b6f - Browse repository at this point
Copy the full SHA fec1b6fView commit details -
[ARM] Move ARMELFStreamer::finishImpl to ARMTargetELFStreamer::finish…
…. NFC ELFStreamer::finishImpl is not intended to be further overridden.
Configuration menu - View commit details
-
Copy full SHA for efdb91e - Browse repository at this point
Copy the full SHA efdb91eView commit details -
Configuration menu - View commit details
-
Copy full SHA for a9ac319 - Browse repository at this point
Copy the full SHA a9ac319View commit details -
[mlir][intrange] Fix inference of zero-trip loop bound (#96429)
When lower bound and exclusive upper bound of a loop are the same, and the zero-trip loop is not canonicalized away before the analysis, this leads to a meaningless range for the induction variable being inferred. This patch adds a check to make sure that the inferred range for the IV is meaningful before updating the analysis state. Fix llvm/llvm-project#94423
Configuration menu - View commit details
-
Copy full SHA for b78883f - Browse repository at this point
Copy the full SHA b78883fView commit details -
[NFC] [Serialization] Refactor getLocalDeclID to 'LocalDeclID::get'
I just realized that the name `getLocalDeclID` looks like an member function in ASTReader. It looks not good. So I decided to refactor this into a static member function in LocalDeclID.
Configuration menu - View commit details
-
Copy full SHA for 79b0966 - Browse repository at this point
Copy the full SHA 79b0966View commit details -
[mlir][Conversion]
FuncToLLVM
: Simplify bare-pointer handling (#96393)Before this commit, there used to be a workaround in the `func.func`/`gpu.func` op lowering when the bare-pointer calling convention is enabled. This workaround "patched up" the argument materializations for memref arguments. This can be done directly in the argument materialization functions (as the TODOs in the code base indicate). This commit effectively reverts back to the old implementation (a664c14) and adds additional checks to make sure that bare pointers are used only for function entry block arguments.
Configuration menu - View commit details
-
Copy full SHA for 9e8ccf6 - Browse repository at this point
Copy the full SHA 9e8ccf6View commit details -
[AMDGPU][SplitModule] Allow non-kernels to be treated as roots (#95902)
I initially assumed only kernels could be roots, but that is wrong. A function with no callers also needs to be a root to ensure it is correctly handled. They're very rare because we usually internalize everything, and internal functions with no callers would be deleted. When they are present, we need to also consider their dependencies and act accordingly. Previously, we could put a function "by default" in P0, but it could call another function with internal linkage defined in another module which was of course incorrect. Fixes SWDEV-467695
Configuration menu - View commit details
-
Copy full SHA for 1c025fb - Browse repository at this point
Copy the full SHA 1c025fbView commit details -
[lldb] Merge CompilerContextKind::{Class,Struct} (#96145)
Our dwarf parsing code treats structures and classes as interchangable. CompilerContextKind is used when looking DIEs for types. This makes sure we always they're treated the same way. See also [#95905#discussion_r1645686628](llvm/llvm-project#95905 (comment)).
Configuration menu - View commit details
-
Copy full SHA for 599ca71 - Browse repository at this point
Copy the full SHA 599ca71View commit details -
[clang][analyzer] Add notes to PointerSubChecker (#95899)
Notes are added to indicate the array declarations of the arrays in a found invalid pointer subtraction.
Configuration menu - View commit details
-
Copy full SHA for c43d5f5 - Browse repository at this point
Copy the full SHA c43d5f5View commit details -
Revert "[AMDGPU]Optimize SGPR spills (#93668)"
This reverts commit 4b9112e. A separate issue(#96353) describing it has been opened to further keep its track.
Configuration menu - View commit details
-
Copy full SHA for c2fc7f7 - Browse repository at this point
Copy the full SHA c2fc7f7View commit details -
[LV] Add test showing cost is computed when there are no vector plans.
Add test showing unnecessary cost computations, as no vector VPlans are generated.
Configuration menu - View commit details
-
Copy full SHA for f0c674f - Browse repository at this point
Copy the full SHA f0c674fView commit details -
[X86][Driver] Enable feature cf for -mapxf
This is follow-up for #78901 after validation.
Configuration menu - View commit details
-
Copy full SHA for 45a7af7 - Browse repository at this point
Copy the full SHA 45a7af7View commit details -
[mlir][vector] Fix FlattenGather for scalable vectors (#96074)
This pattern flattens vector.gather ops by unrolling the outermost dimension for rank > 2 vectors. There's two issues with this pattern for scalable vectors: 1. The unrolling doesn't take vscale into account. A constraint is added to disable this pattern for vectors with leading scalable dims. 2. The scalable dims are dropped when creating the new gather. Fixed by propagating the flags. Depends on #96049.
Configuration menu - View commit details
-
Copy full SHA for 9931ee6 - Browse repository at this point
Copy the full SHA 9931ee6View commit details -
[VPlan] Don't compute costs if there are no vector VPlans.
In some cases, no vector VPlans can be constructed due to failing VPlan legality checks (e.g. unable to perform sinking for first order recurrences or plans being incompatible with EVL). There's no need to compute costs in those cases, so check directly if there are no vector plans.
Configuration menu - View commit details
-
Copy full SHA for abf5969 - Browse repository at this point
Copy the full SHA abf5969View commit details -
[IR] Lazily initialize the class to pass name mapping (NFC) (#96321)
Initializing this map is somewhat expensive (especially for O0), so we currently only do it if certain flags are used. I would like to make use of it for crash dumps (#96078), where we don't know in advance whether it will be needed or not. This patch changes the initialization to a lazy approach, where a callback is registered that does the actual initialization. The callbacks will be run the first time the pass name is requested. This way there is no compile-time impact if the mapping is not used.
Configuration menu - View commit details
-
Copy full SHA for 957dc43 - Browse repository at this point
Copy the full SHA 957dc43View commit details -
[SPIR-V]: Add SPIR-V extension: SPV_KHR_cooperative_matrix (#96091)
This PR adds SPIR-V extension SPV_KHR_cooperative_matrix that "adds a new set of types known as "cooperative matrix" types, where the storage for and computations performed on the matrix are spread across a set of invocations such as a subgroup" (see https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_cooperative_matrix.asciidoc). This PR also fixes llvm/llvm-project#96170, a new test cases is attached (llvm/test/CodeGen/SPIRV/transcoding/OpPtrCastToGeneric.ll).
Configuration menu - View commit details
-
Copy full SHA for 57f7937 - Browse repository at this point
Copy the full SHA 57f7937View commit details -
[NFC] [Modules] Extract the logic to decide whether the module units …
…belongs to the same module This patch extracts the logci to decide how we decide the module units belongs to the same module into a member function of ASTContext. This is helpful to refactor the implementation in the future.
Configuration menu - View commit details
-
Copy full SHA for 790f931 - Browse repository at this point
Copy the full SHA 790f931View commit details -
[Passes] Try to fix build on windows
Some passes reference *this (inside decltype) which fails with MSVC. Fix this by not explicitly specifying the captures (otherwise we would get an unused lambda capture warning for cases where this is *not* used).
Configuration menu - View commit details
-
Copy full SHA for e7137f2 - Browse repository at this point
Copy the full SHA e7137f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6eaf204 - Browse repository at this point
Copy the full SHA 6eaf204View commit details -
[clang][Interp] Fix variable initialization in inactive regions
When the EvalEmitter is inactive, it will simply not evaluate any of the operations we emit via emit*. However, it will still allocate variables. So the variables will be allocated, but we won't evaluate their initializer, so later when we see the variable again, it is uninitialized. Stop creating variables in that case.
Configuration menu - View commit details
-
Copy full SHA for 33676ba - Browse repository at this point
Copy the full SHA 33676baView commit details -
[flang] harden TypeAndShape for assumed-ranks (#96234)
SIZEOF and C_SIZEOF were broken for assumed-ranks because `TypeAndShape::MeasureSizeInBytes` behaved as a scalar because the `TypeAndShape::shape_` member was the same for scalar and assumed-ranks. The easy fix would have been to add special handling in `MeasureSizeInBytes` for assumed-ranks using the TypeAndShape attributes, but I think this solution would leave `TypeAndShape::shape_` manipulation fragile to future developers. Hence, I went for the solution that turn shape_ into a `std::optional<Shape>`.
Configuration menu - View commit details
-
Copy full SHA for 73cf014 - Browse repository at this point
Copy the full SHA 73cf014View commit details -
Revert "[IR] Lazily initialize the class to pass name mapping (NFC) (…
Configuration menu - View commit details
-
Copy full SHA for e5a41f0 - Browse repository at this point
Copy the full SHA e5a41f0View commit details -
Merge from 'main' to 'sycl-web' (91 commits)
CONFLICT (content): Merge conflict in llvm/lib/IR/Verifier.cpp
Configuration menu - View commit details
-
Copy full SHA for faf72a2 - Browse repository at this point
Copy the full SHA faf72a2View commit details -
[CMake][libclc] Improve dependencies to avoid build errors (#95018)
With the Makefile generator and particularly high build parallelism some intermediate dependencies may be generated redundantly and concurrently, leading to build failures. To fix this, arrange for libclc's add_custom_commands to depend on targets in addition to files. This follows CMake documentation's[^1] guidance on add_custom_command: > Do not list the output in more than one independent target that may > build in parallel or the instances of the rule may conflict. Instead, > use the add_custom_target() command to drive the command and make the > other targets depend on that one. Eliminating the redundant commands also improves build times. [^1]: https://cmake.org/cmake/help/v3.29/command/add_custom_command.html
Configuration menu - View commit details
-
Copy full SHA for 6479604 - Browse repository at this point
Copy the full SHA 6479604View commit details -
Configuration menu - View commit details
-
Copy full SHA for 090e0c4 - Browse repository at this point
Copy the full SHA 090e0c4View commit details -
[C++20] [Modules] Avoid comparing primary module name to decide isInS…
…ameModule all the time Previously, we decide if two module units are in the same module by comparing name of the primary module interface. But it looks not efficiency if we always compare the strings. It should be good to avoid the expensive string operations if possible. In this patch, we introduced a `llvm::StringMap` to map primary module name to a Module* and a `llvm::DenseMap<Module*, Module*>` to map a Module* to a representative Module *. The representative Module* is one of the Module units belonging to a certain module. The module units have the same representative Module* should belong to the same module. We choose the representative Module* by the first module lookup for a certain primary module name. So the following module units have the same primary module name would get the same representative modules. So that for every modules, there will be only one hash process for the primary module name.
Configuration menu - View commit details
-
Copy full SHA for 2232881 - Browse repository at this point
Copy the full SHA 2232881View commit details -
Prefer to check .empty() instead of .size() == 0
Configuration menu - View commit details
-
Copy full SHA for 3b6462c - Browse repository at this point
Copy the full SHA 3b6462cView commit details -
[Docs][Clang] Missing DR status for C++23-era papers in cxx_status.ht…
…ml (#68846) List the following C++23-era WG21 papers as Defect Reports in cxx_status.html as per WG21 meeting minutes. - [P1949R7](https://wg21.link/p1949r7) (C++ Identifier Syntax using Unicode Standard Annex 31) - [P2156R1](https://wg21.link/p2156r1) (Allow Duplicate Attributes) - [P2036R3](https://wg21.link/p2036r3) (Change scope of lambda _trailing-return-type_) - [P2468R2](https://wg21.link/p2468r2) (The Equality Operator You Are Looking For) - [P2327R1](https://wg21.link/p2327r1) (De-deprecating `volatile` compound operations) - [P2493R0](https://wg21.link/p2493r0) (Missing feature test macros for C++20 core papers) - [P2513R3](https://wg21.link/p2513r3) (`char8_t` Compatibility and Portability Fix) - [P2460R2](https://wg21.link/p2460r2) (Relax requirements on `wchar_t` to match existing practices) - [P2579R0](https://wg21.link/p2579r0) (Mitigation strategies for [P2036](https://wg21.link/p2036) ”Changing scope for lambda _trailing-return-type_”)
Configuration menu - View commit details
-
Copy full SHA for 2151ba0 - Browse repository at this point
Copy the full SHA 2151ba0View commit details -
[mlir][vector] Support n-D vectors in i8 to i4 trunci emulation (#94946)
Previously, this only supported 1-D vectors via vector.shuffle, with the new vector.deinterleave this can be updated to support n-D vectors.
Configuration menu - View commit details
-
Copy full SHA for 137a745 - Browse repository at this point
Copy the full SHA 137a745View commit details -
[OpenMP][LLVM] Fix access to reduction args of
omp.parallel
. (#96426)Fix for Fujitsu test suite test: 0275_0032.f90. The MLIR to LLVM translation logic assumed that reduction arguments to an `omp.parallel` op are always the last set of arguments to the op. However, this is a wrong assumption since private args come afterward.
Configuration menu - View commit details
-
Copy full SHA for b0bc2f6 - Browse repository at this point
Copy the full SHA b0bc2f6View commit details -
[mlir][gpu] Add py binding for AsyncTokenType (#96466)
The PR adds py binding for `AsyncTokenType`
Configuration menu - View commit details
-
Copy full SHA for f8ff909 - Browse repository at this point
Copy the full SHA f8ff909View commit details -
[SourceManager] Expose max usage of source location space as a Statis…
…tic (#96292) We have been running into source location exhaustion recently and want to use the statistics to monitor the usage in various files to be able to anticipate where the next problem will happen. I picked `Statistic` because it can be written into a structured JSON file and is easier to consume by further automation. This commit does not change any existing per-source-manager metrics exposed via `SourceManager::PrintStats()`. This does create some redundancy, but I also expect to be non-controversial because it aligns with the intended use of `Statistic`.
Configuration menu - View commit details
-
Copy full SHA for dfbfb6c - Browse repository at this point
Copy the full SHA dfbfb6cView commit details -
[AArch64] Consider streaming mode in TTI interfaces for vectorization…
…. (#96305) At the moment, vectorization is only enabled in streaming(-compatible) mode when enabled through an option. But the interfaces should check more than just 'hasSVE()', because a function with +sme in streaming mode should also vectorize with the option enabled. Additionally, a streaming-compatible function should only be able to use fixed-length autovec if SVE is available, otherwise the vector code will be scalarised by the backend.
Configuration menu - View commit details
-
Copy full SHA for 738533c - Browse repository at this point
Copy the full SHA 738533cView commit details -
[mlir][ArmSME] Disallow streaming mode for gathers/scatters (#96209)
Ideally, this would be based on target information (but we don't really have that), so this currently errs on the side of caution. If possible gathers/scatters should be lowered regular vector loads/stores before using invoking enable-arm-streaming.
Configuration menu - View commit details
-
Copy full SHA for 1b64ed0 - Browse repository at this point
Copy the full SHA 1b64ed0View commit details -
[X86] Rename clz.ll -> ctlz.ll to match the intrinsic naming
I'll be splitting the ctlz/cttz tests into separate test files shortly
Configuration menu - View commit details
-
Copy full SHA for fd5a177 - Browse repository at this point
Copy the full SHA fd5a177View commit details -
Configuration menu - View commit details
-
Copy full SHA for 145f36c - Browse repository at this point
Copy the full SHA 145f36cView commit details -
Remove reference to removed method. (#96315)
Methodes were removed in dc37dc8.
Configuration menu - View commit details
-
Copy full SHA for 53e577a - Browse repository at this point
Copy the full SHA 53e577aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5a997c1 - Browse repository at this point
Copy the full SHA 5a997c1View commit details -
[Clang] Introduce
nonblocking
/nonallocating
attributes (#84983)Introduce `nonblocking` and `nonallocating` attributes. RFC is here: https://discourse.llvm.org/t/rfc-nolock-and-noalloc-attributes/76837 This PR introduces the attributes, with some changes in Sema to deal with them as extensions to function (proto)types. There are some basic type checks, most importantly, a warning when trying to spoof the attribute (implicitly convert a function without the attribute to one that has it). A second, follow-on pull request will introduce new caller/callee verification. --------- Co-authored-by: Doug Wyatt <[email protected]> Co-authored-by: Shafik Yaghmour <[email protected]> Co-authored-by: Aaron Ballman <[email protected]> Co-authored-by: Sirraide <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f03cb00 - Browse repository at this point
Copy the full SHA f03cb00View commit details -
[RISCV] Pretty print AVL register in VSETVLIInfo::dump. NFC
Currently the AVLReg is printed raw like {AVLReg=2147483668, ...}, this changes it to {AVLReg=%20, ...} which should be easier to read.
Configuration menu - View commit details
-
Copy full SHA for a66900b - Browse repository at this point
Copy the full SHA a66900bView commit details -
[clang][AArch64][FMV] Stop emitting alias to ifunc. (#96221)
Long story short the interaction of two optimizations happening in GlobalOpt results in a crash. For more details look at the issue llvm/llvm-project#96197. I will be fixing this in GlobalOpt but it is a conservative solution since it won't allow us to optimize resolvers which return a pointer to a function whose definition is in another TU when compiling without LTO: ``` __attribute__((target_version("simd"))) void bar(void); __attribute__((target_version("default"))) void bar(void); int foo() { bar(); } ``` fixes: #96197
Configuration menu - View commit details
-
Copy full SHA for 3d80792 - Browse repository at this point
Copy the full SHA 3d80792View commit details -
[clang] [MinGW] Set a predefined __GXX_TYPEINFO_EQUALITY_INLINE=0 for…
… MinGW targets (#96062) libstdc++ requires this define to match what is predefined in GCC for the ABI of this platform; GCC hardcodes this define for all mingw configurations in gcc/config/i386/cygming.h. (It also defines __GXX_MERGED_TYPEINFO_NAMES=0, but that happens to match the defaults in libstdc++ headers, so there's no similar need to define it in Clang.) This fixes a Clang/libstdc++ interop issue discussed at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110572.
Configuration menu - View commit details
-
Copy full SHA for 4e6c8f1 - Browse repository at this point
Copy the full SHA 4e6c8f1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b41de3 - Browse repository at this point
Copy the full SHA 6b41de3View commit details -
[C++20] [Modules] Diagnose redeclarations from different modules
[basic.link]/p10: > If two declarations of an entity are attached to different modules, > the program is ill-formed But we only implemented the check for ODR. In this patch, we tried to diagnose the redeclarations from different modules.
Configuration menu - View commit details
-
Copy full SHA for cc4ec6d - Browse repository at this point
Copy the full SHA cc4ec6dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 17e51d5 - Browse repository at this point
Copy the full SHA 17e51d5View commit details -
[AMDGPU] Set total VGPRs to 1536 for gfx12 (#96272)
- Use Feature1_5xVGPRs
Configuration menu - View commit details
-
Copy full SHA for 689c5c4 - Browse repository at this point
Copy the full SHA 689c5c4View commit details -
Merge from 'sycl' to 'sycl-web' (4 commits)
iclsrc committedJun 24, 2024 Configuration menu - View commit details
-
Copy full SHA for f8cc33b - Browse repository at this point
Copy the full SHA f8cc33bView commit details -
[lldb/DWARF] Optimize DIEToType handling (#96308)
- move type insertion from individual parse methods into ParseTypeFromDWARF - optimize sentinel (TYPE_IS_BEING_PARSED) insertion to avoid double map lookup - as this requires the map to not have nullptr values, I've replaced all `operator[]` queries with calls to `lookup`.
Configuration menu - View commit details
-
Copy full SHA for 41a4db1 - Browse repository at this point
Copy the full SHA 41a4db1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9298e40 - Browse repository at this point
Copy the full SHA 9298e40View commit details -
Configuration menu - View commit details
-
Copy full SHA for 284fbf9 - Browse repository at this point
Copy the full SHA 284fbf9View commit details -
[Sema] Fix -Wunused-variable in SemaType.cpp (NFC)
/llvm-project/clang/lib/Sema/SemaType.cpp:7625:8: error: unused variable 'Success' [-Werror,-Wunused-variable] bool Success = FX.insert(NewEC, Errs); ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for efab4a3 - Browse repository at this point
Copy the full SHA efab4a3View commit details -
[RISCV] Remove experimental from Ztso. (#96465)
Ztso 1.0 was ratified in January 2023. Documentation: https://github.com/riscv/riscv-isa-manual/blob/main/src/ztso-st-ext.adoc
Configuration menu - View commit details
-
Copy full SHA for 9cd6ef4 - Browse repository at this point
Copy the full SHA 9cd6ef4View commit details -
[lldb] Fix TestDAP_runInTerminal for #96256
change the expected error msg.
Configuration menu - View commit details
-
Copy full SHA for c053ec9 - Browse repository at this point
Copy the full SHA c053ec9View commit details -
[PPC][InlineASM] Mark the 'a' constraint as unsupported (#96109)
'a' is an input/output constraint for restraining assembly variables to an indexed or indirect address operand. It previously was marked as supported but would throw an assertion for unknown constraint type in the back-end when this test case was compiled. This change marks it as unsupported until we can add full support for address operands constraining to the compiler code generation.
Configuration menu - View commit details
-
Copy full SHA for b8979c6 - Browse repository at this point
Copy the full SHA b8979c6View commit details -
[IndVars] Make pushIVUsers() a member function (NFC)
Make it easier to access additional state from it.
Configuration menu - View commit details
-
Copy full SHA for 6e3725d - Browse repository at this point
Copy the full SHA 6e3725dView commit details -
[clang] Skip auto-init on scalar vars that have a non-constant Init a…
…nd no self-ref (#94642) In that scalar case, the Init should initialize the auto var before use. The Init might use uninitialized memory from other sources (e.g., heap) but auto-init did not help us in that case because the auto-init would have been overwritten by the Init before use. For non-scalars e.g., classes, the Init expr might be a ctor call that leaves uninitialized members, so we leave the auto-init there. The motivation is to have less IR for the optimizer to later remove, which may not be until a fairly late pass (DSE) or may not get optimized in lower optimization levels like O1 (no DSE) or sometimes due to derefinement. This is ~10% less left-over auto-init in O1 in a few examples checked.
Configuration menu - View commit details
-
Copy full SHA for 0cf1e66 - Browse repository at this point
Copy the full SHA 0cf1e66View commit details -
Reapply [IR] Lazily initialize the class to pass name mapping (NFC) (…
…#96321) (#96462) On MSVC the `this` uses inside `decltype` require a lambda capture. On clang they result in an unused capture warning instead. Add the capture and suppress the warning with `(void)this`. ----- Initializing this map is somewhat expensive (especially for O0), so we currently only do it if certain flags are used. I would like to make use of it for crash dumps (#96078), where we don't know in advance whether it will be needed or not. This patch changes the initialization to a lazy approach, where a callback is registered that does the actual initialization. The callbacks will be run the first time the pass name is requested. This way there is no compile-time impact if the mapping is not used.
Configuration menu - View commit details
-
Copy full SHA for 5cd0ba3 - Browse repository at this point
Copy the full SHA 5cd0ba3View commit details -
[clang][ThreadSafety] Check trylock function success and return types…
… (#95290) With this change, Clang will generate errors when trylock functions have improper return types. Today, it silently fails to apply the trylock attribute to these functions which may incorrectly lead users to believe they have correctly acquired locks before accessing guarded data. As a side effect of explicitly checking the success argument type, I seem to have fixed a false negative in the analysis that could occur when a trylock's success argument is an enumerator. I've added a regression test to warn-thread-safety-analysis.cpp named `TrylockSuccessEnumFalseNegative`. This change also improves the documentation with descriptions of of the subtle gotchas that arise from the analysis interpreting the success arg as a boolean. Issue #92408
Configuration menu - View commit details
-
Copy full SHA for c1bde0a - Browse repository at this point
Copy the full SHA c1bde0aView commit details -
[C11] Claim we do not conform to WG14 N1285 yet
This also updates the status for C11 to be Partial, and because C17 is C11 plus DR resolutions, that makes C17 also Partial.
Configuration menu - View commit details
-
Copy full SHA for 3ff680a - Browse repository at this point
Copy the full SHA 3ff680aView commit details -
Configuration menu - View commit details
-
Copy full SHA for ae1c564 - Browse repository at this point
Copy the full SHA ae1c564View commit details -
[VectorUtils] Use poison instead of undef in findScalarElement()
Out-of-range extractelement returns poison, and so do poison elements in the shufflevector mask.
Configuration menu - View commit details
-
Copy full SHA for 605e184 - Browse repository at this point
Copy the full SHA 605e184View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b8c3c6 - Browse repository at this point
Copy the full SHA 9b8c3c6View commit details -
[C11] Remove WG14 N1353 from the list of papers to track
Only the first proposed changes in the paper were adopted, and that wording was changing "operations" into "operators", which is purely an editorial change.
Configuration menu - View commit details
-
Copy full SHA for 29e0f04 - Browse repository at this point
Copy the full SHA 29e0f04View commit details -
Configuration menu - View commit details
-
Copy full SHA for 69d0746 - Browse repository at this point
Copy the full SHA 69d0746View commit details -
[C11] Remove WG14 N1382 from the list of papers to track
This paper proposes only changes to a footnote that had problematic implications for ABI; the changes were purely editorial.
Configuration menu - View commit details
-
Copy full SHA for 6ecb9fd - Browse repository at this point
Copy the full SHA 6ecb9fdView commit details -
[analyzer] Add an ownership change visitor to StreamChecker (#94957)
This is very similar to https://reviews.llvm.org/D105553, in fact, I barely made any changes from MallocChecker's ownership visitor to this one. The new visitor emits a diagnostic note for function where a change in stream ownership was expected (for example, it had a fclose() call), but the ownership remained unchanged. This is similar to messages regarding ordinary values ("Returning without writing to x").
Configuration menu - View commit details
-
Copy full SHA for fc4b09d - Browse repository at this point
Copy the full SHA fc4b09dView commit details -
Configuration menu - View commit details
-
Copy full SHA for b644726 - Browse repository at this point
Copy the full SHA b644726View commit details -
Configuration menu - View commit details
-
Copy full SHA for 824113f - Browse repository at this point
Copy the full SHA 824113fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6258b5f - Browse repository at this point
Copy the full SHA 6258b5fView commit details -
[clang][ThreadSafety] Fix code block syntax in ThreadSafetyAnalysis.r…
…st (#96494) Without a newline, documentation was failing to build with this error: Warning, treated as error: /home/runner/work/llvm-project/llvm-project/clang-build/tools/clang/docs/ThreadSafetyAnalysis.rst:466:Error in "code-block" directive: maximum 1 argument(s) allowed, 10 supplied. Issue #92408
Configuration menu - View commit details
-
Copy full SHA for 3402620 - Browse repository at this point
Copy the full SHA 3402620View commit details -
Configuration menu - View commit details
-
Copy full SHA for df9f479 - Browse repository at this point
Copy the full SHA df9f479View commit details -
[C23] Claim conformance to WG14 N3033
Clang has implemented __VA_OPT__ since Clang 12.
Configuration menu - View commit details
-
Copy full SHA for b012ab0 - Browse repository at this point
Copy the full SHA b012ab0View commit details -
[IR] Generate poison for all-poison scalable shufflevector mask
Ultimately doesn't matter because the bitcode reader interprets undef and poison interchangeably in this context.
Configuration menu - View commit details
-
Copy full SHA for db9e9ea - Browse repository at this point
Copy the full SHA db9e9eaView commit details -
[lldb][API] Add Find(Ranges)InMemory() to Process SB API (#95007)
Test Plan: llvm-lit llvm-project/lldb/test/API/python_api/find_in_memory/TestFindInMemory.py llvm-project/lldb/test/API/python_api/find_in_memory/TestFindRangesInMemory.py Reviewers: clayborg Tasks: lldb
Configuration menu - View commit details
-
Copy full SHA for 10bd5ad - Browse repository at this point
Copy the full SHA 10bd5adView commit details -
Reland "[mlir][spirv] Add a generic convert-to-spirv pass" (#96359)
This PR relands #95942, which was reverted in #96332 due to link failures. It fixes the issue by updating CMake dependencies. The bazel support, originally introduced in #96334, is also included in this PR. --------- Co-authored-by: Keith Smiley <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 13c1fec - Browse repository at this point
Copy the full SHA 13c1fecView commit details -
[C23] Remove WG14 N2660 from the list of papers we track
This paper was a clarification paper that made no normative changes to the wording, so we can lean on the C99 status for this.
Configuration menu - View commit details
-
Copy full SHA for 3e36dfa - Browse repository at this point
Copy the full SHA 3e36dfaView commit details -
[SPIR-V]: Improve pattern matching to recognize a composite constant …
…to be a constant (#96286) This PR is to fix llvm/llvm-project#96285 by: * improve pattern matching to recognize an aggregate constant to be a constant * do not emit Bitcast for an aggregate type
Configuration menu - View commit details
-
Copy full SHA for b0efde6 - Browse repository at this point
Copy the full SHA b0efde6View commit details -
Revert "[RISCV] Remove experimental from Ztso. (#96465)"
This reverts commit 9cd6ef4. See discussion on review thread.
Configuration menu - View commit details
-
Copy full SHA for f985a88 - Browse repository at this point
Copy the full SHA f985a88View commit details -
[NFC][CGSCC] Remove RCWorklist from CGSCCUpdateResult (#95448)
After #94815, this is only used within ModuleToPostOrderCGSCCPassAdaptor::run(), so keep it local to that function.
Configuration menu - View commit details
-
Copy full SHA for b312cbf - Browse repository at this point
Copy the full SHA b312cbfView commit details -
[flang] Silence errors on C_LOC/C_FUNLOC in specification expressions…
… (#96108) Transformational functions from the intrinsic module ISO_C_BINDING are allowed in specification expressions, so tweak some general checks that would otherwise trigger error messages about inadmissible targets, dummy procedures in specification expressions, and pure procedures with impure dummy procedures.
Configuration menu - View commit details
-
Copy full SHA for 3602efa - Browse repository at this point
Copy the full SHA 3602efaView commit details -
[flang] Better error reporting for MOD/MODULO/NEAREST (#96114)
When the second argument to these intrinsic functions is a scalar constant zero, emit a warning (if enabled) even if the first argument is not a constant.
Configuration menu - View commit details
-
Copy full SHA for 317277e - Browse repository at this point
Copy the full SHA 317277eView commit details -
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.
Configuration menu - View commit details
-
Copy full SHA for 6481dc5 - Browse repository at this point
Copy the full SHA 6481dc5View commit details -
Add a unit test for SBBreakpoint::SetCallback (#96001)
This commit adds a unit test for SBBreakpoint::SetCallback as it wasn't being tested before.
Configuration menu - View commit details
-
Copy full SHA for 347206f - Browse repository at this point
Copy the full SHA 347206fView commit details -
[libc][arm] add malloc/free/aligned_alloc to entrypoints (#96516)
Necessary for arm32 cross full build.
Configuration menu - View commit details
-
Copy full SHA for 9eba835 - Browse repository at this point
Copy the full SHA 9eba835View commit details -
[flang] Add/fix some semantic checks for assumed-rank (#96194)
Catch some cases where assumed rank dummy arguments are not allowed.
Configuration menu - View commit details
-
Copy full SHA for 9ab292d - Browse repository at this point
Copy the full SHA 9ab292dView commit details -
Add support for using foreign type units in .debug_names. (#87740)
This patch adds support for the new foreign type unit support in .debug_names. Features include: - don't manually index foreign TUs if we have info for them - only use the type unit entries that match the .dwo files when we have a .dwp file - fix type unit lookups for .dwo files - fix crashers that happen due to PeekDIEName() using wrong offsets where an entry had DW_IDX_comp_unit and DW_IDX_type_unit entries and when we had no type unit support, it would cause us to think it was a normal DIE in .debug_info from the main executable. --------- Co-authored-by: paperchalice <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3b5b814 - Browse repository at this point
Copy the full SHA 3b5b814View commit details -
Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc5.
Configuration menu - View commit details
-
Copy full SHA for d75f9dd - Browse repository at this point
Copy the full SHA d75f9ddView commit details -
[RISCV] Add back some test cases I inadvertently deleted. NFC
These tests were accidentally removed in a7a1195. I only meant to remove bfloat tests, but I accidentally removed f32 and f64 as well.
Configuration menu - View commit details
-
Copy full SHA for 7601ae1 - Browse repository at this point
Copy the full SHA 7601ae1View commit details -
[LLD] [COFF] Don't crash on an empty -entry: argument (#96058)
We can't pass an empty string to addUndefined(). This fixes the crash that was encountered in llvm/llvm-project#93309 (turning the crash into a properly handled error; making it do the right thing is handled in llvm/llvm-project#96055).
Configuration menu - View commit details
-
Copy full SHA for cb248f8 - Browse repository at this point
Copy the full SHA cb248f8View commit details -
[flang][debug] Handle allocatable strings. (#95906)
The allocatable strings also use DIStringType but provide dwarf expressions to find the location and length of the string. With this change in place, the debugging of the allocatable strings looks like this: character(len=:), allocatable :: first character(len=:), allocatable :: second character(len=:), allocatable :: third first = 'Mount' second = 'Everest' third = first // " " // second print *, third (gdb) p third $1 = "" (gdb) n 18 print *, third (gdb) p third $2 = 'Mount Everest' (gdb) ptype third type = character (13)
Configuration menu - View commit details
-
Copy full SHA for 3a57925 - Browse repository at this point
Copy the full SHA 3a57925View commit details -
[ELF] Postpone more linker script errors
Since `assignAddresses` is executed more than once, error reporting during `assignAddresses` would be duplicated. Generalize #66854 to cover more errors. Note: address-related errors exposed in one invocation might not be errors in another invocation. Pull Request: llvm/llvm-project#96361
Configuration menu - View commit details
-
Copy full SHA for ee4c12f - Browse repository at this point
Copy the full SHA ee4c12fView commit details -
Configuration menu - View commit details
-
Copy full SHA for fc066ca - Browse repository at this point
Copy the full SHA fc066caView commit details -
[flang][preprocessing] Mix preprocessing directives with free form li…
…… (#96244) …ne continuation Allow preprocessing directives to appear between a source line and its continuation, including conditional compilation directives (#if, #ifdef, &c.). Fixes llvm/llvm-project#95476.
Configuration menu - View commit details
-
Copy full SHA for 5d15f60 - Browse repository at this point
Copy the full SHA 5d15f60View commit details -
[clang][Interp] Fix classifying __builtin_addressof() argument
It's an lvalue, so we need to use the classify() taking an expression.
Configuration menu - View commit details
-
Copy full SHA for e6ec366 - Browse repository at this point
Copy the full SHA e6ec366View commit details -
[flang][runtime] Better handling of "fort.N" opening errors (#96347)
When a data transfer statement references a unit number that hasn't been explicitly OPENed, the runtime I/O support library opens a local "fort.N" file where N is the unit number. If that name exists in the current working directory but is not a readable or writable file (as appropriate), the runtime needs to catch the error at the point of the READ or WRITE statement rather than leaving an open unit in the unit map without a valid file descriptor.
Configuration menu - View commit details
-
Copy full SHA for eac925f - Browse repository at this point
Copy the full SHA eac925fView commit details -
[flang][runtime] Interoperable POINTER deallocation validation (#96100)
Extend the runtime validation of deallocated pointers so that it also works when pointers are allocated &/or deallocated outside Fortran. Previously, bogus runtime errors would be reported for pointers allocated via CFI_allocate() and deallocated in Fortran, and CFI_deallocate() did not check that it was deallocating a whole contiguous pointer that was allocated as such.
Configuration menu - View commit details
-
Copy full SHA for 514c1ec - Browse repository at this point
Copy the full SHA 514c1ecView commit details -
[LLDB][Minidump] Add 64b support to LLDB's minidump file builder. (#9…
…5312) Currently, LLDB does not support taking a minidump over the 4.2gb limit imposed by uint32. In fact, currently it writes the RVA's and the headers to the end of the file, which can become corrupted due to the header offset only supporting a 32b offset. This change reorganizes how the file structure is laid out. LLDB will precalculate the number of directories required and preallocate space at the top of the file to fill in later. Additionally, thread stacks require a 32b offset, and we provision empty descriptors and keep track of them to clean up once we write the 32b memory list. For [MemoryList64](https://learn.microsoft.com/en-us/windows/win32/api/minidumpapiset/ns-minidumpapiset-minidump_memory64_list), the RVA to the start of the section itself will remain in a 32b addressable space. We achieve this by predetermining the space the memory regions will take, and only writing up to 4.2 gb of data with some buffer to allow all the MemoryDescriptor64s to also still be 32b addressable. I did not add any explicit tests to this PR because allocating 4.2gb+ to test is very expensive. However, we have 32b automation tests and I validated with in several ways, including with 5gb+ array/object and would be willing to add this as a test case.
Configuration menu - View commit details
-
Copy full SHA for a27164c - Browse repository at this point
Copy the full SHA a27164cView commit details -
[RISCV][GISEL] IRTranslator for Scalable Vector Store (#86699)
Support IR translation for scalable vector store
Configuration menu - View commit details
-
Copy full SHA for 43d207a - Browse repository at this point
Copy the full SHA 43d207aView commit details -
[mlir][linalg] Implement patterns for reducing rank of named linalg c…
…ontraction ops (#95710) This patch introduces pattern rewrites for reducing the rank of named linalg contraction ops with unit spatial dim(s) to other named contraction ops. For example `linalg.batch_matmul` with batch size 1 -> `linalg.matmul` and `linalg.matmul` with unit LHS spatial dim -> `linalg.vecmat`, etc. These patterns don't support reducing the rank along reduction dimension as those don't convert to other named contraction ops.
Configuration menu - View commit details
-
Copy full SHA for 431213c - Browse repository at this point
Copy the full SHA 431213cView commit details -
Add flag to opt out of wasm-opt (#95208)
This PR fixes #55781 by adding the `--no-wasm-opt` and `--wasm-opt` flags in clang to disable/enable the `wasm-opt` optimizations. The default is to enable `wasm-opt` as before in order to not break existing workflows. I think that adding a warning when no flag or the `--wasm-opt` flag is given but `wasm-opt` wasn't found in the path may be relevant here. It allows people using `wasm-opt` to be aware of if it have been used on their produced binary or not. The only downside I see to this is that people already using the toolchain with the `-O` and `-Werror` flags but without `wasm-opt` in the path will see their toolchain break (with an easy fix: either adding `--no-wasm-opt` or add `wasm-opt` to the path). I haven't implemented this here because I haven't figured out how to add such a warning, and I don't know if this warning should be added here or in another PR. CC @sunfishcode that proposed in the associated issue to review this patch.
Configuration menu - View commit details
-
Copy full SHA for 962d7ac - Browse repository at this point
Copy the full SHA 962d7acView commit details -
[llvm-readobj][ELF] Implement JSON output for --dynamic-table (#95976)
When printing JSON output with --dynamic-table I noticed that the output is invalid JSON. This patch overrides the printDynamicTable() function in the JSONELFDumper to return a list of dictionaries for the DynamicSection value. Before the output was: ``` { "FileSummary": { "File": "bin/llvm-readelf", "Format": "elf64-x86-64", "Arch": "x86_64", "AddressSize": "64bit", "LoadName": "<Not found>" }DynamicSection [ (35 entries) Tag Type Name/Value 0x000000000000001D RUNPATH Library runpath: [$ORIGIN/../lib:] 0x0000000000000001 NEEDED Shared library: [libm.so.6] 0x0000000000000001 NEEDED Shared library: [libz.so.1] 0x0000000000000001 NEEDED Shared library: [libzstd.so.1] ``` Now the output looks like: ``` "DynamicSection": [ { "Tag": 29, "Type": "RUNPATH", "Value": 6322, "Path": [ "$ORIGIN/../lib" ] }, { "Tag": 1, "Type": "NEEDED", "Value": 6109, "Library": "libm.so.6" }, ```
Configuration menu - View commit details
-
Copy full SHA for 0ab8198 - Browse repository at this point
Copy the full SHA 0ab8198View commit details -
Configuration menu - View commit details
-
Copy full SHA for 58cb0e6 - Browse repository at this point
Copy the full SHA 58cb0e6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0534953 - Browse repository at this point
Copy the full SHA 0534953View commit details -
[lldb] Fix failing TestFind(Ranges)InMemory.py tests. (#96511)
This is to unblock #95007. Will investigate why the assertion is failing on some arch.
Configuration menu - View commit details
-
Copy full SHA for 33a9c57 - Browse repository at this point
Copy the full SHA 33a9c57View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2f69e9a - Browse repository at this point
Copy the full SHA 2f69e9aView commit details -
Configuration menu - View commit details
-
Copy full SHA for b7b337f - Browse repository at this point
Copy the full SHA b7b337fView commit details -
Update Clang extension criteria (#96532)
This updates Clang's extension criteria to explicitly mention impacts on other projects within the monorepo. These changes were discussed in the following RFC: https://discourse.llvm.org/t/rfc-require-discussion-of-impact-to-monorepo-stakeholders-when-adding-new-clang-extensions/79613
Configuration menu - View commit details
-
Copy full SHA for d6a3bd1 - Browse repository at this point
Copy the full SHA d6a3bd1View commit details -
[C23] Claim we do not conform to N2819
This paper clarified the lifetime of compound literal objects in odd scopes, such as use at function prototype scope. We do not currently implement this paper, as the new test demonstrates.
Configuration menu - View commit details
-
Copy full SHA for 2ae0905 - Browse repository at this point
Copy the full SHA 2ae0905View commit details -
[AArch64] Check for streaming mode in HasSME* features. (#96302)
This also fixes up some asserts in copyPhysReg, loadRegFromStackSlot and storeRegToStackSlot.
Configuration menu - View commit details
-
Copy full SHA for 62baf21 - Browse repository at this point
Copy the full SHA 62baf21View commit details -
[Clang][SveEmitter] Split up TargetGuard into SVE and SME component. …
…(#96482) One reason to want to split this up is to simplify the code added in #93802, where it checks the SME streaming-mode requirements for a builtin by checking for the absence of SVE. If the target guards are separate, we can generate a table and make the Sema code to verify the runtime mode simpler. Another reason is to avoid an issue with a check in SveEmitter.cpp where it ensures that the 'VerifyRuntimeMode' is set correctly for functions that have both SVE and SME target guards: if (!Def->isFlagSet(VerifyRuntimeMode) && Def->getGuard().contains("sve") && Def->getGuard().contains("sme")) llvm_unreachable("Missing VerifyRuntimeMode flag"); However, if we ever add a new feature with "sme" in the name, even though it is unrelated to FEAT_SME, then this code no longer works. Note that the arm_sve.td and arm_sme.td files could do with a bit of restructuring after this but it seems better to follow that up in an NFC patch.
Configuration menu - View commit details
-
Copy full SHA for 09c0337 - Browse repository at this point
Copy the full SHA 09c0337View commit details -
[mlir][linalg][Transform] Fix use-after-free in
SplitOp::apply
(#96……390) Detected with ASAN. `Operation::getLoc()` was called after erasing the operation. Reverts 48cf6b6, which attempted to fix the use-after-free. (But the use-after-free is still there when the `hasFailed` branch is taken.)
Configuration menu - View commit details
-
Copy full SHA for f2d3d82 - Browse repository at this point
Copy the full SHA f2d3d82View commit details -
[OpenMP] Add num_threads clause list format and strict modifier suppo…
…rt (#85466) Add support to the runtime for 6.0 spec features that allow num_threads clause to take a list, and also make use of the strict modifier. Provides new compiler interface functions for these features.
Configuration menu - View commit details
-
Copy full SHA for d30b082 - Browse repository at this point
Copy the full SHA d30b082View commit details -
[NFC][MLInliner] Rename LastSCC -> CurSCC (#96546)
The passed SCC is the current SCC we're working on.
Configuration menu - View commit details
-
Copy full SHA for 0555afd - Browse repository at this point
Copy the full SHA 0555afdView commit details -
[bazel] Export distributable lldb files (#96549)
If you're building and vendoring lldb, you might need to also vendor these files.
Configuration menu - View commit details
-
Copy full SHA for b1a93db - Browse repository at this point
Copy the full SHA b1a93dbView commit details -
[clang][OpenMP] Fix teams nesting of region check (#94806)
The static verifier flagged dead code in the check since the loop will only execute once and never reach the iterator increment. The loop needs to iterate twice to correctly diagnose when a statement is after the teams. Since there are two iterations again, reset the iterator to the first teams directive when the double teams case is seen so the diagnostic can report both locations.
Configuration menu - View commit details
-
Copy full SHA for b097018 - Browse repository at this point
Copy the full SHA b097018View commit details -
[mlgo] Support composite AOT-ed models (#96276)
This applies to the AOT case where we embed models in the compiler. The change adds support for multiple models for the same agent, and allows the user select one via a command line flag. "agent" refers to e.g. the inline advisor or the register allocator eviction advisor. To avoid build setup complexity, the support is delegated to the saved model. Since saved models define computational graphs, we can generate a composite model (this happens prior to building and embedding it in LLVM and is not shown in this change) that exposes an extra feature with a predefined name: `_model_selector`. The model, then, delegates internally to contained models based on that feature value. Model selection is expected to happen at model instantiation, there is no current scenario for switching them afterwards. If the model doesn't expose such a feature but the user passes one, we report error. If the model exposes such a feature but the user doesn't pass one, we also report an error. Invalid model selector values are expected to be handled by the saved model. Internally, the model uses a pair of uint64 values - the high and low of the MD5 hash of the name. A tool composing models would, then, need to: - expose the extra feature, `_model_selector`, shape (2,), uint64 data type - test its value (`tf.cond` or `tf.case` in Tensorflow) against the MD5 hash, in the [high, low] order, of contained models based on a user-specified name (which the user will then use as flag value to the compiler) Agents just need to add a flag to capture the name of a model and pass it to `ReleaseModeModelRunner` at construction. This can be passed in all cases without checking - the case where the model is not composite and we pass an empty name, everything works as before. This change also factors out the string flags we pass to the `ReleaseModeModelRunner` for better maintainability (we risk confusing parameters that are strings otherwise)
Configuration menu - View commit details
-
Copy full SHA for 313b1a8 - Browse repository at this point
Copy the full SHA 313b1a8View commit details -
[clang][OpenMP] Fix error handling of the adjust_args clause (#94696)
Static verifier noticed the current code has logically dead code parsing the clause where IsComma is assigned. Fix this and improve the error message received when a bad adjust-op is specified. This will now be handled like 'map' where a nice diagnostic is given with the correct values, then parsing continues on the next clause reducing unhelpful diagnostics.
Configuration menu - View commit details
-
Copy full SHA for 5413a2b - Browse repository at this point
Copy the full SHA 5413a2bView commit details -
[AMDGPU] Fix negative immediate offset for unbuffered smem loads (#89…
…165) For unbuffered smem loads, it is illegal for the immediate offset to be negative if the resulting IOFFSET + (SGPR[Offset] or M0 or zero) is negative. New PR of llvm/llvm-project#79553.
Configuration menu - View commit details
-
Copy full SHA for 3aef525 - Browse repository at this point
Copy the full SHA 3aef525View commit details -
[libc++] Build with -fsized-deallocation (#96217)
This patch makes libc++ build with -fsized-deallocation. That flag is enabled by default in recent versions of Clang, so this patch will make libc++ forward-compatible with ToT Clang.
Configuration menu - View commit details
-
Copy full SHA for d2864d1 - Browse repository at this point
Copy the full SHA d2864d1View commit details -
[libc][startup] create header for ElfW and use in startup (#96510)
This is necessary for 32b platforms such as ARM and i386. Link: #94128
Configuration menu - View commit details
-
Copy full SHA for bea7f3d - Browse repository at this point
Copy the full SHA bea7f3dView commit details -
[bazel] Switch mach_gen to apple_genrule (#96551)
mig is a tool vendored with Xcode. Using apple_genrule makes sure that the bazel selected version of Xcode is preferred, and that the action is invalidated when that version changes.
Configuration menu - View commit details
-
Copy full SHA for dd8d978 - Browse repository at this point
Copy the full SHA dd8d978View commit details -
Configuration menu - View commit details
-
Copy full SHA for a030c8b - Browse repository at this point
Copy the full SHA a030c8bView commit details -
[NVPTX] Basic support for "grid_constant" (#96125)
- Adds a helper function for checking whether an argument is a [grid_constant](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html#supported-properties). - Adds support for cvta.param using changes from llvm/llvm-project#95289 - Supports escaped grid_constant pointers conservatively, by casting all uses to the generic address space with cvta.param.
Configuration menu - View commit details
-
Copy full SHA for 687d6fb - Browse repository at this point
Copy the full SHA 687d6fbView commit details -
LAA: strip unnecessary getUniqueCastUse (#92119)
733b8b2 ([LAA] Simplify identification of speculatable strides [nfc]) refactored getStrideFromPointer() to compute directly on SCEVs, and return an SCEV expression instead of a Value. However, it left behind a call to getUniqueCastUse(), which is completely unnecessary. Remove this, showing a positive test update, and simplify the surrounding program logic.
Configuration menu - View commit details
-
Copy full SHA for 5ae5069 - Browse repository at this point
Copy the full SHA 5ae5069View commit details -
[libc][math] Implement double precision sin correctly rounded to all …
…rounding modes. (#95736) - Algorithm: - Step 1 - Range reduction: for a double precision input `x`, return `k` and `u` such that - k is an integer - u = x - k * pi / 128, and |u| < pi/256 - Step 2 - Calculate `sin(u)` and `cos(u)` in double-double using Taylor polynomials with errors < 2^-70 with FMA or < 2^-66 w/o FMA. - Step 3 - Calculate `sin(x) = sin(k*pi/128) * cos(u) + cos(k*pi/128) * sin(u)` using look-up table for `sin(k*pi/128)` and `cos(k*pi/128)`. - Step 4 - Use Ziv's rounding test to decide if the result is correctly rounded. - Step 4' - If the Ziv's rounding test failed, redo step 1-3 using 128-bit precision. - Currently, without FMA instructions, the large range reduction only works correctly for the default rounding mode (FE_TONEAREST). - Provide `LIBC_MATH` flag so that users can set `LIBC_MATH = LIBC_MATH_SKIP_ACCURATE_PASS` to build the `sin` function without step 4 and 4'.
Configuration menu - View commit details
-
Copy full SHA for 16903ac - Browse repository at this point
Copy the full SHA 16903acView commit details -
Revert commits that add
TestFind(Ranges)InMemory.py
(#96560)Reverting to unblock macOS buildbots which are currently failing on these tests. https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/6377/
Configuration menu - View commit details
-
Copy full SHA for a32b719 - Browse repository at this point
Copy the full SHA a32b719View commit details -
Configuration menu - View commit details
-
Copy full SHA for 75ac887 - Browse repository at this point
Copy the full SHA 75ac887View commit details -
[llvm][ProfDataUtils] Provide getNumBranchWeights API (#90146)
As suggested in https://github.com/llvm/llvm-project/pull/86609/files#r1556689262 an API for getting the number of branch weights directly from the MD node would be useful in a variety of checks, and keeps the logic within ProfDataUtils.
Configuration menu - View commit details
-
Copy full SHA for a3a44bf - Browse repository at this point
Copy the full SHA a3a44bfView commit details -
[BOLT] Hash-based function matching (#95821)
Using the hashes of binary and profiled functions to recover functions with changed names. Test Plan: added hashing-based-function-matching.test.
Configuration menu - View commit details
-
Copy full SHA for 5e097c7 - Browse repository at this point
Copy the full SHA 5e097c7View commit details -
[clang][docs] '#pragma clang section' is supported on Mach-O. NFC
This was added back in 7f6e331, but I forgot to update the docs that referenced it.
Configuration menu - View commit details
-
Copy full SHA for b3c668b - Browse repository at this point
Copy the full SHA b3c668bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 32e4906 - Browse repository at this point
Copy the full SHA 32e4906View commit details -
[lldb][API] Add Find(Ranges)InMemory() to Process SB API (#96569)
This is a second attempt to land #95007 Test Plan: llvm-lit llvm-project/lldb/test/API/python_api/find_in_memory/TestFindInMemory.py llvm-project/lldb/test/API/python_api/find_in_memory/TestFindRangesInMemory.py Reviewers: clayborg Tasks: lldb
Configuration menu - View commit details
-
Copy full SHA for 0d4da0d - Browse repository at this point
Copy the full SHA 0d4da0dView commit details -
[libc] Disable freelist test on NVPTX temporarily
Summary: This test fails due to alignment issues, it's likely that it's misaligned on other targets too and they just don't crash on it. @PiJoules maybe we should run this with ubsan?
Configuration menu - View commit details
-
Copy full SHA for dc27ff1 - Browse repository at this point
Copy the full SHA dc27ff1View commit details -
[LLDB][Minidump] Change expected directories to the correct type; siz…
…e_t (#96564) In #95312 I incorrectly set `m_expected_directories` to uint, this broke the windows build and is the incorrect type. `size_t` is more accurate because this value only ever represents the expected upper bound of the directory vector.
Configuration menu - View commit details
-
Copy full SHA for 361543e - Browse repository at this point
Copy the full SHA 361543eView commit details -
Revert "[Flang][Driver] Add -print-resource-dir command line flag to …
…emit Flang's resource directory" (#96557) Reverts llvm/llvm-project#90886 These changes broke linking to compiler-rt on Windows
Configuration menu - View commit details
-
Copy full SHA for a2d340b - Browse repository at this point
Copy the full SHA a2d340bView commit details
Commits on Jun 25, 2024
-
[flang] Allow derf as alternate spelling for erf (#95784)
This patch adds derf as an alternate spelling for the erf intrinsic. This spelling is supported by multiple other compilers and used by WRF.
Configuration menu - View commit details
-
Copy full SHA for 954b692 - Browse repository at this point
Copy the full SHA 954b692View commit details -
[X86] Add sub-feature zu (zero upper) for APX
This is a follow-up patch for #74199
Configuration menu - View commit details
-
Copy full SHA for 8ad32ce - Browse repository at this point
Copy the full SHA 8ad32ceView commit details -
[LoongArch][test] Remove the FIXME in psabi-restricted-scheduling.ll …
…which has been addressed by #76555
Configuration menu - View commit details
-
Copy full SHA for 7ea63b9 - Browse repository at this point
Copy the full SHA 7ea63b9View commit details -
[NVPTX] Make nvptx mma instructions convergent. (#96521)
We are running into NVPTX backend generating wrong code for an input: ``` %0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...) if laneid == 0: ret else: store %0 ``` The backend reorder the instruction (as an effect of `MachineSink` pass) to ``` if laneid == 0: ret else: %0 = llvm.nvvm.mma.m?n?k?.row.col.??? (...) store %0 ``` This is incorrect because `mma` is a warp instruction which needs all threads to sync before performing the operation instead of being guarded by a specific thread id. It should be similar as the shuffle instruction `shfl` in terms of warp level sync, and `shfl` is marked as `isConvergent = true`. Apply `isConvergent = true` to `mma` instructions.
Configuration menu - View commit details
-
Copy full SHA for b0e9b00 - Browse repository at this point
Copy the full SHA b0e9b00View commit details -
[X86] Fix test Clang::CodeGen/builtin-cpu-supports.c failure
The test failed after llvm/llvm-project@8ad32ce In https://github.com/gcc-mirror/gcc/blob/master/gcc/common/config/i386/i386-cpuinfo.h FEATURE_AVX512CD = 23 and FEATURE_AVX512VBMI = 26, we should only add 2 features between them. New features should be inserted at the end.
Configuration menu - View commit details
-
Copy full SHA for 4e0a0ea - Browse repository at this point
Copy the full SHA 4e0a0eaView commit details -
Merge from 'main' to 'sycl-web' (125 commits)
CONFLICT (content): Merge conflict in libclc/CMakeLists.txt
Configuration menu - View commit details
-
Copy full SHA for dd0b7cb - Browse repository at this point
Copy the full SHA dd0b7cbView commit details -
Merge from 'sycl' to 'sycl-web' (7 commits)
iclsrc committedJun 25, 2024 Configuration menu - View commit details
-
Copy full SHA for 164e362 - Browse repository at this point
Copy the full SHA 164e362View commit details -
[clang-format] Add option to remove leading blank lines (#91221)
The options regarding which blank lines are kept are also aggregated. The new option is `KeepEmptyLines`.
Configuration menu - View commit details
-
Copy full SHA for 9267f8f - Browse repository at this point
Copy the full SHA 9267f8fView commit details -
Adjust MSVC version range for ARM64 build performance regression (#90…
…731) This is follow up for #65215 Mentioned regression was fixed in MSVC 19.39 (VS 17.9.0), so it makes sense to not apply fix for that (and newer) compiler versions. Same as original change, this patch is narrowly scoped to not affect any other compiler.
Configuration menu - View commit details
-
Copy full SHA for 437366b - Browse repository at this point
Copy the full SHA 437366bView commit details -
[libc++] Remove Windows-specific configuration from libcxx/test/CMake…
…Lists.txt (#96330) This is essentially a revert of 9853e9b which tried removing duplication in the Windows config files by moving it to the CMake. However, we want to decouple the CMake and the test suite as much as possible, so encoding additional (non-official) Lit parameters in the CMake only as a code reuse mechanism is not an approach we want to take.
Configuration menu - View commit details
-
Copy full SHA for c393121 - Browse repository at this point
Copy the full SHA c393121View commit details -
[clang-tidy] Fix assert in performance-unnecessary-copy-init. (#96506)
`GetDirectCallee` can be null. Fixes #96498.
Configuration menu - View commit details
-
Copy full SHA for 8348d72 - Browse repository at this point
Copy the full SHA 8348d72View commit details -
Merge from 'main' to 'sycl-web' (17 commits)
CONFLICT (content): Merge conflict in clang/lib/CodeGen/CodeGenModule.cpp
Configuration menu - View commit details
-
Copy full SHA for 1d9029f - Browse repository at this point
Copy the full SHA 1d9029fView commit details -
Configuration menu - View commit details
-
Copy full SHA for bd488c1 - Browse repository at this point
Copy the full SHA bd488c1View commit details -
[CodeGen][NewPM] Port machine post dominator tree analysis to new pas…
…s manager (#96378) Follows #95879.
Configuration menu - View commit details
-
Copy full SHA for 8599629 - Browse repository at this point
Copy the full SHA 8599629View commit details -
[MC] Remove setUseAssemblerInfoForParsing(false) workarounds
This reverts commit 245491a ("[MC] Disable MCAssembler based constant folding for DwarfDebug") and cb09b5f ("[MC] Disable MCAssembler based constant folding for compact unwind and emitJumpTableEntry"). Checking the relative order of FA and FB is now faster due to de19f7b ("[MC] Replace fragment ilist with singly-linked lists").
Configuration menu - View commit details
-
Copy full SHA for 62d44fb - Browse repository at this point
Copy the full SHA 62d44fbView commit details -
[mlir][Transforms] Dialect conversion: Simplify handling of dropped a…
…rguments (#96207) This commit simplifies the handling of dropped arguments and updates some dialect conversion documentation that is outdated. When converting a block signature, a `BlockTypeConversionRewrite` object and potentially multiple `ReplaceBlockArgRewrite` are created. During the "commit" phase, uses of the old block arguments are replaced with the new block arguments, but the old implementation was written in an inconsistent way: some block arguments were replaced in `BlockTypeConversionRewrite::commit` and some were replaced in `ReplaceBlockArgRewrite::commit`. The new `BlockTypeConversionRewrite::commit` implementation is much simpler and no longer modifies any IR; that is done only in `ReplaceBlockArgRewrite` now. The `ConvertedArgInfo` data structure is no longer needed. To that end, materializations of dropped arguments are now built in `applySignatureConversion` instead of `materializeLiveConversions`; the latter function no longer has to deal with dropped arguments. Other minor improvements: - Improve variable name: `origOutputType` -> `origArgType`. Add an assertion to check that this field is only used for argument materializations. - Add more comments to `applySignatureConversion`. Note: Error messages around failed materializations for dropped basic block arguments changed slightly. That is because those materializations are now built in `legalizeUnresolvedMaterialization` instead of `legalizeConvertedArgumentTypes`. This commit is in preparation of decoupling argument/source/target materializations from the dialect conversion.
Configuration menu - View commit details
-
Copy full SHA for f1e0657 - Browse repository at this point
Copy the full SHA f1e0657View commit details -
[RISCV][GISel] Fix test case order in fp-arith.mir. NFC
The fadd_f64 test was in the middle of some f32 tests.
Configuration menu - View commit details
-
Copy full SHA for 41f8e6d - Browse repository at this point
Copy the full SHA 41f8e6dView commit details -
[clangd] Fix the build broken (NFC)
/llvm-project/clang-tools-extra/clangd/Format.cpp:284:11: error: no member named 'KeepEmptyLinesAtTheStartOfBlocks' in 'clang::format::FormatStyle' Style.KeepEmptyLinesAtTheStartOfBlocks = true; ~~~~~ ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 4c91b49 - Browse repository at this point
Copy the full SHA 4c91b49View commit details -
[VectorCombine] Add free concats to shuffleToIdentity. (#94954)
This is another relatively small adjustment to shuffleToIdentity, which has had a few knock-one effects to need a few more changes. It attempts to detect free concats, that will be legalized to multiple vector operations. For example if the lanes are '[a[0], a[1], b[0], b[1]]' and a and b are v2f64 under aarch64. In order to do this: - isFreeConcat detects whether the input has piece-wise identities from multiple inputs that can become a concat. - A tree of concat shuffles is created to concatenate the input values into a single vector. This is a little different to most other inputs as there are created from multiple values that are being combined together, and we cannot rely on the Lane0 insert location always being valid. - The insert location is changed to the original location instead of updating per item, which ensure it is valid due to the order that we visit and create items.
Configuration menu - View commit details
-
Copy full SHA for efa8463 - Browse repository at this point
Copy the full SHA efa8463View commit details -
[SmallPtrSet] Add remove_if() method (#96468)
Add remove_if() method, similar to the one already present on SetVector. It is intended to replace the following pattern: for (Foo *Ptr : Set) if (Pred(Ptr)) Set.erase(Ptr); With: Set.remove_if(Pred); This pattern is commonly used for set intersection, where `Pred` is something like `!OtherSet.contains(Ptr)`. The implementation provided here is a bit more efficient than the naive loop, because it does not require looking up the bucket during the erase() operation again. However, my actual motivation for this is to have a way to perform this operation without relying on the current `std::set`-style guarantee that erase() does not invalidate iterators. I'd like to stop making use of tombstones in the small regime, which will make insertion operations a good bit more efficient. However, this will invalidate iterators during erase().
Configuration menu - View commit details
-
Copy full SHA for f019581 - Browse repository at this point
Copy the full SHA f019581View commit details -
[C++20] [Modules] [Serialization] Don't reuse type ID and identifier …
…ID from imported modules To support no-transitive-change model for named modules, we can't reuse type ID and identifier ID from imported modules arbitrarily. Since the theory for no-transitive-change model is, for a user of a named module, the user can only access the indirectly imported decls via the directly imported module. So that it is possible to control what matters to the users when writing the module. And it will be unsafe to do so if the users can reuse the type IDs and identifier IDs from the indirectly imported modules not via the directly imported modules. So in this patch, we don't reuse the type ID and identifier ID in the AST writer to avoid the problematic case.
Configuration menu - View commit details
-
Copy full SHA for fa20184 - Browse repository at this point
Copy the full SHA fa20184View commit details -
[clang][Interp] Fix returning primitive non-blockpointers
We can't deref() them, so return false here.
Configuration menu - View commit details
-
Copy full SHA for 8153773 - Browse repository at this point
Copy the full SHA 8153773View commit details -
[DomTree] Avoid duplicate hash lookups in runDFS() (NFCI) (#96460)
runDFS() currently performs three hash table lookups. One in the main loop, one when checking whether a successor has already been visited and another when adding parent and reverse children to the successor. We can avoid the two additional lookups by making the parent number part of the stack, and then making the parent / reverse children update part of the main loop. The main loop already has a check for already visited nodes, so we don't have to check this in advance -- we can simply push the node to the worklist and skip it later.
Configuration menu - View commit details
-
Copy full SHA for 174f80c - Browse repository at this point
Copy the full SHA 174f80cView commit details -
mlir-config.h is included but not listed in dependencies
Configuration menu - View commit details
-
Copy full SHA for 01fb529 - Browse repository at this point
Copy the full SHA 01fb529View commit details -
[TailDup][MachineSSAUpdater] Let RewriteUse insert a COPY when needed…
… (#95553) When running early-tailduplication we've seen problems with machine verifier errors due to register class mismatches after doing the machine SSA updates. Typical scenario is that there is a PHI node and another instruction that is using the same vreg: %othervreg:otherclass = PHI %vreg:origclass, %bb MInstr %vreg:origclass but then after TailDuplicator::tailDuplicateAndUpdate we get %othervreg:otherclass = PHI %vreg:origclass, %bb, ... MInstr %othervreg:otherclass Such rewrites are only valid if 'otherclass' is equal to (or a subclass of) 'origclass'. The solution here is based on adding a COPY instruction to make sure we satisfy constraints given by 'MInstr' in the example. So if 'otherclass' isn't equal to (or a subclass of) 'origclass' we insert a copy after the PHI like this: %othervreg:otherclass = PHI %vreg:origclass, %bb, ... %newvreg:origclass = COPY %othervreg:otherclass MInstr %newvreg:origclass A special case is when it is possible to constrain the register class instead of inserting a COPY. We currently prefer to constrain the register class instead of inserting a COPY, even if it is a bit unclear if that always is better (considering register pressure for the constrained class etc.). Fixes: llvm/llvm-project#62712
Configuration menu - View commit details
-
Copy full SHA for 7f1a744 - Browse repository at this point
Copy the full SHA 7f1a744View commit details -
[NFC] [GWP-ASan] Rename Check() to check() (#96605)
Change this function to be LLVM-style in name.
Configuration menu - View commit details
-
Copy full SHA for 11e12bd - Browse repository at this point
Copy the full SHA 11e12bdView commit details -
Configuration menu - View commit details
-
Copy full SHA for d782119 - Browse repository at this point
Copy the full SHA d782119View commit details -
[RISCV] Add scheduling model for Syntacore SCR3 (#95427)
Syntacore SCR3 is a microcontroller-class processor core. Overview: https://syntacore.com/products/scr3 Co-authored-by: Dmitrii Petrov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2d84e0f - Browse repository at this point
Copy the full SHA 2d84e0fView commit details -
[lldb/DWARF] Remove parsing recursion when searching for definition D…
…IEs (#96484) If ParseStructureLikeDIE (or ParseEnum) encountered a declaration DIE, it would call FindDefinitionTypeForDIE. This returned a fully formed type, which it achieved by recursing back into ParseStructureLikeDIE with the definition DIE. This obscured the control flow and caused us to repeat some work (e.g. the UniqueDWARFASTTypeMap lookup), but it mostly worked until we tried to delay the definition search in #90663. After this patch, the two ParseStructureLikeDIE calls were no longer recursive, but rather the second call happened as a part of the CompleteType() call. This opened the door to inconsistencies, as the second ParseStructureLikeDIE call was not aware it was called to process a definition die for an existing type. To make that possible, this patch removes the recusive type resolution from this function, and leaves just the "find definition die" functionality. After finding the definition DIE, we just go back to the original ParseStructureLikeDIE call, and have it finish the parsing process with the new DIE. While this patch is motivated by the work on delaying the definition searching, I believe it is also useful on its own.
Configuration menu - View commit details
-
Copy full SHA for 8395f9c - Browse repository at this point
Copy the full SHA 8395f9cView commit details -
[SPIR-V]: Fix creation of constants of array types in SPIRV Backend (…
…#96514) This PR fixes llvm/llvm-project#96513. The way of creation of array type constant was incorrect: instead of creating [1, 1, 1] or [1, 1, 1, 1, 1, ....] constants, the same [1] constant was always created, substituting original composite constants. This in its turn led to a situation when only one of constants might exist in the code without emitting invalid code, the second constant would be eventually rewritten to the first constant, because a key to address both was an array of a single element (like [1]). This PR fixes the issue and purges from the code unneeded copy/pasted clone of the function that creates an array constant.
Configuration menu - View commit details
-
Copy full SHA for f6aa508 - Browse repository at this point
Copy the full SHA f6aa508View commit details -
[AArch64][SVE] optimisation for SVE load intrinsics with no active la…
…nes (#95269) This patch extends #73964 and adds optimisation of load SVE intrinsics when predicate is zero.
Configuration menu - View commit details
-
Copy full SHA for 0bd9c49 - Browse repository at this point
Copy the full SHA 0bd9c49View commit details -
[flang][debug] Support pointer type. (#96153)
The handling of `PointerType` is similar to `HeapType`. The only difference is that allocated flag is generated for `HeapType` and associated flag for `PointerType`. The tests for pointer to allocatable strings are disabled for now. I will enable them once #95906 is merged. The debugging in GDB looks like this: integer, pointer :: par2(:) integer, target, allocatable :: ar2(:) integer, target :: sc integer, pointer :: psc allocate(ar2(4)) par2 => ar2 psc => sc 19 par2 => ar2 (gdb) p par2 $3 = <not associated> (gdb) n 20 do i=1,5 (gdb) p par2 $4 = (0, 0, 0, 0) (gdb) ptype par2 type = integer (4) (gdb) p sc $5 = 3 (gdb) p psc $6 = (PTR TO -> ( integer )) 0x7fffffffda24 (gdb) p *psc $7 = 3
Configuration menu - View commit details
-
Copy full SHA for 919b1ec - Browse repository at this point
Copy the full SHA 919b1ecView commit details -
[AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lower…
…ing for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass to support i64 and f64 operations (along with removing all unnecessary bitcasts). This legalizes 64 bit readlane, writelane and readfirstlane ops pre-ISel --------- Co-authored-by: vikramRH <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5feb32b - Browse repository at this point
Copy the full SHA 5feb32bView commit details -
[GlobalISel] Add build methods for FP environment intrinsics (#96607)
This change adds methods like buildGetFPEnv and similar for opcodes that represent manipulation on floating-point state.
Configuration menu - View commit details
-
Copy full SHA for f9795f3 - Browse repository at this point
Copy the full SHA f9795f3View commit details -
[libc++] Use __is_nothrow_destructible (#95766)
This changes the behaviour in C++03 mode because we'll now use the builtin on Clang, but I don't think that's much of a problem.
Configuration menu - View commit details
-
Copy full SHA for 16d02cd - Browse repository at this point
Copy the full SHA 16d02cdView commit details -
[SetOperations] clang-format header (NFC)
This header used three-space indentation in a number of places. Reformat it completely.
Configuration menu - View commit details
-
Copy full SHA for 29f4a05 - Browse repository at this point
Copy the full SHA 29f4a05View commit details -
This FIXME has already been addressed in #89358
Configuration menu - View commit details
-
Copy full SHA for f09b024 - Browse repository at this point
Copy the full SHA f09b024View commit details -
[VPlan] Iterate over VPlans to get VFs to compute cost for (NFCI).
Instead for iterating over all VFs when computing costs, simply iterate over the VFs available in the created VPlans. Split off from llvm/llvm-project#92555. This also prepares for moving the check if any vector instructions will be generated to be based on VPlan, to unblock recommitting llvm/llvm-project#92555.
Configuration menu - View commit details
-
Copy full SHA for 9d45077 - Browse repository at this point
Copy the full SHA 9d45077View commit details -
Configuration menu - View commit details
-
Copy full SHA for eeb0884 - Browse repository at this point
Copy the full SHA eeb0884View commit details -
[LV] Make create-induction-resume.ll more robust by adding store.
Without the store, the vector loop body is empty. Add a store to avoid that, while not impacting the induction resume values that are created.
Configuration menu - View commit details
-
Copy full SHA for a2e9157 - Browse repository at this point
Copy the full SHA a2e9157View commit details -
Configuration menu - View commit details
-
Copy full SHA for 37c736e - Browse repository at this point
Copy the full SHA 37c736eView commit details -
[Xtensa] Lower GlobalAddress/BlockAddress/JumpTable (#95256)
This patch implements lowering of the GlobalAddress, BlockAddress, JumpTable and BR_JT. Also patch adds legal support of the BR_CC operation for i32 type.
Configuration menu - View commit details
-
Copy full SHA for cc8fdd6 - Browse repository at this point
Copy the full SHA cc8fdd6View commit details -
[SCCP] Generate test checks (NFC)
Some of these are just old, while others previously did not use UTC due to missing features that have since been implemented (such as signature matching).
Configuration menu - View commit details
-
Copy full SHA for 4acc8ee - Browse repository at this point
Copy the full SHA 4acc8eeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 16bb8c1 - Browse repository at this point
Copy the full SHA 16bb8c1View commit details -
[Reassociate] Use poison instead of undef for dummy operands (NFCI)
These will be replaced later.
Configuration menu - View commit details
-
Copy full SHA for 35eef9f - Browse repository at this point
Copy the full SHA 35eef9fView commit details -
[NFC][lld][ELF] Remove unused
sec
param of `ObjFile<ELFT>::getReloc……Target` (#96500)
Configuration menu - View commit details
-
Copy full SHA for 65f9601 - Browse repository at this point
Copy the full SHA 65f9601View commit details -
[LoongArch] Ensure PseudoLA* can be hoisted (#94723)
Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_{LD,GD} does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved.
Configuration menu - View commit details
-
Copy full SHA for bfad875 - Browse repository at this point
Copy the full SHA bfad875View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9952e00 - Browse repository at this point
Copy the full SHA 9952e00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 68efc50 - Browse repository at this point
Copy the full SHA 68efc50View commit details -
[clang][Driver] Add HIPAMD Driver support for AMDGCN flavoured SPIR-V…
… (#95061) This patch augments the HIPAMD driver to allow it to target AMDGCN flavoured SPIR-V compilation. It's mostly straightforward, as we re-use some of the existing SPIRV infra, however there are a few notable additions: - we introduce an `amdgcnspirv` offload arch, rather than relying on using `generic` (this is already fairly overloaded) or simply using `spirv` or `spirv64` (we'll want to use these to denote unflavoured SPIRV, once we bring up that capability) - initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU targets, as it would require some relatively intrusive surgery in the HIPAMD Toolchain and the Driver to deal with two triples (`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively) - in order to retain user provided compiler flags and have them available at JIT time, we rely on embedding the command line via `-fembed-bitcode=marker`, which the bitcode writer had previously not implemented for SPIRV; we only allow it conditionally for AMDGCN flavoured SPIRV, and it is handled correctly by the Translator (it ends up as a string literal) Once the SPIRV BE is no longer experimental we'll switch to using that rather than the translator. There's some additional work that'll come via a separate PR around correctly piping through AMDGCN's implementation of `printf`, for now we merely handle its flags correctly.
Configuration menu - View commit details
-
Copy full SHA for 9acb533 - Browse repository at this point
Copy the full SHA 9acb533View commit details -
Merge from 'main' to 'sycl-web' (15 commits)
CONFLICT (content): Merge conflict in llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Configuration menu - View commit details
-
Copy full SHA for 3f7b832 - Browse repository at this point
Copy the full SHA 3f7b832View commit details
Commits on Jun 26, 2024
-
Merge from 'sycl' to 'sycl-web' (10 commits)
CONFLICT (content): Merge conflict in clang/test/Driver/sycl-linker-wrapper-image.cpp
Configuration menu - View commit details
-
Copy full SHA for fd7622a - Browse repository at this point
Copy the full SHA fd7622aView commit details -
Merge from 'sycl' to 'sycl-web' (1 commits)
iclsrc committedJun 26, 2024 Configuration menu - View commit details
-
Copy full SHA for 2f481f2 - Browse repository at this point
Copy the full SHA 2f481f2View commit details -
Merge from 'main' to 'sycl-web' (125 commits)
CONFLICT (content): Merge conflict in clang/lib/Driver/Driver.cpp CONFLICT (content): Merge conflict in clang/lib/Driver/ToolChains/HIPAMD.cpp
iclsrc committedJun 26, 2024 Configuration menu - View commit details
-
Copy full SHA for 074e55c - Browse repository at this point
Copy the full SHA 074e55cView commit details -
Fix tests after cbf6e93 (#14294)
Test needs update after cbf6e93 2024-05-28 [clang codegen] Delete unnecessary GEP cleanup code. (#90303). Change made by @premanandrao
Configuration menu - View commit details
-
Copy full SHA for 658b9a4 - Browse repository at this point
Copy the full SHA 658b9a4View commit details