-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sycl web #14302
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…to an RAII class (#94854) Modify MachineFunctionProperties in PassModel makes `PassT P; P.run(...);` not work properly. This is a necessary compromise.
… (#95025) In ContinuationIndenter::mustBreak, a break is required between a template declaration and the function/class declaration it applies to, if the template declaration spans multiple lines. However, this also includes template template parameters, which can cause extra erroneous line breaks in some declarations. This patch makes template template parameters not be counted as template declarations. Fixes llvm/llvm-project#93793 Fixes llvm/llvm-project#48746
…(#96384) Buildbot `clang-ppc64le-rhel` failed with: ```sh error: 'MFPropsModifier' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported] note: add a deduction guide to suppress this warning ``` after #94854. This PR adds deduction guide explicitly to suppress warning.
…laration" (#96388) Reverts llvm/llvm-project#95025 ; many bots are broken
When unifying the ResolveExecutable implementations in #96256, I missed that RemoteAwarePlatform was able to resolve executables more aggressively. The host platform can rely on the current working directory to make relative paths absolute and resolve things like home directories. This should fix command-target-create-resolve-exe.test.
This formatter doesn't currently provide much value. It only formats `SourceLocation` and `QualType`. The only formatting it does for `QualType` is call `getAsString()` on it. The main motivator for the removal however is that the formatter implementation can be very slow (since it uses the expression evaluator in non-trivial ways). Not infrequently do we get reports about LLDB being slow when debugging Clang, and it turns out the user was loading `ClangDataFormat.py` in their `.lldbinit` by default. We should eventually develop proper formatters for Clang data-types, but these are currently not ready. So this patch removes them in the meantime to avoid users shooting themselves in the foot, and giving the wrong impression of these being reference implementations.
Fold `mul (uitofp i1 X), Y` to `select i1 X, Y, 0.0` when the `mul` is `nnan` and `nsz` Proof: https://alive2.llvm.org/ce/z/_stiPm
We're ultimately expected to return an APValue simply pointing to the CallExpr, not any useful value. Do that by creating a global variable for the call.
The checks when building a thunk to decide if an arg needed to be cast to/from an integer or redirected via a pointer didn't match how arg types were changed in `canonicalizeThunkType`, this caused LLVM to ICE when using vector types as args due to incorrect types in a call instruction. Instead of duplicating these checks, we should check if the arg type differs between x64 and AArch64 and then cast or redirect as appropriate.
…at_provider (#95704) The original implementation of HelperFunctions::consumeHexStyle always sets Style when it returns true, but this is difficult for a compiler to understand since it requires seeing that Str starts with either an "x" or an "X" when starts_with_insensitive("x") return true. In particular, g++ 12 warns that HS may be used uninitialized in the format_provider::format caller. Change HelperFunctions::consumeHexStyle to return an optional HexPrintStyle and to make the fact that Str necessarily starts with an "X" when all other cases do not apply more explicit. This helps both the compiler and the human reader of the code. Co-authored-by: Sven Verdoolaege <[email protected]>
#95197 and 7500646 eliminated all raw `new MCXXXFragment`. We can now place fragments in a bump allocator. In addition, remove the dead `Kind == FragmentType(~0)` condition. ~CodeViewContext may call `StrTabFragment->destroy()` and need to be reset before `FragmentAllocator.Reset()`. Tested by llvm/test/MC/COFF/cv-compiler-info.ll using asan. Pull Request: llvm/llvm-project#96402
There is only one caller after #95188.
https://reviews.llvm.org/D67249 added content hash (see -fvalidate-ast-input-files-content) using llvm::hash_code (size_t). The hash value is 32-bit on 32-bit systems, which was unintentional. Fix #96379: #96136 switched the hash function to xxh3_64bit but did not update the ContentHash type, leading to mismatch between ASTReader and ASTWriter.
This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This is part 3 of 4 PRs. It sets the ground work for using the intrinsics in HLSL. Add HLSL frontend apis for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh` llvm/llvm-project#70079 llvm/llvm-project#70080 llvm/llvm-project#70081 llvm/llvm-project#70083 llvm/llvm-project#70084 llvm/llvm-project#95966
…n-constants If f(Y) simplifies to Y, replace with Y. This requires Y to be non-undef. Closes #94719
Follow-up to 05ba5c0. uint32_t is preferred over const MCExpr * in the section stack uses because it should only be evaluated once. Change the paramter type to match.
Functions that have the `nvvm.kernel` attribute should have 0 results.
The `gpu.func` op lowering accounts for memref arguments/results (both "normal" and bare-pointer supported), but the `gpu.return` op lowering did not. The lowering produced invalid IR that did not verify. This commit uses the same lowering strategy as for `func.return` in the `gpu.return` lowering. (The C++ implementation is copied. We may want to share some code between `func` and `gpu` lowerings in the future.)
Define subtarget features for atomic fmin/fmax support. The flat/global support is a real messe. We had float/double support at the beginning in gfx6 and gfx7. gfx8 removed these. gfx10 reintroduced them. gfx11 removed the f64 versions again. gfx9 partially reintroduced them, in gfx90a and gfx940 but only for f64.
…IEs (#96484) If ParseStructureLikeDIE (or ParseEnum) encountered a declaration DIE, it would call FindDefinitionTypeForDIE. This returned a fully formed type, which it achieved by recursing back into ParseStructureLikeDIE with the definition DIE. This obscured the control flow and caused us to repeat some work (e.g. the UniqueDWARFASTTypeMap lookup), but it mostly worked until we tried to delay the definition search in #90663. After this patch, the two ParseStructureLikeDIE calls were no longer recursive, but rather the second call happened as a part of the CompleteType() call. This opened the door to inconsistencies, as the second ParseStructureLikeDIE call was not aware it was called to process a definition die for an existing type. To make that possible, this patch removes the recusive type resolution from this function, and leaves just the "find definition die" functionality. After finding the definition DIE, we just go back to the original ParseStructureLikeDIE call, and have it finish the parsing process with the new DIE. While this patch is motivated by the work on delaying the definition searching, I believe it is also useful on its own.
…#96514) This PR fixes llvm/llvm-project#96513. The way of creation of array type constant was incorrect: instead of creating [1, 1, 1] or [1, 1, 1, 1, 1, ....] constants, the same [1] constant was always created, substituting original composite constants. This in its turn led to a situation when only one of constants might exist in the code without emitting invalid code, the second constant would be eventually rewritten to the first constant, because a key to address both was an array of a single element (like [1]). This PR fixes the issue and purges from the code unneeded copy/pasted clone of the function that creates an array constant.
…nes (#95269) This patch extends #73964 and adds optimisation of load SVE intrinsics when predicate is zero.
The handling of `PointerType` is similar to `HeapType`. The only difference is that allocated flag is generated for `HeapType` and associated flag for `PointerType`. The tests for pointer to allocatable strings are disabled for now. I will enable them once #95906 is merged. The debugging in GDB looks like this: integer, pointer :: par2(:) integer, target, allocatable :: ar2(:) integer, target :: sc integer, pointer :: psc allocate(ar2(4)) par2 => ar2 psc => sc 19 par2 => ar2 (gdb) p par2 $3 = <not associated> (gdb) n 20 do i=1,5 (gdb) p par2 $4 = (0, 0, 0, 0) (gdb) ptype par2 type = integer (4) (gdb) p sc $5 = 3 (gdb) p psc $6 = (PTR TO -> ( integer )) 0x7fffffffda24 (gdb) p *psc $7 = 3
…ing for generic types (#89217) This patch is intended to be the first of a series with end goal to adapt atomic optimizer pass to support i64 and f64 operations (along with removing all unnecessary bitcasts). This legalizes 64 bit readlane, writelane and readfirstlane ops pre-ISel --------- Co-authored-by: vikramRH <[email protected]>
This change adds methods like buildGetFPEnv and similar for opcodes that represent manipulation on floating-point state.
This changes the behaviour in C++03 mode because we'll now use the builtin on Clang, but I don't think that's much of a problem.
This header used three-space indentation in a number of places. Reformat it completely.
This FIXME has already been addressed in #89358
Instead for iterating over all VFs when computing costs, simply iterate over the VFs available in the created VPlans. Split off from llvm/llvm-project#92555. This also prepares for moving the check if any vector instructions will be generated to be based on VPlan, to unblock recommitting llvm/llvm-project#92555.
Without the store, the vector loop body is empty. Add a store to avoid that, while not impacting the induction resume values that are created.
This patch implements lowering of the GlobalAddress, BlockAddress, JumpTable and BR_JT. Also patch adds legal support of the BR_CC operation for i32 type.
Some of these are just old, while others previously did not use UTC due to missing features that have since been implemented (such as signature matching).
These will be replaced later.
…Target` (#96500)
Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_{LD,GD} does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved.
… (#95061) This patch augments the HIPAMD driver to allow it to target AMDGCN flavoured SPIR-V compilation. It's mostly straightforward, as we re-use some of the existing SPIRV infra, however there are a few notable additions: - we introduce an `amdgcnspirv` offload arch, rather than relying on using `generic` (this is already fairly overloaded) or simply using `spirv` or `spirv64` (we'll want to use these to denote unflavoured SPIRV, once we bring up that capability) - initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU targets, as it would require some relatively intrusive surgery in the HIPAMD Toolchain and the Driver to deal with two triples (`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively) - in order to retain user provided compiler flags and have them available at JIT time, we rely on embedding the command line via `-fembed-bitcode=marker`, which the bitcode writer had previously not implemented for SPIRV; we only allow it conditionally for AMDGCN flavoured SPIRV, and it is handled correctly by the Translator (it ends up as a string literal) Once the SPIRV BE is no longer experimental we'll switch to using that rather than the translator. There's some additional work that'll come via a separate PR around correctly piping through AMDGCN's implementation of `printf`, for now we merely handle its flags correctly.
CONFLICT (content): Merge conflict in llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
CONFLICT (content): Merge conflict in clang/test/Driver/sycl-linker-wrapper-image.cpp
CONFLICT (content): Merge conflict in clang/lib/Driver/Driver.cpp CONFLICT (content): Merge conflict in clang/lib/Driver/ToolChains/HIPAMD.cpp
Test needs update after cbf6e93 2024-05-28 [clang codegen] Delete unnecessary GEP cleanup code. (#90303). Change made by @premanandrao
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Did get my changes here. so created draft