Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add patch to work around potential crash in PyTorch 1.13.1 caused by GCC 12 compiler bug #19133

Merged

Conversation

Flamefire
Copy link
Contributor

(created using eb --new-pr)

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusml24 - Linux RHEL 7.6, POWER, 8335-GTX, 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/d686d0b796a708a8482b19faf30a2c85 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusml3 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/c202a9df71c546b64fd335694235db98 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
taurusi8012 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/Flamefire/66ce4d13d0735089299b083e2a775795 for a full test report.

@boegel boegel changed the title Workaround potential crash in PyTorch 1.13.1 caused by GCC 12 compiler bug add patch to work around potential crash in PyTorch 1.13.1 caused by GCC 12 compiler bug Nov 6, 2023
@boegel boegel added the bug fix label Nov 6, 2023
@boegel boegel added this to the next release (4.9.0?) milestone Nov 6, 2023
@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusi8027 - Linux CentOS Linux 7.9.2009, x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 470.57.02, Python 2.7.5
See https://gist.github.com/Flamefire/1a88ed913b9562e7ad3827ca3ab82ed4 for a full test report.

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
taurusml3 - Linux RHEL 7.6, POWER, 8335-GTX (power9le), 6 x NVIDIA Tesla V100-SXM2-32GB, 440.64.00, Python 2.7.5
See https://gist.github.com/Flamefire/112fd2d6f307f635bce38a8bafb10da0 for a full test report.

@boegel
Copy link
Member

boegel commented Nov 8, 2023

@boegelbot please test @ generoso
CORE_CNT=16

@boegelbot
Copy link
Collaborator

@boegel: Request for testing this PR well received on login1

PR test command 'EB_PR=19133 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_19133 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 12123

Test results coming soon (I hope)...

- notification for comment with ID 1801518613 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
cnx1 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/dedb19e8c2a01fecb2f9e2bded94d1cf for a full test report.

@boegel
Copy link
Member

boegel commented Nov 8, 2023

Test report by @boegel
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
node3145.skitty.os - Linux RHEL 8.8, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz (skylake_avx512), Python 3.6.8
See https://gist.github.com/boegel/2ae5d3fbc91fedc0cba52879988ab51d for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Member

boegel commented Nov 9, 2023

Going in, thanks @Flamefire!

@boegel boegel merged commit 08a35ed into easybuilders:develop Nov 9, 2023
9 checks passed
@Flamefire Flamefire deleted the 20231101175633_new_pr_PyTorch1131 branch November 10, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants