Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{ai}[foss/2022a] DGL v1.1.3 w/ CUDA 11.7.0 #20092

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

sassy-crick
Copy link
Collaborator

@sassy-crick sassy-crick commented Mar 12, 2024

…DGL-1.1.3_use_externals_instead_of_submodules.patch
@sassy-crick
Copy link
Collaborator Author

@boegelbot please test @ generoso

Copy link
Contributor

@Micket Micket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing checksums

@boegelbot
Copy link
Collaborator

@sassy-crick: Request for testing this PR well received on login1

PR test command 'EB_PR=20092 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_20092 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13247

Test results coming soon (I hope)...

- notification for comment with ID 2034644413 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@sassy-crick
Copy link
Collaborator Author

These are git-commits, can we do reproducible checksums here already? I thought that was a feature for EB-5.0, not?

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
cns1 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/f2b590af38dbb665c07effd24c6584ea for a full test report.

@Micket
Copy link
Contributor

Micket commented Apr 3, 2024

These are git-commits, can we do reproducible checksums here already?

No

I thought that was a feature for EB-5.0, not?

Yes.

missing checksums should be replaced with None. See previous easyconfig.

Comment on lines +26 to +29
'source_urls': ['https://github.com/KarypisLab/METIS/archive'],
'download_filename': 'v5.2.1.tar.gz',
'filename': 'metis-5.2.1.tar.gz',
'extract_cmd': "tar -C %(namelower)s-%(version)s/third_party/METIS --strip-components=1 -xf %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, similar to the rest of the third party things, we do have METIS, nanoflann and such, and i don't think anyone is stopping us from adding a CCCL and the rest as well.. so i'm not sure why these were kept in as sources? @akesandgren comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, we got already existing EC for them. However, I could not get that working with external builds. Also, it appears that they nailed it down to specific commits as well. So in the end I decided to fall back to that approach but I am happy to get that working with existing EC if somebody can show me how to do that without unpicking everything.

Copy link
Contributor

@akesandgren akesandgren Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The METIS they use is patched in some incompatible way.
The EC i made does have nanoflann as a dependency.

And since they patch METIS I didn't even consider using an external GKlib-METIS for that reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could try and see if nanoflann is still working. At one point, to be honest, I decided to make it working so it is reproducible instead of spending too much time flogging what looked like a dead horse to me.

@sassy-crick
Copy link
Collaborator Author

@boegelbot please test @ generoso

@boegelbot
Copy link
Collaborator

@sassy-crick: Request for testing this PR well received on login1

PR test command 'EB_PR=20092 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_20092 --ntasks=4 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 13261

Test results coming soon (I hope)...

- notification for comment with ID 2040697202 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
cns2 - Linux Rocky Linux 8.9, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/a406de596c2ae3c584ffbdda7f1d4683 for a full test report.

@boegel
Copy link
Member

boegel commented May 22, 2024

We should check for overlap/differences with the DGL in #18359...

@pavelToman
Copy link
Contributor

@boegelbot please test @ jsc-zen3-a100

@pavelToman
Copy link
Contributor

DGL-1.1.3-GCC-12.3.0-CUDA-12.1.1 seems ok: #20768 (comment)

@boegelbot
Copy link
Collaborator

@pavelToman: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20092 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20092 --ntasks=8 --partition=jsczen3g --gres=gpu:1 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 4739

Test results coming soon (I hope)...

- notification for comment with ID 2304776224 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen3g1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.4, x86_64, AMD EPYC-Milan Processor (zen3), 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.42.06, Python 3.9.18
See https://gist.github.com/boegelbot/1acfa8618b604e89822e0943320cdcba for a full test report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants