-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spark: init 3.2.1 and test on aarch64-linux #160075
Conversation
@thoughtpolice @offlinehacker @kamilchm @illustris Would you mind taking a look? Briefly: I factored out the |
Thanks! The diff looks good. I'll try running some workloads on aarch64 in a few hours. |
My thought was that I'd rather have a separate PR that changes the version spark3 alises. Does that make sense? I'm not familiar with what best practices are regarding bumping to a new minor release. EDIT: I think I'd prefer to keep it at spark 3.1.2 since there's more tooling built around that version than the newer one. For example, AWS Glue and AWS EMR both use 3.1.2. I'd hate for an end-user to end up developing locally with a version which is newer and when they deploy, find out that they used a new feature or that the runtime behavior is different. EDIT2: Although, I am not a maintainer, so I defer to your judgement! |
I'm not sure the versions of AWS managed services should be a consideration for something that can also be deployed as a self-managed service. 3.1.x is also a stable release, and it is still available in nixpkgs for anyone to use, but having spark/spark3 default to the latest stable release makes more sense. This change would also need an entry in the release notes. The error messages you're seeing on running spark-shell are probably because the
I ran NixOS tests with the updated package on x86_64. Everything looks good. It should work the same on aarch64, as there is no native code in spark. Finally, this isn't a big deal, but could you change the package name from |
That's a good point! I updated it so spark3 points to the latest.
Good catch! I forgot to do this for my hadoop PR... any ideas on the best way to get those changes logged? Since
You're right, I think my shell was funky. I'm unable to reproduce it now.
Yep! I changed it so it matches the versioning used for hadoop and took your recommendation to drop the point release from the name. Thank you so much for the awesome feedback! |
Yes, it's definitely worth mentioning the addition of aarch64 support to hadoop and spark. You could just add that to this PR. |
b2a742c
to
ee1ff07
Compare
Just amended my commit to include the notes about hadoop and R. Thank you again for all of your help and feedback @illustris! |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/tweag-nix-dev-update-26/18252/1 |
Motivation for this change
There is a new version of Spark available (3.2.1).
Additionally, assuming since #158613 gets merged,
aarch64-linux
will be a supported platform, so we should test on it.Things done
Added Spark 3.2.1, set it as the default for
spark
, added aspark3
alias, and enabled testing onaarch64-linux
.sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)spark-shell
was able to successfully launchnixos/doc/manual/md-to-db.sh
to update generated release notesOn
aarch64-linux
(using the set of changes in #158613) andx86_64-linux
I see the following when opening the spark2 shell:I also saw this behavior on
x86_64-linux
:but not on
aarch64-darwin
:The shell still seems to work, though, so I'm not sure if this is a problem. I can't tell if this is just because of the version of OpenJDK being used being different across these platforms, or if it's something else.