Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow spark dependency to be configured dynamically #1326

Merged
merged 6 commits into from
Sep 6, 2024

Conversation

amahussein
Copy link
Collaborator

@amahussein amahussein commented Sep 3, 2024

Signed-off-by: Ahmed Hussein [email protected]

Fixes #1316

Allow user-tools to pick the SPARK dependencies based on a runtime env_var. The value format follows the same format of buildver in the scala pom file.
Currently 333 and 350 (default) are supported.
If user specifies an invalid value, there will be a warning message, then the process fails running the java cmd.

Changes

  • Add dependency key to the platform config-file
  • A platform can define its own default dependency versions using activeBuildVer key
  • Add a default RUNTIME_BUILDVER in the __init__.py to allow upgrades of spark release during official releases
  • Read an env_var RAPIDS_USER_TOOLS_SPARK_DEP_VERSION to pick the correct dependency.
  • If the env_var is set. it will overrides any other value defined in the platform_conf. Otherwise, there will be no way for the user to override a pre-configured platform.
  • Currently, only 333 and 350 are supported. Default is 350

Docs changes

  • Will file an internal issue to update the documentation highlighting the usage of RAPIDS_USER_TOOLS_SPARK_DEP_VERSION

Possible followups

  • Fail early if the dependencies are not defined for the defined key
  • Add more entries in the dependencies to cover more Spark releases (spark-4x, etc).

Signed-off-by: Ahmed Hussein <[email protected]>

Fixes NVIDIA#1316

Allow user-tools to pick the SPARK dependencies based on a runtime
env_var. The value format follows the same format of `buildver` in the
scala pom file.
Currently 333 and 350 (default) are supported.
If user specifies an invalid value, there will be a warning message,
then the process fails running the java cmd.

**Changes**

- Add dependency key to the platform config-file
- A platform can define its own default dependency versions using
  `activeBuildVer` key
- Add a default `RUNTIME_BUILDVER` in the `__init__.py` to allow
  upgrades of spark release during official releases
- Read an env_var `RAPIDS_USER_TOOLS_RUNTIME_BUILDVER` to pick the
  correct dependency.
- Currently, only `333` and `350` are supported. Default is `350`
@amahussein amahussein added feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels Sep 3, 2024
@amahussein amahussein self-assigned this Sep 3, 2024
parthosa
parthosa previously approved these changes Sep 4, 2024
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein. Tested the changes. LGTME.

Signed-off-by: Ahmed Hussein <[email protected]>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein. LGTME

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks!

Can we also make sure to document the env variables

@amahussein amahussein merged commit 4747d14 into NVIDIA:dev Sep 6, 2024
14 checks passed
@amahussein amahussein deleted the rapids-tools-1316 branch September 6, 2024 18:49
amahussein added a commit to amahussein/spark-rapids-tools that referenced this pull request Sep 24, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Followup on NVIDIA#1326 to set the default spark version to 3.4.2 for onPrem
to avoid the bug described in NVIDIA#1316 without need to do something on
customer side.
amahussein added a commit that referenced this pull request Sep 24, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Followup on #1326 to set the default spark version to 3.4.2 for onPrem
to avoid the bug described in #1316 without need to do something on
customer side.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Event parsing error: String length (...) exceeds the maximum length (20000000)
3 participants