Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

zaynt4606
Copy link
Contributor

@zaynt4606 zaynt4606 commented Nov 5, 2024

What changes were proposed in this pull request?

as title

Why are the changes needed?

Does this PR introduce any user-facing change?

No

How was this patch tested?

sbt:
image
image
maven:
image

@zaynt4606 zaynt4606 changed the title add spark version [CELEBORN-1641]Pack the output spark client jar with the mid version Nov 6, 2024
@zaynt4606 zaynt4606 changed the title [CELEBORN-1641]Pack the output spark client jar with the mid version [CELEBORN-1641]Pack the output spark client jar with the major.minor version Nov 6, 2024
@zaynt4606 zaynt4606 marked this pull request as ready for review November 8, 2024 02:44
@@ -67,7 +67,7 @@
<maven.version>3.9.9</maven.version>

<flink.version>1.14.6</flink.version>
<spark.version>3.3.4</spark.version>
<spark.version>3.4.4</spark.version>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IDEA Maven plugin prioritizes the outer properties in the artifactId.
Therefore, we need to update the version in the pom file when using the IDEA Maven plugin.
This update does not impact the compilation process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spark version format should be majorVersion.minorVersion.bugfixVersion

How about using spark.binary.version property with majorVersion.minorVersion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with it~

@zaynt4606
Copy link
Contributor Author

build/make-distribution.sh Outdated Show resolved Hide resolved
@@ -67,7 +67,7 @@
<maven.version>3.9.9</maven.version>

<flink.version>1.14.6</flink.version>
<spark.version>3.3.4</spark.version>
<spark.version>3.4.4</spark.version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spark version format should be majorVersion.minorVersion.bugfixVersion

How about using spark.binary.version property with majorVersion.minorVersion?

@turboFei turboFei changed the title [CELEBORN-1641]Pack the output spark client jar with the major.minor version [CELEBORN-1641] Pack the output spark client jar with the major.minor version Nov 8, 2024
Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why

@zaynt4606
Copy link
Contributor Author

why

Display the version of spark compiled by the client as in client-flink

@pan3793
Copy link
Member

pan3793 commented Nov 8, 2024

then deploy dozens of spark client jars into the final release binary tgz? which virtually are the same one.

I don't know how frequently the Flink API changes, but the Spark side API we use is pretty stable. To record the build env info, you can follow CELEBORN-1658's approach.

@zaynt4606
Copy link
Contributor Author

then deploy dozens of spark client jars into the final release binary tgz? which virtually are the same one.

I don't know how frequently the Flink API changes, but the Spark side API we use is pretty stable. To record the build env info, you can follow CELEBORN-1658's approach.

There will be still one spark client jar and just want to show the spark version info in jar name.

Our own version of spark will need this information and I think it would be nice for the community to see this information in the client jar. But in fact they are all the same one.

@pan3793
Copy link
Member

pan3793 commented Nov 8, 2024

From a user perspective, I would think that abc-spark-3.4.3-xyz.jar is only applicable to Spark 3.4.3.

@zaynt4606
Copy link
Contributor Author

From a user perspective, I would think that abc-spark-3.4.3-xyz.jar is only applicable to Spark 3.4.3.

Yes, this pull request needs to explain in the document. And there is a trade-off between providing this information and ensuring that it remains easy for users to understand. I'd like to know what the community thinks about this.

@pan3793
Copy link
Member

pan3793 commented Nov 8, 2024

And there is a trade-off between providing this information and ensuring that it remains easy for users to understand.

the jar name should reflect the compatible runtime spark version instead of the exact spark version at building time.

In short, I'm -1 on this change, and already provided an alternative solution

To record the build env info, you can follow CELEBORN-1658's approach.

@RexXiong
Copy link
Contributor

RexXiong commented Nov 8, 2024

And there is a trade-off between providing this information and ensuring that it remains easy for users to understand.

the jar name should reflect the compatible runtime spark version instead of the exact spark version at building time.

In short, I'm -1 on this change, and already provided an alternative solution

To record the build env info, you can follow CELEBORN-1658's approach.

If that's the case, can we remove all the spark-3.x profiles in the pom and only keep one? These profiles are very confusing.

@pan3793
Copy link
Member

pan3793 commented Nov 8, 2024

can we remove all the spark-3.x profiles in the pom and only keep one?

I don't opposite that as long as you build an integration test pipeline to cover all supported Spark versions before removing. For example, the released Kyuubi Spark/Flink engine jar does not contain Spark/Flink engine version, and it guarantees that the packaged jar can run across all supported Spark/Flink versions, by https://github.com/apache/kyuubi/blob/b4838b40e6a9074697918ccecd3dfe71cc52442d/.github/workflows/master.yml#L63-L82

@RexXiong
Copy link
Contributor

RexXiong commented Nov 8, 2024

can we remove all the spark-3.x profiles in the pom and only keep one?

I don't opposite that as long as you build an integration test pipeline to cover all supported Spark versions before removing. For example, the released Kyuubi Spark/Flink engine jar does not contain Spark/Flink engine version, and it guarantees that the packaged jar can run across all supported Spark/Flink versions, by https://github.com/apache/kyuubi/blob/b4838b40e6a9074697918ccecd3dfe71cc52442d/.github/workflows/master.yml#L63-L82

Thanks, seems we need use the version to be released to verify all supported spark versions at least during the testing process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants