[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

zaynt4606 · 2024-11-05T17:49:43Z

What changes were proposed in this pull request?

as title

Why are the changes needed?

Does this PR introduce any user-facing change?

No

How was this patch tested?

sbt:

maven:

zaynt4606 · 2024-11-08T02:56:22Z

pom.xml

@@ -67,7 +67,7 @@
    <maven.version>3.9.9</maven.version>

    <flink.version>1.14.6</flink.version>
-    <spark.version>3.3.4</spark.version>
+    <spark.version>3.4.4</spark.version>


The IDEA Maven plugin prioritizes the outer properties in the artifactId.
Therefore, we need to update the version in the pom file when using the IDEA Maven plugin.
This update does not impact the compilation process.

The spark version format should be majorVersion.minorVersion.bugfixVersion

How about using spark.binary.version property with majorVersion.minorVersion?

I am fine with it~

zaynt4606 · 2024-11-08T03:00:58Z

cc @SteNicholas @turboFei @FMX @RexXiong

build/make-distribution.sh

turboFei · 2024-11-08T03:05:36Z

pom.xml

@@ -67,7 +67,7 @@
    <maven.version>3.9.9</maven.version>

    <flink.version>1.14.6</flink.version>
-    <spark.version>3.3.4</spark.version>
+    <spark.version>3.4.4</spark.version>


The spark version format should be majorVersion.minorVersion.bugfixVersion

How about using spark.binary.version property with majorVersion.minorVersion?

pan3793

why

zaynt4606 · 2024-11-08T05:56:41Z

why

Display the version of spark compiled by the client as in client-flink

pan3793 · 2024-11-08T06:02:25Z

then deploy dozens of spark client jars into the final release binary tgz? which virtually are the same one.

I don't know how frequently the Flink API changes, but the Spark side API we use is pretty stable. To record the build env info, you can follow CELEBORN-1658's approach.

zaynt4606 · 2024-11-08T06:18:04Z

then deploy dozens of spark client jars into the final release binary tgz? which virtually are the same one.

I don't know how frequently the Flink API changes, but the Spark side API we use is pretty stable. To record the build env info, you can follow CELEBORN-1658's approach.

There will be still one spark client jar and just want to show the spark version info in jar name.

Our own version of spark will need this information and I think it would be nice for the community to see this information in the client jar. But in fact they are all the same one.

pan3793 · 2024-11-08T06:23:32Z

From a user perspective, I would think that abc-spark-3.4.3-xyz.jar is only applicable to Spark 3.4.3.

zaynt4606 · 2024-11-08T06:32:17Z

From a user perspective, I would think that abc-spark-3.4.3-xyz.jar is only applicable to Spark 3.4.3.

Yes, this pull request needs to explain in the document. And there is a trade-off between providing this information and ensuring that it remains easy for users to understand. I'd like to know what the community thinks about this.

pan3793 · 2024-11-08T06:48:17Z

And there is a trade-off between providing this information and ensuring that it remains easy for users to understand.

the jar name should reflect the compatible runtime spark version instead of the exact spark version at building time.

In short, I'm -1 on this change, and already provided an alternative solution

To record the build env info, you can follow CELEBORN-1658's approach.

RexXiong · 2024-11-08T08:04:56Z

And there is a trade-off between providing this information and ensuring that it remains easy for users to understand.

the jar name should reflect the compatible runtime spark version instead of the exact spark version at building time.

In short, I'm -1 on this change, and already provided an alternative solution

To record the build env info, you can follow CELEBORN-1658's approach.

If that's the case, can we remove all the spark-3.x profiles in the pom and only keep one? These profiles are very confusing.

pan3793 · 2024-11-08T08:20:25Z

can we remove all the spark-3.x profiles in the pom and only keep one?

I don't opposite that as long as you build an integration test pipeline to cover all supported Spark versions before removing. For example, the released Kyuubi Spark/Flink engine jar does not contain Spark/Flink engine version, and it guarantees that the packaged jar can run across all supported Spark/Flink versions, by https://github.com/apache/kyuubi/blob/b4838b40e6a9074697918ccecd3dfe71cc52442d/.github/workflows/master.yml#L63-L82

RexXiong · 2024-11-08T08:43:34Z

can we remove all the spark-3.x profiles in the pom and only keep one?

I don't opposite that as long as you build an integration test pipeline to cover all supported Spark versions before removing. For example, the released Kyuubi Spark/Flink engine jar does not contain Spark/Flink engine version, and it guarantees that the packaged jar can run across all supported Spark/Flink versions, by https://github.com/apache/kyuubi/blob/b4838b40e6a9074697918ccecd3dfe71cc52442d/.github/workflows/master.yml#L63-L82

Thanks, seems we need use the version to be released to verify all supported spark versions at least during the testing process.

zaynt4606 changed the title ~~add spark version~~ [CELEBORN-1641]Pack the output spark client jar with the mid version Nov 6, 2024

zaynt4606 changed the title ~~[CELEBORN-1641]Pack the output spark client jar with the mid version~~ [CELEBORN-1641]Pack the output spark client jar with the major.minor version Nov 6, 2024

zaynt4606 marked this pull request as ready for review November 8, 2024 02:44

zaynt4606 commented Nov 8, 2024

View reviewed changes

turboFei reviewed Nov 8, 2024

View reviewed changes

zaynt4606 and others added 6 commits November 8, 2024 11:05

add spark version

bb7b5b0

change version to major.minor version

16e9963

typo

01e82b9

add spark2

1f5ee9a

add sbt spark version

c3dbf24

md file change

7e61e4e

zaynt4606 force-pushed the version branch from 7c234e6 to 7e61e4e Compare November 8, 2024 03:08

turboFei changed the title ~~[CELEBORN-1641]Pack the output spark client jar with the major.minor version~~ [CELEBORN-1641] Pack the output spark client jar with the major.minor version Nov 8, 2024

space

462a88e

pan3793 requested changes Nov 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

zaynt4606 commented Nov 5, 2024 •

edited

Loading

zaynt4606 Nov 8, 2024

turboFei Nov 8, 2024

zaynt4606 Nov 8, 2024

zaynt4606 commented Nov 8, 2024

turboFei Nov 8, 2024

pan3793 left a comment

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

RexXiong commented Nov 8, 2024

pan3793 commented Nov 8, 2024

RexXiong commented Nov 8, 2024

[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

Are you sure you want to change the base?

[CELEBORN-1641] Pack the output spark client jar with the major.minor version #2881

Conversation

zaynt4606 commented Nov 5, 2024 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

zaynt4606 Nov 8, 2024

Choose a reason for hiding this comment

turboFei Nov 8, 2024

Choose a reason for hiding this comment

zaynt4606 Nov 8, 2024

Choose a reason for hiding this comment

zaynt4606 commented Nov 8, 2024

turboFei Nov 8, 2024

Choose a reason for hiding this comment

pan3793 left a comment

Choose a reason for hiding this comment

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

zaynt4606 commented Nov 8, 2024

pan3793 commented Nov 8, 2024

RexXiong commented Nov 8, 2024

pan3793 commented Nov 8, 2024

RexXiong commented Nov 8, 2024

zaynt4606 commented Nov 5, 2024 •

edited

Loading