Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3666] Extract interfaces for EdgeRDD and VertexRDD #2530

Closed
wants to merge 8 commits into from

Conversation

ankurdave
Copy link
Contributor

This discourages users from calling the VertexRDD and EdgeRDD constructor and makes it easier for future changes to ensure backward compatibility.

This breaks binary compatibility with the error

    java.lang.IncompatibleClassChangeError: Found interface
    org.apache.spark.graphx.VertexRDD, but class was expected

The solution is to do a clean rebuild (sbt/sbt clean assembly).
@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have started for PR 2530 at commit 620e603.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have finished for PR 2530 at commit 620e603.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait EdgeRDD[@specialized ED, VD] extends RDD[Edge[ED]]
    • trait VertexRDD[@specialized VD] extends RDD[(VertexId, VD)]
    • class EdgeRDDImpl[@specialized ED: ClassTag, VD: ClassTag](
    • class VertexRDDImpl[@specialized VD](

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20777/

}, preservesPartitioning = true))
}

trait EdgeRDD[@specialized ED, VD] extends RDD[Edge[ED]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should probably remove the specialized and test performance. maybe we can do that in a a separate pr

@rxin
Copy link
Contributor

rxin commented Sep 27, 2014

Looks good. I didn't look closely but I'm assuming you didn't change any impl logic. Let me know if that is not true.

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have started for PR 2530 at commit 9ba4ec4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 9, 2014

QA tests have finished for PR 2530 at commit 9ba4ec4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait EdgeRDD[@specialized ED, VD] extends RDD[Edge[ED]]
    • trait VertexRDD[@specialized VD] extends RDD[(VertexId, VD)]

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21554/Test PASSed.

@ankurdave
Copy link
Contributor Author

@rxin Right, this just contains refactoring and doc changes. I marked the Impl constructors package-private since they are accessed by the corresponding interface object (EdgeRDD.fromEdgePartitions and VertexRDD.apply). I'll evaluate specialization separately. PTAL

@rxin
Copy link
Contributor

rxin commented Oct 28, 2014

This LGTM.

One question related to this - should we use abstract class instead of trait? It would be better for maintaining binary compatibility across versions

@SparkQA
Copy link

SparkQA commented Nov 1, 2014

QA tests have started for PR 2530 at commit 931b587.

  • This patch does not merge cleanly!

It makes no difference in benchmarks.
@SparkQA
Copy link

SparkQA commented Nov 1, 2014

QA tests have started for PR 2530 at commit cbe15f2.

  • This patch does not merge cleanly!

Fixes a merge conflict created by ee29ef3.

Conflicts:
	graphx/src/main/scala/org/apache/spark/graphx/EdgeRDD.scala
@SparkQA
Copy link

SparkQA commented Nov 1, 2014

QA tests have finished for PR 2530 at commit 931b587.

  • This patch fails unit tests.
  • This patch does not merge cleanly!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22704/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22716 has started for PR 2530 at commit 24201d4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

QA tests have finished for PR 2530 at commit cbe15f2.

  • This patch fails unit tests.
  • This patch does not merge cleanly!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22714/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22716 has finished for PR 2530 at commit 24201d4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22716/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 2, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22741 has started for PR 2530 at commit 24201d4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22741 has finished for PR 2530 at commit 24201d4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class EdgeRDD[ED, VD](
    • abstract class VertexRDD[VD](

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22741/
Test FAILed.

@rxin
Copy link
Contributor

rxin commented Nov 2, 2014

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22754 has started for PR 2530 at commit 24201d4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 2, 2014

Test build #22754 has finished for PR 2530 at commit 24201d4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class EdgeRDD[ED, VD](
    • abstract class VertexRDD[VD](

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22754/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Nov 11, 2014

Test build #516 has started for PR 2530 at commit 24201d4.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 11, 2014

Test build #516 has finished for PR 2530 at commit 24201d4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class EdgeRDD[ED, VD](
    • abstract class VertexRDD[VD](

@rxin
Copy link
Contributor

rxin commented Nov 11, 2014

I filed a JIRA for Mima misreporting: https://issues.apache.org/jira/browse/SPARK-4335

cc @pwendell @ScrapCodes

@rxin
Copy link
Contributor

rxin commented Nov 11, 2014

Also @ankurdave this needs to be updated now...

Conflicts:
	graphx/src/main/scala/org/apache/spark/graphx/VertexRDD.scala
@SparkQA
Copy link

SparkQA commented Nov 11, 2014

Test build #23197 has started for PR 2530 at commit 1472390.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 11, 2014

Test build #23197 has finished for PR 2530 at commit 1472390.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23197/
Test FAILed.

@pwendell
Copy link
Contributor

I took a look and I think MIMA might be reporting a semi-legitimate issue - though this is a tricky situation. I think the issue is - if someone extended the earlier version of EdgeRDD, their class would have inherited the concrete getPartitions call that was present in the earlier version. Now EdgeRDD doesn't have this getPartitions function (it has been moved to the concrete version). I think in general it's fine because this is defined in the parent RDD class, but maybe MIMA doesn't check whether the function is defined further up in the inheritance chain.

In terms of compatiblity for callers, this is a protected method so it really doesn't matter.

I'd go ahead and just add ignores for this and be done with it. Fixing this in MIMA might be diffiuclt

@SparkQA
Copy link

SparkQA commented Nov 12, 2014

Test build #23254 has started for PR 2530 at commit d681f45.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 12, 2014

Test build #23254 has finished for PR 2530 at commit d681f45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • abstract class EdgeRDD[ED, VD](
    • abstract class VertexRDD[VD](

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23254/
Test PASSed.

@ankurdave
Copy link
Contributor Author

@rxin MIMA passed! PTAL.

@rxin
Copy link
Contributor

rxin commented Nov 12, 2014

Merging in master & branch-1.2. Thanks!

@asfgit asfgit closed this in a5ef581 Nov 12, 2014
asfgit pushed a commit that referenced this pull request Nov 12, 2014
This discourages users from calling the VertexRDD and EdgeRDD constructor and makes it easier for future changes to ensure backward compatibility.

Author: Ankur Dave <[email protected]>

Closes #2530 from ankurdave/SPARK-3666 and squashes the following commits:

d681f45 [Ankur Dave] Define getPartitions and compute in abstract class for MIMA
1472390 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into SPARK-3666
24201d4 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into SPARK-3666
cbe15f2 [Ankur Dave] Remove specialized annotation from VertexRDD and EdgeRDD
931b587 [Ankur Dave] Use abstract class instead of trait for binary compatibility
9ba4ec4 [Ankur Dave] Mark (Vertex|Edge)RDDImpl constructors package-private
620e603 [Ankur Dave] Extract VertexRDD interface and move implementation to VertexRDDImpl
55b6398 [Ankur Dave] Extract EdgeRDD interface and move implementation to EdgeRDDImpl

(cherry picked from commit a5ef581)
Signed-off-by: Reynold Xin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants