Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18592][ML] Move DT/RF/GBT Param setter methods to subclasses #16017

Closed
wants to merge 3 commits into from

Conversation

yanboliang
Copy link
Contributor

What changes were proposed in this pull request?

Mainly two changes:

  • Move DT/RF/GBT Param setter methods to subclasses.
  • Deprecate corresponding setter methods in the model classes.

See discussion here #15913 (comment).

How was this patch tested?

Existing tests.

@yanboliang
Copy link
Contributor Author

cc @jkbradley

@SparkQA
Copy link

SparkQA commented Nov 26, 2016

Test build #69187 has finished for PR 16017 at commit 39cbf42.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* [[org.apache.spark.SparkContext]].
* Must be >= 1.
* (default = 10)
* @group setParam
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, repeating this is kinda tedious - I thought it inherits the doc from the super, does it not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point: we can inherit docs.

Copy link
Contributor Author

@yanboliang yanboliang Nov 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cause of this change was suggested at #15913 (comment) , since Param setter methods in traits used to have the wrong type in Java. We would like to remove the setter method from the trait and it also does not make sense to have it in the Model classes. We could put the setter method in each subclass and then deprecate the method in the Model classes.

So we will remove the setter methods from all traits(in release 2.2), then we can not inherit docs from the traits. BTW, the current change is consistent with other ML algorithms which inherit traits. I'd like to know whether i understand the problem correctly. Thanks. @jkbradley

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that would make the setter appear in models. Your way is better.

* @deprecated This method is deprecated and will be removed in 2.2.0.
* @group setParam
*/
@deprecated("This method is deprecated and will be removed in 2.2.0.", "2.1.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this appear as deprecated in the Estimator classes? If so, then can you update the comment to make it clear it is only deprecated for Models?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if this is a problem, we could mark the setters in Models as deprecated and leave the treeParams alone since the treeParams traits are private.

Copy link
Contributor Author

@yanboliang yanboliang Nov 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not appear in estimator classes, only in models. This can be confirmed by the following ways:
The IDE will prompt deprecated methods only for models:
image
We can also confirm it in Scala API docs:
DecisionTreeClassifier
image
DecisionTreeClassificationModel
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for checking!

@jkbradley
Copy link
Member

LGTM pending a check on whether the deprecations affect Estimators in addition to Models
Thanks for the follow-up!

@@ -134,27 +150,31 @@ private[ml] trait DecisionTreeParams extends PredictorParams
/** @group setParam */
def setSeed(value: Long): this.type = set(seed, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you intentionally keeping setSeed in Models?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I missed it and will add it. Thanks.

@SparkQA
Copy link

SparkQA commented Nov 29, 2016

Test build #69306 has started for PR 16017 at commit 30f5096.

@yanboliang
Copy link
Contributor Author

Jenkins, test this please.

@SparkQA
Copy link

SparkQA commented Nov 29, 2016

Test build #69313 has finished for PR 16017 at commit 30f5096.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

LGTM
Merging with master and branch-2.1
Thanks!

asfgit pushed a commit that referenced this pull request Nov 29, 2016
## What changes were proposed in this pull request?
Mainly two changes:
* Move DT/RF/GBT Param setter methods to subclasses.
* Deprecate corresponding setter methods in the model classes.

See discussion here #15913 (comment).

## How was this patch tested?
Existing tests.

Author: Yanbo Liang <[email protected]>

Closes #16017 from yanboliang/spark-18592.

(cherry picked from commit 95f7985)
Signed-off-by: Joseph K. Bradley <[email protected]>
@asfgit asfgit closed this in 95f7985 Nov 29, 2016
@yanboliang yanboliang deleted the spark-18592 branch November 30, 2016 02:44
* E.g. 10 means that the cache will get checkpointed every 10 iterations.
* This is only used if cacheNodeIds is true and if the checkpoint directory is set in
* [[org.apache.spark.SparkContext]].
* Must be >= 1.
Copy link
Member

@HyukjinKwon HyukjinKwon Dec 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi all, this might be too trivial but I just want to let you know we probably should write other ones for < or > such as {@literal <} or {@literal >}. Please refer #16013 (comment).

robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 2, 2016
## What changes were proposed in this pull request?
Mainly two changes:
* Move DT/RF/GBT Param setter methods to subclasses.
* Deprecate corresponding setter methods in the model classes.

See discussion here apache#15913 (comment).

## How was this patch tested?
Existing tests.

Author: Yanbo Liang <[email protected]>

Closes apache#16017 from yanboliang/spark-18592.
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (

// Override parameter setters from parent trait for Java API compatibility.

/** @group setParam */
Copy link
Member

@HyukjinKwon HyukjinKwon Dec 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry but please allow me to leave another trivial comment about documentation. I haven't checked the built documentation but it seems we should use multiple-lines in this case (see #13855).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the generated documentation and found it works well. Could you let me know what you found and how to reproduce it? Thanks.

Copy link
Member

@HyukjinKwon HyukjinKwon Dec 8, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I should have built and checked it first closely by myself. Thank you for correcting me. I just assumed that single line comment breaks other rendering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I created http://issues.apache.org/jira/browse/SPARK-18692 for monitoring Java 8 unidoc in the future

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support that idea too! Thanks for your comment.

uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?
Mainly two changes:
* Move DT/RF/GBT Param setter methods to subclasses.
* Deprecate corresponding setter methods in the model classes.

See discussion here apache#15913 (comment).

## How was this patch tested?
Existing tests.

Author: Yanbo Liang <[email protected]>

Closes apache#16017 from yanboliang/spark-18592.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants