-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18592][ML] Move DT/RF/GBT Param setter methods to subclasses #16017
Conversation
cc @jkbradley |
Test build #69187 has finished for PR 16017 at commit
|
* [[org.apache.spark.SparkContext]]. | ||
* Must be >= 1. | ||
* (default = 10) | ||
* @group setParam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, repeating this is kinda tedious - I thought it inherits the doc from the super, does it not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: we can inherit docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cause of this change was suggested at #15913 (comment) , since Param setter methods in traits used to have the wrong type in Java. We would like to remove the setter method from the trait and it also does not make sense to have it in the Model classes. We could put the setter method in each subclass and then deprecate the method in the Model classes.
So we will remove the setter methods from all traits(in release 2.2), then we can not inherit docs from the traits. BTW, the current change is consistent with other ML algorithms which inherit traits. I'd like to know whether i understand the problem correctly. Thanks. @jkbradley
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that would make the setter appear in models. Your way is better.
* @deprecated This method is deprecated and will be removed in 2.2.0. | ||
* @group setParam | ||
*/ | ||
@deprecated("This method is deprecated and will be removed in 2.2.0.", "2.1.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this appear as deprecated in the Estimator classes? If so, then can you update the comment to make it clear it is only deprecated for Models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if this is a problem, we could mark the setters in Models as deprecated and leave the treeParams alone since the treeParams traits are private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you for checking!
LGTM pending a check on whether the deprecations affect Estimators in addition to Models |
@@ -134,27 +150,31 @@ private[ml] trait DecisionTreeParams extends PredictorParams | |||
/** @group setParam */ | |||
def setSeed(value: Long): this.type = set(seed, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you intentionally keeping setSeed in Models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I missed it and will add it. Thanks.
Test build #69306 has started for PR 16017 at commit |
Jenkins, test this please. |
Test build #69313 has finished for PR 16017 at commit
|
LGTM |
## What changes were proposed in this pull request? Mainly two changes: * Move DT/RF/GBT Param setter methods to subclasses. * Deprecate corresponding setter methods in the model classes. See discussion here #15913 (comment). ## How was this patch tested? Existing tests. Author: Yanbo Liang <[email protected]> Closes #16017 from yanboliang/spark-18592. (cherry picked from commit 95f7985) Signed-off-by: Joseph K. Bradley <[email protected]>
* E.g. 10 means that the cache will get checkpointed every 10 iterations. | ||
* This is only used if cacheNodeIds is true and if the checkpoint directory is set in | ||
* [[org.apache.spark.SparkContext]]. | ||
* Must be >= 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi all, this might be too trivial but I just want to let you know we probably should write other ones for <
or >
such as {@literal <}
or {@literal >}
. Please refer #16013 (comment).
## What changes were proposed in this pull request? Mainly two changes: * Move DT/RF/GBT Param setter methods to subclasses. * Deprecate corresponding setter methods in the model classes. See discussion here apache#15913 (comment). ## How was this patch tested? Existing tests. Author: Yanbo Liang <[email protected]> Closes apache#16017 from yanboliang/spark-18592.
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") ( | |||
|
|||
// Override parameter setters from parent trait for Java API compatibility. | |||
|
|||
/** @group setParam */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry but please allow me to leave another trivial comment about documentation. I haven't checked the built documentation but it seems we should use multiple-lines in this case (see #13855).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the generated documentation and found it works well. Could you let me know what you found and how to reproduce it? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I should have built and checked it first closely by myself. Thank you for correcting me. I just assumed that single line comment breaks other rendering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I created http://issues.apache.org/jira/browse/SPARK-18692 for monitoring Java 8 unidoc in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support that idea too! Thanks for your comment.
## What changes were proposed in this pull request? Mainly two changes: * Move DT/RF/GBT Param setter methods to subclasses. * Deprecate corresponding setter methods in the model classes. See discussion here apache#15913 (comment). ## How was this patch tested? Existing tests. Author: Yanbo Liang <[email protected]> Closes apache#16017 from yanboliang/spark-18592.
What changes were proposed in this pull request?
Mainly two changes:
See discussion here #15913 (comment).
How was this patch tested?
Existing tests.