-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pyspark] add parameters in the ctor of all estimators. #9202
Conversation
@WeichenXu123 @trivialfis could you help to review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
force_repartition: bool = False, | ||
repartition_random_shuffle: bool = False, | ||
enable_sparse_data_optim: bool = False, | ||
**xgboost_parameters: Dict[str, Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the convention of **kwargs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thx
Xgboost DMatrix object will be constructed from sparse matrix instead of | ||
dense matrix. | ||
|
||
xgboost_parameters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
force_repartition: bool = False, | ||
repartition_random_shuffle: bool = False, | ||
enable_sparse_data_optim: bool = False, | ||
**xgboost_parameters: Dict[str, Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
To specify the base margins of the training and validation | ||
dataset, set :py:attr:`xgboost.spark.SparkXGBClassifier.base_margin_col` parameter | ||
instead of setting `base_margin` and `base_margin_eval_set` in the | ||
`xgboost.XGBClassifier` fit method. Note: this isn't available for distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`xgboost.XGBClassifier` fit method. Note: this isn't available for distributed | |
:py:class:`xgboost.XGBClassifier` fit method. Note: this isn't available for distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by isn't available for distributed training?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. removed it.
base_margin_col: | ||
To specify the base margins of the training and validation | ||
dataset, set :py:attr:`xgboost.spark.SparkXGBClassifier.base_margin_col` parameter | ||
instead of setting `base_margin` and `base_margin_eval_set` in the | ||
`xgboost.XGBClassifier` fit method. Note: this isn't available for distributed | ||
training. | ||
qid_col" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qid_col" | |
qid_col: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QID is only available for ranking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Co-authored-by: Jiaming Yuan <[email protected]>
Co-authored-by: Jiaming Yuan <[email protected]>
Co-authored-by: Jiaming Yuan <[email protected]>
Co-authored-by: Jiaming Yuan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can build the document using sphinx. Install the required packages listed in doc/requirements.txt
using pip (or conda if you prefer), run make html
under the doc
directory, watch out for warnings during build, then you can view the html files.
Co-authored-by: Jiaming Yuan <[email protected]>
Co-authored-by: Jiaming Yuan <[email protected]>
Co-authored-by: Jiaming Yuan <[email protected]>
This PR adds parameters into estimators'
__ init__
function and more doc.