-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize binary operations between int and str columns #1828
Conversation
Just FYI: You can run |
6b9a8f5
to
024240f
Compare
@itholic Thank you for reminding me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, Looks fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks! merging. |
@ueshin Thank you for merging! |
Nice 👍 |
Notes
They are commented out and will be removed before the PR merge (after PR review).
+
,-
,*
,/
,//
,%
are concerned.Proposal
Make behaviors of binary operations (
+
,-
,*
,/
,//
,%
) between int and str columns consistent with respective pandas behaviors.Standardize binary operations as follows:
+
: raise TypeError between int column and str column (or string literal)*
: act as spark SQLrepeat
between int column(or int literal) and str columns; raise TypeError if a string literal is involved-
,/
,//
,%(modulo)
: raise TypeError if a str column (or string literal) is involvedAdd
def repeat(col, n):
in databricks/koalas/spark/functions.pyrepeat
defined in scala API only accepts integer literal as the method's second parameter.But internal StringRepeat accepts
IntegerType
as the method's second parameter.In order to pass int columns as the second parameter, we take advantage of
callUDF
.Test
databricks/koalas/tests/test_dataframe.py
Resolves #1819