Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48490][CORE][FOLLOWUP] Properly process escape sequences #47050

Closed
wants to merge 2 commits into from

Conversation

gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Even with the fix in #46824, the escape sequences (\r, \n, \t, etc) are not handled properly. For example, when we use log"\n", the StringContext interprets \n as a literal backslash \ followed by n instead of a newline character. As a result, the bytes of log"\n".message becomes [92, 110], instead of [10].

This PR is to fix the issue by using the method StringContext.processEscapes in LogStringContext.

Why are the changes needed?

To ensure that escape sequences are properly processed in Spark logs

Does this PR introduce any user-facing change?

No

How was this patch tested?

New UT

Was this patch authored or co-authored using generative AI tooling?

No

@gengliangwang
Copy link
Member Author

gengliangwang commented Jun 21, 2024

cc @panbingkun

@panbingkun
Copy link
Contributor

+1, LGTM.

@panbingkun
Copy link
Contributor

Wait, there may still be some minor issues. Let me verify them in UT.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending tests

Copy link
Contributor

@panbingkun panbingkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

 sb.append(processedParts.next()) 

->

 sb.append(StringContext.processEscapes(processedParts.next()))

@yaooqinn yaooqinn closed this in fdabe08 Jun 21, 2024
@panbingkun
Copy link
Contributor

@HyukjinKwon Can we merge this?
After this PR is merged, I will test whether R Windows has been fixed.

@panbingkun
Copy link
Contributor

Thanks @yaooqinn

gengliangwang pushed a commit that referenced this pull request Jun 21, 2024
…s scene

### What changes were proposed in this pull request?
The pr is followup #47050

### Why are the changes needed?
Add some UT for the Windows paths scene.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #47051 from panbingkun/SPARK-48490_FOLLOWUP_TESTS.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?

Even with the fix in apache#46824, the escape sequences (`\r`, `\n`, `\t`, etc) are not handled properly. For example, when we use `log"\n"`, the StringContext interprets `\n` as a literal backslash `\` followed by `n` instead of a newline character. As a result, the bytes of `log"\n".message` becomes `[92, 110]`, instead of `[10]`.

This PR is to fix the issue by using the method StringContext.processEscapes in `LogStringContext`.

### Why are the changes needed?

 To ensure that escape sequences are properly processed in Spark logs

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

New UT

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47050 from gengliangwang/fixEscape.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…s scene

### What changes were proposed in this pull request?
The pr is followup apache#47050

### Why are the changes needed?
Add some UT for the Windows paths scene.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47051 from panbingkun/SPARK-48490_FOLLOWUP_TESTS.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants