Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java_toolchain ignores the encoding setting #2926

Closed
juhalindfors opened this issue May 2, 2017 · 8 comments
Closed

java_toolchain ignores the encoding setting #2926

juhalindfors opened this issue May 2, 2017 · 8 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Java Issues for Java rules type: bug

Comments

@juhalindfors
Copy link

The encoding setting in java_toolchain appears to be ignored. For reference, see the stack overflow discussion here: http://stackoverflow.com/questions/43435772/bazel-java-library-character-encoding-via-javacopts-not-working

The problem seems to stem from the introduction of ClassloaderMaskingFileManager in commit 3c5e55f. It hard-codes UTF-8 in the file manager configuration provided to the SDK JavacTool implementation. The way the JDK base file manager is implemented prevents the encoding setting from taking place if the charset has been configured to anything other than a null reference.

A simple fix is to change the constructor call in CassloaderMaskingFileManager to pass a null reference as a charset argument. I've created a unit test that demonstrates this fix and would prevent future regression on the issue but am still in the process of trying to understand the gerrit submission process.

  • Bazel version (output of bazel info release): release 0.4.5
juhalindfors added a commit to juhalindfors/bazel-patches that referenced this issue May 2, 2017
Issue bazelbuild#2926

Adds a unit test passing an '-encoding' argument from Bazel Javac wrapper API into JDK
JavacTool implementation. The test should prevent regression on issue bazelbuild#2926 where
encoding argument was ignored by the file manager implementation passed as a reference
to the compiler.

Test uses a temporary source file with invalid UTF-8 encoding (the source is valid ISO-8859-1
byte encoding) and checks the compiler result on incorrect and correct encoding setting.
@juhalindfors
Copy link
Author

Proposed fix: juhalindfors@81b1e9a

Related unit test: juhalindfors@8e94ccc

Note that the unit test currently depends on the patch set in https://bazel-review.googlesource.com/#/c/10530/ associated with issue #2897

@damienmg damienmg added category: rules > java P2 We'll consider working on this in future. (Assignee optional) type: bug labels May 4, 2017
@cushon
Copy link
Contributor

cushon commented May 4, 2017

That isn't the only place that assumes Java source files are UTF-8:

I'm not sure how much value java_toolchain.encoding has in general, but if Bazel is going to support it, it should work everywhere.

We'll also need to make sure it isn't overridable for Blaze.

@GinFungYJF
Copy link
Contributor

Any update on this? Would it be fixed in the 0.5.0 release?

juhalindfors added a commit to juhalindfors/bazel-patches that referenced this issue May 27, 2017
Issue bazelbuild#2926

Adds a unit test passing an '-encoding' argument from Bazel Javac wrapper API into JDK
JavacTool implementation. The test should prevent regression on issue bazelbuild#2926 where
encoding argument was ignored by the file manager implementation passed as a reference
to the compiler.

Test uses a temporary source file with invalid UTF-8 encoding (the source is valid ISO-8859-1
byte encoding) and checks the compiler result on incorrect and correct encoding setting.

Change-Id: I7e364fcfa46973b4e9c30314eb2e8f7fa267fc75
juhalindfors added a commit to juhalindfors/bazel-patches that referenced this issue May 28, 2017
Issue bazelbuild#2926

Adds a unit test passing an '-encoding' argument from Bazel Javac wrapper API into JDK
JavacTool implementation. The test should prevent regression on issue bazelbuild#2926 where
encoding argument was ignored by the file manager implementation passed as a reference
to the compiler.

Test uses a temporary source file with invalid UTF-8 encoding (the source is valid ISO-8859-1
byte encoding) and checks the compiler result on incorrect and correct encoding setting.

Change-Id: I7e364fcfa46973b4e9c30314eb2e8f7fa267fc75
@juhalindfors
Copy link
Author

@GinFungYJF

I think I found how the original proposed patch could potentially cause an issue with build reproducibility but I still need to verify this. If correct, it should be feasibile to modify the patch according to the review comments to ensure one fix does not cause another issue. You can follow the review process here: https://bazel-review.googlesource.com/c/10751/, https://bazel-review.googlesource.com/c/10750

juhalindfors added a commit to juhalindfors/bazel-patches that referenced this issue Jun 1, 2017
'-encoding' javac argument is present.

The base file manager class only checks for encoding
value if charset has been left to a null value[1]. This patch
checks the presence of '-encoding' option in javac's arguments
and if present, sets the file manager's charset to null,
allowing '-encoding' to take precedence.

Issue: bazelbuild#2926
Change-Id: I96ab9433b053b94efeba08039a3bf6ea65c02e47
@ddwolf
Copy link

ddwolf commented Oct 25, 2017

is this problem fixed?

@ittaiz
Copy link
Member

ittaiz commented Jun 6, 2018

I think the current state of having a field which is ignored is really bad.
It sounds to me like we should either drop it and clearly document the encoding for javac is UTF-8 without being able to override or fix this issue.

bazel-io pushed a commit that referenced this issue Jun 8, 2018
UTF-8 is the only supported encoding for Java sources.

See #2926.

PiperOrigin-RevId: 199803902
@ittaiz
Copy link
Member

ittaiz commented Jul 16, 2018

@cushon I really think that some blatant documentation is needed about this

werkt pushed a commit to werkt/bazel that referenced this issue Aug 2, 2018
UTF-8 is the only supported encoding for Java sources.

See bazelbuild#2926.

PiperOrigin-RevId: 199803902
werkt pushed a commit to werkt/bazel that referenced this issue Oct 19, 2018
UTF-8 is the only supported encoding for Java sources.

See bazelbuild#2926.

PiperOrigin-RevId: 199803902
@meisterT meisterT added team-Rules-Java Issues for Java rules and removed category: rules > java labels May 12, 2020
@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P2 We'll consider working on this in future. (Assignee optional) labels Nov 20, 2020
@sgowroji sgowroji added the stale Issues or PRs that are stale (no activity for 30 days) label Feb 3, 2023
@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Java Issues for Java rules type: bug
Projects
None yet
Development

No branches or pull requests

9 participants