Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shade Netty transport #2485

Merged
merged 1 commit into from
Dec 8, 2017
Merged

Shade Netty transport #2485

merged 1 commit into from
Dec 8, 2017

Conversation

ejona86
Copy link
Member

@ejona86 ejona86 commented Dec 5, 2016

The entire transport is shaded because we wouldn't want to collide with
the unshaded form of grpc-netty.

@jhspaybar
Copy link
Contributor

So this means that gRPC netty will no longer have a chance to conflict with other Netty versions I'm using? If so, +1, we were thinking of doing this ourselves when we imported and used gRPC anyways. The http2 changes between 4.1.5.Final and 4.1.6.Final were frustrating and it happens often enough to cause issues in a big organization.

How will this play with shared things like worker groups and whatnot? If I have normal Netty plus grpc Netty, can I use the same worker group between them, or will I likely be stuck with 2 worker groups due to the relocation of classes?

@ejona86
Copy link
Member Author

ejona86 commented Dec 8, 2016

Yes, this would remove or reduce the risk of conflicting Netty versions (depending on whether we continue to shade all of Netty, or just a risky subportion).

The two Nettys currently share no code, so you'd have to use different EventLoops. "Normally" we could just shade the HTTP/2 codec. However, the HTTP/2 codec uses code from internal Netty classes that have no API stability. The internal classes then need all of Netty to be shaded. It may be possible to shade just some of the internal classes and the HTTP/2 codec, but it seems a bit dangerous.

@jhspaybar
Copy link
Contributor

Okay, that was the outcome we had evaluated as well internally. We feel the risk of breakage due to dependency conflicts is greater than the cost of not being able to use a single shared event loop for all netty's. We are in support of a complete shading of all netty code, not just http2.

@nevi-me
Copy link

nevi-me commented Jan 9, 2017

I tried using gRPC with Apache Spark, and it was working well while developing on local Windows machine (running with IntelliJ), but when I try running my fatJar on Apache Spark, I get Netty related issues.

Is this something that shading would be able to resolve?

@ejona86
Copy link
Member Author

ejona86 commented Jan 9, 2017

I get Netty related issues

That is too vague to tell you anything. If it was Netty version incompatibilities, then yes, shading would likely resolve the problem.

@zackangelo
Copy link

Netty conflicts have bitten us pretty hard internally during our gRPC pilot. A shaded dependency would be very welcome on our team.

@ejona86
Copy link
Member Author

ejona86 commented Jan 24, 2017

Jenkins, retest this please.

@nevi-me
Copy link

nevi-me commented Jan 24, 2017

That is too vague to tell you anything. If it was Netty version incompatibilities, then yes, shading would likely resolve the problem.

I missed this, I tried reproducing the exact error, but dealing with other nonsense now 😞

Essentially, Spark was complaining about a clash with class names, I know I'm still being vague, but hopefully I'll be able to get someone in my team to reproduce the error in the coming days.

@ejona86
Copy link
Member Author

ejona86 commented Jan 27, 2017

@nevi-me, if it is similar to googleapis/google-cloud-java#1522 then it would be resolved. The root error there is a NoSuchMethodError, which is caused by version incompatibility.

@jhspaybar
Copy link
Contributor

This is becoming a pretty significant source of pain for us. Is this something we think would be included in the 1.2 release(or even a 1.1.x release if possible)? I'd like to determine if this is for sure going to happen so it then becomes a discussion of when, or if we should be shading this ourselves internally if the decision is no. Thanks.

@garrettjonesgoogle
Copy link
Contributor

@ejona86 1.0 -> 1.1 is a minor version change, but a deleted method is a breaking change, which is inconsistent it being a minor version change.

@kuroneko25
Copy link

+1 on including this in 1.1.x release. This is currently preventing us from upgrading from 1.0 to 1.1.

@ejona86
Copy link
Member Author

ejona86 commented Mar 10, 2017

@ejona86 1.0 -> 1.1 is a minor version change, but a deleted method is a breaking change, which is inconsistent it being a minor version change.

I assume you meant 4.0 → 4.1. The failure in logic is Netty is not using your semantics for versioning. Netty has lots of interfaces and it's my understanding they reserve the right to change them in minor versions. So upgrading minor versions isn't too hard, but is not 100% API compatible.

@zackangelo
Copy link

@ejona86 are y'all still considering this? and if so, any idea when it'll land?

@jhspaybar
Copy link
Contributor

@ejona86 Can you provide any updates on this? It would be helpful for a decision one way or the other. If gRPC will shade, that is our preference at Netflix, but if gRPC won't shade, we're likely to shade anyways when we import. Right now the limbo state is worst for us, because we don't want to shade and then have you do it anyways :)

@nevi-me
Copy link

nevi-me commented May 3, 2017

@jhspaybar sorry for the amateur question, how would one shade the current version using Gradle?

@jhspaybar
Copy link
Contributor

@nevi-me This thread has an accompanying commit which shows how.

@nevi-me
Copy link

nevi-me commented May 3, 2017

Silly me, this is a PR haha. Thanks!

@ejona86
Copy link
Member Author

ejona86 commented May 3, 2017

@jhspaybar, as far as decisions go, having a version of grpc-netty that has Netty shaded is a "yes". We'll still have the unshaded one as well (and the two will use different package names). But I need to have time to finish the work. That will be "soon" (weeks).

@vkedia
Copy link

vkedia commented Jun 8, 2017

@ejona86 Any updates on this?

@ejona86
Copy link
Member Author

ejona86 commented Jul 8, 2017

In netty/netty-tcnative#272, netty-tcnative began supporting shading. This needs to be updated to include tcnative, at which point it should be a fully self-contained one-stop-shop.

@ejona86
Copy link
Member Author

ejona86 commented Jul 12, 2017

I created netty/netty#6963 for discussing a out-of-the-box alternative to io.netty.packagePrefix.

@ejona86
Copy link
Member Author

ejona86 commented Jul 21, 2017

Netty can now (in the next release) auto-detect the packagePrefix. That means this PR is very close.

Unfortunately, the shadow plugin is a bit strange when publishing. I need to investigate more how to upload to Maven Central (sanely) and how to install locally (sanely), when using the shadow plugin.

@ejona86
Copy link
Member Author

ejona86 commented Aug 29, 2017

I'll need to rebase this now that we've updated Netty. The new Netty has minor tweaks to the file names that we need to shade. I hope the test catches that it is broken.

@ejona86
Copy link
Member Author

ejona86 commented Aug 29, 2017

@carl-mastrangelo, @zhangkun83, please review. I imagine Carl is probably more comfortable with the Netty side and Kun is probably more comfortable with the Gradle side.

// We have to be careful with these replacements as they must not match any
// string in NativeLibraryLoader, else they cause corruption. Note that
// this includes concatenation of string literals and constants.
relocate 'META-INF/native/libnetty', 'META-INF/native/libio-grpc-netty-shaded-netty'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the version of master this is on, they must be dashes. This is what I as referring to with "I'll need to rebase this now that we've updated Netty. The new Netty has minor tweaks to the file names that we need to shade."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rebased on top of master. The tests correctly failed, and swapping to underscores lets them now pass.

/** Test TLS since that verifies everything, including tcnative, is functional. */
@Test
public void tls() throws Exception {
Server server = ServerBuilder.forPort(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are testing TLS, can you for the SslProvider to be SslProvider.OPENSSL?

Also, this test case should probably be split. It is testing two things: If the ServerLoader worked, and if tcnative was shaded properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Done.

} finally {
server.shutdownNow();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Await termination?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ejona86
Copy link
Member Author

ejona86 commented Sep 14, 2017

@carl-mastrangelo, PTAL

Copy link
Contributor

@carl-mastrangelo carl-mastrangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

public void tls() throws Exception {
Server server = ServerBuilder.forPort(0)
public void serviceLoaderFindsNetty() throws Exception {
assertTrue(ServerBuilder.forPort(0) instanceof NettyServerBuilder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: if you use Truth.assertThat(ServerBuilder.forPort(0)).isInstanceOf(NettyServerBuilder.class), a test failure will tell you what the instance actually was.

Copy link
Member Author

@ejona86 ejona86 Sep 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've swapped the tests to use Truth exclusively.

@ejona86 ejona86 force-pushed the shade-netty branch 2 times, most recently from 9a18305 to 4f0e2f6 Compare September 15, 2017 16:24
@ejona86
Copy link
Member Author

ejona86 commented Sep 15, 2017

retest this please

@ejona86
Copy link
Member Author

ejona86 commented Sep 15, 2017

The dll isn't being loaded on Windows, and it's a PITA to figure out why because java.util.logging is broken in unit tests. I'll need to set up a full dev environment on Windows to figure out what's wrong.

@ejona86 ejona86 added the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Dec 8, 2017
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to a PR to tell Kokoro to re-run all tests. Not generally necessary label Dec 8, 2017
The entire transport is shaded because we wouldn't want to collide with
the unshaded form of grpc-netty.
@ejona86
Copy link
Member Author

ejona86 commented Dec 8, 2017

Soo... I set up an environment, and then I rebased, and saw everything working. I'm assuming an update to Netty fixed the problem.

@ejona86 ejona86 merged commit 83d710b into grpc:master Dec 8, 2017
@ejona86 ejona86 deleted the shade-netty branch December 8, 2017 22:02
@ejona86
Copy link
Member Author

ejona86 commented Dec 8, 2017

It was almost less than a year 😄

@garrettjonesgoogle
Copy link
Contributor

lol that's an interesting way to characterize "more than a year" :-P

@mastergberry
Copy link

Hey @ejona86 we are trying to use GRPC 1.10 in our program and are having some netty conflictions. Wasn't this issue/change in 1.9 supposed to make it so GRPC used their own netty and didn't conflict with other libraries that used netty too?

https://github.com/grpc/grpc-java/blob/master/SECURITY.md#troubleshooting

We were going over this and wanted to know if it was completely up to date...because GRPC is using the wrong netty version when trying to write a response in our current application and this is causing issues.

@ejona86
Copy link
Member Author

ejona86 commented Mar 19, 2018

@mastergberry, as mentioned in the in the v1.9.0 release notes and my proposed changes to SECURITY.md, you have to swap to grpc-netty-shaded to get the pre-shaded Netty. If you use grpc-netty then you will see the original behavior.

@mastergberry
Copy link

My apologies. Thanks for the nudge in the right direction.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.