-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail startup (and tests) on jar hell #11932
Conversation
On my machine, the scan takes 18-20ms on average: so the cost is negligible, it just scans the entry names from the zip... |
LGTM |
Fail startup (and tests) on jar hell
While I admire the effort and understand the drive behind this, this causes Elasticsearch hell for us on several projects due to issues that are beyond our control. This renders using the ES jar impossible for even the minimal use of TransportClient, and blocks us from fully moving to ES 2.x. Here is a concrete example, one of many I'm afraid. We are using Elasticsearch from a Storm bolt, and are using Maven as our build system. storm-core has a dependency in kryo, which in turn has some issues with shaded dependencies (EsotericSoftware/kryo#189). This may or may not have been fixed there, but the latest storm-core still uses the affected version (http://mvnrepository.com/artifact/org.apache.storm/storm-core/0.10.0). As a result, we are seeing the following:
Due to:
And all we need the ES jar for is for TransportClient, not going to run a full blown server / cluster there. And tests (Unit and ITs). So after fixing all the Jar Hell issues that are indeed under our control (> 1 week worth of work), we are now stuck with Jar Hell issues that are beyond our control. It appears you yourself experienced the same:
I'm aware of @rmuir's stand on the topic, but this practically renders the ES Java API useless in any large scale project, exactly because the current Jar Hell check bites way too deep. It at least needs to ignore mismatches in shaded dependencies or the like in certain scenarios, which I agree needs to be well defined. Please allow customisations of exclusions, or disabling Jar Hell altogether At Your Own Risk(TM) at least while running ES as a library (as opposed to as a server instance). |
Can we please avoid such aggressivity and only discuss facts and technical arguments. |
Running ES either embedded, or as a java client does not run the jarhell check. The issue here is about tests. But testing transport client with an in memory ES is not a good test! If you want to test your use of transport client, run a real integration test, where ES is started in a separate process. That is the only real test, anything else is a fantasyland which does not actually confirm that the system works as expected. And regarding unit tests, you are free to not use the ES test framework. It is intended to test ES components, where jarhell is not allowed. |
Please keep the discussions here civil and technical! Don't attack the people raising ideas or asking questions; answer with technical counter arguments instead. |
@synhershko thanks for bringing this up, I think lots of users might run into problems with our strict checks and I this we should clarify why we are doing the checks and who they are supposed to work. First lemme clarify your statement:
as @rjernst already mentioned this is not true. The JarHell checks are run doing Bootstrap which is run in 2 ways:
if you run ES embedded or as a transport client this will not be executed. Hence you statement is not correct and will cause a lot of confusion. If you use the ES test framework you will run into the JarHell checks and that is unfortunate. I think we might be able to help here and maybe move the checks under a special flag only for tests where you can disable the JarHell check for testing in @synhershko please lemme know if that would help you along the lines... |
Thanks @s1monw. I stand corrected with regards to the usage of TransportClient. The blocker for us was building unit tests and integration tests with the excellent ES test suite. We have a rather complex system and ES is being used in various parts of it. We want to be able to both unit- and integration- test those components, and creating a test-local cluster is the only way we could do this efficiently, mocking an ES cluster with test data. Maintaining a test cluster is not an option in that case, and we are fully aware of the consequences of running in a non-clean environment. For us they are negligible, given our use cases. We spent about 2 weeks cleaning up jar hell issues, and ended up complaining about it because we hit walls due to misconfigurations of dependencies which are out of our reach. Since then we were able to bypass Jar Hell with a nice and elegant hack, without forking. Your suggestion is exactly what we were after, and I think I explicitly stated that above. I never wanted to discard strictness for service / server mode, only for test scope in clients due to various circumstances which are beyond our control. Quick googling on this will show we are not the only ones with this kind of issues. I hope this clarifies things further, and thank you for chiming in. |
@synhershko See the client qa test for an example of how to create a real integration test. This would be much better than using ESIntegTestCase, as it will test against a real ES node, not an in memory one that could be mucked up by jarhell. What does ESTestCase get you that you need for your own unit tests, that you would not get by using either junit directly, or LuceneTestCase? |
@rjernst like I said, maintaining an external cluster for tests is an overkill for us. A test-scope cluster is exactly what we need, and if we get to a point where we see failures that we can attribute to jar hell or the like we will revisit this decision. I've been doing this for a while and it hasn't happened yet. ESSmokeClientTestCase gives me nothing that I can't achieve with ESIntegTestCase with less headaches. LuceneTestCase is irrelevant for our purposes. |
@synhershko either way we might even go and make that mandatory in upcoming version since it is the right way to test the integration of elasticsearch. Now that in 3.0 you can reset cluster settings there is no real need for Test scope in 99% of the cases so in upcoming version this might be the way to go. For instance all our REST tests run against such a cluster, they are way more reliable than all other tests. |
@s1monw I'd really appreciate if you wouldn't go that way. Having test-scoped, in-memory clusters is perfect for various tests which make light use of Elasticsearch and therefore aren't exposed (or don't care) about the various failures you are afraid of. This is exactly what we are using it for, and would really like to see this functionality stay as it greatly simplifies and speeds-up our testing infrastructure. Not many databases / data-stores provide volatile versions of themselves for testing purposes, and those which do are much easier to work with in dev environments. |
We have been gradually moving away from this for a reason. It doesn't really test integration, it is really just a complicated unit test. It is much better to have real unit tests (which are actually light weight, ESIntegTestCase is so far from light weight it is not funny), and test integration with a real node. The difference in time between an in memory cluster for tests, and actually spinning up an ES node (which the infrastructure provides in both maven for 2.x and gradle for master), isn't that much, and the latter actually checks it works against elasticsearch. |
This change moves the JarHell check under the test.security.manager guard such that users that have to opt out of JarHell for testing can do so by disabling security. It will all users to use our test framework but when doing so the parameter `-Dtests.security.manager=false` will signal how serious it is to opt out of this check Relates to elastic#11932
@synhershko here is a workaround for tests #16042 |
Looks good, thanks @s1monw . I did some homework wrt to the tests framework the past week, I'll post about it later today or tomorrow. |
Took longer than than expected, but here is my take on testing with Elasticsearch. There aren't many technologies which provide a good testing framework bundled with the tech, and even fewer data store technologies which provide in-memory testing capabilities. But those which do provide this, are by far easier to get things right with. Worth noting that the advice given in the discussion here is different than what's advertised in the docs. With regard to the ESSmokeClientTestCase class, it is already implemented as ExternalTestCluster in the aforementioned Java Testing Framework. And then there is also the InternalTestCluster implementation which spins an in-memory, test-scoped cluster for a test to run with. As a long time user and an Elasticsearch consultant I've seen pretty much every use case you can imagine, and at the end of the day when it comes to testing there are those systems with those Units that can only be really tested when they are paired with Elasticsearch. And for the cases where you are actually running on the JVM, you are particularly lucky. Think about the Storm use case, where somewhere in the topology there are bolts that issue a query (we are not using es-hadoop). The only real way of testing those bolts and their logic is to have a contained Elasticsearch cluster, with contained index, issue the query based on the inputs, and compare the results. This is a Unit Test per se, one that can only be conducted on a clean index with a known set of documents, which is presumably populated at the beginning of each test method / class. Now, sure, one could spin his own cluster somewhere and run those tests each against it's own index. But as everyone who had to deal with CI and deployments knows - infrastructure is a bitch. We all want to avoid work there. Every test will have to run on at least 2 different environments (dev local, and CI, and maybe via production probes etc) and requiring an Elasticsearch cluster just for testing on each is an overkill. Ideally, each test should be able to run it's own ES cluster if needed, at least as a fallback, so at least the local dev's experience is flawless. This is uber important in large teams, where many technologies are in use. There's also questions of security and costs involved. This isn't about testing ES plugins or cluster configurations, and I couldn't care less about performance or the like. Most of the time it's also not about integration tests. It is just about having (even a minimal) version of Elasticsearch available, with the least headache possible, to perform what we call Unit Tests, but ones that mean nothing without Elasticsearch playing along for basic functionality. For those cases, shard allocation, performance etc aren't a concern. To facilitate that, we ended up doing the following:
Wherever we need to test cluster configurations, real integration etc, we obviously use real cluster deployments. This isn't the case in many of our tests, and exactly why we could use such support from the official ES project. As it's already there, it might be worth knowing that there are people who find it useful before deciding to remove it or bluntly declaring that if you use it you are doing something wrong. We will work on extracting this from our code base and open-sourcing it soon in case someone else finds it useful. |
Note its imperfect, there is currently some intentional "hell" done by ES:
But I removed all the jar hell in the test classpath and so on.
Closes #11926