Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secure HDFS fixture startup fails on JDK 8u262 to 8u271 #61050

Closed
jaymode opened this issue Aug 12, 2020 · 4 comments
Closed

Secure HDFS fixture startup fails on JDK 8u262 to 8u271 #61050

jaymode opened this issue Aug 12, 2020 · 4 comments
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs jvm bug Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI

Comments

@jaymode
Copy link
Member

jaymode commented Aug 12, 2020

Build scan: https://gradle-enterprise.elastic.co/s/r7hg55pyb7g6i
https://gradle-enterprise.elastic.co/s/2vx2ssb2jkzl4
https://gradle-enterprise.elastic.co/s/7jdmamwoandv6

On 7.x, when the runtime Java home is a JDK 8 build between u262 and u271, the secureHDFS fixture will fail to start up due to a NullPointerException when logging in via Kerberos.

   java.io.IOException: Login failure for hdfs/[email protected] from keytab /dev/shm/elastic+elasticsearch+7.x+matrix-java-periodic/ES_RUNTIME_JAVA/zulu8/nodes/general-purpose/test/fixtures/krb5kdc-fixture/testfixtures_shared/shared/hdfs/keytabs/hdfs_hdfs.build.elastic.co.keytab: javax.security.auth.login.LoginException: java.lang.NullPointerException
    	at sun.security.krb5.KrbKdcRep.check(KrbKdcRep.java:137)
    	at sun.security.krb5.KrbAsRep.decrypt(KrbAsRep.java:159)
    	at sun.security.krb5.KrbAsRep.decryptUsingKeyTab(KrbAsRep.java:121)
    	at sun.security.krb5.KrbAsReqBuilder.resolve(KrbAsReqBuilder.java:310)
    	at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:498)
    	at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:780)
    	at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:618)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1057)
    	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:286)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1081)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:376)
    	at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:233)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1027)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:830)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:485)
    	at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
    	at hdfs.MiniHDFS.main(MiniHDFS.java:119)
    
    	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1066)
    	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:286)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1081)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:376)
    	at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:233)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1027)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:830)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:485)
    	at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
    	at hdfs.MiniHDFS.main(MiniHDFS.java:119)
    Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException
    	at sun.security.krb5.KrbKdcRep.check(KrbKdcRep.java:137)
    	at sun.security.krb5.KrbAsRep.decrypt(KrbAsRep.java:159)
    	at sun.security.krb5.KrbAsRep.decryptUsingKeyTab(KrbAsRep.java:121)
    	at sun.security.krb5.KrbAsReqBuilder.resolve(KrbAsReqBuilder.java:310)
    	at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:498)
    	at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:780)
    	at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:618)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
    	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1057)
    	at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:286)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1081)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:376)
    	at org.apache.hadoop.hdfs.DFSTestUtil.formatNameNode(DFSTestUtil.java:233)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:1027)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:830)
    	at org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:485)
    	at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:444)
    	at hdfs.MiniHDFS.main(MiniHDFS.java:119)
    
    	at javax.security.auth.login.LoginContext.invoke(LoginContext.java:856)
    	at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
    	at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
    	at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
    	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1057)
    	... 9 more
    2020-08-12 16:23:39,761 INFO  [main] hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1789)) - Shutting down the Mini HDFS Cluster

The root cause is the same as #56507, which was caused by https://bugs.openjdk.java.net/browse/JDK-8246193.

@jaymode jaymode added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI jvm bug labels Aug 12, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Aug 12, 2020
@ywelsch
Copy link
Contributor

ywelsch commented Aug 13, 2020

I don't see how we can disable these Gradle tasks for these specific JDK versions (there does not seem to be any prior art for this). Any thoughts on this @elastic/es-core-infra?

@jaymode
Copy link
Member Author

jaymode commented Aug 13, 2020

This was briefly discussed with @rjernst and @mark-vieira. The outcome was that we may need to disable this entirely until we get a JDK8 build in CI that is fixed, which also depends on a JDK8 being released that is fixed. As of now the latest release for Zulu and AdoptOpenJDK is 8u265, which still has this bug.

jaymode added a commit that referenced this issue Aug 13, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
jaymode added a commit to jaymode/elasticsearch that referenced this issue Aug 13, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes elastic#61050
jaymode added a commit to jaymode/elasticsearch that referenced this issue Aug 13, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes elastic#61050
jaymode added a commit to jaymode/elasticsearch that referenced this issue Aug 13, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes elastic#61050
jaymode added a commit that referenced this issue Aug 14, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
Backport of #61119
jaymode added a commit that referenced this issue Aug 14, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
Backport of #61119
jaymode added a commit that referenced this issue Aug 14, 2020
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
Backport of #61119
@jaymode
Copy link
Member Author

jaymode commented Aug 14, 2020

I found a fix that allows the tests to work even on these JVMs with the bug and merged it in #61119 and also backported the fix as well to other active branches that run against JDK8. The 7.x java matrix ci job was also a success after this workaround was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs jvm bug Team:Distributed Meta label for distributed team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants