-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KV CryptographyClient.getKeys 401s with Token not yet valid #26624
Comments
@vcolin7 could you please assist @garantir-km with this github issue? |
Checking back on this. Anybody know what the issue could have been? |
Hi @garantir-km, sorry for the late response. I searched around for a bit and found a few instances of similar issues such as this one, this one and this one. The common factor in all these was that the time on the machines was ahead of the time on the servers, which caused the token that was generated to not be valid yet. However, those issues were using software other than the Azure SDK (such as the Azure CLI). I ran some tests using the latest version of our Azure Key Vault Keys and Azure Identity libraries and could not reproduce this error after changing the time on my local machine, so I don't think being out of sync could have been the issue. It looks like when using a Maybe it's possible that for a brief window the auth server was not in sync with the time on the Key Vault? I am not an expert in Azure infrastructure, so I will have to bring this up with some folks internally to find answers. I am inclined to say that the issue probably does not lie with the Azure SDK, but some other component. Just in case, would you mind sharing what version of |
Hi @vcolin7, thank you for the details. We did our own quick analysis of the code and came to the same conclusion that the timestamp of our server does not get used during token generation. Like you, we suspect that Azure's authorization server is generating the timestamp and there was possibly some time sync issue between the Key Vault / Managed HSM server that processes the request and the authorization server that generated the token. Interestingly, the same token would sometimes work and sometimes fail. Our server uses that same token in a multi-threaded environment and most requests succeeded. Unless the SDK is smart enough to regenerate the token when this error occurs, I believe all those requests used the same token. The versions of the SDK we are using are: We look forward to hearing what your internal folks have to say about this issue. |
Talked to other folks working on the Key Vault Azure SDK and they also think it most likely was some slight de-sync between servers. Without having the full logs or at least the request IDs I don't think there's much that can be done to figure out the exact details of what happened. Maybe this is something you should route to your Microsoft support contact to have a support engineer look into it further. I will be closing this as it is not an issue on the Azure SDK end, but the service's. |
Describe the bug
Everything has been working fine in our environment and then all of a sudden at 2022-01-20 07:46:05 (UTC) we started getting 401 errors when making Key Vault requests (key retrieval and sign operations). The error message was always "Token not yet valid (Activity ID: XXXX-YYY-ZZZ)". The error showed up intermittently but pretty frequently for about 12 hours. We made no changes to our code and we have been successfully using this code in this environment (Azure Managed HSM) for a few weeks now. The computers where we ran the tests all had NTP configured and as soon as we detected the error we checked the date/time and everything looked fine.
As I write this right now, I am not currently able to reproduce the issue anymore, even though we have made no changes to the code, runtime, or to any system clocks.. However, we, along with our customers, are concerned for the stability of this in production and would love an explanation so that we can sleep comfortably and know what to do if this happens again in the future.
Exception or Stack Trace
com.azure.core.exception.HttpResponseException: Status code 401, "{"error":{"code":"Unauthorized","message":"Token not yet valid (Activity ID: XXXX-YYY-ZZZZZ)"}}"
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.azure.core.http.rest.RestProxy.instantiateUnexpectedException(RestProxy.java:389)
at com.azure.core.http.rest.RestProxy.lambda$ensureExpectedStatus$7(RestProxy.java:444)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:125)
at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.signalCached(MonoCacheTime.java:337)
at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.onNext(MonoCacheTime.java:354)
at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2398)
at reactor.core.publisher.MonoCacheTime$CoordinatorSubscriber.onSubscribe(MonoCacheTime.java:293)
at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:192)
at reactor.core.publisher.MonoFlatMap.subscribeOrReturn(MonoFlatMap.java:53)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:57)
at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
at reactor.core.publisher.MonoCacheTime.subscribeOrReturn(MonoCacheTime.java:143)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:57)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)
at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2398)
at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.set(Operators.java:2194)
at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.onSubscribe(Operators.java:2068)
at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)
at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:157)
at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:120)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:151)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:249)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1816)
at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:249)
at reactor.core.publisher.SerializedSubscriber.onNext(SerializedSubscriber.java:99)
at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onNext(FluxRetryWhen.java:174)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)
at reactor.core.publisher.Operators$MonoInnerProducerBase.complete(Operators.java:2664)
at reactor.core.publisher.MonoSingle$SingleSubscriber.onComplete(MonoSingle.java:180)
at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onComplete(MonoFlatMapMany.java:260)
at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onComplete(FluxMapFuseable.java:150)
at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:145)
at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onComplete(FluxMapFuseable.java:150)
at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1817)
at reactor.core.publisher.MonoCollect$CollectSubscriber.onComplete(MonoCollect.java:159)
at reactor.core.publisher.FluxHandle$HandleSubscriber.onComplete(FluxHandle.java:213)
at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:269)
at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:400)
at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:419)
at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:473)
at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:684)
at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:93)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1372)
at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1246)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1286)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Suppressed: java.lang.Exception: #block terminated with an error
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
at reactor.core.publisher.Mono.block(Mono.java:1706)
at com.azure.security.keyvault.keys.cryptography.CryptographyClient.getKeyWithResponse(CryptographyClient.java:91)
at com.azure.security.keyvault.keys.cryptography.CryptographyClient.getKey(CryptographyClient.java:69)
at ...
To Reproduce
Any Key Vault call via the Java SDK would produce the result for us. Our Managed HSM is deployed in Canada and the servers that we used to test are located in California and Virginia.
Code Snippet
Expected behavior
Cryptographic operation succeeds as it normally does. In the example given here, that means the key is retrieved successfully.
Setup (please complete the following information):
Additional context
We are using a Managed HSM in Canada. Our servers that run our code are located in California and Virginia.
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: