Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RequestId to request and response and use it in EventSource #5600

Merged
merged 8 commits into from
Mar 27, 2019

Conversation

pakrym
Copy link
Contributor

@pakrym pakrym commented Mar 26, 2019

Fixes: #5599

RequestId automatically flows from request to response in transport.

Change EventSource to explicitly log parts of the request/response without relying on ToString implementations.

Add logging of request/response content if verbose level is enabled.

Log error content at Information level.

const int LOG_DELAY = 5;
const int LOG_ERROR_RESPONSE = 6;
// Maximum event size is 32K but we need to account for other data
private const int MaxEventPayloadSize = 31000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you pick up these const values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrongly. I mixed up EventLog (that has 32K limit) and EventSource (~10K limit).

Last time I wrote EventSource that produces a lot of logs I experimented with payload sized and it seems like even though https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.eventing.eventprovider.writeevent?view=netframework-4.7.2 claims the limit is 64K messages with payloads larger than 10K would often get lost.

@@ -97,6 +97,7 @@ sealed class PipelineRequest : HttpPipelineRequest
public PipelineRequest()
{
_requestMessage = new HttpRequestMessage();
CorrelationId = Guid.NewGuid().ToString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be specific about how to write the guid into a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correlation id is a transparent value, we can decide on the format we like more but it doesn't matter much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this a little bit more and the commonly used format seems to be the default guid.ToString() format client-request-id: 9C4D50EE-2D56-4CD3-8152-34347DC9F2B0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for the format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a means to pass an existing CorrelationId to the request? The pattern that I've typically used/seen is to accept a correlation identifier and if one isn't populated, then generate one for this operation that bubbles down into any related calls. It could be useful if we're working against a service that requires a sequence of related operations or potentially exposing logging to SDK consumers at some point.

@pakrym
Copy link
Contributor Author

pakrym commented Mar 26, 2019

@maririos
Copy link
Member

This property was called ClientRequestID

You mean the CorrelationId?
The App Config team uses x-ms-correlation-id for correlation and ClientId for ClientRequest.
See https://github.com/Azure/azure-sdk-for-net-lab/issues/120#issuecomment-462993401

await responseContent.CopyToAsync(memoryStream);

// Rewind the stream
responseContent.Position = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check CanSeek before setting the position since this takes a base Stream instead of something specialized? Perhaps, it may be worth considering changing the type to something more specialized that we're expecting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can reasonably support non-seekable response streams given our overall requirements.

Should we check CanSeek before setting the position since this takes a base Stream instead of something specialized

If the stream is not seekable and we've read the body we won't be able to deserialize the response at the end of the pipeline.

Another solution to this is wrapping the response stream into a logging stream so data can be logged while it's being read but it's way more complicated.

@@ -97,6 +97,7 @@ sealed class PipelineRequest : HttpPipelineRequest
public PipelineRequest()
{
_requestMessage = new HttpRequestMessage();
CorrelationId = Guid.NewGuid().ToString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a means to pass an existing CorrelationId to the request? The pattern that I've typically used/seen is to accept a correlation identifier and if one isn't populated, then generate one for this operation that bubbles down into any related calls. It could be useful if we're working against a service that requires a sequence of related operations or potentially exposing logging to SDK consumers at some point.

@jsquire
Copy link
Member

jsquire commented Mar 27, 2019

This property was called ClientRequestID (https://docs.microsoft.com/en-us/dotnet/api/microsoft.windowsazure.storage.operationcontext.clientrequestid?redirectedfrom=MSDN&view=azure-dotnet#Microsoft_WindowsAzure_Storage_OperationContext_ClientRequestID) in other SDKs so we might want to keep the name. Header name is also x-ms-client-request-id (https://docs.microsoft.com/en-us/rest/api/storageservices/Storage-Analytics-Log-Format?redirectedfrom=MSDN)

Personally, my vote goes to CorrelationId. In my experiences, this is the more standard nomenclature across a services-based application and, from the perspective of the consuming application, it is a correlation between each logical operation that may span multiple calls to multiple services. I can understand why an Azure service may prefer to reference as a client request identifier but, in my opinion, the SDK should reflect a consumer perspective.

@pakrym
Copy link
Contributor Author

pakrym commented Mar 27, 2019

@jsquire there is another feature we might want to use "correlation id" for - correlating multiple requests as part of a single logical operation.

@pakrym
Copy link
Contributor Author

pakrym commented Mar 27, 2019

Is there a means to pass an existing CorrelationId to the request? The pattern that I've typically used/seen is to accept a correlation identifier and if one isn't populated, then generate one for this operation that bubbles down into any related calls. It could be useful if we're working against a service that requires a sequence of related operations or potentially exposing logging to SDK consumers at some point.

It can be set on the request in client SDK or via the pipeline policy.

@jsquire
Copy link
Member

jsquire commented Mar 27, 2019

@jsquire there is another feature we might want to use "correlation id" for - correlating multiple requests as part of a single logical operation.

That's what I'm advocating for; I'm likely missing something, but I don't see how that would be a separate scenario rather than an extension of this one.

@pakrym
Copy link
Contributor Author

pakrym commented Mar 27, 2019

That's what I'm advocating for; I'm likely missing something, but I don't see how that would be a separate scenario rather than an extension of this one.

There are two scenarios here:

  • Correlating request/response/log messages related to a single pipeline invocation
  • Correlating multiple pipeline invocations between themselves

And I don't think a single id can be used for both.

I'm trying to solve the former in this PR the later is a larger work item that requires more design.

@pakrym pakrym changed the title Add CorrelationId to request and response and use it in EventSource Add RequestId to request and response and use it in EventSource Mar 27, 2019
@@ -29,6 +29,8 @@ public virtual void AddHeader(string name, string value)

public abstract IEnumerable<HttpHeader> Headers { get; }

public abstract string RequestId { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be typed as Guid? It would allow us to lazy allocate string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might end up allocating the same string multiple times when using it. RequestId is used multiple times in 99.9% of the pipelines so this string would always be allocated it's better to cache it.

@pakrym pakrym merged commit 1028dd7 into Azure:master Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants