Add integration tests for the shipper output #23

rdner · 2022-04-13T13:50:34Z

An event batch is published from an input to the shipper gRPC server
An event batch is not dropped when the gRPC server is not available but starts later
An event batch is not dropped when ResourceExhausted code is returned from the gRPC

The text was updated successfully, but these errors were encountered:

rdner · 2022-05-09T12:59:34Z

Looking at the interface again, it occurred to me that I never asked one of the questions I had:

Lines 13 to 21 in f2aa770

    
           // Publishes an event via the Elastic agent shipper. 
        
           // 
        
           // Blocks until all processing steps complete and data is written to the queue. Returns a 
        
           // RESOURCE_EXHAUSTED gRPC status code if the queue is full. 
        
           // 
        
           // Inputs may execute multiple concurrent Produce requests for independent data streams.  
        
           // The order in which concurrent requests complete is not guaranteed. Use sequential requests to 
        
           // control ordering. 
        
           rpc PublishEvents(PublishRequest) returns (PublishReply);

is it fair to assume that regardless the returned error I must always check the list of results and mark the listed events as "accepted to the queue"?

I'll give you an example:

I sent 50 events to the gRPC server
It responds with an error with the gRPC code ResourceExhausted
However, it also returns a list of 20 event results

The way I understand it now, I should treat these 20 events as accepted to the queue and I must retry the rest of the 30 events.

Can you please both confirm that this is the right behaviour that we're going to implement on the server?

Also, is this the case with the ResourceExhausted code only, or regardless the error I must always look the results up and retry the unaccepted events?

cmacknz · 2022-05-09T13:15:07Z

The way I understand it now, I should treat these 20 events as accepted to the queue and I must retry the rest of the 30 events.

Yes, I think it is reasonable for us to handle partially published batches. If we start implementing it and decide it is too difficult or complex we can re-evaluate that.

Also, is this the case with the ResourceExhausted code only, or regardless the error I must always look the results up and retry the unaccepted events?

ResourceExhausted is the most obvious case where some events could succeed but others may fail (because the queue is full). For now you can assume it is the only error with this behaviour.

rdner mentioned this issue Apr 13, 2022

[Meta][Feature] Enable filebeat and metricbeat to publish data to the shipper #8

Closed

4 tasks

rdner changed the title ~~Add integration tests that cover the following test cases:~~ Add integration tests for the shipper output Apr 13, 2022

rdner self-assigned this Apr 13, 2022

rdner added v8.3.0 Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Apr 13, 2022

rdner mentioned this issue May 9, 2022

Add tests for shipper output and add support for partial results elastic/beats#31558

Merged

5 tasks

rdner closed this as completed in elastic/beats#31558 May 11, 2022

cmacknz mentioned this issue Aug 22, 2022

[Meta] Shipper 8.5 - Experimental integration with Filebeat and Metricbeat #15

Closed

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration tests for the shipper output #23

Add integration tests for the shipper output #23

rdner commented Apr 13, 2022 •

edited

Loading

rdner commented May 9, 2022

cmacknz commented May 9, 2022

Add integration tests for the shipper output #23

Add integration tests for the shipper output #23

Comments

rdner commented Apr 13, 2022 • edited Loading

rdner commented May 9, 2022

cmacknz commented May 9, 2022

rdner commented Apr 13, 2022 •

edited

Loading