-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SIEM][Detection Engine][Lists] Adds the ability to change the timeout limits from 10 seconds for loads for imports #73103
[SIEM][Detection Engine][Lists] Adds the ability to change the timeout limits from 10 seconds for loads for imports #73103
Conversation
…loads for imports
Pinging @elastic/siem (Team:SIEM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing plugin to have per route configuration of timeout probably makes sense in some edge cases such as uploads or long processing, so I don't have any functional objection.
What concerns me with the implementation is that adding these three 'raw' hapi option might be a little too 'low level'. Maybe adding some sugar to let consumers configure their timeout with a single option could be possible? (see comment)
…the docs to reflect this change
} | ||
); | ||
registerRouter(router); | ||
await expect(server.start()).rejects.toThrowError('Invalid routeConfig options'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The message doesn't contain an invalid option name
- you cover this in http config test https://github.com/elastic/kibana/pull/73103/files#diff-a030b4d8ada02fb446eccfaa965fb895, so probably it's fine not to test this logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test and error is actually me testing and exposing the errors that Hapi server is giving back to us here. I don't do a throw on the zero and neither does the Kibana Schema for this particular test. This test exercises the route as if the route was trying to send down a socket timeout of zero with no Kibana Schema validation and the error that comes back is from Hapi Server which does not tell you which invalid route config option you have. Hapi just does a generic throw which is, Invalid routeConfig options
and is not really under my control unless I begin catching and abstracting away Hapi exceptions (which I didn't see anywhere else at this moment us doing).
This boundary test I found useful as it gave me confidence that if someone tries to create a route without any schema validation and pushes down a zero then at least Hapi will throw an error and that route will not be considered valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid testing Hapi internals then. It's easy to have false positives on the next Hapi update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will remove it.
const router = createRouter('/'); | ||
|
||
router.post( | ||
{ path: '/a', validate: false, options: { timeout: 300000 } }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tests and the others around it test that if the timeout is set it will work (sanity test). I didn't think adding an incredibly long lived test to ensure they can exceed the 2 minutes would be ok so I instead opted for a simple sanity check.
But really exceeding the full 2 minutes but not exceeding 5 minutes would be a typical positive unit test that it is possible to exceed the default 2 minutes and operates as we would want it. However, during integration tests I thought I probably shouldn't exceed 2 minutes for the test so instead I just opted for a test that shows when it is set on a regular endpoint call it will operate normally and not cause any issues such as unexpected timeouts.
); | ||
router.delete({ path: '/b', validate: false }, (context, req, res) => res.ok({})); | ||
await server.start(); | ||
expect(supertest(innerServer.listener).delete('/a')).rejects.toThrow('socket hang up'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add a test for 408 Request timeout
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 408 Request timeout only happens when it is a PUT/POST with a form/payload:
https://hapi.dev/api/?v=19.2.0#-routeoptionspayloadtimeout
And then you have to simulate a connection that starts but then stalls or does it so slowly that the backend decides to then disconnect you because you're taking too long. I looked around at supertest
yesterday for a quite a while hoping it had an easy way to explicitly test the condition but it does not look like there is a way for me to be able to start an upload and then stall it.
For the other endpoints such as GET, DELETE which do not take a payload when the socket does a disconnect Hapi server will do a hangup on them on a timeout and not give back a 408 Request timeout. It doesn't advertise that is its behavior but that is the behavior as the tests are showing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can set a different switch which is the timeout.server here:
https://hapi.dev/api/?v=19.2.0#-routeoptionstimeoutserver
And then increase the socket timeout to be slightly higher than that and instead of a socket hangup we would get a:
Service Unavailable (503) error response.
For GET, DELETE endpoints and a 408 for when its payload based with PUT/POST, but it seemed like the socket hangup was more preferred from the example you sent me in the tests. Just let me know if you want me to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay if you merge it as is and create an issue for the platform team to add such a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket I created for this as requested:
#73557
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
💚 Build SucceededBuild metrics
History
To update your PR or re-run it, just comment with: |
…t limits from 10 seconds for loads for imports (elastic#73103) ## Summary By default the upload time limit for payloads is 10 seconds. This is really too short and we were getting internal QA bug reports that uploads are timing out on large value list importing. This PR adds the plumbing and unit tests to make the timeout configurable for routes. * Adds a single timeout option for routes and then normalizes that through Hapi for the socket, payload, and server timeouts. * Adds unit tests which test the various options * Adds integration tests which test the various options * Adds some NOTES about where there are odd behaviors/bugs within Hapi around validations and the timeouts * Adds a configurable 5 minute timeout to the large value lists route **Manual testing of the feature** You can manually test this by adding a configurable option to your chrome network throttle like so below where you throttle upload by some configurable amount. I chose to use 300 kbs/s upload <img width="556" alt="Screen Shot 2020-07-23 at 11 26 01 AM" src="https://user-images.githubusercontent.com/1151048/88318015-5ab3f700-ccd7-11ea-9d9b-7e3649ec65de.png"> And then run an import of large value lists using a large enough file that it will exceed 5 minutes: ![screen-shot-upload](https://user-images.githubusercontent.com/1151048/88318584-28ef6000-ccd8-11ea-90a1-8ca4aafabcb4.png) After 5 minutes you should see this message within your server side messages if you have configured your kibana.dev.yml to allow for these messages: ```ts server respons [10:52:31.377] [access:lists-all] POST /api/lists/items/_import?type=keyword 408 318292ms - 9.0B ``` Note that it should show you that it is trying to return a `408` after `318292ms` the timeout period. Sometimes you will get the 408 in the browser and sometimes the browser actually will not respect the 408 and continue staying in a pending state forever. This seems to be browser side issue and not a client/user land issue. If you get the browser message it will be this error toaster ![timeout-message](https://user-images.githubusercontent.com/1151048/88318760-74a20980-ccd8-11ea-9b7b-0d27f8eb6bce.png) ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/master/packages/kbn-i18n/README.md) - [x] [Documentation](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#writing-documentation) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#cross-browser-compatibility) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#release-notes-process)
…t limits from 10 seconds for loads for imports (elastic#73103) ## Summary By default the upload time limit for payloads is 10 seconds. This is really too short and we were getting internal QA bug reports that uploads are timing out on large value list importing. This PR adds the plumbing and unit tests to make the timeout configurable for routes. * Adds a single timeout option for routes and then normalizes that through Hapi for the socket, payload, and server timeouts. * Adds unit tests which test the various options * Adds integration tests which test the various options * Adds some NOTES about where there are odd behaviors/bugs within Hapi around validations and the timeouts * Adds a configurable 5 minute timeout to the large value lists route **Manual testing of the feature** You can manually test this by adding a configurable option to your chrome network throttle like so below where you throttle upload by some configurable amount. I chose to use 300 kbs/s upload <img width="556" alt="Screen Shot 2020-07-23 at 11 26 01 AM" src="https://user-images.githubusercontent.com/1151048/88318015-5ab3f700-ccd7-11ea-9d9b-7e3649ec65de.png"> And then run an import of large value lists using a large enough file that it will exceed 5 minutes: ![screen-shot-upload](https://user-images.githubusercontent.com/1151048/88318584-28ef6000-ccd8-11ea-90a1-8ca4aafabcb4.png) After 5 minutes you should see this message within your server side messages if you have configured your kibana.dev.yml to allow for these messages: ```ts server respons [10:52:31.377] [access:lists-all] POST /api/lists/items/_import?type=keyword 408 318292ms - 9.0B ``` Note that it should show you that it is trying to return a `408` after `318292ms` the timeout period. Sometimes you will get the 408 in the browser and sometimes the browser actually will not respect the 408 and continue staying in a pending state forever. This seems to be browser side issue and not a client/user land issue. If you get the browser message it will be this error toaster ![timeout-message](https://user-images.githubusercontent.com/1151048/88318760-74a20980-ccd8-11ea-9b7b-0d27f8eb6bce.png) ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/master/packages/kbn-i18n/README.md) - [x] [Documentation](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#writing-documentation) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#cross-browser-compatibility) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#release-notes-process)
…t limits from 10 seconds for loads for imports (#73103) (#73581) ## Summary By default the upload time limit for payloads is 10 seconds. This is really too short and we were getting internal QA bug reports that uploads are timing out on large value list importing. This PR adds the plumbing and unit tests to make the timeout configurable for routes. * Adds a single timeout option for routes and then normalizes that through Hapi for the socket, payload, and server timeouts. * Adds unit tests which test the various options * Adds integration tests which test the various options * Adds some NOTES about where there are odd behaviors/bugs within Hapi around validations and the timeouts * Adds a configurable 5 minute timeout to the large value lists route **Manual testing of the feature** You can manually test this by adding a configurable option to your chrome network throttle like so below where you throttle upload by some configurable amount. I chose to use 300 kbs/s upload <img width="556" alt="Screen Shot 2020-07-23 at 11 26 01 AM" src="https://user-images.githubusercontent.com/1151048/88318015-5ab3f700-ccd7-11ea-9d9b-7e3649ec65de.png"> And then run an import of large value lists using a large enough file that it will exceed 5 minutes: ![screen-shot-upload](https://user-images.githubusercontent.com/1151048/88318584-28ef6000-ccd8-11ea-90a1-8ca4aafabcb4.png) After 5 minutes you should see this message within your server side messages if you have configured your kibana.dev.yml to allow for these messages: ```ts server respons [10:52:31.377] [access:lists-all] POST /api/lists/items/_import?type=keyword 408 318292ms - 9.0B ``` Note that it should show you that it is trying to return a `408` after `318292ms` the timeout period. Sometimes you will get the 408 in the browser and sometimes the browser actually will not respect the 408 and continue staying in a pending state forever. This seems to be browser side issue and not a client/user land issue. If you get the browser message it will be this error toaster ![timeout-message](https://user-images.githubusercontent.com/1151048/88318760-74a20980-ccd8-11ea-9b7b-0d27f8eb6bce.png) ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/master/packages/kbn-i18n/README.md) - [x] [Documentation](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#writing-documentation) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#cross-browser-compatibility) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#release-notes-process)
…t limits from 10 seconds for loads for imports (#73103) (#73582) ## Summary By default the upload time limit for payloads is 10 seconds. This is really too short and we were getting internal QA bug reports that uploads are timing out on large value list importing. This PR adds the plumbing and unit tests to make the timeout configurable for routes. * Adds a single timeout option for routes and then normalizes that through Hapi for the socket, payload, and server timeouts. * Adds unit tests which test the various options * Adds integration tests which test the various options * Adds some NOTES about where there are odd behaviors/bugs within Hapi around validations and the timeouts * Adds a configurable 5 minute timeout to the large value lists route **Manual testing of the feature** You can manually test this by adding a configurable option to your chrome network throttle like so below where you throttle upload by some configurable amount. I chose to use 300 kbs/s upload <img width="556" alt="Screen Shot 2020-07-23 at 11 26 01 AM" src="https://user-images.githubusercontent.com/1151048/88318015-5ab3f700-ccd7-11ea-9d9b-7e3649ec65de.png"> And then run an import of large value lists using a large enough file that it will exceed 5 minutes: ![screen-shot-upload](https://user-images.githubusercontent.com/1151048/88318584-28ef6000-ccd8-11ea-90a1-8ca4aafabcb4.png) After 5 minutes you should see this message within your server side messages if you have configured your kibana.dev.yml to allow for these messages: ```ts server respons [10:52:31.377] [access:lists-all] POST /api/lists/items/_import?type=keyword 408 318292ms - 9.0B ``` Note that it should show you that it is trying to return a `408` after `318292ms` the timeout period. Sometimes you will get the 408 in the browser and sometimes the browser actually will not respect the 408 and continue staying in a pending state forever. This seems to be browser side issue and not a client/user land issue. If you get the browser message it will be this error toaster ![timeout-message](https://user-images.githubusercontent.com/1151048/88318760-74a20980-ccd8-11ea-9b7b-0d27f8eb6bce.png) ### Checklist - [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/master/packages/kbn-i18n/README.md) - [x] [Documentation](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#writing-documentation) was added for features that require explanation or tutorials - [x] [Unit or functional tests](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#cross-browser-compatibility) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md#release-notes-process)
Pinging @elastic/security-solution (Team: SecuritySolution) |
Summary
By default the upload time limit for payloads is 10 seconds. This is really too short and we were getting internal QA bug reports that uploads are timing out on large value list importing. This PR adds the plumbing and unit tests to make the timeout configurable for routes.
Manual testing of the feature
You can manually test this by adding a configurable option to your chrome network throttle like so below where you throttle upload by some configurable amount. I chose to use 300 kbs/s upload
And then run an import of large value lists using a large enough file that it will exceed 5 minutes:
After 5 minutes you should see this message within your server side messages if you have configured your kibana.dev.yml to allow for these messages:
Note that it should show you that it is trying to return a
408
after318292ms
the timeout period. Sometimes you will get the 408 in the browser and sometimes the browser actually will not respect the 408 and continue staying in a pending state forever. This seems to be browser side issue and not a client/user land issue. If you get the browser message it will be this error toasterChecklist
For maintainers