Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OnWeek][Discover] Allow to fetch more documents on Discover page #157241

Closed
wants to merge 67 commits into from

Conversation

jughosta
Copy link
Contributor

@jughosta jughosta commented May 10, 2023

Per docs https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html
Screenshot 2023-05-10 at 10 25 20

Summary

  1. This PR improves search_after pagination for date_nanos time fields. sort value will be returned from ES as a string instead of a rounded and incorrect timestamp. This change allows to also simplify logic on Surrounding document page.

Before:
Screenshot 2023-05-08 at 17 36 19

After:
Screenshot 2023-05-08 at 17 37 13

  1. Also in this PR we now allow users to load more documents within the same time range. Once the button is pressed, it will load next portion of documents (same "sampleSize" value will be used). Currently, we limit max total loaded documents to 10000.

"Load more" demo:
Aug-07-2023 16-23-28

If refresh interval is on, the button becomes disabled:
Aug-07-2023 16-24-58

Date nanos demo:
Aug-07-2023 16-34-59

100x Flaky test runner https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2801

@jughosta jughosta added Feature:Discover Discover Application backport:skip This commit does not require backporting Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) labels May 10, 2023
@jughosta jughosta self-assigned this May 10, 2023
@jughosta jughosta changed the title [Discover] Improve date_nanos support for "search_after" pagination [OnWeek][Discover] Allow to fetch more documents on Discover page May 11, 2023
@jughosta jughosta changed the title [OnWeek][Discover] Allow to fetch more documents on Discover page [OnWeek][Discover] POC Allow to fetch more documents on Discover page May 11, 2023
}
return sortPair as EsQuerySortValue;
});

// TODO: do we need to have a tie breaker like this?
Copy link
Member

@kertal kertal May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a tie breaker like mentioned in this Issue ? #47934 and implemented here: #32426

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kertal Yes, in this implementation I am using a tie breaker as it's done in context app: based on Advanced Setting context:tieBreakerFields. I think it makes sense to add it but this change can be considered as a breaking change (also for CSV). So I would need your advice on whether to add it now or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the breaking change. Is it required or optional for the implementation? In terms of breaking change, it don't think it would break existing functionality inside Discover, but, yes, mainly it's about CSV, it could lead to a different sort order when exporting and the tie breaker is not implemented.

Would it be an option to exclude the tiebreaker for this PR and implement it in a follow up / later, with also CSV included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kertal Current implementation of CSV relies on Discover search source sort param and reuses it's value. That means that if a tie breaker is not used for Discover then it's not used for CSV generation at the moment. @tsullivan please correct if my assumption is wrong. It's a good question if an explicit tie breaker is required or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure, since we continued the discussion "offline"
@jughosta mentioned

CSV export is done with search_after and pit so I guess according to docs it is running with _shard_doc by default

So if we introduce a tiebreaker for Discover like _doc that we use for incremental data fetching, this would not influence CSV generation, on the contrary, it would make the data fetching more similar, since we'd also use a tie breaker. Can you confirm @tsullivan @rshen91 ? Many thx!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick update here: added _doc as a tie breaker for Discover only so far. It would not affect CSV at this point.

Comment on lines 156 to 160
if (recordRawType === RecordRawType.PLAIN) {
// not supported yet
// TODO?
return Promise.resolve();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think for the initial version this is required, also I assume the logic would be very different, right?

@@ -147,7 +147,7 @@ export function searchSourceFromLocatorFactory(services: LocatorServicesDeps) {

// Inject sort
if (savedSearch.sort) {
const sort = getSortForSearchSource(savedSearch.sort as Array<[string, string]>, index);
const sort = getSortForSearchSource(savedSearch.sort as Array<[string, string]>, index, null); // TODO: is it correct that it was using a `desc` as default sort here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think now it's correct, thx for catching!

Copy link
Member

@kertal kertal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be a great little improvement, breaking out of the usual 500 limit is very useful 🥳
One thing that that would be neat, currently, when loading more data, also histogram data is loaded. But in this case, just loading the document should be sufficient.

@kertal
Copy link
Member

kertal commented Jun 22, 2023

Adding some notes of our planning:

  • We need to figure out if the maximum of 3000 is a reasonable number (when testing)
  • In case auto-refresh is on, the load more link should be disabled (with a tooltip to inform the user)

…agination

# Conflicts:
#	src/plugins/discover/public/application/context/hooks/use_context_app_fetch.tsx
#	src/plugins/discover/public/application/context/services/context.ts
#	src/plugins/discover/public/application/main/components/layout/discover_documents.tsx
#	src/plugins/discover/public/application/main/services/discover_data_state_container.ts
#	src/plugins/discover/public/components/discover_grid/discover_grid.tsx
PhilippeOberti and others added 12 commits August 14, 2023 10:18
…vestigation order and expand Investigation by default (elastic#163684)
## Summary

Some simple dev UX improvements to the swap_references data views api - 

```
POST /api/data_views/swap_references/_preview
{
     "fromId" : "abcd-efg",
     "toId" : "xyz-123"
}

returns 
{
  result: [{ id: "123", type: "visualization" }],
}
```


```
POST /api/data_views/swap_references
{
     "fromId" : "abcd-efg",
     "toId" : "xyz-123",
     "delete" : true // optional, removes data view which is no longer referenced
}

returns 
{
  result: [{ id: "123", type: "visualization" }],
  deleteStatus: {
    remainingRefs: 0,
    deletePerformed: true
}
```

Additional params - 

```
fromType: string - specify the saved object type. Default is `index-pattern` for data view
forId: string | string[] - limit the affected saved objects to one or more by id
forType: string - limit the affected saved objects by type
```

Improves upon elastic#157665

Docs will be created in follow up PR
…ewer cypress test (elastic#162839)

## Summary

Ref: elastic#162569

The test was trying to load the exceptions tab before the rule details
page loaded. Now we wait for the rule tab to load before continuing.
Something I was unaware of was that `cy.url()` will [automatically
retry](https://docs.cypress.io/api/commands/url#Assertions) until all
chained assertions have passed, which I think can be an easy way to fix
future flake issues where cypress tries to click on an element before
the new page loads.
…amper protection only available if security defend integration present (elastic#162196)

## Summary
UI
- [x] When there is no elastic defend integration present, the agent
tamper protection (`is_protected`) switch and instruction link are
disabled and there is an info tooltip explaining why the switch is
disabled

API
- [x] Requires the elastic defend integration to be present, in order to
set `is_protected` to true. Will allow the user to create the agent
policy and not throw an error, but will keep `is_protected` as false and
log a warning in the kibana server. In the next release, the response
will be modified to send back a 201 with the relevant messaging.
- [x] Sets `is_protected` to false when a user deletes the elastic
defend package policy

## Screenshots

### No Elastic Defend integration installed
<img width="970" alt="image"
src="https://github.com/elastic/kibana/assets/56409205/910be766-1a1e-4580-9ace-306089b4626d">
….10.0 (elastic#163627)

## Summary

Closes elastic#157456

Secret storage requires that fleet servers are 8.10.0 or above. 

This PR adds a backend check that all fleet servers are above 8.10.0
before enabling secrets storage. Once all fleet servers are above that
version, secrets are permanently enabled.

the fleet server check checks all agents in policies that contain the
fleet server package.

A flag on the`ingest_manager_settings` saved. object
`secret_storage_requirements_met` is used to make a note that the check
has previously passed, meaning we don't have to keep querying the agents
and policies.

Test scenarios (all covered by integration tests) : 

- given a deployment with no fleet servers connected, on creating a
package policy with secret variables, the values should be stored in
plain text not as a secret reference
- given a deployment with at least one fleet server that is below
8.10.0, on creating a package policy with secret variables, the values
should be stored in plain text not as a secret reference
- given a deployment where all fleet servers are 8.10.0 or above,
secrets should be stored as secret references and in the secrets index
- if a package policy was created before secrets were enabled, and since
its creation the fleet server versions pass the check, when updating
that policy, all secrets should move to being secret references.

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
Co-authored-by: Julia Bardi <[email protected]>
@jughosta jughosta requested review from a team as code owners August 14, 2023 08:57
@jughosta
Copy link
Contributor Author

jughosta commented Aug 14, 2023

Sorry all for the ping.

Reopened the PR in #163784

@jughosta jughosta closed this Aug 14, 2023
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
discover 627 629 +2
telemetry 30 31 +1
total +3

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
securitySolution 128 129 +1

Any counts in public APIs

Total count of every any typed public API. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats any for more detailed information.

id before after diff
securitySolution 2 3 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
controls 199.8KB 199.3KB -545.0B
discover 557.4KB 562.4KB +5.0KB
fleet 1022.3KB 1022.7KB +446.0B
securitySolution 15.6MB 15.6MB -13.4KB
total -8.5KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
securitySolution 32 33 +1

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
fleet 138.7KB 138.8KB +74.0B
securitySolution 58.8KB 58.9KB +122.0B
securitySolutionServerless 25.4KB 25.4KB +26.0B
telemetry 19.2KB 19.2KB -15.0B
total +207.0B
Unknown metric groups

API count

id before after diff
securitySolution 194 195 +1

References to deprecated APIs

id before after diff
discover 20 19 -1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @jughosta

jughosta added a commit that referenced this pull request Aug 17, 2023
…63784)

> [!WARNING]
> Sorry, I had to recreate the PR
#157241
> Please submit your review again.

- Closes #155019

Per docs
https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html
<img width="851" alt="Screenshot 2023-05-10 at 10 25 20"
src="https://github.com/elastic/kibana/assets/1415710/b4b9fef4-7dd8-40ed-8244-343889fc4367">


## Summary

1. This PR improves `search_after` pagination for `date_nanos` time
fields. `sort` value will be returned from ES as a string instead of a
rounded and incorrect timestamp. This change allows to also simplify
logic on Surrounding document page.

Before:
<img width="400" alt="Screenshot 2023-05-08 at 17 36 19"
src="https://github.com/elastic/kibana/assets/1415710/fd9f45c4-5dc2-4103-83b9-8810e3a6e0df">

After:
<img width="400" alt="Screenshot 2023-05-08 at 17 37 13"
src="https://github.com/elastic/kibana/assets/1415710/fe9090c0-2116-4f77-9a57-a96ae6b00365">

2. Also in this PR we now allow users to load more documents within the
same time range. Once the button is pressed, it will load next portion
of documents (same "sampleSize" value will be used). Currently, we limit
max total loaded documents to 10000.

"Load more" demo:
![Aug-07-2023
16-23-28](https://github.com/elastic/kibana/assets/1415710/53af9809-75cb-4b8a-8e99-d8f6d76b4981)

If refresh interval is on, the button becomes disabled:
![Aug-07-2023
16-24-58](https://github.com/elastic/kibana/assets/1415710/85db6144-98eb-40b5-ac88-80ea728bcd6b)

Date nanos demo:
![Aug-07-2023
16-34-59](https://github.com/elastic/kibana/assets/1415710/dc9fe0b1-e419-4c76-9fc6-79907b134e58)


100x Flaky test runner
https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/2801

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Discover Discover Application release_note:enhancement Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Discover] Allow to load documents beyond the limit of discover:sampleSize