Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename advanced setting ml:fileDataVisualizerMaxFileSize to fileUpload:maxFileSize and increase max geojson upload size to 1GB #92620

Merged
merged 32 commits into from
Mar 3, 2021

Conversation

nreese
Copy link
Contributor

@nreese nreese commented Feb 24, 2021

fixes #46376, #60710, #92504

This PR moves advanced setting ml:fileDataVisualizerMaxFileSize to the file_upload plugin and renames the setting to fileUp\load:maxFileSize. There is a migration script to migrate ml:fileDataVisualizerMaxFileSize to fileUpload:maxFileSize. The setting is now displayed in the general section.

This PR updates geojson upload from a hard coded 50MB to using fileUpload:maxFileSize. This means that geojson upload size will increase from 50MB to 100MB and upto 1GB. Because of this increased file size, geojson importing had to be re-worked so that the entire file is never held in memory at a single time. Chunks of the file are read and indexed in Elasticsearch. The biggest change for users is that they will no longer get to preview the entire file. Instead, they get to preview only the first 10000 features or 3MB, whichever is less.

Screen Shot 2021-02-24 at 7 16 16 AM

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-gis (Team:Geo)

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@nreese nreese changed the title rename advanced setting ml:fileDataVisualizerMaxFileSize to fileUppload:maxFileSize and increase max geojson upload size to 1GB rename advanced setting ml:fileDataVisualizerMaxFileSize to fileUpload:maxFileSize and increase max geojson upload size to 1GB Feb 24, 2021
@nreese
Copy link
Contributor Author

nreese commented Feb 24, 2021

@elasticmachine merge upstream

@nreese
Copy link
Contributor Author

nreese commented Feb 24, 2021

#92504 is resolved by changes made to this PR.

@nreese nreese requested a review from a team as a code owner February 24, 2021 16:36
Copy link
Member

@jgowdyelastic jgowdyelastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for adding the icon and tooltip for the preview-layer in the LayerTOC

created #93261 and #93254 for separate follow-up

@kindsun
Copy link
Contributor

kindsun commented Mar 2, 2021

@nreese Overall working really smoothly. Great to see GeoJSON Upload handling large files! A few questions/comments:

Does this PR need a more recent merged version of master or is the number doubling still an issue? Note the docCount vs. the found number. Again, might just need an updated master.

Screenshot_20210302_123603

Looks like we're handling files with mixed types differently than before (albeit one of them always failed before). Looks like it defaults incorrectly to geo point now? I've attached the file used below for testing. In a perfect world we should get a point, a zig-zag line and a square. Or if the square really is invalid, then just the first two.

Old left, new right:

Screenshot_20210302_124227

aGeoJsonFile.zip

We were previously lowercasing the index & index pattern name to save users the hassle of having to do it when ingesting a file with mixed casing in the name. Might be worth doing a toLowerCase() on this name again:

Screenshot_20210302_124310

@nreese
Copy link
Contributor Author

nreese commented Mar 2, 2021

Does this PR need a more recent merged version of master or is the number doubling still an issue? Note the docCount vs. the found number. Again, might just need an update master.

Needs a more recent version of master. That has been fixed but is not in this branch yet.

Looks like we're handling files with mixed types differently then before (albeit one of them always failed before). Looks like it defaults incorrectly to geo point now? I've attached the file used below for testing. In a perfect world we should get a point, a zig-zag line and a square. Or if the square really is invalid, then just the first two.

Have you tried indexing the file as geo_shape?

@kindsun
Copy link
Contributor

kindsun commented Mar 2, 2021

Have you tried indexing the file as geo_shape?

Sorry, I should've been more clear. I'm just referring to the default type-detection settings which determine if the file is geo point or geo shape upfront. The picture above shows how each are ingested by default without the user overriding any settings.

@nreese nreese requested a review from kindsun March 2, 2021 20:14
@nreese
Copy link
Contributor Author

nreese commented Mar 2, 2021

Sorry, I should've been more clear. I'm just referring to the default type-detection settings which determine if the file is geo point or geo shape upfront. The picture above shows how each are ingested by default without the user overriding any settings.

ok, I have updated the logic for auto selecting geo field type. If a file contains both points and shapes then geo_shape is selected. If a file contains just points then geo_point is selected. With this logic in place, the file loads the same as in previous versions, without the user having to configure geo field type.

@kindsun
Copy link
Contributor

kindsun commented Mar 2, 2021

I'm unable to upload the attached file. The progress indicator stays at 0 until it eventually times out.

image

Following the timeout, it progresses to the results screen showing a bad gateway error but also allowing the user to still Add layer.

image

Then it ends up in a strange state where there are no source details but there is a layer.

image

ne_10m_urban_areas_landscan.zip

FWIW, I tested the file in mapshaper.org and it does appear to be valid but with a lot of intersections. Regardless though, we might want to handle it with something other than a 502 if possible and ensure the user can't add it as a layer.

@nreese
Copy link
Contributor Author

nreese commented Mar 2, 2021

Discussed offline with @aaronjcaldwell about ne_10m_urban_areas_landscan.zip. This file is causing loaders.js problems. Loaders can not cleanly iterator over file without problems. Fixing loaders.js should be done outside of this PR.

There was a problem identified where when in an error state, the wizard incorrectly allowed users to advance to adding the layer. This issue has been resolved.

Copy link
Contributor

@kindsun kindsun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing feedback. This is a long overdue improvement to Maps' file upload capabilities. Also great to have alignment with ML so we can hopefully all benefit from each other's work from here forward. Very excited to see where we can take it from here! 🍾

LGTM w/ green CI!

  • code review
  • tested locally in chrome

@nreese nreese merged commit bd9b349 into elastic:master Mar 3, 2021
nreese added a commit to nreese/kibana that referenced this pull request Mar 3, 2021
…d:maxFileSize and increase max geojson upload size to 1GB (elastic#92620)

* rename ml:fileDataVisualizerMaxFileSize to fileUppload:maxFileSize

* add saved object migration

* file preview

* importing status

* remove console statement

* import complete view

* fix geojson_importer test

* tslint

* i18n fixes

* cleanup

* update documenation for advanced setting rename

* advanced settings usage_collection

* remove ml:fileDataVisualizerMaxFileSize from schemas and types

* add copy buttons for import response and fix geojson upload functional tests

* tslint

* remove clipboard-read check

* return early if env does not support reading from clipboard

* fix reporting tests

* review feedback

* update GeoJsonFileSource to support showing results trimmed icon and tooltip

* add fileUpload to useMlKibana context and replace dependencyCache with useMlKibana

* tslint

* review feedback

* lower case file name

* default to selecting geo_shape when file contains both points and shapes

* fix wizard onError callback to not advance to next step

Co-authored-by: Kibana Machine <[email protected]>
nreese added a commit that referenced this pull request Mar 3, 2021
…d:maxFileSize and increase max geojson upload size to 1GB (#92620) (#93358)

* rename ml:fileDataVisualizerMaxFileSize to fileUppload:maxFileSize

* add saved object migration

* file preview

* importing status

* remove console statement

* import complete view

* fix geojson_importer test

* tslint

* i18n fixes

* cleanup

* update documenation for advanced setting rename

* advanced settings usage_collection

* remove ml:fileDataVisualizerMaxFileSize from schemas and types

* add copy buttons for import response and fix geojson upload functional tests

* tslint

* remove clipboard-read check

* return early if env does not support reading from clipboard

* fix reporting tests

* review feedback

* update GeoJsonFileSource to support showing results trimmed icon and tooltip

* add fileUpload to useMlKibana context and replace dependencyCache with useMlKibana

* tslint

* review feedback

* lower case file name

* default to selecting geo_shape when file contains both points and shapes

* fix wizard onError callback to not advance to next step

Co-authored-by: Kibana Machine <[email protected]>

Co-authored-by: Kibana Machine <[email protected]>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Mar 3, 2021
* master: (45 commits)
  Add outcome of node scripts/build_api_docs (elastic#93399)
  [Lens] fix long field name on field stats panel doesn't wrap (elastic#93279)
  [Bug] Fix filter creation for numeric scripted fields in Discover (elastic#93224)
  [uptime] Fix anomaly alert edit (elastic#93025)
  Consolidate @babel/* packages and use latest compatible version (elastic#93264)
  [Search Embeddable] Add highlighting when searching (elastic#93178)
  [APM] Add missing bottom border to header (elastic#93179)
  [CI] No longer collect APM span stack traces (elastic#93263)
  [XY Chart] Fix "No data to display" error when using IP range aggregation to split series (elastic#93024)
  update generated public api docs
  API DOCS Step 3/3 (elastic#92929)
  chore(NA): look for bazel packages on npm_module folder during distributable build (elastic#93262)
  rename advanced setting ml:fileDataVisualizerMaxFileSize to fileUpload:maxFileSize and increase max geojson upload size to 1GB (elastic#92620)
  [kbn/optimizer] allow customizing the limits path from the script (elastic#93153)
  [Alerting][Docs] Adding template for documenting alert and action types (elastic#92830)
  [jenkins] convert baseline capture job to use tasks (elastic#93288)
  removing the linked issue in comments from PR (elastic#93303)
  chore(NA): do not include fs within a storybook build (elastic#93294)
  [Maps] Update Map extent queries to use bounding box logic for both point and shape queries (elastic#93156)
  Add searchDuration to EQL and Threshold rules (elastic#93149)
  ...
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
fileUpload 197 199 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fileUpload 826.0KB 825.0KB -1.0KB
maps 2.7MB 2.7MB +724.0B
ml 6.4MB 6.4MB -452.0B
triggersActionsUi 1.6MB 1.5MB -23.9KB
total -24.6KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
fileUpload 11.5KB 13.7KB +2.2KB
triggersActionsUi 104.0KB 104.1KB +82.0B
total +2.3KB
Unknown metric groups

async chunk count

id before after diff
triggersActionsUi 41 42 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Maps][File upload] Tracking issue for GeoJSON Upload Optimizations