-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Proposal to remove 'Help' pages and gather and update content on 'Large Datasets' #2738
Comments
Thanks for these two suggestions and all your help with the docs lately! 🚀 I agree with you that here is duplicated info in the help tag. I am leaning towards still be keeping a visible Help indicator in some form, because it makes it easier to discover the help for new readers which I think it important. It doesn't have to be in the navbar, but I don't have a better idea currently myself. I could possibly see moving it under "Getting started", but not sure... I do agree that it is not ideal to have the hidden "More" section. We could probably reorganize this since we are also planning to add a Resource page too (although that might be merged with ecosystems #2415). I am very much in favor of adding a page for working with large data and moving things out of the help there (and just having a link on the help page). However I don't think we should recommend the json transformer. I remember Jake mentioning that in a recent comment that I can't find but I found this older one that has the same message. In general I think the Related issues: |
Hi @joelostblom, VegaFusion's support for the Vega specs that Vega-Lite generates is fairly complete. Transforms that are not supported are left in the Vega spec that the Vega renderer handles, so it falls back gracefully in these situations. The biggest ecosystem limitation is that VegaFusion currently depends on a custom Jupyter Widget to render the resulting Vega specs and communicate with the VegaFusion runtime. Part of my motivation for writing vl-convert is that I'd like to add support to VegaFusion for pre-evaluating and optimizing transforms on the server so that the resulting Vega specs can be rendered by regular Vega mimetype renderers. I'll certainly be interested in your feedback when you have a chance to try it out! |
Thank you Joel for the feedback! I started the implementation in #2755 @jonmmease I find the developments around VegaFusion very exciting and vl-convert is a huge upgrade for Altair in terms of usability so thank you very much for putting in all this effort! Looking forward to what comes next, especially the combination of vl-convert and VegaFusion that you described. |
@binste FYI, I'm working on a VegaFusion PR over in vega/vegafusion#195 that implements the workflow described above. |
Wow, that looks very very exciting, thank you @jonmmease for putting in the work to make this happen! Seems like a much more complete altair_transform. Great screencast and PR documentation, clearly explain what's happening and the differences to the existing widget-based approach. Really like that the data is fully inlined and that it produces standalone Vega specs. Might also solve the need for #2586. Once you release a new version, I'll definitely try this out, maybe even use it as a default when working with Altair if it does not have any downsides. Closing this issue now, but if we all feel comfortable with it once the new version of VegaFusion is out, I think we should also promote VegaFusion and specifically this approach much more on the new "Large Datasets" page in the documentation. |
Yeah, I think it does. In addition to pre-applying data transforms, VegaFusion has fairly complete support for projection pushdown (removing unused columns as early as possible in the data pipeline). So even for non-aggregated charts (e.g. scatter plots) it can reduce the bundle size by trimming out unused columns. |
Wow, this is so helpful! Thanks for implementing that functionality and for pinging us here! |
I have two suggestions how we could improve the documentation even further. I'm happy to create a PR for both of them but I first wanted to get some input if I'm on the right track and if this is of general interest.
Remove the "Help" pages
These pages are currently nested under "More" -> "Help" in the top navigation bar. There content is either duplicated or could be reorganised which would lead to a better structure in my opinion:
Gather content on "Large Datasets" in one place
There is already some content in the documentation on how to deal with larger datasets. There have also been some new developments in third-party packages such as vl-convert and vegafusion which open up new possibilities which are not yet documented. I think it would make sense to gather this information in one place, especially as I don't think it is obvious and there are many options by now with various tradeoffs. I'd suggest to do this in a new page called "Large Datasets" under "Advanced Usage" and link to it from Specifying Data
Existing pages from which some content could be consolidated:
So the new page could start with some explanations on why large datasets can be challenging and then discuss the following recommendations (in this order?) with their pros and cons:x
Please follow these steps to make it more efficient to respond to your feature request.
The text was updated successfully, but these errors were encountered: