-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data visualization with OpenAltimetry API #144
Conversation
…visualization module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @icetianli, good job on this, looks like you've picked up some dask
and datashader
skills too! Just a quick suggestion on first user experience, which is to add a progress bar while downloading. Also one minor suggestion to reduce the bounding box size for the test area.
I'll be happy to take a closer look later and can help with adding more tests. What will be helpful is if you can run black
on your code for your next commits (cc @JessicaS11, can you remind us how the black pre-commit hook from #96 works?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, @icetianli! I've got a few overarching comments for some next steps before we merge this PR:
- As @weiji14 suggested, it would be great to run black on this to format it. If you install pre-commit (
pip install pre-commit
) in your local working environment, black should be run automatically when you make a commit, reformatting the files to give the library consistent spacing and use of " vs ' throughout. Let me know if you have any issues with this. - We should expand the docstrings to include input parameters and outputs.
- To get the new functionality to show up in our documentation, we have to explicitly add it to the docs files.
- Some examples to showcase this amazing functionality! This could be an addition to an existing how-to notebook or part of a standalone simple visualization, or maybe even both. My guess is you have either a script or notebook that you've been using this code in, which will make a great starting point for sharing it.
I'm happy to take on some of this work - it looks like a lot, but it's a critical piece of our efforts to make sure that your code is used by the broader community!
…ation, enhance interactive visualization based on cycle number
Thanks @JessicaS11 and @weiji14 for the advice! Sorry it took me so long to update! I have now improved the default spatial extent plot using geoviews, I made a pull request to implement this so it can be merged separately. For OpenAltimetry data visualization, I deprecated the functionality that only requests the lastest cycle demonstrated in my last commit, as now what I think the most valuable function of OA API is to provide a low-cost way of playing ICESat-2 data, e.g. compare geolocations and elevations between different ground tracks and cycles for a large region, without going through the trouble of downloading and plotting data separately. Now the script can show elevation distributions from different cycles, see notebook. In terms of future development, I think it can add a The improvements include:
I only test the functions in Antarctica Peninsula where the data density is relatively low. For a large region with a long time span, OA request may take several minutes, but I think this can be further enhanced by improving the parallel processing or just setting a limit for spatial extent and date range to warn the users that it may take a long time to process. |
Hi @icetianli, welcome back, hope your research is going well! Just had a quick scan of the changes here and it looks like you've got made some really nice improvements, especially with that ICESat-2 cycle slider! I'll give this a full review tomorrow as it's getting late in my timezone. In the meantime, could you merge in the changes from the development into this branch using |
This looks awesome! Thanks for all these updates, @icetianli, and the example is great. I haven't had a chance to do a line-by-line review, but I'm thinking it would make sense to better integrate the capabilities from #176 into this new module (we can always merge that PR into development, then use this one to migrate it all to the visualization module). Then, we can call visualize directly from the query object with options for the different methods of visualization. We'll also need to add in some checks for valid datasets specifically for this step, because as I remember OA doesn't host all datasets, so we'll need to let users know if they can't quick-view the dataset they're requesting. We can talk in more detail about some of this at Tuesday's icepyx meeting! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more comments for now, mostly just minor typos and code style suggestions. I'll still need to walk myself through the logic a bit more and play around with your 'examples/ICESat-2_Data_Visualization_Example.ipynb' notebook (which should be listed under the README.md file by the way under https://github.com/icesat2py/icepyx/tree/v0.3.2#examples-jupyter-notebooks).
I'm thinking it would make sense to better integrate the capabilities from #176 into this new module (we can always merge that PR into development, then use this one to migrate it all to the visualization module). Then, we can call visualize directly from the query object with options for the different methods of visualization.
Agree. So the workflow would be something like Merge #176 into 'development'
-> Merge 'development' into this PR #144
-> Migrate the 'visualize_spatial_extent' to 'visualization.py'
.
We'll also need to add in some checks for valid datasets specifically for this step, because as I remember OA doesn't host all datasets, so we'll need to let users know if they can't quick-view the dataset they're requesting. We can talk in more detail about some of this at Tuesday's icepyx meeting!
Yep, something like an allowlist to check if the product is in ['ATL06','ATL07','ATL08','ATL10','ATL12','ATL13']
should work. I'll try and listen in for the meeting tomorrow but probably won't talk much cause it'll be 6am for me 😆
Co-authored-by: Wei Ji <[email protected]>
Co-authored-by: Wei Ji <[email protected]>
- Add product check - Increase number of concurrent multithreading max_worker to reduce API requesting time - Use backoff to deal with specfic exceptions raised during API request - Set elevation sampling rate of 1/20 and cycle number of latest cycle to optimize memory and speed up data processing - Add testing to all supported products - Add an interactive map to visualize elevation spatial distribution and along-track elevation for individual RGT - Update visualization notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just restarting the review process again 😄 Sorry if it looks like a lot (got a bit carried away), but hopefully you can just batch commit them, or I can push the commits to this branch directly if you're ok with it. Feel free to disagree or discuss about any of the recommended changes too, you've worked on this code a lot more than I have and tbh, it's been a while since I worked on ATL06!
To continue with the rest of the review though, could you please:
- Merge in the latest changes from the 'development' branch into this 'openaltimetry' branch (click on the 'Resolve conflicts' button in the GitHub UI and remove the extra blank line).
- Clear the output of the first cell (the one with
from icepyx.core.visualization import Visualize
) in your ICESat-2_Data_Visualization_Example.ipynb notebook and re-commit it to GitHub. This will result in a smaller diff (the bokeh svg icon is very large for some reason), and hopefully we can comment/review on the *.ipynb file easier later.
P.S. I'm doing some work at #196 to enable showing jupyter notebooks directly as a 'Tutorial' on the icepyx ReadTheDocs page. If that work gets merged in, we'll have to move your ICESat-2_Data_Visualization_Example.ipynb to a new home under doc/
, and users will be able to see your walkthrough tutorial even better! Here's a sneak peek of how it would look like:
Keep up the good work 🙌
Co-authored-by: Wei Ji <[email protected]>
Co-authored-by: Wei Ji <[email protected]>
This is looking so great - thanks for all your hard work, @icetianli! I managed to break your tests, which I will work to fix pending our decision about how we initialize the Visualization object. I'd also like to try and add a few more tests to target some of the other functions you've added. Also, since I made the PR a mess by running black on some of the files, over the next couple of days I will put in a separate PR to do the reformatting, and once we merge it into development we can bring it in here to clean things up again. Sorry about that - I didn't realize how many modules had fallen behind on formatting! |
@JessicaS11, could you describe your idea a bit more on how the Visualization object should be initialized? I read your note on #144 (comment) and saw commit b862eca (which looks good by the way), so is this pretty much done? Or do you mean having an API like
Just on this, you can also revert commits 53b55c4 and 8c1cd05 to remove unrelated changes and reduce the git diff on this PR. Following that, do cherry-pick (using -n/--no-commit) on 8c1cd05 to get only the relevant changes (see https://stackoverflow.com/questions/5717026/how-to-git-cherry-pick-only-changes-to-certain-files). git revert 53b55c42a97e681dc5ba954acee6e418a525e350
git revert 8c1cd052b77e1a8570efdcd739db3d107d14469d
git cherry-pick -n 8c1cd052b77e1a8570efdcd739db3d107d14469d
git add doc/source/user_guide/documentation/query.rst examples/examples.rst icepyx/core/visualization.py
git commit -m "add docstrings and visualization to docs; reformat with black" |
This change looks good to me @JessicaS11 , it's nice to be able to pass |
Good points, @icetianli and @weiji14. After some more thought, I think that it makes the most sense to encourage users to access the visualizations as methods on the Query object ( Thanks for asking all these great clarifying questions! Let me know if more changes are still needed. |
Travis is failing on the ATL07
And the error traceback is:
It looks like it's coming from the specification of error type @icetianli added to line 347 of |
Ah yes! It's another error caused by empty dataframe by |
Thanks for adding these additions @JessicaS11! They look great, agree it's better to call different visualization methods directly from |
Thanks @icetianli for taking this on, and @weiji14 for all your review help. I'm going to go ahead and approve this PR, but I won't merge it in case either of you have any last edits/changes/fixes since I was the last one to push a commit. Feel free to merge (and if you're feeling really ambitious open a dev-->main PR, though we'll need to add a changelog before we merge that). |
Thanks @JessicaS11 and @weiji14 for all the great advice! I am happy to proceed with merging this PR, is there any additional change you would like to add @weiji14? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this PR looks good to me so feel free to squash and merge this in! Well done on seeing this through to the end, I hope you learned some new things from it, and sorry that it took as long as a paper review!
Follow the original PR and the discussion with @JessicaS11 and @friedrichknuth, further improvements have been made regarding visualization by data requests from OpenAltimetry API:
Generalization of the code for different datasets available from OA API (https://openaltimetry.org/data/swagger-ui/). It now supports products ['ATL06','ATL07','ATL08','ATL10','ATL12','ATL13’]. ATL03 is not supported beacause OA API only supports ATL03 data request at single date with a spatial limitation of 1 degree at either lat/lon.
Create a visualization module to host all relevant methods related to OA request and data visualisation. Three methods of visualization are available, which are: 1) interactive plot in holoview and datashader; 2) traditional matplotlib plot for geodataframe; 3) default spatial bounding box plot. This is still at the exploration stage and can be enhanced later.
Solve the issue mentioned in PR on a 5-degree spatial limit in the latitude or longitude dimension set by the OpenAltimetry API. OA API can now return data with any bounding box.
Only request ICESat-2 data in the latest cycle from OpenAltimetry, this greatly reduces the overall time expenses. However, for a larger region with high data density, the OA request will still take several minutes, presumably related to how parameter lists are generated in the
generate_OA_paras
function.Test added under icepyx/icepyx/test/test_OpenAltimetry.py
Binder button to test this branch:
closes #92