-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconsider how editing environments works #886
Comments
@kcpevey mentioned to me that this may be a foot gun for shared environments:
I somewhat agree, but ultimately if shared env is changed by someone else, activating it after the change will cause the same issue. And questions the UX for user awareness:
Here I would mention that auto-reloading is not enabled by default, and users who enable it know what they are doing. Also, rebuilding and env should not take 20 minutes (but it does). I do however, agree that a notification that an environment building has completed should be shown when conda-store is used with JupyterLab, which is tracked in: |
As an fyi, historically, |
But this does go with another discussion I had had about the packaging at pycon, we actually have multiple target audiences (devs, end users etc) for environment management and we are using the same tools for all of them. |
Is this actually a valid use case? How reliably does it work? For pure python packages maybe. To me it seems if you change the underlying environment all bets are off on whether your python objects are even valid if an install changed something under the hood. Seems like a better option would be to make sure you serialized your results. |
Yes, in IPython installation/updating of packages via
Very well in my experience. And Databricks considers it a valid use case too, they are contributing enhancements in ipython/ipython#14500.
This is my call as an experienced user to make. I can tell if I will need to restart the kernel or not, and I often know what specific changes will be made. It is not for updating numpy from 1.x to 2.x, it is for grabbing patches for very specific bugfixes.
No, I disagree here. It sounds like blaming user here but in fact even with best serialization and caching, there are operations that always take time like loading up large files or training small/medium models. I don't think that using conda-store should be incompatible with data scientists analysing big data or training baseline/statistical models in notebooks. But maybe I misunderstand the target audience of conda-store. |
On the other hand, the current conda-store approach leads to broken notebooks for data scientists who are not used to working with conda-store:
Why not give advanced users a choice on whether to update in place or not? If the old environment is copied as an archival build that has 0 risk, right? Thinking about it, what I really miss is:
|
Part of the delay is that even after environment is built we need to wait for up to a minute for it to be refreshed on the So I as a user keep restarting the environment until it clicks. If my new/edited dependency is used lower in the notebook I can waste many minutes there. For the interactive use case we need to somehow rewrite conda-store/conda-store/Dockerfile Line 34 in bde7fe4
|
Part of the goal of the 2024
So it's a combination of slow iteration and not enough feedback to the user. It sounds like there are downsides to both symlinking and update-in-place. However I feel like I don't have the full context, especially with regards to:
In
This really sounds like a problem that could be fixed by notifying the user. We could also possibly give them the option to stay on the old build or bump their own build to the latest version as well, although if we do opt for hot-reloading by eliminating the symlinking mechanism, users who stick with the old build would need to be the ones who reload (to target the old build)? |
I agree with your summary. Just one more thing:
How can a user achieve the closest possible thing to "add a new package to environment without updating changing dependencies of anything else unless necessary", like in As I user I now have a fear of adding anything to an environment (but I have to!). What is the safe path? |
With conda-store you can't do this at the moment because the environment gets re-solved when a new specification is submitted. This comes from previous experiences with incremental updates:
I have trouble understanding how we'd be able to reproduce the environment you'd find yourself with if there were an option to add a new package without changing dependencies of anything else unless necessary; isn't this what happens when you
But maybe I'm missing something or there's another way to do this? About user-facing messaging: are we currently passing messages to the |
Well, by having a lock file/ The conda-store approach might be fine, but if the only sane way to add a new package is to have everything pinned, then I think it should have a button to populate pins for all packages in spec from the currently installed versions. |
Context
Currently editing environment:
This means that autoreloading does not work. For example, when using with Jupyter/IPython:
If instead it worked like:
Value and/or benefit
Many minutes to hours in productivity gained (or rather not lost) for the use case of interactive environment creation by a senior data analyst.
Anything else?
No response
The text was updated successfully, but these errors were encountered: