-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Can't use DBFS as a filesystem in distributed #1045
Comments
Hey. Is it possible for you to use a remote storage like S3? DBFS is weird in the sense that you access it differently from spark than from pandas and in the training stage we read the partitions with pandas, so even if you manage to use it it'll break there. |
Ok, I understand. I'm on Azure, so I'll try with ADLS |
If you have experience with DBFS we could also give it a shot, I got stuck trying to define a path that could be written by spark and then retrieved by fsspec such that pandas would understand it. |
Oh by the way, we recently fixed a bug in the distributed implementation which hasn't been released, you'll see only one executor training and the others will be idle. We'll make a release soon. |
I think I will try - it might be very convenient to use with dbfs for databricks users. |
We just released 1.7.3 with the distributed fix. |
Ok, it works with adls! |
It should be. Note that what will be distributed is the training of each model, so the search will be sequential. If you want to distribute the search instead you can try setting up ray on databricks, once you've done that ray should be able to distribute the trials on the cluster using the regular interface (no spark dataframes). |
What happened + What you expected to happen
I'm trying to run https://nixtlaverse.nixtla.io/neuralforecast/examples/distributed/distributed_neuralforecast.html sample on databricks. As a storage for partitions I'm using dbfs. My first issue was that I can't pass additional arguments needed for dbfs to work (instance and token) which i've worked around by:
Then the second issue was from the fsspec.ls which in case of dbfs returns a list of dicts:
And so I get the error:
Versions / Dependencies
Click
Reproduction script
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: