-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Allow dandi organize to force usage of session_id (NWB Assets) #1000
Comments
Yet to check in detail, but I think it is at large duplicate of #69 |
@yarikoptic Specifically the last comment by @satra in that thread (#69 (comment), the part about 'partial access' to data) kind of touches on a part of why I'm requesting this... But the issue would still exist for the very first session of each subject I upload, which acts as if the subject ID alone is sufficient to be 'unique', but really it needs the full session ID added on for that. But I won't have two files present at time of organize to force the automated usage of the session ID, so a manual control for that would be nice. |
@AlmightyYakob and I (with @CodyCBakerPhD's help) discovered that this may be leading to bad behavior in general: apparently, repeated upload of a disambiguated filename of the kind Cody is talking about here led to multiple assets being stored in the dandiset at the exact same pathname. This resulted in two assets in the dandiset only one of which was visible in the file browser. We have manually renamed the later version of the duplicate pathname and communicated with Cory about it, and we expect that under the new AssetPath work in the archive that this shouldn't happen again, but it may instead result in an error about the target file already existing in the dandiset. We may find out soon as Cody's CI runs and uploads files to that folder again. |
@waxlamp I am not certain on what is this and what would be the bad behavior in
since
sounds just as a bug in dandi-archive backend to not replace an asset with the same pathname upon an upload of a new asset. In other words -- it is native to upload different version for the same path. Could you please elaborate a little more and what in particular would you like to be changed in |
i agree with @yarikoptic . this should never happen according to our design. if it's happening, something broke. path is a unique key in the asset space. there can be only one. perhaps there was a regression at some point. |
the place where this gets tricky is partial uploads and perhaps where @CodyCBakerPhD is running into an issue.
ping @jwodder and @yarikoptic about the partial disambiguation step. also see #69 and #47 |
I don't know specifically what caused multiple assets to be coexisting in the database with identical pathnames. Perhaps it was a dandi-archive bug, or perhaps it was intended behavior, with a future GC capability intended to remove orphaned versions. This came up with @AlmightyYakob was applying the new AssetPath checks and Cody's dandiset was the one place it came up. If the duplicate asset path doesn't appear again during regular use of Cody's CI, then I believe we can chalk it up to something that was buggy and no longer is; if we run into difficulties, we can decide at that point what the proper behavior of the archive ought to be, and adjust the logic to account for that. That is to say, no action needed at the moment nor any bugs to address, and we're just waiting on more data from the CI outcomes. |
I stay with opinion that it is at large a duplicate of #69 . |
Hey all, I'd like to be able to use
dandi organize
to force usage of both subject ID and session ID when generating a unique filename.E.g., I want to get
sub-subj1_ses-foobar_ecephys.nwb
but all I get is after standarddandi organize
issub-subj1_ecephys.nwb
This came about because I am iteratively uploading only a single NWBFile at a time to a dandiset that is being downloaded lazily (
dandi download ... --download dandiset.yaml
) in order to reduce the amount of time my very large files spend on the processing server.The text was updated successfully, but these errors were encountered: