Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asdftool edit #873

Merged
merged 48 commits into from
Oct 9, 2020
Merged

Asdftool edit #873

merged 48 commits into from
Oct 9, 2020

Conversation

kmacdonald-stsci
Copy link
Contributor

@kmacdonald-stsci kmacdonald-stsci commented Sep 4, 2020

I have added an 'edit' subcommand to the asdftool. It separates the ASDF markings and YAML text into a separate file to be edited, using

asdftool -e -f <ASDF file> -o <YAML file name>

Once editing the YAML is complete, use

asdftool -s -f <edited YAML file> -o <ASDF to save YAML to>

There may be a gotcha when saving the edited YAML back to ASDF file. If the edited YAML contains for characters than the number of characters before the binary blocks in the ASDF file, an error message is returned.

Copy link
Contributor

@perrygreenfield perrygreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I have a vague sense that some of this code is similar to what is in AsdfFile and there is going to be unnecessary duplication. On the other hand, what is in AsdfFile doesn't quite meet the needs of this and I think it would need a little reworking to be useful for this context. I wouldn't expect that you would have to do that reworking. Perhaps some future refactoring?

asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
Copy link
Contributor

@eslavich eslavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, in general I think we should take more opportunities for code sharing between edit.py and asdf.py/block.py.

Also, just a heads up, we're going to need to figure out some pytest tests for this, but that should wait until we're all agreed on the implementation.

asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/asdf.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
bindex = [start]
k = 0
while len(asdf_blocks) - k > min_header :
hsz = struct.unpack(">H",asdf_blocks[k+bmlen:k+bmlen+2])[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a header struct-like thing in the Block class that we can reuse here.

return False


def validate_asdf_file ( fd ) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function on the whole seems like it can be shared with asdf.py.

asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
Copy link
Contributor

@perrygreenfield perrygreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the request to bypass validation on opening should hold up acceptance.

asdf/asdf.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
Copy link
Contributor

@eslavich eslavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The streaming block case is a serious problem, and IMO we should still change that padding character. Can you also look back over the comments from previous rounds of review? GitHub marks them as "outdated" when the file changes but some of them haven't actually been resolved.

asdf/commands/tests/test_edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jdavies-st jdavies-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Just a couple specific comments below.

And a general comment - it looks like there are a lot of print() statements here when handling errors, and this is the case with a bunch of the other asdftool subcommands as well. It might be better to use the logging module I think, but perhaps that ship has sailed. =)

asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/tests/test_edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Outdated Show resolved Hide resolved
asdf/commands/edit.py Show resolved Hide resolved
@eslavich
Copy link
Contributor

eslavich commented Oct 7, 2020

@kmacdonald-stsci I discovered a bug in testing the tool, kind of a silly case but I think we should handle it:

  • Create an ASDF file with no binary blocks:
import asdf

with asdf.AsdfFile() as af:
  af["foo"] = "bar"
  af.write_to("test.asdf")
  • Use asdftool edit -e to split the file
  • In all cases of editing the YAML (equal, shorter, or longer file size) asdftool edit -s raises the following exception:
Traceback (most recent call last):
  File "/Users/eslavich/anaconda3/envs/asdftool-edit/bin/asdftool", line 33, in <module>
    sys.exit(load_entry_point('asdf', 'console_scripts', 'asdftool')())
  File "/Users/eslavich/repos/asdf/asdf/commands/main.py", line 87, in main
    sys.exit(main_from_args(args))
  File "/Users/eslavich/repos/asdf/asdf/commands/main.py", line 70, in main_from_args
    result = args.func(args)
  File "/Users/eslavich/repos/asdf/asdf/commands/edit.py", line 91, in run
    return edit(args)
  File "/Users/eslavich/repos/asdf/asdf/commands/edit.py", line 649, in edit
    return save_func(args.fname, args.oname)
  File "/Users/eslavich/repos/asdf/asdf/commands/edit.py", line 633, in save_func
    loc = find_first_block(oname)
  File "/Users/eslavich/repos/asdf/asdf/commands/edit.py", line 432, in find_first_block
    reader.read()  # Read to the beginning of the first binary block.
  File "/Users/eslavich/repos/asdf/asdf/generic_io.py", line 224, in read
    raise ValueError("{0} not found".format(self._delimiter_name))
ValueError: First binary block not found

ASDF files without blocks will become more common after we change the ndarray auto-inline threshold to 100, which I think I heard is coming soon.

…ining no binary blocks. Added tests for these conditions.
Copy link
Contributor

@eslavich eslavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this turned out great! Thanks for bearing with all the review. I suspect we'll want the next version (that launches an editor automatically) to be able to work with files without binary blocks, but the current solution is fine for now.

@eslavich eslavich merged commit f329d1e into asdf-format:master Oct 9, 2020
@eslavich eslavich mentioned this pull request Oct 9, 2020
@kmacdonald-stsci kmacdonald-stsci deleted the asdftool-edit branch December 17, 2020 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants