Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS paper review - Software paper #144

Closed
9 tasks done
PeterKraus opened this issue Jul 7, 2022 · 4 comments
Closed
9 tasks done

JOSS paper review - Software paper #144

PeterKraus opened this issue Jul 7, 2022 · 4 comments

Comments

@PeterKraus
Copy link

PeterKraus commented Jul 7, 2022

Hello @sgbaird,
In this issue, I will collate points that relate to the software paper itself. Link to review is: openjournals/joss-reviews#4528

  • the reference to the software repo seems unnecessary here (and throughout the paper). The link to the repo is right there in the front-matter!

    diffusion models. Using `xtal2png` [@xtal2png] to encode/decode crystal structures via grayscale PNG images (see

  • the summary is not clear about what is being encoded into what. My understanding is that xtal2png can take a coordinate file and create a fingerprint / QR-code-like representation of that coordinate file. If that's the case, it should be more explicit in the summary.

    e.g. \autoref{fig:64-bit}) is akin to making/reading a QR code for crystal structures.

  • streamlined results? Is the point here that it feeds well into image-based pipelines?

    This allows you, as a materials informatics practitioner, to get streamlined results for

  • The references in this section on domain transfer make it hard to read. Please rewrite - perhaps put the citations behind the dates.

    Another example of state-of-the-art algorithm domain transfer is refactoring image-processing models for crystal structure applications, with
    introduction [@kipfSemisupervisedClassificationGraph2016], domain transfer (preprint)
    [@xieCrystalGraphConvolutional2017], and peer-reviewed domain transferred
    [@xieCrystalGraphConvolutional2018] publication dates of Sep 2016, Oct 2017, and Apr

  • Here, rather than to-from, I'd say between coordinate file and png representation (or something similar):

    is a Python package that allows you to encode/decode a crystal structure to/from a
    grayscale PNG image for direct use with image-based machine learning models. Let's take

  • I am not sure I understand the relevance of Pallette. Please elaborate.

  • It would be helpful to discuss the technical limitations somewhere. Perhaps it's in the docs, but in my view it could be commented on in the paper:

    the representation are given in \autoref{fig:example-and-legend}. Due to the encoding of
    numerical values as grayscale PNG images (allowable values are integers between 0 and
    255), a small round-off error is present during a single round of encoding and decoding.
    An example comparing an original vs. decoded structure is given in

  • First, is there a reason 64x64 pixel image has to be used? I understand it's possible to encode at most 52 atoms within the unit cell. Just by reading the paper, it's not obvious why the cell parameters (lattice constants, angles, volume, point group) occupy rows - seems like a waste of space. If that information was moved into the upper-left zero sector, there would be more space for the pairwise distance matrix.

  • Second, the issue of 255 possible values is a limitation of the grayscale requirement. If this was lifted, the precision could go up dramatically.

  • Third, I'm sure there's a good reason for it, but what's the distance matrix for? It can be calculated from the fractional coordinates. If all we want is to do is to represent the CIF file, there's no reason to include it.

  • I would rewrite this section, as a "potential" time saving is only potential.

    xtal2png/reports/paper.md

    Lines 105 to 106 in cc644f3

    dataset types, potentially saving days or weeks during the process of obtaining
    preliminary results on a newly released model.

  • There is no mention of any other software that provides similar functionality. If there truly is not any encoder/decoder for crystal structure to image data, perhaps a short section discussing existing software for molecules and/or textual representation could be included.

@sgbaird
Copy link
Member

sgbaird commented Jul 7, 2022

@PeterKraus thanks for your comments! These are great. I will address these soon.

sgbaird added a commit that referenced this issue Jul 7, 2022
#144 (comment)

> the reference to the software repo seems unnecessary here (and throughout the paper). The link to the repo is right there in the front-matter!
sgbaird added a commit that referenced this issue Jul 7, 2022
@sgbaird
Copy link
Member

sgbaird commented Jul 8, 2022

  • the reference to the software repo seems unnecessary here (and throughout the paper). The link to the repo is right there in the front-matter!

These have been removed.

  • the summary is not clear about what is being encoded into what. My understanding is that xtal2png can take a coordinate file and create a fingerprint / QR-code-like representation of that coordinate file. If that's the case, it should be more explicit in the summary.
  • streamlined results? Is the point here that it feeds well into image-based pipelines?

Clarified both the encoding and meaning of streamlined as follows:

If this is still off, lmk and I can take another stab.

  • Here, rather than to-from, I'd say between coordinate file and png representation (or something similar):

Used "between" instead of to/from

  • I am not sure I understand the relevance of Pallette. Please elaborate.

  • It would be helpful to discuss the technical limitations somewhere. Perhaps it's in the docs, but in my view it could be commented on in the paper:

Per your later comment in #146, I added a section to the docs and addressed the 3 points you brought up. Again, lmk if I'm missing something here.

  • I would rewrite this section, as a "potential" time saving is only potential.

Rewritten:

  • There is no mention of any other software that provides similar functionality. If there truly is not any encoder/decoder for crystal structure to image data, perhaps a short section discussing existing software for molecules and/or textual representation could be included.

I'm not aware of an encoder/decoder for crystal structure directly to an image file format. Added the following:

@PeterKraus
Copy link
Author

PeterKraus commented Jul 11, 2022

Thanks, looks great. Please leave the issue open, I'll give it another read tomorrow.

@PeterKraus
Copy link
Author

PeterKraus commented Jul 15, 2022

A couple more comments/suggestions:

  • line 23: "in June 2017" would be more readable.
  • line 30: I would write "for crystal structure applications, which was first introduced in a preprint (Kipf & Welling, 2016) and published in a peer-reviewed journal over a year later (Xie & Grossman 2017, 2018).
  • line 33: I would spell out the need here: you want to help shorten this gap, with a focus on...
  • line 38: instead of "tasks with corresponding applications", extend to "tasks, which are features with corresponding applications"
  • line 42: "those results using the default instructions on the repository" perhaps "comparable results using the default parameters"
  • line 61: merge paragraph with previous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants