Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalise the complete list of trait statuses and transitions between them #7

Closed
tskir opened this issue May 19, 2020 · 13 comments · Fixed by #13
Closed

Finalise the complete list of trait statuses and transitions between them #7

tskir opened this issue May 19, 2020 · 13 comments · Fixed by #13

Comments

@tskir
Copy link
Member

tskir commented May 19, 2020

The Google doc included a simplified example of this. Now we need to discuss & finalise this. We need to make sure that the trait status list/transitions fully take into account:

  • Terms becoming obsolete
  • Terms becoming deleted from the ontology
  • Terms having none/partial/full reviews
@joj0s
Copy link
Collaborator

joj0s commented May 20, 2020

Right, so based on my current understanding of a trait's possible life cycle and some of your clarifications, I have given it some more thought and here is my suggestion:
The different possible status values for a trait mapping are: unmapped, under_review, mapped, obsolete, deleted, past.

The life cycle is as follows:

  • A trait is first inserted into the database, in the traits table. It has no entry in the mappings table so it is considered unmapped and will show up as such when looking it up in the app.
  • Any maintainer can then assign an ontology term to it. It is then made to be under_review
  • Assuming a minimum required number of reviews (e.g. 3), it keeps this status until it reaches that number. Upon a 3rd review, it is then considered mapped
  • It is regularly being checked against the Ontology Lookup Service. If the assigned term is found to be obsolete, then it assumes the obsolete status. If the term no longer exists, it assumes the deleted status
  • If someone assigns a new term to the trait, the current mapping now assumes the past status and a new entry in the mappings table is inserted, repeating the above cycle.

Two notes here. The past status is something that I haven't added in the google doc model I provided so if you agree with the above let me know so that I can fix that. Also this suggestion would raise a follow-up discussion regarding the possible public API endpoints and responses so that would be the next step I think.

I also assume that a maintainer can freely change the mapping to a trait even if he is not the one who issued its previous mapping

@tskir
Copy link
Member Author

tskir commented May 21, 2020

@joj0s This sounds good for the most part, just some comments/clarifications:

1. I'm not sure how and when the past status will be applied. Remember that here we are talking about the statuses of traits, not mappings. A trait can only have one status, regardless on how many different mappings it had historically.

2. Another thing which we need to track, which is currently not reflected in the list of statuses, is import and creation of new terms. This can include quite a few steps. For example, here is a reasonable lifecycle of a trait:

  • Added in Clinvar, imported into the database as an unmapped term → unmapped
  • Mapped to an existing term from EFO and put into review → under_review
  • Rejected by a reviewer with some comments → something like "changes_requested"?
  • Mapped by a curator to an existing term from MONDO, which is however not in EFO, and put back into review → under_review again probably?
  • Confirmed by N reviewers → we cannot use "mapped" here because the trait is not yet finalised, so something like "awaiting_efo_import"?
  • Submitted by a curator (in a batch with other traits) for EFO import via a GitHub PR → something like "submitted_for_efo_import"?
  • Then the term remains in this status until the periodic OLS query will determine that the term is now, indeed, in EFO, which means the submission in the PR has been successfully processed. In this case, it will finally go into mapped.

A similar approach will have to be done for creating EFO terms, probably with their corresponding statuses.

  1. I think it would be very useful to draw a diagram, and include it in the documentation in this repo. I suggest that you use Mermaid for this purpose, as it's much easier than to manually draw it in Google Slides or Inkscape. Here's a live editor where you can create the diagram, then export its source code and the final PNG.

@tskir
Copy link
Member Author

tskir commented May 21, 2020

Also, regarding the statuses, I think it may be more useful (this needs to be evaluated) to have not just one big "status" column with 10+ possibilities, but rather separate simpler ones. For example:

  • Each trait can be either mapped or unmapped
  • The trait to ontology mapping can be either not reviewed at all; or reviewed by N people. The result of the curation can be either approval, or disapprovals, or mixed (mixed essentially means disapproval, as the trait needs to be discussed further)
  • The ontology term which the mapping points to can be either current, obsolete, or deleted. It can either be in EFO, or in other ontology from which it could be imported into EFO, or it might be created from scratch and require creation of a new EFO term.

This might not be comprehensive, but the idea is that we could use those smaller, simpler "statuses" to keep track of individual trait/mapping/term properties, and then assemble the final "status" of the trait as a computed property.

@joj0s
Copy link
Collaborator

joj0s commented May 25, 2020

I worked on the model more and separated the statuses into more distinct ones. In particular, the model I am suggesting includes statuses for the reviewing process of a trait (whether it has been reviewed, if it is under review or if it has been successfully reviewed) and for the ontology which is mapped to a trait (if it is current, obsolete, deleted or awaiting import). I would appreciate your opinion on whether we need more individual statuses.

Then, the master status of a trait can be calculated based on those values. I would like your input however on what these different master statuses will be

This is the link for the google doc which describes my current model proposal, with the status suggestions as well some details about them https://docs.google.com/document/d/1iunypcRGtlxPc_uwhjKnMv8tD67dHOEPaOQMxHSL96U/edit?usp=sharing

@tskir
Copy link
Member Author

tskir commented May 27, 2020

I have left a number of comments & suggestions in the data model document. The summary:

  • Adding/modifying/removing certain fields
  • Renaming ontologies to ontology_terms
  • Clarifying master statuses, mapping statuses, and ontology term statuses
  • Clarifying which entity the comments apply to
  • Discussing the process and status tracking for imported and new terms — this might require more discussion & final clarifications once you address all other comments

@joj0s
Copy link
Collaborator

joj0s commented May 28, 2020

I have resolved most of the comments and added some follow-up ones myself. I would also like to discuss the master statuses specifically. Are we ok with these 4 suggested ones? I myself am not sure whether we need the "new_term_required" one (I'm leaning towards no). And also would we need more?

@tskir
Copy link
Member Author

tskir commented May 28, 2020

I've added a section + table to the document linked above regarding:

  • Approximate complete workflows for new/imported/created terms from start to finish
  • Considerations for the possible values of a master status and how it can be computed

Please let me know what you think of this, any suggestions are welcome!

@joj0s
Copy link
Collaborator

joj0s commented May 28, 2020

The sections you added are extremely helpful and make a ton of sense! Based on them I have updated the data model. The changes I made are these:

  • Updated the number of possible master statuses in the traits table. These are “current”, “unmapped”, “awaiting_review”, “needs_import”, “needs_creation”, “awaiting_import” or “awaiting_creation”. Basically everything that is listed in the table, plus an "unmapped" status for every trait to begin with.
  • Renamed the ontologies table to ontology_terms. Also renamed the "term" field to "curie" field to be less ambiguous.
  • Added cross_refs and description field to ontology_terms table for newly created terms.

Let me know if you think that any further changes need to be made. I can also create a diagram as you suggested once the model is finalized

@joj0s
Copy link
Collaborator

joj0s commented May 28, 2020

I think we should also probably include "obsolete" and "deleted" as master statuses

@tskir
Copy link
Member Author

tskir commented Jun 1, 2020

Regarding the changes — everything's great, I did a third iteration of data model review in #8, and also added some comments about status-related things in the doc.

I can also create a diagram as you suggested once the model is finalized

Yes, it probably makes sense. You can do it as part of this ticket or you may create a new ticket for this purpose if you feel it would be better.

I think we should also probably include "obsolete" and "deleted" as master statuses

Yes, I agree. Since we want to just make the master status mirror the ontology term status (except for unmapped and awaiting_review, as I described in the document), the list will automatically include "obsolete" and "deleted", because those are allowed values for ontology terms.

@joj0s
Copy link
Collaborator

joj0s commented Jun 2, 2020

I am currently working on the mermaid diagram. I think a state diagram (https://mermaid-js.github.io/mermaid/#/stateDiagram) seems most appropriate, what do you think? The only other option I see would be a flowchart but it doesn't seem as detailed

@joj0s
Copy link
Collaborator

joj0s commented Jun 2, 2020

Here's what I've got using mermaid, tell me your thoughts

@tskir
Copy link
Member Author

tskir commented Jun 3, 2020

The diagram is both exactly correct in the status change workflow, and very nice looking. I don't have anything at all to correct in it. Please submit it (both the source code and the resulting PNG) as a PR.

You can either submit separate PRs for this issue and for #8 (documenting data model), or a single joint PR, both approaches are fine with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants