Finalise the complete list of trait statuses and transitions between them #7

tskir · 2020-05-19T22:37:48Z

The Google doc included a simplified example of this. Now we need to discuss & finalise this. We need to make sure that the trait status list/transitions fully take into account:

Terms becoming obsolete
Terms becoming deleted from the ontology
Terms having none/partial/full reviews

joj0s · 2020-05-20T12:00:53Z

Right, so based on my current understanding of a trait's possible life cycle and some of your clarifications, I have given it some more thought and here is my suggestion:
The different possible status values for a trait mapping are: unmapped, under_review, mapped, obsolete, deleted, past.

The life cycle is as follows:

A trait is first inserted into the database, in the traits table. It has no entry in the mappings table so it is considered unmapped and will show up as such when looking it up in the app.
Any maintainer can then assign an ontology term to it. It is then made to be under_review
Assuming a minimum required number of reviews (e.g. 3), it keeps this status until it reaches that number. Upon a 3rd review, it is then considered mapped
It is regularly being checked against the Ontology Lookup Service. If the assigned term is found to be obsolete, then it assumes the obsolete status. If the term no longer exists, it assumes the deleted status
If someone assigns a new term to the trait, the current mapping now assumes the past status and a new entry in the mappings table is inserted, repeating the above cycle.

Two notes here. The past status is something that I haven't added in the google doc model I provided so if you agree with the above let me know so that I can fix that. Also this suggestion would raise a follow-up discussion regarding the possible public API endpoints and responses so that would be the next step I think.

I also assume that a maintainer can freely change the mapping to a trait even if he is not the one who issued its previous mapping

tskir · 2020-05-21T16:43:38Z

@joj0s This sounds good for the most part, just some comments/clarifications:

1. I'm not sure how and when the past status will be applied. Remember that here we are talking about the statuses of traits, not mappings. A trait can only have one status, regardless on how many different mappings it had historically.

2. Another thing which we need to track, which is currently not reflected in the list of statuses, is import and creation of new terms. This can include quite a few steps. For example, here is a reasonable lifecycle of a trait:

Added in Clinvar, imported into the database as an unmapped term → unmapped
Mapped to an existing term from EFO and put into review → under_review
Rejected by a reviewer with some comments → something like "changes_requested"?
Mapped by a curator to an existing term from MONDO, which is however not in EFO, and put back into review → under_review again probably?
Confirmed by N reviewers → we cannot use "mapped" here because the trait is not yet finalised, so something like "awaiting_efo_import"?
Submitted by a curator (in a batch with other traits) for EFO import via a GitHub PR → something like "submitted_for_efo_import"?
Then the term remains in this status until the periodic OLS query will determine that the term is now, indeed, in EFO, which means the submission in the PR has been successfully processed. In this case, it will finally go into mapped.

A similar approach will have to be done for creating EFO terms, probably with their corresponding statuses.

I think it would be very useful to draw a diagram, and include it in the documentation in this repo. I suggest that you use Mermaid for this purpose, as it's much easier than to manually draw it in Google Slides or Inkscape. Here's a live editor where you can create the diagram, then export its source code and the final PNG.

tskir · 2020-05-21T16:50:50Z

Also, regarding the statuses, I think it may be more useful (this needs to be evaluated) to have not just one big "status" column with 10+ possibilities, but rather separate simpler ones. For example:

Each trait can be either mapped or unmapped
The trait to ontology mapping can be either not reviewed at all; or reviewed by N people. The result of the curation can be either approval, or disapprovals, or mixed (mixed essentially means disapproval, as the trait needs to be discussed further)
The ontology term which the mapping points to can be either current, obsolete, or deleted. It can either be in EFO, or in other ontology from which it could be imported into EFO, or it might be created from scratch and require creation of a new EFO term.

This might not be comprehensive, but the idea is that we could use those smaller, simpler "statuses" to keep track of individual trait/mapping/term properties, and then assemble the final "status" of the trait as a computed property.

joj0s · 2020-05-25T15:48:18Z

I worked on the model more and separated the statuses into more distinct ones. In particular, the model I am suggesting includes statuses for the reviewing process of a trait (whether it has been reviewed, if it is under review or if it has been successfully reviewed) and for the ontology which is mapped to a trait (if it is current, obsolete, deleted or awaiting import). I would appreciate your opinion on whether we need more individual statuses.

Then, the master status of a trait can be calculated based on those values. I would like your input however on what these different master statuses will be

This is the link for the google doc which describes my current model proposal, with the status suggestions as well some details about them https://docs.google.com/document/d/1iunypcRGtlxPc_uwhjKnMv8tD67dHOEPaOQMxHSL96U/edit?usp=sharing

tskir · 2020-05-27T16:01:36Z

I have left a number of comments & suggestions in the data model document. The summary:

Adding/modifying/removing certain fields
Renaming ontologies to ontology_terms
Clarifying master statuses, mapping statuses, and ontology term statuses
Clarifying which entity the comments apply to
Discussing the process and status tracking for imported and new terms — this might require more discussion & final clarifications once you address all other comments

joj0s · 2020-05-28T08:54:19Z

I have resolved most of the comments and added some follow-up ones myself. I would also like to discuss the master statuses specifically. Are we ok with these 4 suggested ones? I myself am not sure whether we need the "new_term_required" one (I'm leaning towards no). And also would we need more?

tskir · 2020-05-28T11:09:00Z

I've added a section + table to the document linked above regarding:

Approximate complete workflows for new/imported/created terms from start to finish
Considerations for the possible values of a master status and how it can be computed

Please let me know what you think of this, any suggestions are welcome!

joj0s · 2020-05-28T11:36:56Z

The sections you added are extremely helpful and make a ton of sense! Based on them I have updated the data model. The changes I made are these:

Updated the number of possible master statuses in the traits table. These are “current”, “unmapped”, “awaiting_review”, “needs_import”, “needs_creation”, “awaiting_import” or “awaiting_creation”. Basically everything that is listed in the table, plus an "unmapped" status for every trait to begin with.
Renamed the ontologies table to ontology_terms. Also renamed the "term" field to "curie" field to be less ambiguous.
Added cross_refs and description field to ontology_terms table for newly created terms.

Let me know if you think that any further changes need to be made. I can also create a diagram as you suggested once the model is finalized

joj0s · 2020-05-28T15:49:10Z

I think we should also probably include "obsolete" and "deleted" as master statuses

tskir · 2020-06-01T12:24:51Z

Regarding the changes — everything's great, I did a third iteration of data model review in #8, and also added some comments about status-related things in the doc.

I can also create a diagram as you suggested once the model is finalized

Yes, it probably makes sense. You can do it as part of this ticket or you may create a new ticket for this purpose if you feel it would be better.

I think we should also probably include "obsolete" and "deleted" as master statuses

Yes, I agree. Since we want to just make the master status mirror the ontology term status (except for unmapped and awaiting_review, as I described in the document), the list will automatically include "obsolete" and "deleted", because those are allowed values for ontology terms.

joj0s · 2020-06-02T10:29:10Z

I am currently working on the mermaid diagram. I think a state diagram (https://mermaid-js.github.io/mermaid/#/stateDiagram) seems most appropriate, what do you think? The only other option I see would be a flowchart but it doesn't seem as detailed

joj0s · 2020-06-02T15:18:54Z

Here's what I've got using mermaid, tell me your thoughts

tskir · 2020-06-03T15:04:18Z

The diagram is both exactly correct in the status change workflow, and very nice looking. I don't have anything at all to correct in it. Please submit it (both the source code and the resulting PNG) as a PR.

You can either submit separate PRs for this issue and for #8 (documenting data model), or a single joint PR, both approaches are fine with me.

tskir mentioned this issue May 19, 2020

Determine a starting UI prototype #6

Closed

tskir mentioned this issue May 21, 2020

Describe and discuss the data model #8

Closed

tskir mentioned this issue May 27, 2020

Rolling minutes for weekly calls #9

Open

joj0s mentioned this issue Jun 4, 2020

Create data documentation md file #13

Merged

tskir closed this as completed in #13 Jun 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalise the complete list of trait statuses and transitions between them #7

Finalise the complete list of trait statuses and transitions between them #7

tskir commented May 19, 2020

joj0s commented May 20, 2020 •

edited

Loading

tskir commented May 21, 2020

tskir commented May 21, 2020

joj0s commented May 25, 2020 •

edited

Loading

tskir commented May 27, 2020

joj0s commented May 28, 2020

tskir commented May 28, 2020

joj0s commented May 28, 2020 •

edited

Loading

joj0s commented May 28, 2020

tskir commented Jun 1, 2020 •

edited

Loading

joj0s commented Jun 2, 2020 •

edited

Loading

joj0s commented Jun 2, 2020 •

edited

Loading

tskir commented Jun 3, 2020

Finalise the complete list of trait statuses and transitions between them #7

Finalise the complete list of trait statuses and transitions between them #7

Comments

tskir commented May 19, 2020

joj0s commented May 20, 2020 • edited Loading

tskir commented May 21, 2020

tskir commented May 21, 2020

joj0s commented May 25, 2020 • edited Loading

tskir commented May 27, 2020

joj0s commented May 28, 2020

tskir commented May 28, 2020

joj0s commented May 28, 2020 • edited Loading

joj0s commented May 28, 2020

tskir commented Jun 1, 2020 • edited Loading

joj0s commented Jun 2, 2020 • edited Loading

joj0s commented Jun 2, 2020 • edited Loading

tskir commented Jun 3, 2020

joj0s commented May 20, 2020 •

edited

Loading

joj0s commented May 25, 2020 •

edited

Loading

joj0s commented May 28, 2020 •

edited

Loading

tskir commented Jun 1, 2020 •

edited

Loading

joj0s commented Jun 2, 2020 •

edited

Loading

joj0s commented Jun 2, 2020 •

edited

Loading