Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modules meta.yml ontology #3032

Merged
merged 18 commits into from
Sep 20, 2024
Merged

Conversation

mirpedrol
Copy link
Member

@mirpedrol mirpedrol commented Jun 21, 2024

Follow up from #3028
Split into a different PR to make it easier to review once #3028 is merged
I closed #3028 so this is the only PR to review now

Continuation of #2789

This PR adds an automated way of generating the right format meta.yml for modules.

 input: 
   - - meta: 
           type: map 
           description: | 
             Groovy Map containing sample information 
             e.g. [ id:'test', single_end:false ] 
       - scaffold: 
            type: file 
            description: Fasta file containing scaffold 
            pattern: "*.{fasta,fa}" 
   - - fasta: 
           type: file 
           description: FASTA reference file 
           pattern: "*.{fasta,fa}" 

Note that the structure proposed in nf-core/modules#4983 (comment) is not possible if we want to automate the creation of this file, as comments are ignored when reading a yaml file with Python.

Example of outputs formatting:

output:
  - versions:
    - "versions.yml":
        type: file
        description: File containing software versions
        pattern: "versions.yml"
  - bam:
    - meta:
        type: map
        description: Groovy Map containing sample information
  
    - "*.bam":
        type: file
        description: Sorted BAM/CRAM/SAM file
        pattern: "*.{bam,cram,sam}"

In this PR we also add an option --update-meta-yml --fix to fix existing files automatically.
To be changed to --fix as suggested in #2789 (comment)

Pytests are also missing for this functionality.
A test was added for the command nf-core modules lint --fix. ⚠️ This test will fail until the JSON schema is updated (nf-core/modules#5837)

Together with this PR, there are other actions which must happen at the same time:

This PR also adds a tool identifier to the modules meta.yml. It queries bio.tools to obtain the bio.tools ID.
It adds the edam ontologies for file inputs and outputs to the meta.yml template.

Note that ontologies are not automatised, even though this can sometimes be obtained from bio.tools.
Currently, inputs and outputs are not automatically obtained when first creating the module. We should consider if it is required to automatise this.
One option is to update the ontologies when updating the meta.yml with --update-meta-yml

POC in modules: nf-core/modules#5867

Ontologies can be added manually, and we have linting for them, but we leave implementing the tooling for a later stage: #3027

@mirpedrol mirpedrol mentioned this pull request Jun 21, 2024
4 tasks
@mirpedrol mirpedrol changed the title Modules yml ontology Modules meta.yml ontology Jun 21, 2024
@mirpedrol mirpedrol force-pushed the modules-yml-ontology branch 3 times, most recently from afd2edc to de6b069 Compare June 21, 2024 13:58
@maxulysse
Copy link
Member

some conflicts 😨

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and tested over zoom, LGTM after final check about schema ✅

@mirpedrol
Copy link
Member Author

Confirmed that we are using the path to the cloned remote to obtain the JSON schema, not the local repo
(see https://github.com/nf-core/tools/blob/dev/nf_core/modules/lint/meta_yml.py#L70)

@mirpedrol
Copy link
Member Author

Merging this to continue with the bulk modules update.
The reason for the failing test is that we use the JSON schema from master modules branch, and we are updating everything in the batch_update_staging modules branch. We should be ready to merge batch_update_staging to master once subworkflows are updated too.

@mirpedrol mirpedrol merged commit 282e8fe into nf-core:dev Sep 20, 2024
82 of 83 checks passed
@mirpedrol mirpedrol deleted the modules-yml-ontology branch September 20, 2024 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants