-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Correctly handle non str index list for bwa-mem2/mem #3101
Conversation
As reported in snakemake#522, currently this wrapper has issues if `idx` is a sequence is it incorrectly checked `input.index` to be as tring instead of `input.idx`. This is fixed now. While the discussin in snakemake#522 indicates, that there is an underlying issue with the concept of `input.idx` being as sequence of indices, this may need rework later. Until all the details are decided, having a correct version should be still valuable.
I think that all input files should be specified as If so, can't we just have something lile: if not isinstance(snakemake.input.idx, list):
RAISE ERROR
index = path.splitext(snakemake.input.idx[0])[0] |
I agree that this should be more specific, but would think that expecting any list may still be a bit vague. I lack a bit the domain knowledge but would think, this should be all outputs of So wouldnt the even better solution be to:
|
I am not super familiar with |
Reading again through the discussion #522 I have 3 points:
Should this be added as well as part of this PR, eg as an explicit
CC @lczech @dlaehnemann ' |
Regarding your questions:
|
By the way, many thanks for taking this up and for putting in the effort of fixing this properly! |
Sorry for being a bit delayed in impelementing the "proper" solution: I got a strong flu last week and now battling catching up to dooming deadlines. After Wednesday I would have time to do this - if this is more urgent and someone else has time to finish this please go ahead! |
These are typically built by the bwa-mem2/index tool.
Now all indices required need to be declared.
The previous solution using os.path.splitext does fail with the index "a.fasta.bwt.2bit.64"
Warning Rate limit exceeded@votti has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 15 seconds before requesting another review. How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe recent updates enhance the documentation and functionality of the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Wrapper
participant IndexFile
User->>Wrapper: Provide index files
Wrapper->>IndexFile: Validate presence of required files
IndexFile-->>Wrapper: All required files present
Wrapper-->>User: Execute BWA-MEM2 algorithm
sequenceDiagram
participant User
participant Wrapper
participant IndexFile
User->>Wrapper: Provide index files
Wrapper->>IndexFile: Validate presence of required files
IndexFile-->>Wrapper: Missing files detected
Wrapper-->>User: Raise ValueError with missing files
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
I have now finished implementing this and also updated the tests and documentation. What I found was that previously implemented approach of using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Files selected for processing (3)
- bio/bwa-mem2/mem/meta.yaml (1 hunks)
- bio/bwa-mem2/mem/test/Snakefile (2 hunks)
- bio/bwa-mem2/mem/wrapper.py (2 hunks)
Additional context used
Path-based instructions (1)
bio/bwa-mem2/mem/wrapper.py (1)
Pattern
**/*.py
: Do not suggest to add trailing commas.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theself
argument of methods.
Ruff
bio/bwa-mem2/mem/wrapper.py
11-11:
os.path
imported but unusedRemove unused import:
os.path
(F401)
47-47: Missing return type annotation for private function
all_same
(ANN202)
47-47: Missing type annotation for function argument
x
(ANN001)
57-57: Undefined name
snakemake
(F821)
60-63: Avoid specifying long messages outside the exception class
(TRY003)
62-62: Trailing comma missing
Add trailing comma
(COM812)
65-65: Undefined name
snakemake
(F821)
68-71: Avoid specifying long messages outside the exception class
(TRY003)
70-70: Trailing comma missing
Add trailing comma
(COM812)
Additional comments not posted (8)
bio/bwa-mem2/mem/meta.yaml (1)
11-14
: Documentation improvement approved.The updated description for the
idx
parameter provides clear guidance on the required index files, enhancing user understanding.bio/bwa-mem2/mem/test/Snakefile (2)
4-5
: Index file specification update approved.The addition of the ".0123" file extension ensures that all required index files are specified, aligning with the updated requirements.
23-24
: Index file specification update approved.The addition of the ".0123" file extension ensures that all required index files are specified, aligning with the updated requirements.
bio/bwa-mem2/mem/wrapper.py (5)
16-17
: Introduction ofREQUIRED_IDX
approved.Defining the required index file extensions as a constant improves code maintainability and clarity.
35-53
: Introduction ofget_common_prefix
approved.The function efficiently extracts the longest common prefix, enhancing the robustness of index handling.
Tools
Ruff
47-47: Missing return type annotation for private function
all_same
(ANN202)
47-47: Missing type annotation for function argument
x
(ANN001)
65-65
: Verify the usage ofsnakemake
.The variable
snakemake
is used but not defined in the provided context. Ensure it is correctly imported or passed to this script.Tools
Ruff
65-65: Undefined name
snakemake
(F821)
68-71
: Consider using exception classes for long messages.To improve readability, consider defining a custom exception class for long error messages.
Tools
Ruff
68-71: Avoid specifying long messages outside the exception class
(TRY003)
70-70: Trailing comma missing
Add trailing comma
(COM812)
57-57
: Verify the usage ofsnakemake
.The variable
snakemake
is used but not defined in the provided context. Ensure it is correctly imported or passed to this script.Tools
Ruff
57-57: Undefined name
snakemake
(F821)
bio/bwa-mem2/mem/wrapper.py
Outdated
raise ValueError( | ||
"Could not determine common prefix of index files.\n" | ||
"Please make sure that the index files are named correctly." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using exception classes for long messages.
To improve readability, consider defining a custom exception class for long error messages.
class IndexPrefixError(ValueError):
def __init__(self):
super().__init__(
"Could not determine common prefix of index files.\n"
"Please make sure that the index files are named correctly."
)
# Usage
raise IndexPrefixError()
Tools
Ruff
60-63: Avoid specifying long messages outside the exception class
(TRY003)
62-62: Trailing comma missing
Add trailing comma
(COM812)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgvieira How do you think about this one - should I go for the custom exception class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a bit overkill, but up to you (this coderabitai thing is quite new).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Johannes turned this coderabbitai
application on. It's very chatty, but it does have some useful suggestions -- but also a lot of suggestions you can ignore. And you can chat back at it, telling it not to review certain things or why a certain suggestion is not useful. It then at least claims to learn from it... 😅
You also added a log file. Is that needed? |
@log file and bam file: this were added by accident, I removed them from the corresponding commit. |
🤖 I have created a release \*beep\* \*boop\* --- ## [4.1.0](https://www.github.com/snakemake/snakemake-wrappers/compare/v4.0.0...v4.1.0) (2024-08-21) ### Features * Allow outputting of rendered yte config in Datavzrd wrapper ([#3123](https://www.github.com/snakemake/snakemake-wrappers/issues/3123)) ([e74be78](https://www.github.com/snakemake/snakemake-wrappers/commit/e74be7843299478a276febfacb52825a9762500c)) ### Bug Fixes * Correctly handle non str index list for bwa-mem2/mem ([#3101](https://www.github.com/snakemake/snakemake-wrappers/issues/3101)) ([6f46508](https://www.github.com/snakemake/snakemake-wrappers/commit/6f46508aa99663930e70c82114205929aee2dfcb)) ### Performance Improvements * autobump bio/gatk3/baserecalibrator ([#3119](https://www.github.com/snakemake/snakemake-wrappers/issues/3119)) ([6c2bdfd](https://www.github.com/snakemake/snakemake-wrappers/commit/6c2bdfdca79c7037448bc601a4aeb3d578ea0980)) * autobump bio/gatk3/realignertargetcreator ([#3118](https://www.github.com/snakemake/snakemake-wrappers/issues/3118)) ([ba219e7](https://www.github.com/snakemake/snakemake-wrappers/commit/ba219e7698eb078fe22c703a9662c8cb6deee0eb)) * autobump bio/genomescope ([#3117](https://www.github.com/snakemake/snakemake-wrappers/issues/3117)) ([6292229](https://www.github.com/snakemake/snakemake-wrappers/commit/62922298404397c7b23dc5b8fb01bc33e445e97f)) * autobump bio/igv-reports ([#3120](https://www.github.com/snakemake/snakemake-wrappers/issues/3120)) ([fe33c2c](https://www.github.com/snakemake/snakemake-wrappers/commit/fe33c2c27328212c528e94f995fd0bc26d7fde39)) * autobump bio/trinity ([#3121](https://www.github.com/snakemake/snakemake-wrappers/issues/3121)) ([62b0231](https://www.github.com/snakemake/snakemake-wrappers/commit/62b0231c99920200e3fa6794f87578869cc35bd5)) * Update Datavzrd wrapper ([#3122](https://www.github.com/snakemake/snakemake-wrappers/issues/3122)) ([5101461](https://www.github.com/snakemake/snakemake-wrappers/commit/51014613939bfc5e48e0b2ae619e283cf36bfd88)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
As reported in #522, currently the
bwa2-mem
wrapper has issues ifidx
is a sequence is it incorrectly checkedinput.index
to be as string instead ofinput.idx
.This bug was silent, as usually
snakemake.input.index
would always be a available asfunctools.partial(<function Namedlist._used_attribute at 0x73ac4dbc0c20>, _name='index')
or similar. Thus the predicate that it was not astr
was always true.Now with
snakemake=8.16.0
this changed and the bug becomes an error running the already existing tests.While the discussion in #522 indicates, that there is an underlying issue with the concept of
input.idx
being as sequence of indices, this may need rework later. Until all the details are decided, having a correct working version should be still valuable. With the current state any workflow using this wrapper usingsnakemake=8.16.0
is broken.QC
snakemake-wrappers
.While the contributions guidelines are more extensive, please particularly ensure that:
test.py
was updated to call any added or updated example rules in aSnakefile
input:
andoutput:
file paths in the rules can be chosen arbitrarilyinput:
oroutput:
)tempfile.gettempdir()
points tometa.yaml
contains a link to the documentation of the respective tool or command underurl:
Summary by CodeRabbit
New Features
Bug Fixes
Documentation