-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding BBMap qchist #1021
adding BBMap qchist #1021
Conversation
Nice! Ping @boulund as this is BBMap code.. |
A welcome addition, thanks for filing a PR to add this functionality! I haven't had a chance to test it yet, but it looks like the code for qchist finding and parsing that has been added is handled outside of the generic parsing and plotting code here: https://github.com/ewels/MultiQC/blob/b035f081fac0baed23b3e78d48b0dfb3084738c7/multiqc/modules/bbmap/bbmap.py#L38-L72 I know that section of the code is rather... opaque... Sorry for that! It'll most certainly work, but if more additions like this come in this module will become a hairy mess to maintain (more than it already is...). At least the new code that is added is very clear 👍 . If you think it looks OK @ewels, then by all means merge the PR. |
Hmmm 🤔.... Cloned MultiQC into the folder, added (the spaces below should be tabs)
to Lo and behold! We got a plot in the output report.The correct data is indeed plotted, but the plot is ugly and it's giving the description for |
I'm confused - |
@ewels, yes they are all different datasets. BBMap offers a very wide range of histogram outputs. Here are some of them, as described in the BBMap CLI help:
Some of these I had no idea what to do with when we wrote the module to begin with, and opted for an approach where we wrote a fairly generic histogram parser with generic histogram plotting so we could cover as many as possible, and then added a few specific variants for the non-histogram data (a handful or so). We were hoping that end-users would request or make their own PRs for additional stuff as they realized they wanted them ... 😓 |
In the last commit I have changed the search pattern removing the filename (not useful) and moving from |
Nice! Cool that using the |
Hi @massiddamt, When I try running your code on the bbmap test data I get a new plot, but it's empty: Weirdly, hovering over the plot shows data tooltips: Looking at the exported data from the report, it seems as though it's a line graph but with only one point for each line. Is this what you also see? Do you have any example data that you have been running with? Phil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't done a full code review yet, will take a proper look when I get the module to produce a plot as expected..
Like I hinted at in my previous post above I don't think we need any code changes to enable qchist plotting. It seems it was just an issue with the file contents search pattern that made it match another output file from BBMap.
Adding the line above (preferably with the |
Yes, but you also said at the end:
|
😅 ... Haha. Indeed I did, and didn't even read the entire post I quoted :D If we're planning to go that route and keep the new code then it needs all of those items fixed that you just noted in your review @ewels. If we go the other route we need to change a single line of code and check that the plot looks ok. I'd argue that the latter is easier, lower effort, and quicker. There is no real risk of adding more complexity to the module by keeping the existing code, since nothing really changes. But of course, I haven't tested it properly, so not sure if it'll work. |
Sure! I’m all for that approach if it’s easy. Fancy opening a PR..? 😉 |
3e19e54
to
9f00c19
Compare
* Fix broken plot series with 0 counts on log axis * Rewrite general stats code to avoid double-parsing * Remove superflous data file save * Remove bad UserWarning stop if no qchist files found * Remove double-logging
ok @massiddamt / @boulund - I figured this PR had languished long enough and I wanted to get it off the books 😅 I went over the code and did a bit of refactoring. I tried to make some bits (eg. general stats) use the generalised BBMap code a bit more, and sorted out some of the issues with the plotting and functionality blocking other BBMap sections. I'm pretty happy with this now and I think it works fine in my testing. I'll merge, but if you guys get a chance to test it out, that'd be fab! Thanks again for your contribution @massiddamt and discussion @boulund 🎉 |
Wow! Great work, thanks! I'll try to give it a spin next time an opportunity opens up. |
CHANGELOG.md
has been updatedWith this update it is possible to handle BBMap qchist reports.
A plot is generated with the number of counts at each Phred Score value.
A table is included in the general stats section with the percentage of bases with Q >= 30.