-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate big mutation file w/o failure #216
Comments
I have used 20G heap size (probably I was going to use a lot more) for a ~1.5G file (pog570_bcgsc_2020/data_mutations.txt) just to load... We shouldn't touch a single character of a line until we really need it because it actually takes 3-5 secs to load that file using a buffered reader |
its runtime-wise problem is solved by #227 |
@inodb @rmadupuri @sheridancbio For some reason we use giant Map<String, VariantAnnotation> gnResponseVariantKeyMap let me show you what's going on step by step
these steps suggest that there should be fewer OriginalVariantQuery than genomicLocations and for some reason, we should use the last inserted OriginalVariantQuery It sounds to me that this is unnecessary. These steps should be converted into this:
and now, garbage collector can start to clean unused POST response data If these steps can't be changed, using a smaller version VariantAnnotation 'might' help |
We can use this file to test:
https://github.com/cBioPortal/datahub/blob/master/public/difg_glass_2019/data_mutations.txt
The text was updated successfully, but these errors were encountered: