Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance increase in alignment #279

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andreasbuhr
Copy link

Some changes which result in a performance improvement of about 1.5%

I know it makes the code less readable. Only merge if you like it.

bwt.c Show resolved Hide resolved
@andreasbuhr
Copy link
Author

Hi @BrettDong,

thank you so much for your review. Yes, I do know that (uint64_t)p[0]<<32 | p[1] is not equivalent to *((uint64_t*)p). But you when taking into account the function __occ_aux, things are different. The function __occ_aux counts how many two-bit tuples bits are equal to c in the 64bit integer. The parameter c can only be 0, 1, 2, or 3. The number of two-bit tuples equal to c does not change when reordering the bytes. So __occ_aux((uint64_t)p[0]<<32 | p[1], c) is in fact equivalent to __occ_aux(*((uint64_t*)p), c). And it is significantly faster. So I'd still suggest to merge this pull request.

best regards,
Andreas

@BrettDong
Copy link

Thank you a lot for your detailed explanation! After I incorporated these three lines of changes my application seem to get stuck in an infinite loop inside BWT but if I restore these three lines the problem immediately disappears. Therefore I thought the problem arose from the equivalence got broken. Now the culprit must lie somewhere else.

@andreasbuhr
Copy link
Author

Can you provide instructions how to reproduce this? Like what exact command are you running on what exact input files?

@BrettDong
Copy link

Seems the reason of the problem I encountered is the compiler doing aggressive inline and

y = ((c&2)? y : ~y) >> 1 & ((c&1)? y : ~y) & 0x5555555555555555ull;

this line overwrites content in p array.

@andreasbuhr
Copy link
Author

The function takes y by value, so it should not change the array. Do you think it is a bug in your compiler?

@BrettDong
Copy link

I think I was encountering a compiler bug. So sorry for my impulsive comment above. Your changes are indeed mathematically equivalent and should not break the behaviour of the program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants