Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uleb128 decoding #3

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

IsmailShaheen
Copy link

First of all, thank you for writing this blog it's been a tremendus help. Now I've tried to run this code on an x86_64 Linux 18.04 machine and I didn't have any problem until v08. in v08 the personality function tries to access a weird memory location. On further inspection it turns out that the larger machine code produced by the 64bit architecture produces relative offset values (the ones that are stored in the LSDA call site) greater than 127 which requires 2 bytes for ULEB128 encoding and this is where it goes downhill. I know that you have mentioned ignoring this issue in your blog, that's why I have tried implementing a decoding function which works (for an extent) but for some odd reasons the start and len of the first entry are ignored and the readings shifts accordingly. Now, I am still working on it, but I would appreciate your help if you have any idea why this might happen.

PS. I am working with a group on a small university project and two of my colleagues have already contacted you previously, in case this issue seemed familiar.

@IsmailShaheen IsmailShaheen changed the title first draft for decoding function uleb128 decoding Feb 11, 2020
@IsmailShaheen
Copy link
Author

The call site is now being parsed correctly with a proper uleb128 decoding function, but still the abi can't access catch_ti->name() in mycppabi.cpp:334

@nicolasbrailo
Copy link
Owner

It's amazing that you're contributing a fix for this, thanks a lot! I'm sure we'll manage to figure why it isn't working.

If you traced the problem to uleb decoding, I guess my initial assumption is that any uleb's stored would be small enough to need no decoding at all. If that's the case, adding a proper decoder is a great first step. Have you verified if the types defined for the LSDA actually match those that are defined by a real libcpp? For example: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_personality.cc#L49 - while somewhat less didactic than my own ABI version, is much more likely to be correct!

@IsmailShaheen
Copy link
Author

Well, you were right from the beginning about the types, they match the ones in the libsupc. The problem was the decoding and some pointer arithmetic. The entire call site now is being accessed correctly as stated in the previous commit along with the type table this time. The problem is the catch_type_info variable. It contains the last 4 bytes in the LSDA as intended (more specifically the value represented by .long DW.ref._ZTI14Fake_Exception-. from the assembly code. The only problem is what does it represent? It doesn't appear to be an address to something as far as I can tell from the objdump of app. It would be helpful if you have more info about how the types are stored in the type table and how to retrieve them from the assembly code as accessing the name() method is what causes a segmentation fault.

@IsmailShaheen
Copy link
Author

v08 now successfully works. After tracing the personality routine in libsupc, it turns out that type table entries are pc relative addresses (relative to the entry itself). After implementing a simplified version of the decoding function, the new address now points to a valid type_info object and the name() method can be invoked safely. I will try propagating the changes to later version to see what happens.

@nicolasbrailo
Copy link
Owner

This is amazing work, thanks @IsmailShaheen. I doubt I'll be able to test your changes any time soon, but if you update it so that successive entries also work I'll be more than happy to merge the PR.

  • I'd like to include a note to the fixed version in the original blog post. Who should I credit this to? (Also, would you like to write up a short explanation about the problem to include in the post?)
  • I'm really really curious: what is your group using this project for?

Thanks again for the fix!

@IsmailShaheen
Copy link
Author

Sure thing, I will be updating later entries next week and I will make sure it has the "bare minimum" as you intended. As for the fixed version I have been working on it with @amroadel.
I am not much of a writer but I would be more than happy to write something you could edit for your blog. We will also document our findings on the except table and I can share it with you once it's done.
Regarding the project, we are basically building our own specific exception handling framework for a modified standard c library which is part of a larger project: building a unix-based specialized micro-kernel for a distributed architecture. I know you want more info but it's an ongoing research project in my university and I don't think I am entitled to give you details but I can share it with you once we are done after my professor's permission. Your blog was a great first step to understanding how the current exception handling framework is implemented so thank you again.

@IsmailShaheen
Copy link
Author

Well, that's it. v12 now finally works on an 86_64bit architecture. We will be doing some cleanup and commenting but you can review the changes now if you want. I have also written something for the problem and I would love to share our findings with you as well, if you could send me your contact info.

@nicolasbrailo
Copy link
Owner

@IsmailShaheen this is awesome work, thank you so much! Would love to followup on this with your group. If you'd like we can get in contact by email and maybe even plan a video-call if you think it'd help. I believe I'm already in contact with different people from your group via Linked in, email and even Facebook, but feel free to reach out to my gmail account, nicolasbrailo, anytime.

Thanks again!

@iamkroot
Copy link

iamkroot commented Feb 7, 2022

Thanks so much @IsmailShaheen for creating this PR! I was stuck on exactly the same problem of invalid type info ptr.
I was able to get it working by using your get_ttype_entry function to parse the info.

Also, thanks @nicolasbrailo for creating the excellent tutorial series! It would be great if this PR got merged, so that at least the main code repo would be correct (even if the blogposts aren't fully updated). Would've saved me about a day's worth of cursing at gdb :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants