uleb128 decoding #3

IsmailShaheen · 2020-02-11T22:23:37Z

First of all, thank you for writing this blog it's been a tremendus help. Now I've tried to run this code on an x86_64 Linux 18.04 machine and I didn't have any problem until v08. in v08 the personality function tries to access a weird memory location. On further inspection it turns out that the larger machine code produced by the 64bit architecture produces relative offset values (the ones that are stored in the LSDA call site) greater than 127 which requires 2 bytes for ULEB128 encoding and this is where it goes downhill. I know that you have mentioned ignoring this issue in your blog, that's why I have tried implementing a decoding function which works (for an extent) but for some odd reasons the start and len of the first entry are ignored and the readings shifts accordingly. Now, I am still working on it, but I would appreciate your help if you have any idea why this might happen.

PS. I am working with a group on a small university project and two of my colleagues have already contacted you previously, in case this issue seemed familiar.

IsmailShaheen · 2020-02-12T01:38:28Z

The call site is now being parsed correctly with a proper uleb128 decoding function, but still the abi can't access catch_ti->name() in mycppabi.cpp:334

nicolasbrailo · 2020-02-12T08:05:27Z

It's amazing that you're contributing a fix for this, thanks a lot! I'm sure we'll manage to figure why it isn't working.

If you traced the problem to uleb decoding, I guess my initial assumption is that any uleb's stored would be small enough to need no decoding at all. If that's the case, adding a proper decoder is a great first step. Have you verified if the types defined for the LSDA actually match those that are defined by a real libcpp? For example: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/libsupc%2B%2B/eh_personality.cc#L49 - while somewhat less didactic than my own ABI version, is much more likely to be correct!

IsmailShaheen · 2020-02-15T22:08:08Z

Well, you were right from the beginning about the types, they match the ones in the libsupc. The problem was the decoding and some pointer arithmetic. The entire call site now is being accessed correctly as stated in the previous commit along with the type table this time. The problem is the catch_type_info variable. It contains the last 4 bytes in the LSDA as intended (more specifically the value represented by .long DW.ref._ZTI14Fake_Exception-. from the assembly code. The only problem is what does it represent? It doesn't appear to be an address to something as far as I can tell from the objdump of app. It would be helpful if you have more info about how the types are stored in the type table and how to retrieve them from the assembly code as accessing the name() method is what causes a segmentation fault.

IsmailShaheen · 2020-02-17T14:53:03Z

v08 now successfully works. After tracing the personality routine in libsupc, it turns out that type table entries are pc relative addresses (relative to the entry itself). After implementing a simplified version of the decoding function, the new address now points to a valid type_info object and the name() method can be invoked safely. I will try propagating the changes to later version to see what happens.

nicolasbrailo · 2020-02-19T18:30:24Z

This is amazing work, thanks @IsmailShaheen. I doubt I'll be able to test your changes any time soon, but if you update it so that successive entries also work I'll be more than happy to merge the PR.

I'd like to include a note to the fixed version in the original blog post. Who should I credit this to? (Also, would you like to write up a short explanation about the problem to include in the post?)
I'm really really curious: what is your group using this project for?

Thanks again for the fix!

IsmailShaheen · 2020-02-21T22:18:03Z

Sure thing, I will be updating later entries next week and I will make sure it has the "bare minimum" as you intended. As for the fixed version I have been working on it with @amroadel.
I am not much of a writer but I would be more than happy to write something you could edit for your blog. We will also document our findings on the except table and I can share it with you once it's done.
Regarding the project, we are basically building our own specific exception handling framework for a modified standard c library which is part of a larger project: building a unix-based specialized micro-kernel for a distributed architecture. I know you want more info but it's an ongoing research project in my university and I don't think I am entitled to give you details but I can share it with you once we are done after my professor's permission. Your blog was a great first step to understanding how the current exception handling framework is implemented so thank you again.

…pp_exception_handling_abi into ULEB128_Decoding

IsmailShaheen · 2020-02-29T01:35:32Z

Well, that's it. v12 now finally works on an 86_64bit architecture. We will be doing some cleanup and commenting but you can review the changes now if you want. I have also written something for the problem and I would love to share our findings with you as well, if you could send me your contact info.

nicolasbrailo · 2020-02-29T11:25:28Z

@IsmailShaheen this is awesome work, thank you so much! Would love to followup on this with your group. If you'd like we can get in contact by email and maybe even plan a video-call if you think it'd help. I believe I'm already in contact with different people from your group via Linked in, email and even Facebook, but feel free to reach out to my gmail account, nicolasbrailo, anytime.

Thanks again!

iamkroot · 2022-02-07T12:29:37Z

Thanks so much @IsmailShaheen for creating this PR! I was stuck on exactly the same problem of invalid type info ptr.
I was able to get it working by using your get_ttype_entry function to parse the info.

Also, thanks @nicolasbrailo for creating the excellent tutorial series! It would be great if this PR got merged, so that at least the main code repo would be correct (even if the blogposts aren't fully updated). Would've saved me about a day's worth of cursing at gdb :)

first draft for decoding function

980b266

IsmailShaheen changed the title ~~first draft for decoding function~~ uleb128 decoding Feb 11, 2020

LSDA call site is read correctly with uleb128 decoding

1c73a37

table type is accessed correctly

d0c04ce

abi_v08 fixed!

3141127

IsmailShaheen and others added 10 commits February 22, 2020 02:21

simplified get_ttype_entry method

4aa611d

dec_uleb128 changed for compatibility

f265e96

explaining read_uleb128

ba1dc84

Merge branch 'ULEB128_Decoding' of https://github.com/IsmailShaheen/c…

aa04207

…pp_exception_handling_abi into ULEB128_Decoding

bug fixed in v09 __cxa_throw

cd0f3a5

Merge branch 'ULEB128_Decoding' of https://github.com/IsmailShaheen/c…

1c9f2aa

…pp_exception_handling_abi into ULEB128_Decoding

v09 works

10d26e2

v10 works

73fb238

v11 works

6dcebc1

v12 works

ea02768

catch all fixed

347e1fa

IsmailShaheen and others added 3 commits March 3, 2020 15:29

commenting

eab0e8e

ready for review

ab84f28

typos

0de7d23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uleb128 decoding #3

uleb128 decoding #3

IsmailShaheen commented Feb 11, 2020

IsmailShaheen commented Feb 12, 2020

nicolasbrailo commented Feb 12, 2020

IsmailShaheen commented Feb 15, 2020

IsmailShaheen commented Feb 17, 2020

nicolasbrailo commented Feb 19, 2020

IsmailShaheen commented Feb 21, 2020

IsmailShaheen commented Feb 29, 2020

nicolasbrailo commented Feb 29, 2020

iamkroot commented Feb 7, 2022 •

edited

Loading

uleb128 decoding #3

Are you sure you want to change the base?

uleb128 decoding #3

Conversation

IsmailShaheen commented Feb 11, 2020

IsmailShaheen commented Feb 12, 2020

nicolasbrailo commented Feb 12, 2020

IsmailShaheen commented Feb 15, 2020

IsmailShaheen commented Feb 17, 2020

nicolasbrailo commented Feb 19, 2020

IsmailShaheen commented Feb 21, 2020

IsmailShaheen commented Feb 29, 2020

nicolasbrailo commented Feb 29, 2020

iamkroot commented Feb 7, 2022 • edited Loading

iamkroot commented Feb 7, 2022 •

edited

Loading