You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While doing type translation and type casting C expressions I ran into a lot of trouble with different semantics of operations between Z3, LLVM IR and C. For example, C allows numeric costants to have types int, long and long long in their signed and unsigned versions. LLVM IR routinely contains numeric constants of i1 and i8 types, which would naturally map to C types like char. Another example would be the conflation of C pointers and integers into bitvector sorts when Z3 is involved. It's impossible to tell if a 64 bits wide Z3_BV_SORT is a char* or long long.
The issue becomes even more complex when typing of expressions is involved. LLVM IR has every instruction (a value) explicitly typed and this type can differ from the what the result of an equivalent C expression would be.
My proposal would be to only directly translate variable and constant types between representations (Z3, IR, C). Expression types would inferred using the type semantics of the given representation without referring to the expression types of any other representations. However the result types of expressions should correspond between the representations.
For example if an IR and i8 %a, 1, where %a is an i32, yields an i8. The equivalent C expression must yield an i8 equivalent type, namely unsigned char. So the equivalent C expression would be (unsigned char)(a & 1U).
The correspondence check can be implemented using gtest / gflag CHECK() macros.
The text was updated successfully, but these errors were encountered:
I think that you should pre-define a bunch of "built-in" types / typedefs, e.g. int8_t, uint8_t, etc. When deciding on the signedness of a variable, I'd inspect the graph of all like-typed llvm uses, uses-of-uses, etc. and try to find the most "popular" interpretation, based on observed operations, then use that as the high-level type, and where an oppositely-signed operation is done, I'd inject casts and stuff. Not sure if this helps :-P
Also, I'd be wary of assuming pointers are always 64 bits in with, be sure to check the datalayout.
While doing type translation and type casting C expressions I ran into a lot of trouble with different semantics of operations between Z3, LLVM IR and C. For example, C allows numeric costants to have types
int
,long
andlong long
in their signed and unsigned versions. LLVM IR routinely contains numeric constants ofi1
andi8
types, which would naturally map to C types likechar
. Another example would be the conflation of C pointers and integers into bitvector sorts when Z3 is involved. It's impossible to tell if a 64 bits wideZ3_BV_SORT
is achar*
orlong long
.The issue becomes even more complex when typing of expressions is involved. LLVM IR has every instruction (a value) explicitly typed and this type can differ from the what the result of an equivalent C expression would be.
My proposal would be to only directly translate variable and constant types between representations (Z3, IR, C). Expression types would inferred using the type semantics of the given representation without referring to the expression types of any other representations. However the result types of expressions should correspond between the representations.
For example if an IR
and i8 %a, 1
, where%a
is ani32
, yields ani8
. The equivalent C expression must yield ani8
equivalent type, namelyunsigned char
. So the equivalent C expression would be(unsigned char)(a & 1U)
.The correspondence check can be implemented using gtest / gflag
CHECK()
macros.The text was updated successfully, but these errors were encountered: