-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make literal collection more precise #247
Comments
No concerns. |
Why can it increase chances of matching the underlying structure of the program? I think we should ignore the exact type in the program. This means that if we have, say int16(42), we should consider we actually have all of int64(42), int32(42), int16(42) and int8(42). It means there is little point in storing more than 1 version of 42 in the file. What am I missing? |
I don't understand what this means. Can you expand or give an example? |
Sounds good to me. This significantly increases the number of literals, but that's ok. |
I think we should not put them all into the file. There is no point. We should just apply transformations at runtime as if they all are there. |
I mean that the set of transformations we apply to a literal should not depend on the spelled type of the literal. So int8(42), int64(42) and "42" should be transformed the same say. |
I am not sure if this literal collection is a good idea at all. |
I have work in progress improving literal collection. This issue is to discuss design decisions in advance of sending PRs.
The current design converts int literals to strings during go-fuzz-build. I'd like to change that, so that the metadata json contains strings and ints, and do the int-to-string conversion lazily on the go-fuzz side. This gives us flexibility about encodings (little-endian, big-endian, varint, ascii, hex) without having to decode and re-encode. Step one would be no behavioral changes but simply moving the conversion. Thoughts or concerns?
The current design encodes ints in the smallest number of bytes possible. Thus a uint64 with value 1 gets encoded as a uint8. Now that we use go/packages, we have type information available, so we could encode that
1
as a uint64. Is that preferable? It might mean having multiple1
s of various widths, but it might also increase the chance of matching the underlying structure of the program. It would also mean having to track more precise type in the metadata.That's a start. I may add questions as I work on the PRs.
cc @dvyukov
The text was updated successfully, but these errors were encountered: