-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add utility function for converting an address to checksummed string #5067
Conversation
🦋 Changeset detectedLatest commit: 8f11b32 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There's some re-used logic here from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to come up with a way of reusing the logic of toHexString
and I noticed some old behaviors we've kept in the library like providing the expected length
of the hex string (which we already know by Math.log256(value) + 1
).
Perhaps this PR derives in a broader discussion about how to implement this, but I'd say the simplest option is to rewrite toHexString
in the following way:
function toHexString(uint256 value, uint256 length) internal pure returns (string memory) {
if (length < Math.log256(value) + 1) {
revert StringsInsufficientHexLength(value, length);
}
bytes memory buffer = new bytes(2 * length + 2);
buffer[0] = "0";
buffer[1] = "x";
_setHexString(buffer, 2, value);
return string(buffer);
}
function _setHexString(bytes memory buffer, uint256 offset, uint256 value) internal pure {
for (uint256 i = buffer.length - 1; i >= offset; --i) {
buffer[i] = HEX_DIGITS[value & 0xf];
value >>= 4;
}
}
This way, we can reuse _setHexString
in the new function. Similar to:
function toChecksumHexString(address addr) internal pure returns (string memory) {
bytes memory lowercase = new bytes(40);
_setHexString(lowercase, 0, uint160(address);
bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase));
...
}
What do you think?
@ernestognw Yes, I was thinking something like that as well -- should we should make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a rust implementation of this checksum algorithm in Foundry (as seen in cast), so it should be relatively trivial to make a PR and request for it to be exposed through VM.sol as with Base64.
With that, we can fuzz the implementation, which would be extremely valuable.
Not required for this PR though, but something to consider given that the changes we're making are somewhat relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally -- fuzzing should be used in some of the other utils as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, feel free to open PRs adding fuzzing or Halmos FV to those utils you consider make sense
Co-authored-by: Ernesto García <[email protected]>
Co-authored-by: Ernesto García <[email protected]>
test/utils/Strings.test.js
Outdated
describe('toChecksumHexString address', function () { | ||
it('converts a random address', async function () { | ||
const addr = '0xa9036907dccae6a1e0033479b12e837e5cf5a02f'; | ||
expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr)); | ||
}); | ||
|
||
it('converts an address with leading zeros', async function () { | ||
const addr = '0x0000e0ca771e21bd00057f54a68c30d400000000'; | ||
expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr)); | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a little consideration in the documentation of getAddress
:
- If %%address%% contains both upper-case and lower-case, it is
- assumed to already be a checksum address and its checksum is
- validated, and if the address fails its expected checksum an
- error is thrown.
None of these tests are using mixed-case letters so I'd recommend adding .toLowerCase()
according to the same docs:
- If you wish the checksum of %%address%% to be ignore, it should
- be converted to lower-case (i.e.
.toLowercase()
) before- being passed in.
Even better, let's rewrite these tests:
const addresses = [...]
describe('toChecksumHexString address', function () {
for (const addr of addresses) {
it(`converts ${addr}`, async function () {
expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr.toLowerCase())).to.equal(
ethers.getAddress(addr),
);
});
}
});
I'm pushing a commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we'll need an approval from @Amxx imo
contracts/utils/Strings.sol
Outdated
@@ -63,17 +64,15 @@ library Strings { | |||
* @dev Converts a `uint256` to its ASCII `string` hexadecimal representation with fixed length. | |||
*/ | |||
function toHexString(uint256 value, uint256 length) internal pure returns (string memory) { | |||
uint256 localValue = value; | |||
if (length < Math.log256(value) + 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
computing the log is quite expensive. Checking at the end is less expensive (when everything works fine, which is the case we should optimize for). Any reason to not keep it?
contracts/utils/Strings.sol
Outdated
bytes memory lowercase = new bytes(40); | ||
uint160 addrValue = uint160(addr); | ||
_unsafeSetHexString(lowercase, 0, addrValue); | ||
bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You often do abi.encodePacked(x)
when x is already a bytes
. (here and on line 107)
Used like this, abi.encodePacked
is the identity function ... the output exactly match the input ... so it is not necessary. One thing that happens however, is that memory is allocated for this copy... there is also a copy loop (or a mcopy if we are lucky).
Anyway, this increasse costs and leaks memory, so we should avoid it!
contracts/utils/Strings.sol
Outdated
function toChecksumHexString(address addr) internal pure returns (string memory) { | ||
bytes memory lowercase = new bytes(40); | ||
uint160 addrValue = uint160(addr); | ||
_unsafeSetHexString(lowercase, 0, addrValue); | ||
bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase)); | ||
|
||
bytes memory buffer = new bytes(42); | ||
buffer[0] = "0"; | ||
buffer[1] = "x"; | ||
uint160 hashValue = uint160(bytes20(hashedAddr)); | ||
for (uint256 i = 41; i > 1; --i) { | ||
uint8 digit = uint8(addrValue & 0xf); | ||
buffer[i] = hashValue & 0xf > 7 ? HEX_DIGITS_UPPERCASE[digit] : HEX_DIGITS[digit]; | ||
addrValue >>= 4; | ||
hashValue >>= 4; | ||
} | ||
return string(abi.encodePacked(buffer)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entier function should be done "in place". Allocating two buffers is a waste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redesigned toChecksumHexString
to avoid double allocation.
- removed the need for
_unsafeSetHexString
- removed the need for
HEX_DIGITS_UPPERCASE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right this is extremely cleaner. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great changes, thxs!
|
||
// hash the hex part of buffer (skip length + 2 bytes, length 40) | ||
uint256 hashValue; | ||
assembly ("memory-safe") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this is safe because we know buffer is 42 bytes long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM again
Congrats, your important contribution to this open-source project has earned you a GitPOAP! GitPOAP: 2024 OpenZeppelin Contracts Contributor: Head to gitpoap.io & connect your GitHub account to mint! Learn more about GitPOAPs here. |
Closes #4633
PR Checklist
npx changeset add
)