Add utility function for converting an address to checksummed string #5067

cairoeth · 2024-06-03T18:14:21Z

Closes #4633

PR Checklist

Tests
Documentation
Changeset entry (run npx changeset add)

changeset-bot · 2024-06-03T18:14:25Z

🦋 Changeset detected

Latest commit: 8f11b32

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
openzeppelin-solidity	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cairoeth · 2024-06-03T18:17:23Z

There's some re-used logic here from toHexString but the main difference is removing the 0x from the string. toHexString adds 0x to the buffer by default, so to use this function it would require removing it form the returned string which would be extra computation just for reusability.

ernestognw

I've been trying to come up with a way of reusing the logic of toHexString and I noticed some old behaviors we've kept in the library like providing the expected length of the hex string (which we already know by Math.log256(value) + 1).

Perhaps this PR derives in a broader discussion about how to implement this, but I'd say the simplest option is to rewrite toHexString in the following way:

function toHexString(uint256 value, uint256 length) internal pure returns (string memory) {
    if (length < Math.log256(value) + 1) {
        revert StringsInsufficientHexLength(value, length);
    }

    bytes memory buffer = new bytes(2 * length + 2);
    buffer[0] = "0";
    buffer[1] = "x";
    _setHexString(buffer, 2, value);

    return string(buffer);
}

function _setHexString(bytes memory buffer, uint256 offset, uint256 value) internal pure {
    for (uint256 i = buffer.length - 1; i >= offset; --i) {
        buffer[i] = HEX_DIGITS[value & 0xf];
        value >>= 4;
    }
}

This way, we can reuse _setHexString in the new function. Similar to:

function toChecksumHexString(address addr) internal pure returns (string memory) {
    bytes memory lowercase = new bytes(40);
    _setHexString(lowercase, 0, uint160(address);
    bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase));
    ...
}

What do you think?

cairoeth · 2024-06-03T21:11:39Z

@ernestognw Yes, I was thinking something like that as well -- should we should make _setHexString private as it's not designed to be used outside the library (mainly because it doesn't have any checks on value after the operation)?

contracts/utils/Strings.sol

ernestognw · 2024-06-03T22:39:54Z

test/utils/Strings.test.js

There's a rust implementation of this checksum algorithm in Foundry (as seen in cast), so it should be relatively trivial to make a PR and request for it to be exposed through VM.sol as with Base64.

With that, we can fuzz the implementation, which would be extremely valuable.
Not required for this PR though, but something to consider given that the changes we're making are somewhat relevant

totally -- fuzzing should be used in some of the other utils as well.

Agree, feel free to open PRs adding fuzzing or Halmos FV to those utils you consider make sense

Co-authored-by: Ernesto García <[email protected]>

ernestognw · 2024-06-03T23:02:32Z

test/utils/Strings.test.js

+  describe('toChecksumHexString address', function () {
+    it('converts a random address', async function () {
+      const addr = '0xa9036907dccae6a1e0033479b12e837e5cf5a02f';
+      expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr));
+    });
+
+    it('converts an address with leading zeros', async function () {
+      const addr = '0x0000e0ca771e21bd00057f54a68c30d400000000';
+      expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr)).to.equal(ethers.getAddress(addr));
+    });
+  });


There's a little consideration in the documentation of getAddress:

If %%address%% contains both upper-case and lower-case, it is

assumed to already be a checksum address and its checksum is

validated, and if the address fails its expected checksum an

error is thrown.

None of these tests are using mixed-case letters so I'd recommend adding .toLowerCase() according to the same docs:

If you wish the checksum of %%address%% to be ignore, it should

be converted to lower-case (i.e. .toLowercase()) before

being passed in.

Even better, let's rewrite these tests:

const addresses = [...] describe('toChecksumHexString address', function () { for (const addr of addresses) { it(`converts ${addr}`, async function () { expect(await this.mock.getFunction('$toChecksumHexString(address)')(addr.toLowerCase())).to.equal( ethers.getAddress(addr), ); }); } });

I'm pushing a commit

ernestognw

LGTM, we'll need an approval from @Amxx imo

Amxx · 2024-06-04T09:29:21Z

contracts/utils/Strings.sol

@@ -63,17 +64,15 @@ library Strings {
     * @dev Converts a `uint256` to its ASCII `string` hexadecimal representation with fixed length.
     */
    function toHexString(uint256 value, uint256 length) internal pure returns (string memory) {
-        uint256 localValue = value;
+        if (length < Math.log256(value) + 1) {


computing the log is quite expensive. Checking at the end is less expensive (when everything works fine, which is the case we should optimize for). Any reason to not keep it?

contracts/utils/Strings.sol

Amxx · 2024-06-04T09:58:21Z

contracts/utils/Strings.sol

+        bytes memory lowercase = new bytes(40);
+        uint160 addrValue = uint160(addr);
+        _unsafeSetHexString(lowercase, 0, addrValue);
+        bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase));


You often do abi.encodePacked(x) when x is already a bytes. (here and on line 107)

Used like this, abi.encodePacked is the identity function ... the output exactly match the input ... so it is not necessary. One thing that happens however, is that memory is allocated for this copy... there is also a copy loop (or a mcopy if we are lucky).

Anyway, this increasse costs and leaks memory, so we should avoid it!

Amxx · 2024-06-04T09:58:49Z

contracts/utils/Strings.sol

+    function toChecksumHexString(address addr) internal pure returns (string memory) {
+        bytes memory lowercase = new bytes(40);
+        uint160 addrValue = uint160(addr);
+        _unsafeSetHexString(lowercase, 0, addrValue);
+        bytes32 hashedAddr = keccak256(abi.encodePacked(lowercase));
+
+        bytes memory buffer = new bytes(42);
+        buffer[0] = "0";
+        buffer[1] = "x";
+        uint160 hashValue = uint160(bytes20(hashedAddr));
+        for (uint256 i = 41; i > 1; --i) {
+            uint8 digit = uint8(addrValue & 0xf);
+            buffer[i] = hashValue & 0xf > 7 ? HEX_DIGITS_UPPERCASE[digit] : HEX_DIGITS[digit];
+            addrValue >>= 4;
+            hashValue >>= 4;
+        }
+        return string(abi.encodePacked(buffer));
+    }


This entier function should be done "in place". Allocating two buffers is a waste.

Amxx · 2024-06-04T10:25:13Z

contracts/utils/Strings.sol

Redesigned toChecksumHexString to avoid double allocation.

removed the need for _unsafeSetHexString

removed the need for HEX_DIGITS_UPPERCASE

Right this is extremely cleaner. Thanks!

great changes, thxs!

Amxx · 2024-06-04T10:28:54Z

contracts/utils/Strings.sol

+
+        // hash the hex part of buffer (skip length + 2 bytes, length 40)
+        uint256 hashValue;
+        assembly ("memory-safe") {


note: this is safe because we know buffer is 42 bytes long.

ernestognw

LGTM again

gitpoap-bot · 2024-06-04T19:43:18Z

Congrats, your important contribution to this open-source project has earned you a GitPOAP!

GitPOAP: 2024 OpenZeppelin Contracts Contributor:

Head to gitpoap.io & connect your GitHub account to mint!

Learn more about GitPOAPs here.

cairoeth added 2 commits June 3, 2024 11:04

add implementation and tests

215452f

add changeset

3ef22ec

cairoeth added this to the 5.1 milestone Jun 3, 2024

cairoeth requested review from Amxx and ernestognw June 3, 2024 18:14

ernestognw reviewed Jun 3, 2024

View reviewed changes

use shared logic with _setHexString

db76d54

cairoeth requested a review from ernestognw June 3, 2024 21:44

ernestognw reviewed Jun 3, 2024

View reviewed changes

cairoeth and others added 4 commits June 3, 2024 15:46

add addrValue to simplify

e05ab88

Co-authored-by: Ernesto García <[email protected]>

use addrValue

b0967a8

Co-authored-by: Ernesto García <[email protected]>

Apply PR recommendations

719978b

rename to HEX_DIGITS_UPPERCASE

f2ce027

ernestognw reviewed Jun 3, 2024

View reviewed changes

Improve tests

ac713f0

ernestognw previously approved these changes Jun 3, 2024

View reviewed changes

Amxx reviewed Jun 4, 2024

View reviewed changes

checksum in place to avoid double allocation + simplification

8f11b32

Amxx dismissed ernestognw’s stale review via 8f11b32 June 4, 2024 10:24

Amxx reviewed Jun 4, 2024

View reviewed changes

Amxx requested a review from ernestognw June 4, 2024 16:22

ernestognw approved these changes Jun 4, 2024

View reviewed changes

Amxx approved these changes Jun 4, 2024

View reviewed changes

Amxx merged commit 337bfd5 into OpenZeppelin:master Jun 4, 2024
21 checks passed

RomulousApollo mentioned this pull request Nov 8, 2024

[Snyk] Upgrade @openzeppelin/contracts from 5.0.0 to 5.1.0 RomulousApollo/v3-core#4

Open

RomulousApollo mentioned this pull request Nov 8, 2024

[Snyk] Upgrade @openzeppelin/contracts-upgradeable from 5.0.0 to 5.1.0 RomulousApollo/v3-core#5

Open

This was referenced Nov 9, 2024

[Snyk] Upgrade @openzeppelin/contracts from 5.0.0 to 5.1.0 doperiddle/stl-contracts#5

Merged

[Snyk] Upgrade @openzeppelin/contracts-upgradeable from 5.0.0 to 5.1.0 doperiddle/stl-contracts#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add utility function for converting an address to checksummed string #5067

Add utility function for converting an address to checksummed string #5067

cairoeth commented Jun 3, 2024 •

edited

Loading

changeset-bot bot commented Jun 3, 2024 •

edited

Loading

cairoeth commented Jun 3, 2024

ernestognw left a comment

cairoeth commented Jun 3, 2024

ernestognw Jun 3, 2024

cairoeth Jun 3, 2024

ernestognw Jun 3, 2024

ernestognw Jun 3, 2024 •

edited

Loading

ernestognw left a comment

Amxx Jun 4, 2024

Amxx Jun 4, 2024 •

edited

Loading

Amxx Jun 4, 2024 •

edited

Loading

Amxx Jun 4, 2024

ernestognw Jun 4, 2024

cairoeth Jun 4, 2024

Amxx Jun 4, 2024

ernestognw left a comment

gitpoap-bot bot commented Jun 4, 2024

Add utility function for converting an address to checksummed string #5067

Add utility function for converting an address to checksummed string #5067

Conversation

cairoeth commented Jun 3, 2024 • edited Loading

PR Checklist

changeset-bot bot commented Jun 3, 2024 • edited Loading

🦋 Changeset detected

cairoeth commented Jun 3, 2024

ernestognw left a comment

Choose a reason for hiding this comment

cairoeth commented Jun 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ernestognw Jun 3, 2024 • edited Loading

Choose a reason for hiding this comment

ernestognw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Amxx Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Amxx Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ernestognw left a comment

Choose a reason for hiding this comment

gitpoap-bot bot commented Jun 4, 2024

cairoeth commented Jun 3, 2024 •

edited

Loading

changeset-bot bot commented Jun 3, 2024 •

edited

Loading

ernestognw Jun 3, 2024 •

edited

Loading

Amxx Jun 4, 2024 •

edited

Loading

Amxx Jun 4, 2024 •

edited

Loading