Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String to bytes32 ? padZeros adds padding to front instead of back. #207

Closed
kyriediculous opened this issue Jun 19, 2018 · 16 comments
Closed
Labels
discussion Questions, feedback and general information.

Comments

@kyriediculous
Copy link

So this kind of resembles the last issue I posted.
How can I go from a string to solidity bytes32 value?

I'm trying utils.hexlify(utils.padZeros(utils.toUtf8Bytes('Test'), 32)) , but it seems to add the in front rather than at the end.
AssertionError: Names should be equal: expected '0x5465737400000000000000000000000000000000000000000000000000000000' to equal '0x0000000000000000000000000000000000000000000000000000000054657374'

Should I use utils.keccak256() instead to encode my parameters on both read and write contract calls?

@ricmoo ricmoo self-assigned this Jun 20, 2018
@ricmoo ricmoo added the discussion Questions, feedback and general information. label Jun 20, 2018
@ricmoo ricmoo removed their assignment Jun 20, 2018
@ricmoo
Copy link
Member

ricmoo commented Jun 20, 2018

You should be sure to check out this issue: #66

Basically, the ability for Web3 to treat strings as bytes32 and vice versa is at best a bug, at worst, irresponsible. :)

Stings and bytes32 are completely incompatible, however, Web3 supported special cases to make them sort of behave together, but leads to ambiguous output.

The above issue includes sample code on how to do this, if this is really what you want to do, or have to be backwards compatible with an antiquated Web3 application, but I strongly recommend against it, as it is really relying unsafe operations.

With this method, you string must be less than 32 bytes long (not characters... For example, it must be less than 10 Japanese characters long, or depending on the words and composed form between 10 and 32 characters in German).

This method also requires the application to understand implicit padding has occurred. The library cannot do this, since it cannot know whether it was a string or binary data.

Consider the data "0x303132330000", passed into a library, it won't know if it represents a) a 14 character long literal string, b) the padded string "1234" or c) 4 bytes of data.

The best thing to do, if you truly need strings is to use the Solidity string type, which is a dynamic type, with a length and all that jazz. If you can get by with hashed strings, that is also a great option, but if you only have a small set of strings to choose from, perhaps a simple enum makes sense?

Hopefully that makes sense. I'll leave the ticket open though, so we can discuss it further if there are more questions. :)

@kyriediculous
Copy link
Author

kyriediculous commented Jun 20, 2018

Hey thanks for the answer.

One thing I don't understand is why is it a bug when solidity has the same behaviour?

When you enter a string literal in a contract that is supposed to be a bytes32 value , it will automatically convert the value. Web3 doing this as well thus doesn't really seem like a bug but intended behaviour?

Despite it being a bug I actually found it to be useful behaviour.

I can't use string types because they are still restricted to use as keys in a mapping due to being dynamic. Another reason for never using strings in solidity and encoding/decoding everything client side would be gas cost.

Example:

A proxy contract that gets the address of the library to delegate to from a registry through a string.
The string however is a bytes32 value in reality.

pragma solidity ^0.4.23;
/**
 * @title Proxy
 * @dev Gives the possibility to delegate any call to a foreign implementation.
 */
 import '../registry/contract-registry.sol';
 import '../registry/enabled.sol';

contract PRProxy is Enabled {
  /**
  * @dev Fallback function allowing to perform a delegatecall to the given implementation.
  * This function will return whatever the implementation call returns
  */
  function () payable public isEnabled("controller") {
    ContractRegistry contractRegistry = ContractRegistry(CMC);
    address _impl = contractRegistry.getLibrary("pullrequests"); //THIS LINE <<<<<<<<<<<<
    require(_impl != address(0));

    assembly {
      let ptr := mload(0x40)
      calldatacopy(ptr, 0, calldatasize)
      let result := delegatecall(gas, _impl, ptr, calldatasize, 0, 0)
      let size := returndatasize
      returndatacopy(ptr, 0, size)

      switch result
      case 0 { revert(ptr, size) }
      default { return(ptr, size) }
    }
  }
}

This is the relevant code in the registry contract:

  mapping(bytes32 => address) public libraries;

  function getLibrary(bytes32 _name) external view returns (address) {
    return libraries[_name];
  }

@kyriediculous
Copy link
Author

kyriediculous commented Jun 20, 2018

I took the things you mentioned in the referenced topic into consideration.

Something that isn't clear to me (I'm trying this out next). Is what happends when I send web3.utils.fromAscii("test") as a bytes32 parameter to a contract. Will the EVM automatically add the padding for 32 bytes itself, or throw due to not receiving a bytes32 Length value?

edit: the EVM will add the padding so my question is quite irrelevent and I can just use fromAscii().

There are currently only two types of strings I store: short strings and ipfs hashes

For the short strings I will use this function on the client side:

function stringToBytes32(text) {
    let result = utils.toUtf8Bytes(text)
    if (result.length > 32) { throw new Error('String too long') }
    result = utils.hexlify(result);
    while (result.length < 66) { result += '0'; }
    if (result.length !== 66) { throw new Error("invalid web3 implicit bytes32"); }
    return result;
}

For IPFS Hash conversion I will be using base58

  // Return bytes32 hex string from base58 encoded ipfs hash,
  // stripping leading 2 bytes from 34 byte IPFS hash
  // Assume IPFS defaults: function:0x12=sha2, size:0x20=256 bits
  // E.g. "QmNSUYVKDSvPUnRLKmuxk9diJ6yS96r1TrAXzjTiBcCLAL" -->
  // "0x017dfd85d4f6cb4dcd715a88101f7b1f06cd1e009b2327a0809d01eb9c91f231"
  getBytes32FromIpfsHash(ipfsListing) {
    return (
      "0x" +
      bs58
        .decode(ipfsListing)
        .slice(2)
        .toString("hex")
    )
  }

  // Return base58 encoded ipfs hash from bytes32 hex string,
  // E.g. "0x017dfd85d4f6cb4dcd715a88101f7b1f06cd1e009b2327a0809d01eb9c91f231"
  // --> "QmNSUYVKDSvPUnRLKmuxk9diJ6yS96r1TrAXzjTiBcCLAL"
  getIpfsHashFromBytes32(bytes32Hex) {
    // Add our default ipfs values for first 2 bytes:
    // function:0x12=sha2, size:0x20=256 bits
    // and cut off leading "0x"
    const hashHex = "1220" + bytes32Hex.slice(2)
    const hashBytes = Buffer.from(hashHex, "hex")
    const hashStr = bs58.encode(hashBytes)
    return hashStr
}

One thing I did notice is that web3 has both a padLeft and padRight function.
https://web3js.readthedocs.io/en/1.0/web3-utils.html#padright

@ricmoo
Copy link
Member

ricmoo commented Jun 23, 2018

Ahoy hoy! :)

It's not the EVM that adds the padding, it is the ABI coder, which processes the data before sending it to the blockchain. The final data sent will always be word aligned (32 byte words) with a 4-byte signature. The EVM doesn't actually care about this, but all the current tools enforce this, since the compiled Solidity makes code that runs on the EVM that expects this. The EVM never pads, Solidity (or whatever language you use) builds any runtime code that pads (or masks); the exception being when you read past memory, calldata, et cetera.

So, regarding Solidity performing the same behaviour (which I still argue occasionally over ;)) is not exactly what it does. There are two things to note: a) Solidity can only go in one direction (pass a string into bytes32 and it will "comply", but try passing a bytes32 into a string and the world will end) and b) Solidity is a statically typed language.

a) More on the one-direction of this in Solidity has to do with the fact the types aren't compatible, but it can sort of fake it, by calling the function passing the add(data, 32) (the pointer to the string data, after the string length) and the receiving function will mask the value as needed. Passing a bytes32 into a string though, cannot be done easily and would almost certainly fail to be a valid UTF-8 string, most binary data isn't.

b) This is the biggest difference between a language like JavaScript and Solidity. By being a dynamic language, and one in which strings are often used to represent hexadecimal encoded strings (the alternative makes using the library nearly impossible; using a Binary wrapper object), means that allowing strings in places where binary data is allowed is unsafe. Web3 does this "magic", and during audits I've seen many potential security and usability problems as a result.

// In Solidity, this is exactly 2 bytes, always
0x1234

// In Solidity, this is exactly 6 bytes, always
"0x1234"

// In JavaScript? 2 bytes of data or 6 bytes of string?
"0x1234"

So, for your example, if you are hard-coding strings into the call, "pullrequests", it is not uncommon to hardcode the hash sha3("pullrequests") as a constant. This save gas costs when the key is know (for example at compile time), but still allows for libraries to dynamically lookup values if necessary.

I believe a bytes32 would also be cheaper on gas than a short string, since masking is required in the latter case. Also, if there are only a handful of fixed values, enums might also help.

bytes32 constant pullrequests = 0x5a240ef2e352806693493f8b2318e232cb084a2f64dd84f24d69d7b249731fe6;

function getLibrary(string name) returns (address) {
    return getLibrary(keccak256(name));
}

function getLibrary(bytes32 nameHash) (address) {
    return _libs[nameHash];
}

That said, I don't think that is even necessary, as I think all storage lookups execute a keccak256 anyways, in which case it is not (meaningly) more expensive to use a string or a bytes32. But I'm not sure what the current compiler does.

For IPFS multihash, I absolutely agree, storing the bottom 32 bytes of raw hash is the way to go. I think it is ridiculous they made a 34 byte hash size... I think it would have made far more sense to keep the top byte as a version, and select just the last 31-bytes of the sha2 for the rest... They could always bump it up to a 2-byte + 30-byte in the future (just make the first by outside the range) and everyone would be happy... I mean, 20 bytes is enough for crypto-currency addresses... Le sigh.

Hope this helps. :)

@ricmoo
Copy link
Member

ricmoo commented Jul 9, 2018

Closing this, but if there are still further discussion points, please feel free to re-open. :)

@ricmoo ricmoo closed this as completed Jul 9, 2018
@o0ragman0o
Copy link

I'd like this issue reopened as I don't think proper consideration has been given to the extent by which string to bytes32 coercion is in use. It has always been a feature of Web3 as a reflection of Solidity's value assignment and parameterisation conversions. It has thereby been the accepted practice for the life of Solidity itself so it makes no sense to diverge from it.

The concerns raised against the coercion amount to little more than requiring additional bounds checking in the library which doesn't seem like such a hard task.

Practically, it has made Remix near impossible to use for such parameters (which are fundamental to my project).

@ricmoo
Copy link
Member

ricmoo commented Sep 6, 2018

I have added safe operations to v4 for this, ethers.utils.formatBytes32String and ethers.utils.parseBytes32String.

There are a lot of nuances required to make the coercion safe, but I’ve updated the UTF8 library in the v4 branch to correctly detect (and fail on) overlong sequences, invalid UTF16 surrogate pairs and provide bounds checking on character ranges.

You can install v4 using npm install ethers@next. The documentation is almost ready, at which time v4 will be the primary release.

@RobertoSnap
Copy link

Created a little GUI for this here: https://blockchangers.github.io/solidity-converter-online/
I see there is some questions around this.

@v-stickykeys
Copy link

When I input a 16byte hexcode string into ethers.utils.formatBytes32String I get an error trying to use it as a bytes32 value. When I use ethers.utils.toUtf8Bytes it works as expected. Anyone know if this is an incorrect approach?

@ricmoo
Copy link
Member

ricmoo commented Sep 14, 2020

@valmack

What are you trying to do? A 16 byte value will be 34 bytes long and a bytes32 strong may only be 31 or fewer bytes. You could either store in as a bytes16 or as a string, if you are trying to pass it into a contract.

Make sense?

@v-stickykeys
Copy link

v-stickykeys commented Sep 15, 2020

@ricmoo

What are you trying to do? A 16 byte value will be 34 bytes long and a bytes32 strong may only be 31 or fewer bytes. You could either store in as a bytes16 or as a string, if you are trying to pass it into a contract.

Make sense?

I was generating a 256 bit hash which I thought would be 32 bytes long and 16 bytes in hexcode format. But putting that hexcode string into ethers.utils.formatBytes32String produced a value longer than 32 bytes.

But based on what you are saying 16 bytes hexcode is 34 bytes long. So can I use bytes16 solidity param for a 256 bit hash in hexcode format?

Sorry this is a bit off-topic from the issue but appreciate the help anyway.

@sugalivijaychari
Copy link

My solidity Code:
<
function execute(bytes32 txHash,uint8[] memory sigV, bytes32[] memory sigR, bytes32[] memory sigS)returns(address[] memory){
address[] public recoveredAddr;
for(uint i=0; i<sigV.length; i++){
address recovered = ecrecover(txHash, sigV[i], sigR[i], sigS[i]);
recoveredAddr.push(recovered);
}
return recoveredAddr;
}

My test.js code:
Here txHash is same(calculated in both smart contract and node.js script) and pk is private keys of accounts which we recover by invoking execute function.
<
let sigV=[],sigR=[],sigS=[];
var signature = web3.eth.accounts.sign( txHash , pk );
sigV.push(web3.utils.hexToBytes(signature.v)[0]);
sigR.push(signature.r);
sigS.push(signature.v);
const result = await solidityContract.methods.execute(txHash,sigV, sigR, sigS).send({
from: accounts[0],
gas:3000000
});
console.log(result);

Now, result is different with actual accounts.
By researching, I came to know that console.log(typeof(signature.r)) is string but not bytes32.
console.log(typeof(sigR)) is object but not array.

As signature.r or signature.s have length with 66 which is greater than 32. How can we convert 66 characters length of string to bytes32 and pass to execute function to achieve same result i.e, recovered accounts are same to actual accounts where we sign.

@ricmoo
Copy link
Member

ricmoo commented Mar 25, 2021

@sugalivijaychari In ethers, a hex data string that is 66 bytes long (i.e. 0x prefixed with 64 nibbles, so the 66 characters represents 32 bytes) can be used as a bytes32 in most cases.

My guess is that the sign method you are using in Web3 may not be using EIP-191 prefixing? But I'm not sure. You can experiment with the signDigest method to see if that answer corresponds to the address you are getting? Keep in mind, this function is not generally available in Etheruem as its use in signing messages can result in security vulnerabilities.

Check out these examples to see if that helps get you on track. :)

@sugalivijaychari
Copy link

sugalivijaychari commented Mar 25, 2021

@ricmoo The sign method is working perfectly alright.
I have tested using remix.
In remix - When I give sigR and sigS in ["0xfds...","0xgfhghk..."] this format, I am able to recover deserved addresses.
In remix - When I give sigR and sigS in ['0xfds...','0xgfhghk...'] this format, I am unable to recover deserved addresses.

so, I am converting ['0xkjhjkvhjl...','0xkjhgdff...'] to ["0xfds...","0xgfhghk..."] format to call solidityContract.methods.execute function, but here I am getting error as {TypeError: param.map is not a function
at ABICoder.formatParam (node_modules/web3-eth-abi/lib/index.js:218:22)
} probably, when I checked it is taking my conversion as '["0xfds...","0xgfhghk..."]' format. So, when I tried to give ['0xkjhjkvhjl...','0xkjhgdff...'] this format to solidityContract.methods.execute function - i am unable to recover desired accounts.

I think the array is always considered as object but not array. that too, bytes32 is considered as string. So, here how do I convert
object -
[
'0xceb1aaf5c07b59a1fa4fbc1477e0f7a8f38f542b4f60569c22d055e8bf675ba4',
'0xbf2cf385802b4ff56b09af616ddbff5bfffde42698b386a305e02e6ad5f8873a'
] format to

[ "0xceb1aaf5c07b59a1fa4fbc1477e0f7a8f38f542b4f60569c22d055e8bf675ba4", "0xbf2cf385802b4ff56b09af616ddbff5bfffde42698b386a305e02e6ad5f8873a"] - this format as this format got executed in remix.

Note that conversion doesn't considered as -
'[ "0xceb1aaf5c07b59a1fa4fbc1477e0f7a8f38f542b4f60569c22d055e8bf675ba4", "0xbf2cf385802b4ff56b09af616ddbff5bfffde42698b386a305e02e6ad5f8873a"]' - this format as it gives param.map error as mentioned above.

@sugalivijaychari
Copy link

@ricmoo, please re-open the issue as above mentioned problem is not solved.

@ricmoo
Copy link
Member

ricmoo commented Mar 29, 2021

@sugalivijaychari I don’t think this is the issue you mean though. A Bytes32String is quite unrelated to bytes32 in general.

The methods involved here are strictly for converting between legacy contracts (written pre-2015-ish; some new age contracts do this too, but it is heavily discouraged) which encoded UTF-8 strings as null-terminated and zero-padded bytes32 BytesLike objects and the equivalent UTF-8 code points converted to a string.

I don’t think these are the methods you are looking for…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Questions, feedback and general information.
Projects
None yet
Development

No branches or pull requests

6 participants