Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add string event integration tests #3497

Merged
merged 5 commits into from
Jun 16, 2020
Merged

Add string event integration tests #3497

merged 5 commits into from
Jun 16, 2020

Conversation

cgewecke
Copy link
Collaborator

@cgewecke cgewecke commented May 4, 2020

Description

Adds integration tests for illegal utf-8 string and wide unicode (emoji) event decoding.

Type of change

  • Tests

Checklist:

  • I have selected the correct base branch.
  • I have performed a self-review of my own code.

@cgewecke cgewecke mentioned this pull request May 4, 2020
13 tasks
}

function firesIllegalUtf8StringEvent() public {
emit StringEvent('�������');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try emit StringEvent(string(0x9f90e274657374)).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cag

solc is not letting me do this (have tried a few variants). Keep getting something along these lines:

TypeError: Explicit type conversion not allowed from "int_const 44913823586808692" 
           to "string memory".
        emit StringEvent(string(0x9f90e274657374));
                         ^----------------------^

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I don't usually use bytes literals. Try emit StringEvent(string(hex"9f90e274657374"))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cag Ah that works but it's detecting that it's invalid. Also tried w/ 0.4.26 and it's the same...

TypeError: Explicit type conversion not allowed from "literal_string (contains invalid UTF-8 sequence 
           at position 0)" to "string memory".
        emit StringEvent(string(hex"9f90e274657374"));
                         ^-------------------------^

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to use this kind of test fwiw, we could also try to trigger it using a raw log, as shown here.

Copy link
Collaborator Author

@cgewecke cgewecke May 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also tried using a raw log style test without success (e.g failure)

@cgewecke cgewecke marked this pull request as ready for review May 5, 2020 22:20
@cgewecke cgewecke added 1.x 1.0 related issues Testing Review Needed Maintainer(s) need to review labels May 5, 2020
@cgewecke
Copy link
Collaborator Author

cgewecke commented May 8, 2020

More utf-8 stuff: #3480

@coveralls
Copy link

coveralls commented May 29, 2020

Coverage Status

Coverage remained the same at 89.855% when pulling 7432e40 on add-string-event-tests into 8ff9d6d on 1.x.

@ryanio
Copy link
Collaborator

ryanio commented Jun 2, 2020

really nice to have these extra tests 👍

were you able to try decoding a malformed utf-8 string from a contract to see how web3 would react? @ricmoo mentioned flipping a top bit of any byte:

Basically, take any UTF-8 string and flip any top bit of any byte. If you flip a continuation byte, the code unit will underrun, if you flip a start-byte, the next continuation byte is invalid as a sequence, and if you flip a non-ASCII bit, the next byte won't be the needed continuation byte...

If ether’s default behavior is to error in this case (but also provides a way to recover the string) I wonder if it would be of benefit to the community for web3.js to have a differing default behavior.

Since we have relaxed strictness around some encoding formats, one mentality could be for web3 try to “fail as little as possible” in decoding so if a user is interacting with a smart contract he doesn’t own and someone has stored or emits bad string data, the code tries to continue to run instead of crashing (if not caught). We could introduce a param to enforce strict validation on decoding strings that is default false.

Easier said than done, but just an idea.

@cgewecke
Copy link
Collaborator Author

cgewecke commented Jun 2, 2020

were you able to try decoding a malformed utf-8 string from a contract to see how web3 would react?

I have tried to do this here, but have been unsure about how to malform the strings. The recipe you suggest is great, will see if I can do that, thanks!

@cgewecke
Copy link
Collaborator Author

@ryanio

Took another look at this and remembered that I got the inputs for these from an w3.org test case set for malformed utf-8. (Apologies - I should have documented this.)

The new tests use case 3.1.8 (7 continuation bytes).

Each unexpected continuation byte should be separately signaled as a         
malformed sequence of its own.                                                |
                                                                              
3.1.1  First continuation byte 0x80: "�"                                      
3.1.2  Last  continuation byte 0xbf: "�"                                      
                                                                              
3.1.3  2 continuation bytes: "��"                                             
3.1.4  3 continuation bytes: "���"                                            
3.1.5  4 continuation bytes: "����"                                           
3.1.6  5 continuation bytes: "�����"                                          
3.1.7  6 continuation bytes: "������"                                         
3.1.8  7 continuation bytes: "�������" 

This seems quite similar to what @ricmoo suggests. Not certain how to proceed here.

Am going to update the file with a note on the test case source for now...

@cgewecke
Copy link
Collaborator Author

This needs a rebase, there are irrelevant changes in the diff....

Copy link
Collaborator

@ryanio ryanio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for investigating and adding the additional tests!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.x 1.0 related issues Review Needed Maintainer(s) need to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants