Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[framework] Implements ascii + utf8 strings all over again #18462

Merged
merged 9 commits into from
Jun 29, 2024
Merged

Conversation

damirka
Copy link
Contributor

@damirka damirka commented Jun 28, 2024

Description

This PR rolls #17380 (successfully merged once) yet again.

Test plan

Features tests.

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol: This change updates the ascii module in the following ways:

    • Adds new methods to std::ascii:
    • ascii::append(&mut String, String)
    • ascii::is_empty(): bool
    • ascii::substring(&String, i, j): String
    • ascii::index_of(&String, &String): u64
    • ascii::to_uppercase(&String): String
    • ascii::to_lowercase(&String): String

    These additions make the ASCII interface more similar to the UTF8 one.

    Renames:
    - string::bytes() to string::as_bytes()
    - string::sub_string() to string::substring()

    Deprecates:
    - string::sub_string in favour of string::substring
    - string::bytes in favour of string::as_bytes

    Additional changes:
    - updates std::type_name to use std::substring
    - removes use statements for implicit imports
    - renames constants from E_INDEX to conventional EIndexOutOfBounds

@damirka damirka self-assigned this Jun 28, 2024
Copy link

vercel bot commented Jun 28, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 29, 2024 0:01am
3 Ignored Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jun 29, 2024 0:01am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jun 29, 2024 0:01am
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Jun 29, 2024 0:01am

Copy link
Contributor

@tzakian tzakian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

const EINVALID_ASCII_CHARACTER: u64 = 0x10000;
const EInvalidASCIICharacter: u64 = 0x10000;
/// An invalid index was encountered when creating a substring.
const EInvalidIndex: u64 = 0x10001;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ughhhh but fine this is the right thing to do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk I think we should at some point switch all of the std/sui errors to using clever errors.
It will be a pain, and maybe break someone somewhere, but I feel like it is the right thing to do long term

const EINVALID_ASCII_CHARACTER: u64 = 0x10000;
const EInvalidASCIICharacter: u64 = 0x10000;
/// An invalid index was encountered when creating a substring.
const EInvalidIndex: u64 = 0x10001;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk I think we should at some point switch all of the std/sui errors to using clever errors.
It will be a pain, and maybe break someone somewhere, but I feel like it is the right thing to do long term

i = i + 1;
};
option::some(String { bytes })
let is_valid = bytes.all!(|byte| is_valid_char(*byte));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

i = i + 1;
};
true
string.bytes.all!(|byte| is_printable_char(*byte))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 🔥

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// Insert the `other` string at the `at` index of `string`.
public fun insert(s: &mut String, at: u64, o: String) {
assert!(at <= s.length(), EInvalidIndex);
o.into_bytes().destroy!(|e| s.bytes.insert(e, at));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 🔥 🔥

Comment on lines +87 to +88
let mut bytes = vector[];
i.range_do!(j, |i| bytes.push_back(string.bytes[i]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might prefer vector::tabulate for this one. But this one is good too :)

Comment on lines +129 to +136
let bytes = string.as_bytes().map_ref!(|byte| char_to_uppercase(*byte));
String { bytes }
}

/// Convert a `string` to its lowercase equivalent.
public fun to_lowercase(string: &String): String {
let bytes = string.as_bytes().map_ref!(|byte| char_to_lowercase(*byte));
String { bytes }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 🔥 🔥 🔥

Comment on lines +140 to +141
/// Returns the length of the `string` if the `substr` is not found.
/// Returns 0 if the `substr` is empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really hate this API, but I agree it is the right thing to do for consistency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, not a fan.

Comment on lines 125 to 131
/// [DEPRECATED]
public fun bytes(s: &String): &vector<u8> { s.as_bytes() }

/// [DEPRECATED]
public fun sub_string(s: &String, i: u64, j: u64): String {
s.substring(i, j)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damirka #[deprecated] should land soon ™️ , if it hasn't already

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Literally just landed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And.... it's gone

@damirka damirka merged commit f6f2584 into main Jun 29, 2024
47 checks passed
@damirka damirka deleted the ds/ascii-again branch June 29, 2024 12:34
tx-tomcat pushed a commit to tx-tomcat/sui-network that referenced this pull request Jul 29, 2024
…s#18462)

## Description 

This PR rolls MystenLabs#17380 (successfully merged once) yet again. 

## Test plan 

Features tests.

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [x] Protocol: This change updates the [ascii
module](https://docs.sui.io/references/framework/move-stdlib/ascii) in
the following ways:
Adds new methods to `std::ascii`:
- `ascii::append(&mut String, String)`
- `ascii::is_empty(): bool`
- `ascii::substring(&String, i, j): String`
- `ascii::index_of(&String, &String): u64`
- `ascii::to_uppercase(&String): String`
- `ascii::to_lowercase(&String): String`

These additions make the ASCII interface more similar to the UTF8 one.

Renames:
- `string::bytes() to string::as_bytes()`
- `string::sub_string() to string::substring()`

Deprecates:
- `string::sub_string` in favour of `string::substring`
- `string::bytes` in favour of `string::as_bytes`

Additional changes:
- updates `std::type_name` to use `std::substring`
- removes use statements for implicit imports
- renames constants from `E_INDEX` to conventional `EIndexOutOfBounds`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants