Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nominal no_std + alloc support to regex-syntax #477

Closed

Conversation

ZackPierce
Copy link

Introduces feature flags and associated conditional
compilation configuration to alloc use in no_std
builds for environments with access to a heap.

Note that until ucd-util is updated and the relevant
feature flags piped through, this mode will not
be practically usable.

The main discussion of this feature is likely to occur on the primary issue tracking the general no_std with heap feature for regex: #476

Introduces feature flags and associated conditional
compilation configuration to alloc use in no_std
builds for environments with access to a heap.

Note that until ucd-util is updated and the relevant
feature flags piped through, this mode will not
be practically usable.
Copy link

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be cleaned up a bit, but it's a big shame alloc doesn't have a prelude and also that it isn't a super-set of core (which is why Borrow must have two different import rules here). Hopefully in the future that will get fixed (or we just get feature flags on the std lib).

# Disable this on-by-default feature and add "alloc" to allow use in no_std builds
std = []
# Required for use in no_std builds, presently nightly-only
alloc = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need an alloc feature flag if it will always be required — i.e. unless there are plans for core-only support in the future you could just drop this feature flag and do #![cfg_attr(not(feature = "std"), feature(alloc))].

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to leave the door open for that possibility. We can simplify the cfg statements if those plans don't materialize.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trick we do with Rand: make std depend on alloc. This has the advantage that you can use just #[cfg(feature = "alloc")] to feature-gate modules/items requiring an allocator (though also the disadvantage that sometimes you need to check both: #![cfg_attr(all(feature="alloc", not(feature="std")), feature(alloc))]).

#![cfg_attr(all(feature = "alloc", not(feature = "std")), feature(slice_concat_ext))]

#[cfg(test)]
extern crate std as std_test;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this? If std is required for testing, you can just use regular includes from std. If you want to test where std is not available this will fail. It's possible to run many tests without std anyway, though might take some more work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that we want to be able to run our tests against the library compiled in either mode as a way of confirming that there is no accidental behavioral differences.

E.G. cargo test --no-default-features --features "alloc"

IIUIC, because we (now) do extern crate core as std in the main library when in no_std mode (to reduce code diffs), we need to import std under some other name for use in the tests so as not to conflict with the core-renamed-as-std crate being used by the library.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't even know that you do need std for most of the testing. Have a look at the Rand lib code; we allow many tests to run with cargo test --no-default-features without re-importing std. But it's possible there are other parts of std you do require.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that we could make a decent chunk of the tests run without access to the standard library.

That said, the main goal is testing the library code. AFAICT that library code can be compiled in no_std mode and exercised regardless of whether the test code itself has access to standard library. As as we're getting the library code exercised already, I'm not clear on what value add there would be to go to the effort of triaging and modifying the test code to be able to run without the standard lib.


#[cfg(feature = "std")]
#[macro_use]
extern crate std as core;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In retrospect, I would recommend not doing what we did in Rand but instead having extern crate core as std; (for non-std). The reasons being (a) it saves changing so many imports and (b) std may use feature flags in the future.

However, I suppose error messages may be fun in no-std mode 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adopted this recommendation in the latest commit to this PR.

@@ -118,6 +133,9 @@ mod parser;
mod unicode;
mod unicode_tables;

#[cfg(all(feature = "alloc", not(feature = "std")))]
use alloc::string::String;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame alloc doesn't have a prelude. Can I suggest making your own (containing the usual members only std isn't used) then doing use ::prelude::*; in all modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I'll see how this shakes out.

@Centril
Copy link

Centril commented Jun 4, 2018

@ZackPierce so one change I'm making to the changes you made in proptest in a PR, https://github.com/Centril/proptest/blob/general-refactor/src/std_facade.rs, is to have one file for all the conditional std/alloc uses, which reexports all the needed paths. Then you just import use std_facade::Vec. This cut down on complexity somewhat.

@ZackPierce
Copy link
Author

Excellent, thank you for adding that!

@ZackPierce
Copy link
Author

ZackPierce commented Jun 5, 2018

I have updated this PR based on suggestions from @dhardy and a helpful example from @Centril . Pretty much all differences between the no_std + alloc and std versions of the build have been centralized and hidden in the new prelude module.

The end result is that there are about half as many code changes to the main, functional portions of the crate compared to the previous iteration.

@Centril
Copy link

Centril commented Jun 22, 2018

Relevant: rust-lang/rfcs#2480

@BurntSushi
Copy link
Member

I'm going to close this out for now. My plan is add support for this at some point. The main thing standing in the way right now is probably the work itself, the MSRV and the new features introduced by #613. Managing the MSRV is probably find, since this will mean enabling compilation where it previously didn't work before, so long as nothing else requires an MSRV bump. (And if it does, we could add back a version sniffing build.rs script, but it would be nice to avoid that complexity.)

And #613 poses an issue because I think Cargo has problems with enabling features in optional dependencies. e.g., if we had a alloc = ["aho-corasick/alloc"] feature, then that will unconditionally bring in the aho-corasick dependency when alloc is enabled, even if it was otherwise disabled. I think this can be worked around by introducing more features, but that's not a great user experience.

@BurntSushi BurntSushi closed this Sep 3, 2019
BurntSushi added a commit that referenced this pull request Aug 27, 2022
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Sep 20, 2022
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Sep 25, 2022
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Nov 5, 2022
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Jan 6, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Jan 13, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Mar 2, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Mar 4, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Mar 5, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Mar 15, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Mar 21, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Apr 15, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Apr 17, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Apr 17, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Apr 17, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
BurntSushi added a commit that referenced this pull request Apr 17, 2023
In effect, this adds support for no_std by depending on only core and
alloc. There is still currently some benefit to enabling std support,
namely, getting the 'std::error::Error' trait impls for the various
error types. (Although, it seems like the 'Error' trait is going to get
moved to 'core' finally.) Otherwise, the only 'std' things we use are in
tests for tweaking stack sizes.

This is the first step in an effort to make 'regex' itself work without
depending on 'std'. 'regex' itself will be more precarious since it uses
things like HashMap and Mutex that we'll need to find a way around.
Getting around HashMap is easy (just use BTreeMap), but figuring out how
to synchronize the threadpool will be interesting.

Ref #476, Ref #477
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants