Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranged-based for for user-defined types #1885

Merged
merged 52 commits into from
Aug 25, 2023
Merged
Changes from 12 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
c7b8ccf
Filling out template with PR 1885
Pixep Aug 3, 2022
a355466
Add context, other languages, and basic interfaces proposal
Pixep Aug 6, 2022
ef76d36
Cleanup proposal structure, add context and proposal details
Pixep Aug 6, 2022
83b208c
Fix various typos
Pixep Sep 24, 2022
217ee4e
docs: revert typo fix to generics, separate PR
Pixep Aug 13, 2022
02ee7f8
Drop 'Indexable' interface proposal
Pixep Oct 1, 2022
0471f8d
Add abstract section
Pixep Oct 1, 2022
cf71002
Fix invalid code flags, `()` for interfaces
Pixep Oct 1, 2022
91d9709
Add a basic cursor vs iterator paragraph
Pixep Oct 1, 2022
c90d7ba
Add `Get` method to `Iterator`
Pixep Oct 1, 2022
c9bbad7
Fix typos
Pixep Oct 2, 2022
0152f41
Drop 'TODO' section
Pixep Oct 3, 2022
00ef294
Park `Consumable` interface
Pixep Oct 3, 2022
bfcb08c
Add a list of use cases
Pixep Oct 13, 2022
4fe274f
Merge branch 'trunk' into proposal-for-statement
Pixep Oct 23, 2022
936bedc
p1885: add Iterator `Advance()` method
Pixep Oct 23, 2022
b435c6f
p1885: refer to `Advance` in for-loop logic
Pixep Nov 4, 2022
76f13e4
p1885: add mutable iterators
Pixep Nov 4, 2022
0d593e1
p1885: add support suggestions for main use cases
Pixep Nov 4, 2022
dd6e4b1
p1885: switch from `Optional(T)` to `T` for getters
Pixep Nov 4, 2022
48c95f7
p1885: fix interface name typo
Pixep Nov 4, 2022
bf7330b
Apply suggestions from code review
Pixep Nov 7, 2022
51aabf4
p1885: update from google docs
Pixep Jan 17, 2023
6f2dd82
p1885: drop outdated alternatives considered
Pixep Jan 17, 2023
40716c7
p1885: drop reference to future work from "Details"
Pixep Jan 22, 2023
c714989
p1885: fix superfluous line breaks
Pixep Jan 31, 2023
5c653a7
p1885: wording improvements and precisions from josh11b
Pixep Feb 1, 2023
c897aad
p1885: more changes
Pixep Feb 1, 2023
6d57ec8
p1885: more small changes
Pixep Feb 1, 2023
12e0ff9
p1885: clarified interface example
Pixep Feb 3, 2023
471a7af
p1885: expand on comments
Pixep Feb 3, 2023
3efeb38
p1885: add python `yield` example
Pixep Feb 3, 2023
0108dba
p1885: more updates
Pixep Feb 7, 2023
0c45a70
Fix code snippet typo
Pixep Mar 4, 2023
06bb190
Clarified C++ interoperability goals
Pixep Mar 6, 2023
c8d8bcf
Add carbon to cpp interop section
Pixep Mar 6, 2023
845e0aa
Fix iterator member typo
Pixep Mar 11, 2023
f90fbfd
Clarify sections for C++ interop
Pixep Mar 11, 2023
31f9361
Merge branch 'carbon-language:trunk' into proposal-for-statement
Pixep Apr 1, 2023
f5eea57
Add mitigation for container with large values
Pixep Apr 1, 2023
59637d2
Add Iterate to C++ interop sample implementation
Pixep Apr 2, 2023
63f927c
add copy constructor for cursor iterator
Pixep Apr 2, 2023
cc86c60
Update proposals/p1885.md
Pixep Apr 4, 2023
80a74cd
Improve C++ iterator sample
Pixep Apr 4, 2023
4fad5ef
Precise the impact of caching last value
Pixep Apr 4, 2023
ff102e4
Add suggestions from @josh11b
Pixep Apr 23, 2023
caa8d86
Add suggestion from @geoffromer
Pixep Apr 23, 2023
fe33c52
Fix typo on `Iterate` class name in c++ snippet
Pixep May 18, 2023
7b81a95
Clarified C++ requirements, and implementation example
Pixep May 18, 2023
0e545d8
Link range-based for statement from C++ standard.
Pixep May 22, 2023
8e1af85
Mention ranges consuming their input
Pixep May 22, 2023
4277d37
Force checks to re-run.
chandlerc Aug 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 134 additions & 5 deletions proposals/p1885.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Alternative views](#alternative-views)
- [Large elements and Optional](#large-elements-and-optional)
- [C++ interoperability](#c-interoperability)
- [Iterating over C++ types in Carbon](#iterating-over-c-types-in-carbon)
- [Iterating over Carbon types in C++](#iterating-over-carbon-types-in-c)
- [Inversion of control](#inversion-of-control)
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)
Expand Down Expand Up @@ -250,7 +252,7 @@ interface Iterate {
A naive Carbon implementation of the for loop could be:

```carbon
let cursor: range.(Iterate.CursorType) = range.(Iterate.NewCursor)();
var cursor: range.(Iterate.CursorType) = range.(Iterate.NewCursor)();
var iter: Optional(range.(Iterate.ElementType)) = range.(Iterate.Next)(&cursor);

// A. Possible implementation
Expand Down Expand Up @@ -582,13 +584,55 @@ for (v: my_dict.(Iterate.ElementType) in my_dict.reversed) {}
Large elements would have a negative impact if the `Optional` cannot leverage
unused bit patterns within the type, nor present a view instead of a copy.

Such opportunities for `Optional` exist, and could be addressed in the future.
If this proves difficult, we can let implementers of the container choose
whether to expose values or pointers to values. When electing to "pointers to
values", an adapter can be provided to support transforming the range of
pointers into a range a values. This choice allows supporting either iteration
over container r-values (`ElementType = T`), or efficient iteration
(`ElementType = T*`), but not both. While acceptable as an intermediate
solution, a better alternative will be needed in the future.

```carbon
adapter IterateByValue(template T:! type) for T {
Pixep marked this conversation as resolved.
Show resolved Hide resolved
impl as Iterate where
// Speculative syntax. Alternatively, a cursor type parameter
// would avoid the need for deduction.
where .CursorType = T.CursorType
and .ElementType = T.CursorType.(Deref.Result) {
Pixep marked this conversation as resolved.
Show resolved Hide resolved
...
}
}

// Used like:
var frames: NetworkFramesContainerByPtr = ...
for (i: NetworkFrame in frames as IterateByValue(NetworkFramesContainerByPtr))
{
...
}
```

Alternatively, we can allow the binding in the `for` loop to be an `addr`
pattern, implemented by calling into a separate interface on the container. This
puts the choice of using values or pointers to values in the hands of the `for`
loop author. While this deviates from the ergonomics of `let` (where the
compiler can choose to copy or alias the value depending on what is more
efficient), this option would provide parity with C++ `for` loops (where the
author chooses between value and reference).

### C++ interoperability

The limitation of this approach is the need to interoperate with the C++
iterator model. This could be implemented by having the cursor be an iterator
type:
#### Iterating over C++ types in Carbon

For reference, range-based `for` loops in C++ requires:

- `begin()` and `end()` methods or free functions
- supporting pre-increment operator `++` and indirection operator `*`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  • Something to connect these two requirements to make it clear that the type returned by begin() and end() is what has to support these operators.
  • I believe the iterator type must also support inequality comparison (at least). According to https://en.cppreference.com/w/cpp/language/range-for a C++ range-based for loop gets rewritten to include for ( ; __begin != __end; ++__begin).

We may need to be careful about closely matching C++ so we can be as compatible as we can manage, as long as it doesn't impact the usability for ordinary users. In C++, my understanding is as long as the rewrite type checks, the ranged-based for loop is legal, like a template. This means we may need to support begin() and end() returning different types, as long as != is defined on those two types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely yes, clarified.


See section 8.6.4 of N4860 for more details.
Pixep marked this conversation as resolved.
Show resolved Hide resolved

A limitation of the approach proposed in this document is the need for
adaptation to interoperate with the C++ iterator model. This could be
implemented by having the cursor be an iterator type:
Pixep marked this conversation as resolved.
Show resolved Hide resolved

- `NewCursor` providing the "start iterator"
Pixep marked this conversation as resolved.
Show resolved Hide resolved
- `Next` doing the "increment, bounds check, and returning value"
Expand Down Expand Up @@ -657,6 +701,91 @@ for (i: i32 in v as CppIterate(Cpp.std.vector(i32))) { ... }
These two options could allow interoperability with a compatible C++ type,
either implicitly (template approach) or explicitly (adapter approach).

#### Iterating over Carbon types in C++

C++ needs to provide `begin()` and `end()` methods or free functions, in
addition to `*` and `++` operators.
Pixep marked this conversation as resolved.
Show resolved Hide resolved

The `Iterate` interface does not expose a way to have a cursor nor an iterator
to the past-last element. An option would be to have `end()` return a predefined
invalid value, to satisfy iterator comparison when `Next()` returns an empty
`Optional`.

Below is sample implementation to iterate on a Carbon `Iterate`. Note that we
need to copy and cache the last value, to allow for `*it` to be called. This
puts a requirement on `ElementType` to be copyable for this interoperability to
be usable.
Comment on lines +717 to +719
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will still work if ElementType is move-only (but see the comment below), it just means that IterateIterator will be move-only in those cases. That means it won't be a valid iterator according to the standard library, but it should still work with range-based for loops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked back at the code:

     if (const auto v = container_->Next(*cursor_); v) {
      value_ = *std::move(v);
    }

Won't Next() require ElementType to be copyable nonetheless, as we are effectively returning a copy of an element, and not moving anything from the container? Even if we avoid an optional-like wrapper, or just move from it, the ElementType still has to be copied at some point (likely withing Next()).

Maybe I forgot something between now and then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In recent versions of C++, if Next returns an ElementType rather than a reference, the code you quote won't actually copy or move the return value; it will use v itself as the storage for the returned temporary. Now, you're right that in this context, the body of Next will very likely need to make a copy, but Next is written in pure Carbon, so that problem isn't caused by caching in the interoperability layer, and in fact it's not caused by the interop layer at all -- the same problem exists in pure Carbon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that this is by design, and not caused by the interop. To expand, I think we could say "move-only" in the C++ domain, most that would be ignoring a very likely "copiable" requirement on Carbon side / Next. So all-in-all, "copyable" seem more correct, and the upper bound of reqs, vs the lower bound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that this doc doesn't explicitly talk about copyability on the Carbon side, so by introducing it here as a new topic that's specifically tied to caching in the interop layer, it really makes it sound like native Carbon will not require copyability, but C++ interop will.

I'd suggest revising the "Large elements and Optional" section to explicitly talk about the fact that this design may force Next to make a copy, and hence require ElementType to be copyable. Then down here you can say that the caching requires ElementType to be movable, and note that this won't matter in practice if Carbon requires it to be copyable anyway.


```cpp
geoffromer marked this conversation as resolved.
Show resolved Hide resolved
namespace Carbon {
// A C++ input iterator for any type that implements the Carbon `Iterate` interface.
template <typename Iterate>
class IterateIterator {
public:
// Returns an "Invalid" iterator representing `end()`
static auto Invalid(const Iterate& container) -> IterateIterator<Iterate> {
return IterateIterator<Iterate>(container, std::nullopt);
}

IterateIterator(const Iterate& container,
std::optional<typename Iterate::CursorType> cursor)
: container_(&container), cursor_(cursor) {
if (cursor_) {
next();
}
}
IterateIterator(const IterateIterator& other)
: container_(other.container_), cursor_(other.cursor_),
value_(other) {}

auto operator*() const -> const typename Iterate::ElementType& {
assert(cursor_);
return value_;
}
auto operator++() -> IterateIterator<Iterate>& {
josh11b marked this conversation as resolved.
Show resolved Hide resolved
next();
return *this;
}
auto operator==(const IterateIterator<Iterate>& rhs) -> bool {
return container_ == rhs.container_ && cursor_ == rhs.cursor_;
}
auto operator!=(const IterateIterator<Iterate>& rhs) -> bool {
return !(*this == rhs);
}

private:
void next() {
assert(cursor_);
if (const auto v = container_->Next(*cursor_); v) {
value_ = *v;
Pixep marked this conversation as resolved.
Show resolved Hide resolved
} else {
cursor_ = std::nullopt;
}
}

const Iterate* container_;
std::optional<typename Iterate::CursorType> cursor_;
typename Iterate::ElementType value_;
};

// A type implementing `Iterate` with CursorType `C`, and
// ElementType `T` could have `begin()` and `end()` methods
// defined as such:
class IterateType ... {
auto begin() -> IterateIterator<Iterable<C, T>> {
return IterateIterator<Iterable<C, T>>(*this, NewCursor());
}
auto end() -> IterateIterator<Iterable<C, T>> {
return IterateIterator<Iterable<C, T>>::Invalid(*this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is Iterable<...> here? It looks like IterateIterator expects to be parameterized by a container type. Is Iterable some sort of adapter to define the CursorType and ElementType members? I don't see it defined anywhere else in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a typo, and should be Iterate<C, T>, which is the C++ equivalent to Carbon's type. Fixed.

}
};
} // namespace Carbon

// Usage:
Carbon::MyContainer container;
for (auto s : container) { ...}
```

### Inversion of control

Using an `Optional` has downsides, discussed in the
Expand Down