Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix copy of structs through the scheduler. #924

Merged
merged 8 commits into from
Sep 30, 2022
Merged

Conversation

blagoev
Copy link
Contributor

@blagoev blagoev commented Sep 26, 2022

We did not copy the native structs for session errors and http requests correctly. This PR fixes that

@@ -33,19 +33,19 @@ RLM_API void realm_dart_http_request_callback(realm_userdata_t userdata, const r
realm_http_request_t request_copy = request; // copy struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to copy the request here - you can std::move the request and create the copy inside the scheduler callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about that. The request is const arg so we should not std::move it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::move-ing doesn't guarantee that a move will occur, it only tells the compiler it may move it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and accessing the old reference is undefined behavior. Hence I am hesitant to do what you suggest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not accessing the old reference though, so I'm not sure I see the problem. If you do prefer to eagerly copy here, then we don't need to copy it again on line 45.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is a const arg should not be modified. And by moving it we will invalidate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 45 we do need to copy it again since it's not modifiable since its an rvalue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pardon me, on line 45 its not a copy. It's a move so as efficient as it can be.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just make the argument not be const. Top-level const on arguments is ignored when deciding the function's type, so void(foo) and void(const foo) have the same type and can both be assigned to a void(*)(foo) function pointer. The const is only meaningful inside the implementation of the function as it makes the parameter value behave like a const object. In this case, that is probably something you don't want.

std::move-ing doesn't guarantee that a move will occur, it only tells the compiler it may move it.

Sort of. std::move makes its result a (possibly const) rvalue. That then can affect overload resolution specifically to choose overloads that then move from the object. All of this is specifically defined, and no part of it is up to the compiler (except for the poorly named copy-elision optimization where it is allowed to completely skip both copies and moves in some cases).

yes and accessing the old reference is undefined behavior.

Not quite, although it is perhaps a good rule of thumb to pretend this is the case most of the time. A) it isn't the std::move that causes this affect, it is whether a move actually occurs. B) For all standard library types, you are guaranteed that a moved-from object will be left in a "valid but unspecified state". This means that you can call any methods that don't have preconditions. For user-defined types, unless otherwise stated, it is best to assume that you can only destroy the object or assign over it. "Named assignments" such as clear() or reset() are also generally considered acceptable.

(As I said, only using std::move on object you don't touch again is a good safe rule of thumb if you don't want to think too hard. But this thread was already getting into the weeds, so I wanted to make sure there was a precise statement of what the real rules are)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RedBeard0531. Good insights. I was with the impression that const args will help the compiler in someway optimize the function invocation. I will remove the const from the arg there.

CHANGELOG.md Show resolved Hide resolved
@blagoev blagoev self-assigned this Sep 26, 2022
@blagoev
Copy link
Contributor Author

blagoev commented Sep 26, 2022

@nirinchev seems like this change is not good. Will convert to draft until it is ready

@blagoev blagoev marked this pull request as draft September 26, 2022 12:11
@coveralls
Copy link

coveralls commented Sep 26, 2022

Pull Request Test Coverage Report for Build 3158967167

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 87.87%

Totals Coverage Status
Change from base Build 3126503085: 0.2%
Covered Lines: 2195
Relevant Lines: 2498

💛 - Coveralls

@blagoev blagoev marked this pull request as ready for review September 26, 2022 14:11
@blagoev blagoev marked this pull request as draft September 26, 2022 14:12
@blagoev blagoev marked this pull request as ready for review September 26, 2022 14:12

auto ud = reinterpret_cast<realm_dart_userdata_async_t>(userdata);
ud->scheduler->invoke([ud, request_copy = std::move(request_copy), buf = std::move(buf), request_context]() {
(reinterpret_cast<realm_http_request_func_t>(ud->dart_callback)(ud->handle, request_copy, request_context));
realm_http_request_t request = std::move(request_copy);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this copy necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 45 we do need to copy it again since it's not modifiable since its an rvalue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words you get this error if you try to modify request_copy.url
error C3490: 'url' cannot be modified because it is being accessed through a const object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw this realm_http_request_t request = std::move(request_copy); is not a copy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a move so as efficient as it can be.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you add mutable to the lambda, you can get rid of the realm_copy variable.

BTW, if you are OK writing this part in C++, why not just implement the C++ GenericNetworkTransport API directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw the C-API definition of the http request is this one

 /**
 * Callback function used by Core to make a HTTP request.
 *
 * Complete the request by calling realm_http_transport_complete_request(),
 * passing in the request_context pointer here and the received response.
 * Network request are expected to be asynchronous and can be completed on any thread.
 *
 * @param request The request to send.
 * @param request_context Internal state pointer of Core, needed by realm_http_transport_complete_request().
 */
typedef void (*realm_http_request_func_t)(realm_userdata_t userdata, const realm_http_request_t request,
                                          void* request_context);

Shouldn't that be a const realm_http_request_t& request instead. Why do they insist on copying the request object while invoking the function.

This is how they invoke it https://github.com/realm/realm-core/blob/master/src/realm/object-store/c_api/http.cpp#L60-L63.

So why not pass the second argument by reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RedBeard0531 I did the fix. Could you please, check the PR again. Also this message above :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I spoke with Yavor about this. seems no point on doing that. there will be not much gain

@blagoev
Copy link
Contributor Author

blagoev commented Sep 26, 2022

@nirinchev I fixed the issue and opening this again. I believe the CI fails are not related to this change

CHANGELOG.md Outdated Show resolved Hide resolved
.vscode/settings.json Show resolved Hide resolved
@@ -33,19 +33,19 @@ RLM_API void realm_dart_http_request_callback(realm_userdata_t userdata, const r
realm_http_request_t request_copy = request; // copy struct

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just make the argument not be const. Top-level const on arguments is ignored when deciding the function's type, so void(foo) and void(const foo) have the same type and can both be assigned to a void(*)(foo) function pointer. The const is only meaningful inside the implementation of the function as it makes the parameter value behave like a const object. In this case, that is probably something you don't want.

std::move-ing doesn't guarantee that a move will occur, it only tells the compiler it may move it.

Sort of. std::move makes its result a (possibly const) rvalue. That then can affect overload resolution specifically to choose overloads that then move from the object. All of this is specifically defined, and no part of it is up to the compiler (except for the poorly named copy-elision optimization where it is allowed to completely skip both copies and moves in some cases).

yes and accessing the old reference is undefined behavior.

Not quite, although it is perhaps a good rule of thumb to pretend this is the case most of the time. A) it isn't the std::move that causes this affect, it is whether a move actually occurs. B) For all standard library types, you are guaranteed that a moved-from object will be left in a "valid but unspecified state". This means that you can call any methods that don't have preconditions. For user-defined types, unless otherwise stated, it is best to assume that you can only destroy the object or assign over it. "Named assignments" such as clear() or reset() are also generally considered acceptable.

(As I said, only using std::move on object you don't touch again is a good safe rule of thumb if you don't want to think too hard. But this thread was already getting into the weeds, so I wanted to make sure there was a precise statement of what the real rules are)


auto ud = reinterpret_cast<realm_dart_userdata_async_t>(userdata);
ud->scheduler->invoke([ud, request_copy = std::move(request_copy), buf = std::move(buf), request_context]() {
(reinterpret_cast<realm_http_request_func_t>(ud->dart_callback)(ud->handle, request_copy, request_context));
realm_http_request_t request = std::move(request_copy);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real answer here is that you want to add mutable to the lambda. Without it, all by-value captures, including request_copy are considered as const inside the body of the lambda. So request is silently making a copy, even though there is a std::move. You probably should also remove the const from the original request parameter, and the request_copy variable and just capture request as ~request = std::move(request)~~ simply request (see next paragraph). And if it the lambda is marked mutable, you can just modify it in place without making another copy. Since the callback will only be called once this is fine.

Oh. I just noticed that realm_http_request_t is the C API type. For C types (or any other "Trivially Copiable" type in C++) there is no difference between move and copy. So all of this discussion of move vs copy is moot, and you might as well just copy since it is syntactically lighter. I'm leaving what I've already written about std::move in the hope that it is informative.

std::move is not a copy its more of a const_cast

To be pedantic, const_cast can only add or remove const and volatile qualifiers. std::move is specifically a static_cast that adds turns an lvalue into an rvalue. It does NOT remove the const. But it also doesn't copy (although the thing you pass the result to may make a copy of the referenced object).

my understanding is that you cannot std::move a const object.

You can std::move a const object, but you probably shouldn't. It produces a const T&& which is a bit of a weird thing. While a type could have a special constructor just for that, in general it will select the copy constructor during overload resolution and make a copy. So the move isn't harmful (in that it doesn't change the behavior of the program, although it may make more template instantiations and potentially slow down compiles or bloat code size), but it is misleading. We could consider making our own realm::move that will cause a compiler error on const objects so that you know to either remove the const or remove the no-op move. The reason std::move allows passing const object is both for consistency, and because it is useful in templates where there are some cases where you want to move from non-const objects but obviously you can't from const ones, but you don't know what you will be passed and want to work with either. But that isn't something we do often.


auto ud = reinterpret_cast<realm_dart_userdata_async_t>(userdata);
ud->scheduler->invoke([ud, request_copy = std::move(request_copy), buf = std::move(buf), request_context]() {
(reinterpret_cast<realm_http_request_func_t>(ud->dart_callback)(ud->handle, request_copy, request_context));
ud->scheduler->invoke([ud, request = std::move(request), buf = std::move(buf), request_context]() mutable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ud->scheduler->invoke([ud, request = std::move(request), buf = std::move(buf), request_context]() mutable {
ud->scheduler->invoke([ud, request, buf = std::move(buf), request_context]() mutable {

As far as I understand, std::move doesn't really do anything for that struct.

Copy link
Contributor Author

@blagoev blagoev Sep 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true for buf as well as I understand it. I kept it for consistency. Otherwise whoever reads it later will wonder why we are not moving request but we are moving buf. Also for c++ I prefer being explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't do anything, then let's remove it from buf as well - no point in pretending we move something.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think std::move(buf) doesn't do anything? buf is not a trivially copiable C-like type. Note that it's members include C++ lifetime aware types. This id different from request which is a plain C-like struct, and is required to be as part of the C API.

TL;DR: I agree with the suggested change.

Edit: that said, after looking at this, it may be even more efficient to do auto buf = std::make_unique<request_copy_buf>, and possibly put all of your captures in the buf. It will be an extra allocation, but then you will have less moves to do. Right now, I think you are moving each member at least twice, and that adds up, maybe to more than a malloc/free pair. Your choice if you want to do this, or avoid the allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to leave that so it is more understandable what the intention is. Otherwise you really need to understand there is no difference in this particular case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I am not maintaining this code, so if the people who will be think this will be more maintainable, I won't stand in your way.


auto ud = reinterpret_cast<realm_dart_userdata_async_t>(userdata);
ud->scheduler->invoke([ud, request_copy = std::move(request_copy), buf = std::move(buf), request_context]() {
(reinterpret_cast<realm_http_request_func_t>(ud->dart_callback)(ud->handle, request_copy, request_context));
ud->scheduler->invoke([ud, request = std::move(request), buf = std::move(buf), request_context]() mutable {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think std::move(buf) doesn't do anything? buf is not a trivially copiable C-like type. Note that it's members include C++ lifetime aware types. This id different from request which is a plain C-like struct, and is required to be as part of the C API.

TL;DR: I agree with the suggested change.

Edit: that said, after looking at this, it may be even more efficient to do auto buf = std::make_unique<request_copy_buf>, and possibly put all of your captures in the buf. It will be an extra allocation, but then you will have less moves to do. Right now, I think you are moving each member at least twice, and that adds up, maybe to more than a malloc/free pair. Your choice if you want to do this, or avoid the allocation.


auto ud = reinterpret_cast<realm_dart_userdata_async_t>(userdata);
ud->scheduler->invoke([ud, session = *session, error = std::move(error), buf = std::move(buf)]() {
ud->scheduler->invoke([ud, session = *session, error = std::move(error), buf = std::move(buf)]() mutable {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the std::move of error is meaningless here as well. In fact all comments carry over.

I am amused that this code is basically identical to the above, but unfortunately, due to the structure of the c api, it is just different enough that I don't think there is a clean way to avoid the code duplication.

RLM_API void realm_dart_http_request_callback(realm_userdata_t userdata, const realm_http_request_t request, void* request_context) {
// the pointers in error are to stack values, we need to make copies and move them into the scheduler invocation
RLM_API void realm_dart_http_request_callback(realm_userdata_t userdata, realm_http_request_t request, void* request_context) {
// the pointers in request are to stack values, we need to make copies and move them into the scheduler invocation
struct request_copy_buf {
std::string url;
std::string body;
std::map<std::string, std::string> headers;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, it is probably better to use a std::deque<std::pair<std::string, std::string>> here. You could also use a std::vector<...> as long as you reserve() the correct size up front, or do two passes, one to push everything, and another to push collect the addresses of the strings. You just need to avoid a growth event after collecting the addresses.

ud->scheduler->invoke([ud, request_copy = std::move(request_copy), buf = std::move(buf), request_context]() {
(reinterpret_cast<realm_http_request_func_t>(ud->dart_callback)(ud->handle, request_copy, request_context));
ud->scheduler->invoke([ud, request = std::move(request), buf = std::move(buf), request_context]() mutable {
request.url = buf.url.c_str();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment stating that you are only updating the addresses due to moves, and that all other fields of the request, such as sizes do not need to be modified.

Alternatively, since you know the sizes of everything, You might as well just recreate the whole request from scratch rather than moving it around. Your call though.

@RedBeard0531
Copy link

All comments are optional suggestions. The code looks correct to me now, at least as far as I can see in a quick review.

@blagoev blagoev merged commit 1dc1dc7 into master Sep 30, 2022
@blagoev blagoev deleted the fix-structs-copy branch September 30, 2022 19:46
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants