Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpmsg: add release cb and refcnt in endpoint to fix ept used-after-free #508

Merged
merged 1 commit into from
Nov 29, 2023

Conversation

yintao707
Copy link

@yintao707 yintao707 commented Oct 9, 2023

issue description:​

4nOaD0QVZj

In our case, the rpmsg_virtio_rx_callback() is called in a rpmsg thread. Let's assume a situation, when my_cb has already got the ept in rpmsg thread, user service thread just right called my_deinit to excute rpmsg_destroy_ept and released ept, at this time, the execution of ept ->cb has not been completed in rpmsg thread. so , there is a used after free about the ept.

Assuming we choose to fix this situation at the user level, we can continue to refer to the following examples:

void my_cb(...)
{
   atomic_fetch_add(&g_priv->refcnt, 1); // 2*
   ....
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) {
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}

void my_init(void)
{
  g_priv = malloc(sizeof(*g_priv));
  atomic_init(&g_priv->refcnt, 1);
  g_priv->ept.cb = my_cb;
  rpmsg_create_ept(&g_priv->ept);
}

void my_deinit(void)
{
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) { // 1*
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}
  1. rpmsg_virtio_rx_callback release lock here: https://github.com/OpenAMP/open-amp/blob/cd8823876fb6920571abbeecc570a9da0cb546ba/lib/rpmsg/rpmsg_virtio.c#L561C1-L561C36
  2. os decide to suspend rpmsg_virtio_rx_callback and switch to my_deinit
  3. my_deinit finish execution and release g_priv(1*)
  4. rpmsg_virtio_rx_callback resume and then happen use-after-free

The root cause is that OpenAMP call cb after release lock
Therefore, reference counting in the application layer cannot solve this race condition. Therefore, to avoid race condition, we added refnt to the endpoint and call the release callback when ept callback finished

@arnopo
Copy link
Collaborator

arnopo commented Oct 10, 2023

Hello @yintao707

Your issue is not clear for me.
Please, could you detail your issue ? what do you mean by "there is a used after free about the ept,"

Seems to me that you can call rpmsg_destroy_ept() at the end of your endpoint callback. The only constraint would be that you don't use the endpoint in your callback after the destroy.

lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
@yintao707
Copy link
Author

hi @arnopo ,
The problems currently encountered are as follows:
static void rpmsg_virtio_rx_callback(struct virtqueue *vq)
{
...
while (rp_hdr) {
rp_hdr->reserved = idx;

	/* Get the channel node from the remote device channels list. */
	metal_mutex_acquire(&rdev->lock);
	ept = rpmsg_get_ept_from_addr(rdev, rp_hdr->dst);
	metal_mutex_release(&rdev->lock);

	if (ept) {
		if (ept->dest_addr == RPMSG_ADDR_ANY) {
			/*
			 * First message received from the remote side,
			 * update channel destination address
			 */
			ept->dest_addr = rp_hdr->src;
		}
		status = ept->cb(ept, RPMSG_LOCATE_DATA(rp_hdr),
				 rp_hdr->len, rp_hdr->src, ept->priv);

		RPMSG_ASSERT(status >= 0,
			     "unexpected callback status\r\n");
	}
	...
}

}
When the ept->cb is being executed, if other threads of the service release ept->priv, then use-afer-free will occur if the callback continues to execute;
if rpsmg_unregister_endpoint can synchronously wait for ept->cb to be executed before ending, then the above problem will not occur. however there is no restriction on this in the current code. So I added refcnt to avoid this issue.

static void rpmsg_unregister_endpoint(struct rpmsg_endpoint *ept)
{
struct rpmsg_device *rdev = ept->rdev;

metal_mutex_acquire(&rdev->lock);
if (ept->addr != RPMSG_ADDR_ANY)
	rpmsg_release_address(rdev->bitmap, RPMSG_ADDR_BMP_SIZE,
			      ept->addr);
metal_list_del(&ept->node);
ept->rdev = NULL;
metal_mutex_release(&rdev->lock);
rpmsg_decref(ept);

}

@arnopo
Copy link
Collaborator

arnopo commented Oct 10, 2023

When the ept->cb is being executed, if other threads of the service release ept->priv, then use-afer-free will occur if the callback continues to execute; if rpsmg_unregister_endpoint can synchronously wait for ept->cb to be executed before ending, then the above problem will not occur. however there is no restriction on this in the current code. So I added refcnt to avoid this issue.

Something doesn't seem correct to me (if I understood your use case correctly).

In your endpoint callback, you call rpmsg_destroy_ept(ept) to free the endpoint and then continue to use the ept structure. This is equivalent to using a pointer after freeing it.

Could you share your code with me so that I can better understand the issue you are trying to solve with this PR?"

@yintao707
Copy link
Author

When the ept->cb is being executed, if other threads of the service release ept->priv, then use-afer-free will occur if the callback continues to execute; if rpsmg_unregister_endpoint can synchronously wait for ept->cb to be executed before ending, then the above problem will not occur. however there is no restriction on this in the current code. So I added refcnt to avoid this issue.

Something doesn't seem correct to me (if I understood your use case correctly).

In your endpoint callback, you call rpmsg_destroy_ept(ept) to free the endpoint and then continue to use the ept structure. This is equivalent to using a pointer after freeing it.

Could you share your code with me so that I can better understand the issue you are trying to solve with this PR?"

hi, @arnopo , I have updated the description. Could you please take a look again

@arnopo
Copy link
Collaborator

arnopo commented Oct 11, 2023

Thank you for clarifying the race condition!
Please correct me if I am wrong, but the issue I see here is that the 'user thread service' releases the endpoint by freeing the ept pointer ("release ept"). I don't think it's a good idea to manage the free of this pointer in the library, as it is allocated and freed in the application, and we cannot ensure all application use cases.

For instance, the rpmsg callback can be called under an interrupt context. In such a case, a work can be created in the callback to treat the message in normal context. In such a scenario, your PR would not prevent the ept pointer from being freed between the callback call and the work execution.

Is there any particular reason why you are not using counters or mutexes in your application to prevent race conditions?"

@CV-Bowen
Copy link
Contributor

Thank you for clarifying the race condition! Please correct me if I am wrong, but the issue I see here is that the 'user thread service' releases the endpoint by freeing the ept pointer ("release ept"). I don't think it's a good idea to manage the free of this pointer in the library, as it is allocated and freed in the application, and we cannot ensure all application use cases.

For instance, the rpmsg callback can be called under an interrupt context. In such a case, a work can be created in the callback to treat the message in normal context. In such a scenario, your PR would not prevent the ept pointer from being freed between the callback call and the work execution.

Is there any particular reason why you are not using counters or mutexes in your application to prevent race conditions?"

@arnopo In our case, the rpmsg_virtio_rx_callback() is called in a thread.
And even the rpmsg_virtio_rx_callback() is called in interrupt and creates a work to process the message in normal context, we can call rpmsg_incref() first before creating work and call rpmsg_decref() to release the endpoint after the work finish to avoid the endpoint has been released after the ept->cb().

Actually, Linux did the same thing to avoid the endpoint used-after-free issue, this is the commit:
torvalds/linux@5a081ca
The different between OpenAMP and Linux is: Linux malloc the endpoint inside the virtio rpsmg bus, but in OpenAMP, the endpoint is maintained by the user, so I think we should add a callback ept->release_cb() to let user do this.

@arnopo
Copy link
Collaborator

arnopo commented Oct 13, 2023

Thank you for clarifying the race condition! Please correct me if I am wrong, but the issue I see here is that the 'user thread service' releases the endpoint by freeing the ept pointer ("release ept"). I don't think it's a good idea to manage the free of this pointer in the library, as it is allocated and freed in the application, and we cannot ensure all application use cases.
For instance, the rpmsg callback can be called under an interrupt context. In such a case, a work can be created in the callback to treat the message in normal context. In such a scenario, your PR would not prevent the ept pointer from being freed between the callback call and the work execution.
Is there any particular reason why you are not using counters or mutexes in your application to prevent race conditions?"

@arnopo In our case, the rpmsg_virtio_rx_callback() is called in a thread. And even the rpmsg_virtio_rx_callback() is called in interrupt and creates a work to process the message in normal context, we can call rpmsg_incref() first before creating work and call rpmsg_decref() to release the endpoint after the work finish to avoid the endpoint has been released after the ept->cb().

That's my point. Half of the solution is in the library and the other part is in the application. It seems to me that you are creating a open-amp interface to solve a problem in a complex way, which could be addressed with a mutex or acounter in the application in a safer way ( for instance you can face same issue if you free the ept->priv with another thread).

Actually, Linux did the same thing to avoid the endpoint used-after-free issue, this is the commit: torvalds/linux@5a081ca The different between OpenAMP and Linux is: Linux malloc the endpoint inside the virtio rpsmg bus, but in OpenAMP, the endpoint is maintained by the user, so I think we should add a callback ept->release_cb() to let user do this.

Yes, in Linux, this has been implemented because the endpoint allocation and free is also managed by the virtio rpmsg bus.. However, this is not the case for OpenAMP, as a requirement is to support static allocation.

The ept->release_cb() only makes sense with a counter mechanism. Therefore, I don't see a strong reason to complexify the rpmsg API, to partially answer to some memory allocations/free that should be managed by application .

@edmooring, @tnmysh : any opinion on this?

@xiaoxiang781216
Copy link
Collaborator

xiaoxiang781216 commented Oct 13, 2023

The ept->release_cb() only makes sense with a counter mechanism. Therefore, I don't see a strong reason to complexify the rpmsg API, to partially answer to some memory allocations/free that should be managed by application .

@CV-Bowen has tried to add refcount inside application, but it can't fix this problem without any race condition, for example:

void my_cb(...)
{
   atomic_fetch_add(&g_priv->refcnt, 1); // 2*
   ....
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) {
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}

void my_init(void)
{
  g_priv = malloc(sizeof(*g_priv));
  atomic_init(&g_priv->refcnt, 1);
  g_priv->ept.cb = my_cb;
  rpmsg_create_ept(&g_priv->ept);
}

void my_deinit(void)
{
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) { // 1*
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}

Let's assume this sequence:

  1. rpmsg_virtio_rx_callback release lock here:
    https://github.com/OpenAMP/open-amp/pull/508/files#diff-569d5f59c7dc7ac2fee7252d74df8af1ba6e366a30fae7e1b97de627bc0ce472L563
  2. os decide to suspend rpmsg_virtio_rx_callback and switch to my_deinit
  3. my_deinit finish execution and release g_priv(1*)
  4. rpmsg_virtio_rx_callback resume and then happen use-after-free

You can see the reference count inside my_cb can just reduce the race condition(safe after 2*), but the gap between item 1 and 2* can just be handled by OpenAMP self.

The root cause is that OpenAMP call cb after release lock, the fix could be:

  1. Add the reference count
  2. Set a busy flag before release lock and wait in rpmsg_destroy_ept if this flag is set.

@arnopo
Copy link
Collaborator

arnopo commented Oct 13, 2023

Hi @xiaoxiang781216

Thanks! much more simple to understand the use case with code and associated explanations.

The ept->release_cb() only makes sense with a counter mechanism. Therefore, I don't see a strong reason to complexify the rpmsg API, to partially answer to some memory allocations/free that should be managed by application .

@CV-Bowen has tried to add refcount inside application, but it can't fix this problem without any race condition, for example:

void my_cb(...)
{

Here I suppose that you can not test g_priv ponter, else would be simple to exit with a mutex protection

   atomic_fetch_add(&g_priv->refcnt, 1); // 2*
   ....
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) {
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}

void my_init(void)
{
  g_priv = malloc(sizeof(*g_priv));
  atomic_init(&g_priv->refcnt, 1);
  g_priv->ept.cb = my_cb;
  rpmsg_create_ept(&g_priv->ept);
}

void my_deinit(void)
{
  if (atomic_fetch_sub(&g_priv->refcnt, 1) == 1) { // 1*
    rpmsg_destroy_ept(&g_priv->ept);
    free(g_priv);
  }
}

Let's assume this sequence:

  1. rpmsg_virtio_rx_callback release lock here:
    https://github.com/OpenAMP/open-amp/pull/508/files#diff-569d5f59c7dc7ac2fee7252d74df8af1ba6e366a30fae7e1b97de627bc0ce472L563
  2. os decide to suspend rpmsg_virtio_rx_callback and switch to my_deinit
  3. my_deinit finish execution and release g_priv(1*)
  4. rpmsg_virtio_rx_callback resume and then happen use-after-free

You can see the reference count inside my_cb can just reduce the race condition(safe after 2*), but the gap between item 1 and 2* can just be handled by OpenAMP self.

Ok I have understood your issue now. Thanks for pointing it out.

As you mention current patch does no protect between the release of the mutex and the counter increment :
https://github.com/OpenAMP/open-amp/pull/508/files#diff-569d5f59c7dc7ac2fee7252d74df8af1ba6e366a30fae7e1b97de627bc0ce472R562-L570

Same issue for the unbind: https://github.com/OpenAMP/open-amp/pull/508/files#diff-569d5f59c7dc7ac2fee7252d74df8af1ba6e366a30fae7e1b97de627bc0ce472R642

The root cause is that OpenAMP call cb after release lock, the fix could be:

  1. Add the reference count
  2. Set a busy flag before release lock and wait in rpmsg_destroy_ept if this flag is set.

One alternative was to have a rpmsg_sync_destroy_ept( cb). But regarding the rpmsg_deinit_vdev function, the release callback you propose seems more flexible .

I will review more closely the patch at the beginning of the week.

@yintao707:

  • Can you send a version that fix the protection hole, between the metal_mutex_release and the callbacks calls?
  • I would prefer something else that atomic_* functions in open-amp lib. We regularly face problems with compilers that not implements the atomic library.

@xiaoxiang781216
Copy link
Collaborator

@CV-Bowen has tried to add refcount inside application, but it can't fix this problem without any race condition, for example:

void my_cb(...)
{

Here I suppose that you can not test g_priv ponter, else would be simple to exit with a mutex protection

Yes, if it saves in a global variable and protected by a mutex in this simple demo, but the real case is that the endpoints are created and destroyed with dynamical number instances. Even this simple demo applies the change you suggest, the race condition still exists since OpenAMP access g_priv->ept's fields too before entering my_cb.

@yintao707
Copy link
Author

  • Can you send a version that fix the protection hole, between the metal_mutex_release and the callbacks calls?
  • I would prefer something else that atomic_* functions in open-amp lib. We regularly face problems with compilers that not implements the atomic library.

hi @arnopo , Updated, please help review again.thanks

Copy link
Collaborator

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments., else LGTM
Please, also update the commit message to better explain the race condition issue, and fix typo errors.

lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@yintao707 yintao707 changed the title rpmsg: add release cb and refcnt in end pointto fix ept used-after-free rpmsg: add release cb and refcnt in endpoint to fix ept used-after-free Oct 17, 2023
@yintao707
Copy link
Author

Minor comments., else LGTM Please, also update the commit message to better explain the race condition issue, and fix typo errors.

hi, @arnopo ,Thank you very much for your patient review, I have updated the PR based on your modification suggestions. Could you please help me review again, thanks

lib/rpmsg/rpmsg.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg.c Show resolved Hide resolved
lib/rpmsg/rpmsg.c Show resolved Hide resolved
lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg.c Outdated Show resolved Hide resolved
@xiaoxiang781216
Copy link
Collaborator

LGTM

lib/rpmsg/rpmsg.c Show resolved Hide resolved
lib/rpmsg/rpmsg.c Show resolved Hide resolved
@yintao707 yintao707 force-pushed the refcnt branch 2 times, most recently from dc8b74e to f8d7246 Compare October 21, 2023 08:35
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
lib/include/openamp/rpmsg.h Outdated Show resolved Hide resolved
@yintao707 yintao707 changed the title rpmsg: add release cb and refcnt in endpoint to fix ept used-after-free rpmsg: add incref_cb and decref_cb in endpoint to fix ept used-after-free Oct 21, 2023
@yintao707 yintao707 changed the title rpmsg: add incref_cb and decref_cb in endpoint to fix ept used-after-free rpmsg: add incref_cb and decref_cb in ept to fix ept used-after-free Oct 21, 2023
@yintao707
Copy link
Author

hi, @arnopo , I updated this PR because I found that if the reference count is only handled in the openamp layer, there may be the following issues.
Assuming this is my rpmsg_service thread, under normal circumstances, I would increment the reference count in my_init(). If no ept_cb is currently executing, I would release ept->priv in my_deinit().

rpmsg_service thread

void my_init()
{
    ...
    rpmsg_create_ept(ept);
    //ept->ref = 1;
    ...
}

void my_deinit()
{
    ...
    //Assuming ept_cb is not currently executing
    rpmsg_destroy_ept(ept);
        call release_cb(ept);
             ept->refcnt-1=0;
             free(ept->priv);
}

But if there is another thread that actively calls rpmsg_destroy_ept through my_stop_remote, and the previous reference count is 1, after rpmsg_destroy_ept is executed, release_cb will release ept->priv. If the rpmsg_service thread continues to access ept->priv, a problem may occur.

other thread

void my_stop_remote()
{
    /* Traversing all registered services.*/
    metal_list_for_each(g_rp_service, node)
    {
        rpmsg_xxx_device_destroy();
            call rpmsg_destroy_ept(ept);
                 call release_cb free ept->priv;
    }
}

Therefore, I have changed the original release_cb to incref_cb and decref_cb, allowing the service to control the usage of reference counts. This can mitigate the aforementioned problem.

Please @xiaoxiang781216 @arnopo help review this modification,thanks

Copy link
Collaborator

@xiaoxiang781216 xiaoxiang781216 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@arnopo
Copy link
Collaborator

arnopo commented Oct 24, 2023

hi, @arnopo , I updated this PR because I found that if the reference count is only handled in the openamp layer, there may be the following issues. Assuming this is my rpmsg_service thread, under normal circumstances, I would increment the reference count in my_init(). If no ept_cb is currently executing, I would release ept->priv in my_deinit().

rpmsg_service thread

void my_init()
{
    ...
    rpmsg_create_ept(ept);
    //ept->ref = 1;
    ...
}

void my_deinit()
{
    ...
    //Assuming ept_cb is not currently executing
    rpmsg_destroy_ept(ept);
        call release_cb(ept);
             ept->refcnt-1=0;
             free(ept->priv);
}

But if there is another thread that actively calls rpmsg_destroy_ept through my_stop_remote, and the previous reference count is 1, after rpmsg_destroy_ept is executed, release_cb will release ept->priv. If the rpmsg_service thread continues to access ept->priv, a problem may occur.

No cristal clear to me: "If the rpmsg_service thread continues to access ept->priv, a problem may occur."
When the release callback is called this means that no one is using the ept, right?
If you set ept->priv to NULL in release callback does it prevent the issue?

other thread

void my_stop_remote()
{
    /* Traversing all registered services.*/
    metal_list_for_each(g_rp_service, node)
    {
        rpmsg_xxx_device_destroy();
            call rpmsg_destroy_ept(ept);
                 call release_cb free ept->priv;
    }
}

Therefore, I have changed the original release_cb to incref_cb and decref_cb, allowing the service to control the usage of reference counts. This can mitigate the aforementioned problem.

Please @xiaoxiang781216 @arnopo help review this modification,thanks

Your last version looks more like a hack to me. If the release_cb does not solve the issue, we probably have to consider another approach

@yintao707
Copy link
Author

hi, @arnopo , I updated this PR because I found that if the reference count is only handled in the openamp layer, there may be the following issues. Assuming this is my rpmsg_service thread, under normal circumstances, I would increment the reference count in my_init(). If no ept_cb is currently executing, I would release ept->priv in my_deinit().

rpmsg_service thread

void my_init()
{
    ...
    rpmsg_create_ept(ept);
    //ept->ref = 1;
    ...
}

void my_deinit()
{
    ...
    //Assuming ept_cb is not currently executing
    rpmsg_destroy_ept(ept);
        call release_cb(ept);
             ept->refcnt-1=0;
             free(ept->priv);
}

But if there is another thread that actively calls rpmsg_destroy_ept through my_stop_remote, and the previous reference count is 1, after rpmsg_destroy_ept is executed, release_cb will release ept->priv. If the rpmsg_service thread continues to access ept->priv, a problem may occur.

No cristal clear to me: "If the rpmsg_service thread continues to access ept->priv, a problem may occur." When the release callback is called this means that no one is using the ept, right? If you set ept->priv to NULL in release callback does it prevent the issue?

other thread

void my_stop_remote()
{
    /* Traversing all registered services.*/
    metal_list_for_each(g_rp_service, node)
    {
        rpmsg_xxx_device_destroy();
            call rpmsg_destroy_ept(ept);
                 call release_cb free ept->priv;
    }
}

Therefore, I have changed the original release_cb to incref_cb and decref_cb, allowing the service to control the usage of reference counts. This can mitigate the aforementioned problem.
Please @xiaoxiang781216 @arnopo help review this modification,thanks

Your last version looks more like a hack to me. If the release_cb does not solve the issue, we probably have to consider another approach

hi, @arnopo , I can solve the problem mentioned above at the application layer, so using release_cb and refnt may be a better choice. I has updated this pr. thanks

@yintao707 yintao707 changed the title rpmsg: add incref_cb and decref_cb in ept to fix ept used-after-free rpmsg: add release cb and refcnt in endpoint to fix ept used-after-free Oct 30, 2023
Copy link
Collaborator

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 remaining comments not yet addressed + 2 new one

lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
lib/rpmsg/rpmsg_virtio.c Outdated Show resolved Hide resolved
@yintao707
Copy link
Author

2 remaining comments not yet addressed + 2 new one

Done, thanks!

Copy link
Collaborator

@arnopo arnopo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@@ -515,6 +515,7 @@ static void rpmsg_virtio_rx_callback(struct virtqueue *vq)
/* Get the channel node from the remote device channels list. */
metal_mutex_acquire(&rdev->lock);
ept = rpmsg_get_ept_from_addr(rdev, rp_hdr->dst);
rpmsg_ept_incref(ept);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yintao707, @arnopo does that make sense to move rpmsg_ept_incref() API within rpmsg_get_endpoint API ? If endpoint is retrieved successfully then we increase refcount. It is possible that get_endpoint API is called in future, so we increase refcount notifying endpoint is being used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnopo Looks like refcnt is tracking more if callback is in progress or not rather than how many times endpoint is being used my multiple threads using rpmsg_get_endpoint correct ?

If this is the case, can we update documentation accordingly ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yintao707, @arnopo does that make sense to move rpmsg_ept_incref() API within rpmsg_get_endpoint API ? If endpoint is retrieved successfully then we increase refcount. It is possible that get_endpoint API is called in future, so we increase refcount notifying endpoint is being used.

I would prefer not to hide it in rpmsg_get_endpoint and address this only if we need to export the rpmsg_get_endpoint API in the future.

@arnopo Looks like refcnt is tracking more if callback is in progress or not rather than how many times endpoint is being used my multiple threads using rpmsg_get_endpoint correct ?

If this is the case, can we update documentation accordingly ?

Is the documentation header for rpmsg_ept_incref not explicit enough for you?
Could you provide more details on which part of the documentation you would like to see updated ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation for ept_incref looks good. But same documentation for refcnt variable should be updated:

/** Reference count of the endpoint */
uint32_t refcnt;

Above comment gives impression that refcnt variable is used for endpoint object reference counts. But, refcnt variable is used to track if callback execution is in progress or not. So, above variable documentation should be updated accordingly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to hide it in rpmsg_get_endpoint and address this only if we need to export the rpmsg_get_endpoint API in the future.

Ok sounds good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation for ept_incref looks good. But same documentation for refcnt variable should be updated:

/** Reference count of the endpoint */
uint32_t refcnt;

Above comment gives impression that refcnt variable is used for endpoint object reference counts. But, refcnt variable is used to track if callback execution is in progress or not. So, above variable documentation should be updated accordingly.

@yintao707 , please could you address @tnmysh comment that we can merge it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arnopo @tnmysh , Thank you for your suggestion, I modified the comments about refcnt. Can you help me review whether this modification is appropriate

@GUIDINGLI
Copy link
Contributor

LGTM

if rpmsg service free the ept when has got the ept from the ept
list in rpmsg_virtio_rx_callback, there is a used after free about
the ept, so add refcnt to end point and call the rpmsg service
release callback when ept callback fininshed.

Signed-off-by: Bowen Wang <[email protected]>
@arnopo arnopo merged commit b33183f into OpenAMP:main Nov 29, 2023
3 checks passed
@arnopo arnopo added this to the Release V2024.04 milestone Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants