Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhance] The best choice of pvc accessMode should be ReadWriteMany instead of ReadWriteOnce #539

Closed
lianneli opened this issue Jun 4, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@lianneli
Copy link

lianneli commented Jun 4, 2024

Describe the current behavior

Now, the pvc is ReadWriteOnce, generated when a CN pod generated. This design leads to create data cache so many times when hpa used and CN pod count changes. I think ReadWriteMany is the more proper choice in shared data mode. The storage directory is only one, and CN's count will not effect the data and data cache.

Describe the enhancement

I hope a variable in values.yaml to set accessMode as ReadWriteMany.

Additional context

@lianneli lianneli added the enhancement New feature or request label Jun 4, 2024
@yandongxiao
Copy link
Collaborator

‌‌‌‌‌‌I'm not quite sure I understand what you mean. Because each CN pod will have its own independent PVC, and when the POD is deleted, its corresponding PVC will not be deleted.

When the POD is recreated, are you saying that the cache data in the volume corresponding to the CN POD will become completely invalid? Then why does ReadWriteMany solve this problem?

@lianneli
Copy link
Author

lianneli commented Jun 4, 2024

‌‌‌‌‌‌I'm not quite sure I understand what you mean. Because each CN pod will have its own independent PVC, and when the POD is deleted, its corresponding PVC will not be deleted.

When the POD is recreated, are you saying that the cache data in the volume corresponding to the CN POD will become completely invalid? Then why does ReadWriteMany solve this problem?

Sorry for confused.
I draw a picture to describe the issue clearly.
image

As the picture shows above, ReadWriteOnce leads to many shorts:

  • replicas for data cache and inner table data, waste of storage
  • update data cache when cn pod scale up makes low efficiency
  • cn node seems not focus on calculate only, they also need bind or create pvcs with data and data cache. It's not much different from the functionality of the be node.

I thought ReadWriteMany may properly for this scenario.

@yandongxiao
Copy link
Collaborator

yandongxiao commented Jun 4, 2024

‌‌‌‌‌‌‌There are issues that need to be addressed here:

  1. Currently, Operator creates a Statefulset with PVC for each POD, thus creating corresponding volumes for each. Therefore, it does not yet support multiple CN PODs mounting the same volume.
  2. Different CN nodes should have distinct cache data. There is only one replica in shared-data mode.
  3. It is uncertain whether multiple CN nodes can share the same cache.

May I ask, have you used the ReadWriteMany mode before?

@lianneli
Copy link
Author

lianneli commented Jun 4, 2024

‌‌‌‌‌‌‌There are issues that need to be addressed here:

  1. Currently, Operator creates a Statefulset with PVC for each POD, thus creating corresponding volumes for each. Therefore, it does not yet support multiple CN PODs mounting the same volume.
  2. Different CN nodes should have distinct cache data. There is only one replica in shared-data mode.
  3. It is uncertain whether multiple CN nodes can share the same cache.

May I ask, have you used the ReadWriteMany mode before?

I have tried ReadWriteMany mode though spark history, so I guess the unified data cache may be more efficient when CN nodes scale up and down. It's just a suggestion.

The data cache key's structure may support unified cache, for it's not related with CN or BE.
image

@yandongxiao
Copy link
Collaborator

To my knowledge, currently, the unified data cache is not supported by StarRocks.

@lianneli
Copy link
Author

lianneli commented Jun 4, 2024

get it, thanks a lot.

@lianneli lianneli closed this as completed Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants