Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with segment replication #2916

Closed
Tracked by #8211
dreamer-89 opened this issue Jun 29, 2023 · 17 comments
Closed
Tracked by #8211

Compatibility with segment replication #2916

dreamer-89 opened this issue Jun 29, 2023 · 17 comments
Assignees
Labels
enhancement New feature or request v2.10.0 Issues targeting release v2.10.0

Comments

@dreamer-89
Copy link
Member

dreamer-89 commented Jun 29, 2023

Summary

With 2.9.0 release, there are lot of enhancements going in for segment replication[1][2] feature (went GA in 2.7.0), we need to ensure different plugins are compatible with current state of this feature. Previously, we ran tests on plugin repos to verify this compatibility but want plugin owners to be aware of these changes so that required updates (if any) can be made. With 2.10.0 release, remote store feature is going GA which internally uses SEGMENT replication strategy only i.e. it enforces all indices to use SEGMENT replication strategy. So, it is important to validate plugins are compatible with segment replication feature.

What changed

1. Refresh policy behavior

  1. RefreshPolicy.IMMEDIATE will only refresh primary shards but not replica shards immediately. Instead post refresh, primary will start a round of segment replication to update the replica shard copies leading to eventual consistency.
  2. RefreshPolicy.WAIT_UNTIL ensures the indexing operation is searchable in your cluster i.e. RAW (Read after write guarantee). With segment replication, this guarantee is not promised due to delay in replica shared updates from asynchronous background refreshes.

2. Refresh lag on replicas

With segment replication, there is inherent delay in documents to be searchable on replica shard copies. This is due to the fact that replica shard copies over data (segment) files from primary. Thus, compared to document replication, there will be on average increase in amount of time the replica shards are consistent with primaries.

3. System/hidden indices support

With opensearch-project/OpenSearch#8200, system and hidden indices are now supported with SEGMENT replication strategy. We need to ensure there are no bottlenecks which prevents system/hidden indices with segment replication.

Next steps

With segment replication strong reads are not guaranteed. Thus, if the plugin needs strong reads guarantees specially as alternative to change in behavior of refresh policy and lag on replicas (point 1 and 2 above), we need to update search requests to target primary shard only. With opensearch-project/OpenSearch#7375, core now supports primary shards only based search. Please follow documentation for examples and details

Open questions

In case of any questions or issues, please post it in core issue

Reference

[1] Design

[2] Documentation

@dreamer-89 dreamer-89 added enhancement New feature or request untriaged Require the attention of the repository maintainers and may need to be prioritized labels Jun 29, 2023
@dreamer-89
Copy link
Member Author

Request owners to add v2.9.0 label on this issue.

@gaiksaya gaiksaya added the v2.9.0 v2.9.0 label Jul 3, 2023
@stephen-crawford
Copy link
Contributor

[Triage] Duplicate from Dashboards plugin version.

@dreamer-89
Copy link
Member Author

@scrawfor99: Can you please share the duplicate issue that you are referring here ?

@dreamer-89
Copy link
Member Author

@scrawfor99 : Do you mind sharing the duplicate issue you referred above.

@stephen-crawford
Copy link
Contributor

Hi @dreamer-89, so sorry I missed your original tag. I was referring to this issue since Security and Security Dashboards are coupled: opensearch-project/security-dashboards-plugin#1494.

@stephen-crawford
Copy link
Contributor

To clear up the confusion: we will be leaving this issue open as the tracking issue for seg rep compatibility. The Sec Dashboards issue is closed. This will stay open and I will label it 2.10.

@stephen-crawford stephen-crawford added v2.10.0 Issues targeting release v2.10.0 and removed v2.9.0 v2.9.0 labels Aug 22, 2023
@dreamer-89
Copy link
Member Author

Copy pasting @cwperks comment from security-dashboards-plugin issue.


@dreamer-89 Yes, the security repo. The security-dashboards-plugin repo utilizes the .kibana-* indices to support multi-tenancy, but those indices are not system indices the way that .opendistro_security (the security index) is. The security repo requires strong consistency for the security index to ensure that every node reading from the security index sees the freshest data from the index. There was a PR opened in the security repo to set the preference of the mget request that the security plugin performs when reading from the security index, but the PR was closed. See discussion on the PR here: #2903

security issue: #2916

@dreamer-89
Copy link
Member Author

dreamer-89 commented Aug 22, 2023

Copy pasting @cwperks comment from security-dashboards-plugin issue.

@dreamer-89 Yes, the security repo. The security-dashboards-plugin repo utilizes the .kibana-* indices to support multi-tenancy, but those indices are not system indices the way that .opendistro_security (the security index) is. The security repo requires strong consistency for the security index to ensure that every node reading from the security index sees the freshest data from the index. There was a PR opened in the security repo to set the preference of the mget request that the security plugin performs when reading from the security index, but the PR was closed. See discussion on the PR here: #2903

security issue: #2916

Thank you @cwperks for the explanation and linking the PR #2903 where _primary shard based routing was attempted. Just wanted to update you that with opensearch-project/OpenSearch#8536, core now also supports realtime reads for segment replication enabled indices. Thus, if you are using get/mget APIs, you should receive the latest data irrespective of replication strategy. If security utilises get/mget API for data retrieval, I think there is no action needed from your end. Please do verify once that get/mget works as intended with segrep (via tests?)

Please check opensearch-project/OpenSearch#8536 for more details.

@peternied
Copy link
Member

[Triage] @dreamer-89 The 2.10 release is coming up, what work needs to be done to resolve this issue for that release. Doesn't seem like we need to do anything can you confirm and close if that is the case?

@peternied peternied removed the untriaged Require the attention of the repository maintainers and may need to be prioritized label Aug 28, 2023
@dreamer-89
Copy link
Member Author

dreamer-89 commented Aug 28, 2023

Thanks @peternied for the comment. I have already called out ask in the description. Let me know if you have any questions there.

With opensearch-project/OpenSearch#8536, core now supports realtime reads via get/mget APIs for SEGMENT replication enabled indices. If security ONLY relies on get/mget APIs for strong reads, then there is no change needed. If your plugin relies on strong reads via write path i.e. using IMMEDIATE/WAIT_UNTIL refresh policy during indexing operations, then you need to use get/mget APIs or use _primary preference in search requests. The later is not recommended as it results in all request been handled by primary also called out here. I will let plugin owners to validate that they are compatible with SEGMENT replication.

CC @cwperks @scrawfor99

@dreamer-89
Copy link
Member Author

Ping @scrawfor99 @cwperks @peternied @davidlago as we are approaching code freeze date for 2.10 release. Please have a look.

@stephen-crawford
Copy link
Contributor

Hi @dreamer-89, the security plugin requires strong reads. In the past it was shared that seg. rep. was intending to offer support of this by 2.10: #2903 (comment).

@dreamer-89
Copy link
Member Author

dreamer-89 commented Aug 30, 2023

Hi @dreamer-89, the security plugin requires strong reads. In the past it was shared that seg. rep. was intending to offer support of this by 2.10: #2903 (comment).

Thanks @scrawfor99 for the update. Yes, opensearch-project/OpenSearch#8536 is resolved on core which ensures strong reads for get/mget APIs.

Also, please note DOCUMENT replication guarantees RAW on the write path by using IMMEDIATE or WAIT_UNTIL refresh policy, which ensure all replica shards are updated with same indexing operation. With SEGMENT replication this guarantee does not holds true. Thus, if you are using IMMEDIATE, WAIT_UNTIL refresh policy, then for follow up search requests (where you need RAW), you need to use either get/mget APIs or use _primary preference parameter. The former is preferred as later puts extra load on primary shard.

@stephen-crawford
Copy link
Contributor

Hi @dreamer-89, thank you for the follow-up. Sorry but I am not sure I understand what needs to be done... I thought that the changes were going to make it so that strong reads were supported meaning the current behavior of the security plugin could remain unchanged?

From your comment

please note DOCUMENT replication guarantees RAW on the write path by using IMMEDIATE or WAIT_UNTIL refresh policy, which ensure all replica shards are updated with same indexing operation. With SEGMENT replication this guarantee does not holds true. Thus, if you are using IMMEDIATE, WAIT_UNTIL refresh policy, then for follow up search requests (where you need RAW), you need to use either get/mget APIs or use _primary preference parameter. The former is preferred as later puts extra load on primary shard.

It sounds like do need to swap something though?

If you look at this PR: #2903, you can see the previous attempt to resolve the issue. Do we simply need to move forward with that PR again or what is required?

Thank you again.

@mch2
Copy link
Member

mch2 commented Aug 31, 2023

Hi @scrawfor99,

I think all thats required here is to remove the _primary based preference here because it is no longer required on get/mget requests, and the pr you linked is using an mget. This will be realtime with SR internally. Going forward only search requests would require any primary based routing using these parameters.

@mch2
Copy link
Member

mch2 commented Aug 31, 2023

Ah i see that was never merged, I do not think there is anything required here on security side.

@stephen-crawford
Copy link
Contributor

Hi @mch2, thank you for following up. My apologies for the confusion here. I know the closing of the issue and re-opening etc. could have contributed to this so glad we could get everything squared away. I am going to close this issue then to signal Security is all set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v2.10.0 Issues targeting release v2.10.0
Projects
None yet
Development

No branches or pull requests

5 participants