Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCR] Auto follow patterns #33007

Closed
10 tasks done
martijnvg opened this issue Aug 21, 2018 · 3 comments
Closed
10 tasks done

[CCR] Auto follow patterns #33007

martijnvg opened this issue Aug 21, 2018 · 3 comments
Assignees
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features Meta

Comments

@martijnvg
Copy link
Member

martijnvg commented Aug 21, 2018

Tasks

  • Implement auto follow patterns feature as described (minus tasks below). [CCR] Added auto follow patterns feature #33118
  • Add license checks to to put and delete apis. Also add a license check to AutoFollowCoordinator to not try to auto follow if license is expired. Otherwise we fail when follow api gets invoked and then errors get messy, while it does not have to be that way. (@jasontedor) (yes) Add license checks for auto-follow implementation #33496
  • Ensure that auto follow patterns work correctly with security. Security headers will need to be stored in AutoFollowMetadata. (similar to how headers are stored is shard follow persistent task) (@martijnvg) (yes)
  • Add a component that purges already follow index UUIDS from the AutoFollowMetadata of leader indices that have been removed. (no) [CCR] Clean followed leader index UUIDs in auto follow metadata #36408
  • Consider this situation: the AutoFollowCoordinator follows a leader index, but failed to update the AutoFollowMetadata about the fact it has followed that leader. Then in order to avoid the AutoFollowCoordinator to repeatedly trying to follow this index, there should be an additional check that verifies whether there is a follow index that has that leader index in its custom index metadata and if so then AutoFollowCoordinator should just update the followedLeaderIndexUUIDs entry in the AutoFollowMetadata instead of trying to follow that index first and then update the followedLeaderIndexUUIDs entry. (no) [CCR] AutoFollowCoordinator and follower index already created #36540
  • Add get auto follow patterns api. (@martijnvg) (yes)
  • The component should keep track of statistics such as last_checked_imd_version and followed indices. These stats can then be exposed via a auto-follow stats api (GET /_ccr/auto-follow/_stats) (@martijnvg) (yes) [CCR] Changed AutoFollowCoordinator to keep track of certain statistics #33684
  • The component should fetch leader cluster states via long polling instead of periodic polling as is described here. We could add index_metadata_version parameter to the cluster state api, that would make the cluster state api only return if cluster state’s index metadata version is higher than is specified in this parameter. (no) [CCR] Change AutofollowCoordinator to use wait_for_metadata_version #36264
  • Improve the auto follow stats api to include the time since the auto follow coordinator last fetched cluster state of all of the remote clusters. This is useful to see whether the auto follow coordinator is alive and functional. [CCR] Add time since last auto follow fetch to auto follow stats #36542
  • Fail once if a leader index that matches an auto follow pattern cannot be followed, because soft deletes are not enabled. Currently those indices are silently ignored and that is confusing, because if a user expects a specific leader index to be followed. We need to ensure we fail once and this failure is reported in the auto follow stats. Currently if auto following fails then the auto follower will retry to follow on each run and for this kind of error that does not make sense, because if an index doesn't have soft deletes enabled then that will never change (it is controlled via a final setting). [CCR] Report error if auto follower tries auto follow a leader index with soft deletes disabled #36886

Description

Auto Following Patterns is a cross cluster replication feature that keeps track whether in the leader cluster indices are being created with names that match with a specific pattern and if so automatically let the follower cluster follow these newly created indices.

The auto follow patterns are managed via a put auto follow api:

PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*"], 
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}

The follow index name used defaults the the leader index name. In certain cases (e.g. follow an index in the same cluster) this is unwanted and the follow_index_pattern parameter can be used to pick a different name.

This api will also support other parameters (max_concurrent_read_batches etc.) that the create_and_follow api supports. These parameters will be used instead of the defaults when the auto follow feature is invoking the create_and_follow api.

and delete auto follow api:

DELETE /_ccr/_autofollow/{{remote_cluster_alias}}

The auto follow patterns are stored as custom metadata in the cluster state.

The follow cluster should have a component that periodically checks the cluster states of multiple leader clusters (depends on the number of remote cluster aliases being followed) whether new indices have been created that match with patterns specified in the put autofollow api. If that is the case then this component invokes the create_and_follow api for each matching new index. The follow cluster will use the cluster state api to fetch cluster states from leader clusters. How often this component checks remote clusters for newly created indices dependents on the a poll interval setting (‘ccr.auto_follow.poll_interval’).

The component needs to keep track for what indices it already invoked the create_and_follow api for. The UUID of these indices should also be saved in the auto follow custom metadata. If a new a new pattern is added then the component should not follow existing indices matching this pattern, only indices created after this pattern was added to the auto_follow_patterns setting. This is achieved by including the index UUID of already created indices to the autofollow custom metadata (without actually following these indices). The component also need to keep track of indices in leader cluster that were auto followed and then removed. These index uuids need to be pruned in the custom index metadata.

The component can be implemented by a simple task that runs on the elected master node. In the background it schedules a task (ThreadPool#schedule(...)) that checks whether new leader indices need to be followed in remote clusters.

Relates to #30086 -

@martijnvg martijnvg added Meta :Distributed/CCR Issues around the Cross Cluster State Replication features labels Aug 21, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@martijnvg martijnvg self-assigned this Aug 21, 2018
@martijnvg
Copy link
Member Author

I've updated the description of this issue to remove the fact that auto follow patterns are stored as dynamic cluster settings and use dedicated apis to manage auto follow patterns instead.

When auto follow patters are updated then existing leader indices matching with these patterns need to marked as followed (it is expected that only newly created indices should be followed automatically). In order to do this remote calls need to be made and there is no opportunity to do this in when a settings update consumer gets executed. So it is better to manage auto follow patters via dedicated apis.

I hoped that dynamic cluster settings would be enough, but this turned out to be not the case.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Aug 24, 2018
Auto Following Patterns is a cross cluster replication feature that
keeps track whether in the leader cluster indices are being created with
names that match with a specific pattern and if so automatically let
the follower cluster follow these newly created indices.

This change adds an `AutoFollowCoordinator` component that is only active
on the elected master node. Periodically this component checks the
 the cluster state of remote clusters if there new leader indices that
match with configured auto follow patterns that have been defined in
`AutoFollowMetadata` custom metadata.

This change also adds two new APIs to manage auto follow patterns. A put
auto follow pattern api:

```
PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*", ...],
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}
```

and delete auto follow pattern api:

```
DELETE /_ccr/_autofollow/{{remote_cluster_alias}}
```

The auto follow patterns are directly tied to the remote cluster aliases
configured in the follow cluster.

Relates to elastic#33007
martijnvg added a commit that referenced this issue Sep 6, 2018
Auto Following Patterns is a cross cluster replication feature that
keeps track whether in the leader cluster indices are being created with
names that match with a specific pattern and if so automatically let
the follower cluster follow these newly created indices.

This change adds an `AutoFollowCoordinator` component that is only active
on the elected master node. Periodically this component checks the
 the cluster state of remote clusters if there new leader indices that
match with configured auto follow patterns that have been defined in
`AutoFollowMetadata` custom metadata.

This change also adds two new APIs to manage auto follow patterns. A put
auto follow pattern api:

```
PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*", ...],
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}
```

and delete auto follow pattern api:

```
DELETE /_ccr/_autofollow/{{remote_cluster_alias}}
```

The auto follow patterns are directly tied to the remote cluster aliases
configured in the follow cluster.

Relates to #33007


Co-authored-by: Jason Tedor [email protected]
martijnvg added a commit that referenced this issue Sep 6, 2018
Auto Following Patterns is a cross cluster replication feature that
keeps track whether in the leader cluster indices are being created with
names that match with a specific pattern and if so automatically let
the follower cluster follow these newly created indices.

This change adds an `AutoFollowCoordinator` component that is only active
on the elected master node. Periodically this component checks the
 the cluster state of remote clusters if there new leader indices that
match with configured auto follow patterns that have been defined in
`AutoFollowMetadata` custom metadata.

This change also adds two new APIs to manage auto follow patterns. A put
auto follow pattern api:

```
PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*", ...],
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}
```

and delete auto follow pattern api:

```
DELETE /_ccr/_autofollow/{{remote_cluster_alias}}
```

The auto follow patterns are directly tied to the remote cluster aliases
configured in the follow cluster.

Relates to #33007

Co-authored-by: Jason Tedor <[email protected]>
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Sep 7, 2018
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Sep 13, 2018
The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to elastic#33007
martijnvg added a commit that referenced this issue Sep 18, 2018
…cs (#33684)

The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to #33007
martijnvg added a commit that referenced this issue Sep 18, 2018
…cs (#33684)

The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to #33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Sep 18, 2018
GET /_ccr/auto_follow/stats

Returns:

```
{
   "number_of_successful_follow_indices": ...
   "number_of_failed_follow_indices": ...
   "number_of_failed_remote_cluster_state_requests": ...
   "recent_auto_follow_errors": [
      ...
   ]
}
```

Relates to elastic#33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Sep 19, 2018
martijnvg added a commit that referenced this issue Sep 20, 2018
GET /_ccr/auto_follow/stats

Returns:

```
{
   "number_of_successful_follow_indices": ...
   "number_of_failed_follow_indices": ...
   "number_of_failed_remote_cluster_state_requests": ...
   "recent_auto_follow_errors": [
      ...
   ]
}
```

Relates to #33007
martijnvg added a commit that referenced this issue Sep 20, 2018
GET /_ccr/auto_follow/stats

Returns:

```
{
   "number_of_successful_follow_indices": ...
   "number_of_failed_follow_indices": ...
   "number_of_failed_remote_cluster_state_requests": ...
   "recent_auto_follow_errors": [
      ...
   ]
}
```

Relates to #33007
martijnvg added a commit that referenced this issue Sep 24, 2018
martijnvg added a commit that referenced this issue Sep 24, 2018
kcm pushed a commit that referenced this issue Oct 30, 2018
GET /_ccr/auto_follow/stats

Returns:

```
{
   "number_of_successful_follow_indices": ...
   "number_of_failed_follow_indices": ...
   "number_of_failed_remote_cluster_state_requests": ...
   "recent_auto_follow_errors": [
      ...
   ]
}
```

Relates to #33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Nov 29, 2018
…te cluster

and replaced poll interval setting with a hardcoded poll interval.
The hard coded interval will be removed in a follow up change to make
use of cluster state API's wait_for_metatdata_version.

Originates from elastic#35895
Relates to elastic#33007
martijnvg added a commit that referenced this issue Dec 5, 2018
…ster (#36031)

and replaced poll interval setting with a hardcoded poll interval.
The hard coded interval will be removed in a follow up change to make
use of cluster state API's wait_for_metatdata_version.

Before the auto following was bootstrapped from thread pool scheduler,
but now auto followers for new remote clusters are bootstrapped when
a new cluster state is published.

Originates from #35895
Relates to #33007
martijnvg added a commit that referenced this issue Dec 5, 2018
…ster (#36031)

and replaced poll interval setting with a hardcoded poll interval.
The hard coded interval will be removed in a follow up change to make
use of cluster state API's wait_for_metatdata_version.

Before the auto following was bootstrapped from thread pool scheduler,
but now auto followers for new remote clusters are bootstrapped when
a new cluster state is published.

Originates from #35895
Relates to #33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 5, 2018
Changed AutofollowCoordinator makes use of the wait_for_metadata_version
feature in cluster state API and removed hard coded poll interval.

Originates from elastic#35895
Relates to elastic#33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 8, 2018
The auto follow coordinator keeps track of the UUIDs of indices that it has followed. The index UUID strings need to be cleaned up in the case that these indices are removed in the remote cluster.

Relates to elastic#33007
martijnvg added a commit that referenced this issue Dec 12, 2018
The auto follow coordinator keeps track of the UUIDs of indices that it has followed. The index UUID strings need to be cleaned up in the case that these indices are removed in the remote cluster.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 12, 2018
The auto follow coordinator keeps track of the UUIDs of indices that it has followed. The index UUID strings need to be cleaned up in the case that these indices are removed in the remote cluster.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 12, 2018
…36264)

Changed AutofollowCoordinator makes use of the wait_for_metadata_version
feature in cluster state API and removed hard coded poll interval.

Originates from #35895
Relates to #33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 12, 2018
The AutoFollowCoordinator should be resilient to the fact that the follower
index has already been created and in that case it should only update
the auto follow metadata with the fact that the follower index was created.

Relates to elastic#33007
martijnvg added a commit that referenced this issue Dec 12, 2018
…36264)

Changed AutofollowCoordinator makes use of the wait_for_metadata_version
feature in cluster state API and removed hard coded poll interval.

Originates from #35895
Relates to #33007
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 12, 2018
For each remote cluster the auto follow coordinator, starts an auto
follower that checks the remote cluster state and determines whether an
index needs to be auto followed. The time since last auto follow is
reported per remote cluster and gives insight whether the auto follow
process is alive.

Relates to elastic#33007
Originates from elastic#35895
martijnvg added a commit that referenced this issue Dec 17, 2018
)

For each remote cluster the auto follow coordinator, starts an auto
follower that checks the remote cluster state and determines whether an
index needs to be auto followed. The time since last auto follow is
reported per remote cluster and gives insight whether the auto follow
process is alive.

Relates to #33007
Originates from #35895
martijnvg added a commit that referenced this issue Dec 17, 2018
)

For each remote cluster the auto follow coordinator, starts an auto
follower that checks the remote cluster state and determines whether an
index needs to be auto followed. The time since last auto follow is
reported per remote cluster and gives insight whether the auto follow
process is alive.

Relates to #33007
Originates from #35895
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 20, 2018
…with soft deletes disabled

Currently if a leader index with soft deletes disabled is auto followed then this index is silently ignored.
This commit changes this behaviour to mark these indices as auto followed and report an error, which is
visible in auto follow stats. Marking the index as auto follow is important, because otherwise the auto follower
will continuously try to auto follow and fail.

Relates to elastic#33007
martijnvg added a commit that referenced this issue Dec 20, 2018
…with soft deletes disabled (#36886)

Currently if a leader index with soft deletes disabled is auto followed then this index is silently ignored.
This commit changes this behavior to mark these indices as auto followed and report an error, which is visible in auto follow stats. Marking the index as auto follow is important, because otherwise the auto follower will continuously try to auto follow and fail.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 20, 2018
…with soft deletes disabled (#36886)

Currently if a leader index with soft deletes disabled is auto followed then this index is silently ignored.
This commit changes this behavior to mark these indices as auto followed and report an error, which is visible in auto follow stats. Marking the index as auto follow is important, because otherwise the auto follower will continuously try to auto follow and fail.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 20, 2018
…with soft deletes disabled (#36886)

Currently if a leader index with soft deletes disabled is auto followed then this index is silently ignored.
This commit changes this behavior to mark these indices as auto followed and report an error, which is visible in auto follow stats. Marking the index as auto follow is important, because otherwise the auto follower will continuously try to auto follow and fail.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 24, 2018
The AutoFollowCoordinator should be resilient to the fact that the follower
index has already been created and in that case it should only update
the auto follow metadata with the fact that the follower index was created.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 24, 2018
The AutoFollowCoordinator should be resilient to the fact that the follower
index has already been created and in that case it should only update
the auto follow metadata with the fact that the follower index was created.

Relates to #33007
martijnvg added a commit that referenced this issue Dec 24, 2018
The AutoFollowCoordinator should be resilient to the fact that the follower
index has already been created and in that case it should only update
the auto follow metadata with the fact that the follower index was created.

Relates to #33007
@martijnvg
Copy link
Member Author

All tasks have been implemented 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/CCR Issues around the Cross Cluster State Replication features Meta
Projects
None yet
Development

No branches or pull requests

2 participants