Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring informer logic for type safety and to refine the role of stores #3008

Closed
wants to merge 5 commits into from

Conversation

shawkins
Copy link
Contributor

Description

While looking at changes for #2993, #2995, and another bug that has not been logged here. It appears there are some clarifications that could be made to improve the readability and the safety of the informer code. The central change is that Store should be more type specific and have a narrower interface - I realize this differs from the go implementation and this does change the public interface of SharedIndexInformer.getIndexer because the Store will no longer have replace, resync, or isPopulated. If it's a problem to make such a change it can of course be refined (previous code, all no-ops, or throw not implemented exceptions). In general it seems problematic that SharedIndexInformer.getIndexer returns as broad of an interface as it does - for example in DefaultSharedIndexInformer.addIndexers you can't add an indexer once running, but if you call SharedIndexInformer.getIndexer.addIndexers you can.

If you'd prefer another interface such as ResyncableStore vs direct usage of the DeltaFIFO in ResyncRunnable, Reflector, and ReflectorWatcher - that can be done as well, it just didn't seem needed given the single implementation of DeltaFIFO. I do see several implementations of Store types in go.

This also moves the responsibility for the lastResourceVersion to the DeltaFIFO - having it updatable outside the DeltaFIFO lock is dangerous. With the code pre #2812 it appears to allow for timing issues in threads updating the DeltaFIFO/lastResourceVersion:

  1. resyncExecutor thread performs scheduled relist - sees state s1, sets the last resourceVersion to s1
  2. watch thread via ReflectorWatcher - sees an update to state s2, calls DeltaFIFO.update and sets last resourceVersion to s2
  3. resyncExecutor thread continues - calls DeltaFIFO.replace with state s1, adds a syncronization event to state s1 after the update event

The event generated from 2 will be an update showing s1 as the old state and s2 as the new
The event generated from 3 will be an update showing s2 as the old state and s1 as the new

A reconnect won't fix that as the last resourceVersion is s2.

In theory though on the next relist there will be a synchronization and event to s2 (or whatever the latest state is). However with the pre #2812 code any uncaught exception from relist will inhibit the job from running again.

Granted this should no longer happen after #2812 because the scheduled relist has been removed, but it is still something that should be guarded against.

@rohanKanojia or others let me know if you are open to these types of changes, I can refine this PR as needed.

Type of change

  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

also moves the responsibility for the lastResourceVersion to the
DeltaFIFO - having it updatable outside the DeltaFIFO lock is dangerous

this does not include the changes for fabric8io#2993 nor fabric8io#2995
@centos-ci
Copy link

Can one of the admins verify this patch?

@shawkins shawkins changed the title refactoring for type safety and to refine the role of stores refactoring informer logic for type safety and to refine the role of stores Apr 17, 2021
@sonarcloud
Copy link

sonarcloud bot commented Apr 19, 2021

@shawkins shawkins marked this pull request as ready for review April 21, 2021 11:59
@shawkins
Copy link
Contributor Author

I'll rebase this after 5.3.1 - it appears that quite a few changes will make it into that release. And some of this is already being addressed by things like #3031

@manusa manusa added this to the 5.4.0 milestone Apr 23, 2021
@manusa
Copy link
Member

manusa commented Apr 26, 2021

I'll rebase this after 5.3.1

Hi @shawkins, 5.3.1 was just released

@shawkins
Copy link
Contributor Author

@manusa thank you. I think the way forward will be to split this into several prs. I've started with #3048 - that incorporates some the changes from here.

After that, can we look at #3029

Then I will include the changes that move the last resource version responsibility onto the store.

Finally I'll open a pr that refines the Store interface.

@shawkins shawkins mentioned this pull request Apr 27, 2021
8 tasks
@shawkins
Copy link
Contributor Author

shawkins commented May 4, 2021

I'll now replace this pr with one that refines the store interface. This will be for after the deltafifo removal #3061 and ideally after a cleanup of the threading concerns for informers: shawkins#2

@shawkins shawkins closed this May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants