Leader Election issue #434 #206

Invictus17 · 2020-07-31T23:11:00Z

Leader Election issue #434
kubernetes-client/python#434

k8s-ci-robot · 2020-07-31T23:11:07Z

Welcome @Invictus17!

It looks like this is your first PR to kubernetes-client/python-base 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-client/python-base has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

Invictus17 · 2020-07-31T23:12:46Z

/assign @mbohlool

Invictus17 · 2020-07-31T23:14:16Z

Hi @mbohlool @roycaihw , can you please review my PR?

yliaog · 2020-08-02T00:13:18Z

thanks for the PR.

could you please keep the file naming style consistent with the existing code? e.g. LeaderElection is better to switched to leaderelection

Invictus17 · 2020-08-02T02:31:14Z

thanks for the PR.

could you please keep the file naming style consistent with the existing code? e.g. LeaderElection is better to switched to leaderelection

Sure. I've pushed the changes.

roycaihw · 2020-08-03T16:18:35Z

/assign

palnabarun · 2020-08-03T16:33:03Z

/assign

leaderelection/electionconfig.py

roycaihw · 2020-08-06T22:42:10Z

leaderelection/electionconfig.py

+import sys
+
+
+class createConfig:


let's call it LeaderElectionConfig or Config

roycaihw · 2020-08-06T22:43:18Z

leaderelection/electionconfig.py

+
+class createConfig:
+    # Validate config, exit if an error is detected
+    def __init__(self, lock, leaseDuration, renewDeadline, retryPeriod, onStartedLeading, onStoppedLeading):


s/leaseDuration/lease_duration

roycaihw · 2020-08-06T22:44:12Z

leaderelection/electionconfig.py

+        if retryPeriod < 1:
+            sys.exit("retryPeriod must be greater than zero")
+
+        self.leaseDuration = leaseDuration


same again for code style

leaderelection/leaderelection.py

roycaihw · 2020-08-06T23:16:01Z

leaderelection/threadingwithexception.py

@@ -0,0 +1,31 @@
+import threading
+import ctypes


I'm not very excited if we have to add C compatibility to achieve this

I can look into a different way to kill threads.

You should be able to use stop flags and let the client check for it.

You should be able to use stop flags and let the client check for it.

Yes, that's similar to the latest update using traces(this commit is outdated now) but I also want to look into https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor as suggested by @yliaog in her review.

roycaihw · 2020-08-06T23:26:49Z

leaderelection/leaderelection.py

+            self.OnStoppedLeadingThread.start()
+
+            # Start to follow
+            self.follow(scheduler)


In your implementation, a program can keep switching between being a leader and a follower. However reading the client-go implementation, when the leader fails to renew the lease, it gives up and doesn't re-join as a follower. Should we keep the client-go behavior and let the user decide what to do when a leader loses the lease?

Sure. I can change the code to have a leader exit and close if failed to update lease.

roycaihw · 2020-08-06T23:28:33Z

leaderelection/leaderelection.py

+    def getLatestLeader(self):
+        getStatus, getResponse = self.electionConfig.lock.Get(name=self.electionConfig.lock.name, namespace=self.electionConfig.lock.namespace)
+        if getStatus:
+            return "leader is " + str(ast.literal_eval(getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey])['holderIdentity'])


why do we need literal_eval?

getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey] returns a string.
literal_eval converts that string to a dictionary so that ['holderIdentity'] can be easily accessed.

getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey] returns a string.
literal_eval converts that string to a dictionary so that ['holderIdentity'] can be easily accessed.

Would json.loads() work or is it not JSON?

getResponse.metadata.annotations[self.electionConfig.lock.LeaderElectionRecordAnnotationKey] returns a string.
literal_eval converts that string to a dictionary so that ['holderIdentity'] can be easily accessed.

Would json.loads() work or is it not JSON?

No, it's not JSON.

Invictus17 · 2020-08-07T21:44:28Z

Hi @roycaihw , Thanks for your review. Based on your comments I've pushed the suggested changes.

changes:

Coding style - Updated variable and function names.
Threading with traces - This implementation does not depend on ctypes.
Leader exits - The leader does not become a follower if it fails to update lease. It now exits after running onStoppedLeading().

yliaog · 2020-08-09T00:35:50Z

leaderelection/example.py

+from datetime import timedelta
+
+# Authenticate using config file
+config.load_kube_config(config_file=r"D:\Kubernetes open source - Python client\Go example\kubeconfig.txt")


usually the default config file is located at ~/.kube/config

sorry, I forgot to leave that out. Updated now.

yliaog · 2020-08-09T00:38:31Z

leaderelection/example.py

+# , if the default callback function will be used is a callback is not provide
+config = electionconfig.Config(ConfigMapLock(lock_name, lock_namespace, candidate_id), lease_duration=17, renew_deadline=15, retry_period=5, onstarted_leading=example_func, onstopped_leading=None)
+
+leaderelection.LeaderElection(config).run()


usually more than one is run to see how leaderelection is working. a simple README would be helpful.

Go example:
https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/client-go/examples/leader-election

yliaog · 2020-08-09T00:44:41Z

leaderelection/leaderelection.py

+        lock_status, lock_response = self.election_config.lock.get(self.election_config.lock.name, self.election_config.lock.namespace)
+
+        # create a default Election record for this candidate
+        self.leader_election_record = self.leaderelector_record(self.election_config)


why creating it here instead of in the 'else:' below? it is not used in the "if lock_status:"

Thank you for pointing it out. I'll update it in the following commit.

yliaog · 2020-08-09T00:49:40Z

leaderelection/leaderelection.py

+
+        # If a lock is already created with that name
+        if lock_status:
+            print(self.election_config.lock.identity, "is a follower")


better to use 'logging' instead of 'print'

yliaog · 2020-08-09T19:16:45Z

leaderelection/threadingwithtrace.py

+
+
+# citing source: https://www.geeksforgeeks.org/python-different-ways-to-kill-a-thread/
+class thread_with_trace(threading.Thread):


it would be simpler to use https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor than working with threads directly.

it is added in python 3, but i guess it is ok since python 2 is deprecated anyway, you don't have to support python 2.

yliaog · 2020-08-09T19:34:19Z

leaderelection/leaderelection.py

+            # Make sure thread that runs onstopped_leading callback is stopped
+            if self.onstopped_leadingthread:
+                self.onstopped_leadingthread.stop()
+                self.onstopped_leadingthread.join()


i don't think you need to 'stop', then 'join', i.e., wait for it to stop.

it's better to implement a simpler leaderelection:

right at the start, try to acquire lease to be the leader

if not yet a leader, periodically check if it can acquire the lease to be leader

if it becomes the leader, call the hook on started_leading

try to maintain the leadership by renewing the lease

if fail to renew the lease, call the hook on_stopped_leading

done with the leaderelection, return

NOTE: the lifecycle of one leaderelection run has two possibilities:
a) it always blocks, waiting to be a leader, but never succeeds
b) it becomes a leader, lead for sometime, then stop leading and return

It is simpler because during one leaderelection run, it can be a leader at most once. if become a leader, then somehow lose the leadership, then the whole leaderelection returns. it leave to the caller of the leaderelection to decide what to do after that, the caller may exit completely, or the caller may choose to run another leaderelection.

Thank you for pointing this out. I should have removed this piece of code when I updated the code for a leader to not follow after losing leadership and exit. Other than that, the leader election logic is identical to what you've described. I'll also make sure that the program let's the user decide whether to exit or run for election again. I'll handle these changes in the next commit.

yliaog · 2020-08-09T19:38:49Z

leaderelection/leaderelection.py

+    # Point of entry to Leader election
+    def run(self):
+        # Try to create/ acquire a lock
+        self.try_acquire_or_renew()


better to check out the java implementation, https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java

It uses only three threads, or more precisely, three threadpools, but each pool has only a single thread.

yliaog · 2020-08-09T19:40:45Z

leaderelection/resourcelock/configmaplock.py

+from kubernetes.client.api_client import ApiClient
+
+
+class ConfigMapLock:


maybe not now, but unittests need to be added before merging the PR

roycaihw · 2020-08-11T18:51:11Z

leaderelection/leaderelection.py

+        self.leaderfunction_thread = None
+
+        # onstopped_leadingthread contains the thread object for onstopped_leading
+        self.onstopped_leadingthread = None


on_stopped_leading should not be run in a separate thread

Invictus17 · 2020-08-12T21:13:16Z

Hi @roycaihw @yliaog ,
Thank you for your review. I have updated my code based on your comments and discussions. The updates are:

Not killing threads and making the logic simpler, similar to the initial Java client.
OnStoppedLeading() is not run in a sub-thread.
Added a README.
Using logging and minor changes pointed out during review.

roycaihw · 2020-08-15T21:25:37Z

leaderelection/leaderelection.py

+
+        # updatelease_schedulerId variable stores the scheduler object id for the update_lease schedule that is repeated
+        # every retry_period seconds by the leader
+        self.updatelease_schedulerId = None


is this unused?

No it is used:
https://github.com/Invictus17/python-base/blob/master/leaderelection/leaderelection.py#L138
https://github.com/Invictus17/python-base/blob/master/leaderelection/leaderelection.py#L182

roycaihw · 2020-08-15T21:40:38Z

leaderelection/leaderelection.py

+                scheduler_leader = sched.scheduler(time.time, time.sleep)
+                self.lead(scheduler_leader)
+
+    def transition_follower_to_leader(self):


we don't need to distinguish transition_follower_to_leader and lead, same for follow and try_acquire. These complexity might be useful if we wanted to convert a leader back to a follower

roycaihw · 2020-08-15T21:56:32Z

leaderelection/leaderelection.py

+
+
+            # keep checking for lease updates every retry_period seconds
+            self.followerlease_checkscheduler = scheduler.enter(int(self.election_config.retry_period), 1, self.check_lease_updates, (scheduler,))


previously I thought the scheduler was like the wait.Until method that client-go uses, where the scheduling and the real logic are isolated. I found this implementation (scheduling the real logic recursively) hard to follow. What's the benefit against using a simple while loop + sleep like the java client?

I don't think it has a benefit over the other. Would you suggest using a while + sleep?

yliaog · 2020-08-15T21:53:24Z

leaderelection/README.md

+### Command to run
+```python example.py```
+
+Now kill the existing leader. You will see from the terminal outputs that one of the remaining two processes will be elected as the new leader.


are you assuming 3 in total? (it says the remaining two processes)

Thanks for pointing it out. I'll update it.

yliaog · 2020-08-15T22:07:10Z

leaderelection/electionconfig.py

+
+    # Default callback for when the current candidate if a leader, stops leading
+    def on_stoppedleading_callback(self):
+        print(self.lock.identity, "stopped leading")


use logging?

yliaog · 2020-08-16T21:53:11Z

leaderelection/resourcelock/configmaplock.py

+        :return: 'True, None' if object is created else 'False, error' if failed
+        """
+        body = client.V1ConfigMap(
+            metadata={"name": name, "annotations": {self.leader_electionrecord_annotationkey: str(election_record)}})  # V1ConfigMap | Name is a necessary metadata for a configmap object


not sure why having the comment at the end of the line before? name is required for any k8s object, it is not special for configmap

yliaog · 2020-08-17T01:53:03Z

leaderelection/leaderelection.py

+        # If a lock is already created with that name
+        if lock_status:
+            logging.info("{} is a follower".format(self.election_config.lock.identity))
+            scheduler_follower = sched.scheduler(time.time, time.sleep)


there are a total of 4 of these: sched.scheduler(time.time, time.sleep) in the code. would one scheduler be sufficient? would it be better to create just one in the class init?

there are a total of 4 of these: sched.scheduler(time.time, time.sleep) in the code. would one scheduler be sufficient? would it be better to create just one in the class init?

Yes, one would do, I'll update it. @roycaihw had a different opinion about using the sched module. Do you have similar thoughts about it?
#206 (comment)

Yes, i agree with his comment. i think it's easier to understand to have the program structure like this:

try to acquire lease to be leader in main thread

if get lease, then run lead in a separate thread

keep renewing lease in main thread

if failed renewing lease, then stop leading in main thread

codecov-commenter · 2020-08-24T01:45:06Z

Codecov Report

Merging #206 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #206   +/-   ##
=======================================
  Coverage   92.37%   92.37%           
=======================================
  Files          13       13           
  Lines        1613     1613           
=======================================
  Hits         1490     1490           
  Misses        123      123

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54d188f...9fa58e0. Read the comment docs.

Invictus17 · 2020-08-24T01:56:07Z

Hi @roycaihw @yliaog ,
Thank you for your review. I have updated my code based on your comments and discussions. The updates are:

Switched from using sched module to a while + sleep approach.
Made the code consistent with the other clients.

yliaog · 2020-08-24T18:40:52Z

leaderelection/leaderelection.py

+
+        # If a lock is already created with that name
+        if lock_status:
+            old_election_record = ast.literal_eval(lock_response.metadata.annotations[self.election_config.lock.leader_electionrecord_annotationkey])


lock_response needs to be validated, i.e. it may not have annotations, the annotation may not have the key

also ast.literal_eval could throw an exception, better to catch it

yliaog · 2020-08-24T19:17:10Z

leaderelection/leaderelection.py

+
+
+            # If This candidate is not the leader and lease duration is yet to finish
+            if str(self.election_config.lock.identity) != self.observed_record['holderIdentity'] and self.observed_time_milliseconds + self.election_config.lease_duration*1000 > int(time.time()*1000):


self.observed_time_milliseconds is from this candidate, but it should be from the election record, no?

also self.observed_time_milliseconds is updated to the current time at line 117, so it will always be the current time, and the lease duration will never expire

self.observed_time_milliseconds is from this candidate, but it should be from the election record, no?
No, the java and go clients are also keeping a local record of 'observed_time_milliseconds' and not referring to the election record
https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L269-L315

also self.observed_time_milliseconds is updated to the current time at line 117, so it will always be the current time, and the lease duration will never expire
It is only updated if a follower identifies that the lock object has been updated by a leader.
if old_election_record != self.observed_record:
# Update self.observed_time_milliseconds & self.observed_record

In case a leader fails to update the lock, self.observed_time_milliseconds will not be updated to current time & after a period of 'leaseDuration' a follower will try to update the lock.
https://github.com/kubernetes-client/java/blob/master/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L267-L279

ok, the "> int(time.time()*1000):" in the above line is incorrect. time.time() would return the current time, what should be used instead is the time at the start of the function try_acquire_or_renew (https://github.com/kubernetes-client/java/blob/6a2a60ad2ad75fb127874797ea910b90e4a80651/extended/src/main/java/io/kubernetes/client/extended/leaderelection/LeaderElector.java#L236)

yliaog · 2020-08-24T19:21:53Z

leaderelection/leaderelection.py

+            # If this candidate is the Leader
+            if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']:
+                # Leader sets acquireTime
+                leader_election_record['acquireTime'] = self.observed_record['acquireTime']


self.observed_record['acquireTime'] is not updated, it is still the old_election_record's acquireTime

this line is correct I think. It prevents leader_election_record['acquireTime'] to be "now"

ok. i was thinking about setting the 'acquireTime' at the time when the lease is acquired, currently, the 'acquireTime' is set at the line: 104: leader_election_record = self.leaderelector_record(self.election_config)

it probably does not matter though, as there is no wait or loop inside the function try_acquire_or_renew

roycaihw · 2020-08-25T03:56:10Z

leaderelection/leaderelection.py

+            return True
+
+        # A lock is not created with that name, try to create one
+        else:


we should check the api exception. We can create a lock only if it's a 404

roycaihw · 2020-08-25T03:57:35Z

leaderelection/leaderelection.py

+        leader_election_record = self.leaderelector_record(self.election_config)
+
+        # If a lock is already created with that name
+        if lock_status:


reverse the condition to have less indentation in the code

roycaihw · 2020-08-25T04:06:45Z

leaderelection/leaderelection.py

+
+            if old_election_record != self.observed_record:
+                self.observed_record = old_election_record
+                self.observed_time_milliseconds = int(time.time()*1000)


capture the current time (e.g. now = time.time()) at the beginning of this function, and keep using now, instead of calling time.time() multiple times.

roycaihw · 2020-08-25T04:16:33Z

leaderelection/leaderelection.py

+            # If this candidate is the Leader
+            if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']:
+                # Leader sets acquireTime
+                leader_election_record['acquireTime'] = self.observed_record['acquireTime']


this line is correct I think. It prevents leader_election_record['acquireTime'] to be "now"

roycaihw · 2020-08-25T04:19:58Z

leaderelection/leaderelection.py

+
+            # Update object with latest election record
+            lock_response.metadata.annotations[self.election_config.lock.leader_electionrecord_annotationkey] = str(leader_election_record)
+            update_status, update_response = self.election_config.lock.update(self.election_config.lock.name, self.election_config.lock.namespace, lock_response)


this is not generalized for different types of locks. Better to make leader_election_record a class or a dict, and pass it directly to the lock methods

the leaderelection logic doesn't need to know about LeaderElectionRecordAnnotationKey. It's a implementation detail for the lock

roycaihw · 2020-09-02T04:53:09Z

leaderelection/leaderelection.py

+
+        # If this candidate is the Leader
+        if str(self.election_config.lock.identity) == self.observed_record['holderIdentity']:
+            # Leader sets acquireTime


nit: "Leader updates renewTime, but keeps acquireTime unchanged" may be more accurate here

roycaihw · 2020-09-02T04:55:12Z

leaderelection/leaderelection.py

+                                                                              election_record=leader_election_record)
+
+            if create_status is False:
+                return False


Please add a log for the create failure

roycaihw · 2020-09-02T05:06:56Z

leaderelection/leaderelection.py

+
+        return self.update_lock(lock_response, leader_election_record)
+
+    def update_lock(self, lock_response, leader_election_record):


lock_response can be a field stored in the lock, since it was returned from lock.get() and never changed before we passed it back to lock.update(). See client-go as an example.

roycaihw · 2020-09-02T05:41:19Z

leaderelection/leaderelection.py

+        # A lock exists with that name
+        # Validate lock_record
+        if lock_record is None:
+            # try to update lock with proper annotation and election record
+            return self.update_lock(lock_response, leader_election_record)
+
+        # check for any key, value errors in the record
+        try:
+            old_election_record = ast.literal_eval(lock_record)
+            if (old_election_record['holderIdentity'] == '' or old_election_record['leaseDurationSeconds'] == ''
+                    or old_election_record['acquireTime'] == '' or old_election_record['renewTime'] == ''):
+                # try to update lock with proper annotation and election record
+                return self.update_lock(lock_response, leader_election_record)
+        except:
+            # try to update lock with proper annotation and election record
+            return self.update_lock(lock_response, leader_election_record)


I think it's better to assume lock_record is a class object, instead of a string here. Each lock type can have its own unmarshalling implementation (e.g. ConfigMap and Endpoints unmarshal their annotations, while Lease unmarshalls its spec).

This will also allow the short-circuit logic here to be as simple as client-java.

roycaihw

Thanks for generalizing the leader_election_record. Overall the implementation looks good to me. Please add a test.

leaderelection/resourcelock/configmaplock.py

yliaog · 2020-09-16T23:12:01Z

leaderelection/leaderelection.py


        # A lock is not created with that name, try to create one
        if not lock_status:
-            if ast.literal_eval(lock_response.body)['code'] != 404:
-                logging.info("Error retrieving resource lock {} as {}".format(self.election_config.lock.name, lock_response.reason))
+            if json.loads(old_election_record.body)['code'] != 404:


please avoid magic constant 404, instead use NOT_FOUND

yliaog · 2020-09-16T23:22:33Z

leaderelection/leaderelection.py

+            logging.info("{} successfully acquired lease".format(self.election_config.lock.identity))
+
+            # Start leading and call OnStartedLeading()
+            threading.Thread(target=self.election_config.onstarted_leading, daemon=True).start()


please add some documentation to the class to describe the threading behavior. i.e. the thread is running until onstarted_leading returns if it does return, it's not safe to run this leader election more than once in a process, etc.

codecov-io · 2020-12-25T00:24:46Z

Codecov Report

Merging #206 (8793d25) into master (54d188f) will decrease coverage by 0.08%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #206      +/-   ##
==========================================
- Coverage   92.37%   92.29%   -0.09%     
==========================================
  Files          13       13              
  Lines        1613     1635      +22     
==========================================
+ Hits         1490     1509      +19     
- Misses        123      126       +3

Impacted Files	Coverage Δ
config/kube_config_test.py	`95.60% <0.00%> (-0.28%)`	⬇️
config/kube_config.py	`83.40% <0.00%> (+0.14%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54d188f...8793d25. Read the comment docs.

Invictus17 · 2021-01-03T17:31:52Z

@yliaog @roycaihw In my latest push I've added the tests but the travis ci report is indicating a few failures. Any idea why some of these tests are failing?
test result: https://travis-ci.org/github/kubernetes-client/python-base/builds/751412124

yliaog · 2021-01-06T01:19:55Z

#222 fixed the configmap test failure

roycaihw · 2021-01-12T23:00:23Z

leaderelection/leaderelection_test.py

+                                         onstopped_leading=on_stopped_leading_A)
+
+        # Enter leader election
+        leaderelection.LeaderElection(config_A).run()


Since run() blocks until the leader election ends, B won't start until A hits renew_count_max. Could you make the two clients run in parallel to make sure the leader election handles concurrency properly?

roycaihw · 2021-01-12T23:03:05Z

leaderelection/leaderelection_test.py

+                             "try update record",
+                             "update record",
+                             "try update record",
+                             "try update record"])


I had some trouble understanding what this test (test_Leader_election_with_renew_deadline ) does. Could you add some comments to explain how it's related to renew_deadline and what's the expected behavior? Thanks

Thanks @roycaihw, sure I'll add some comments. The expected behavior is to check if the leader stops leading if it fails to update the lock within the renew_deadline.

roycaihw · 2021-01-13T22:10:09Z

leaderelection/leaderelection_test.py

+    on update:  zzz s
+    on try update:  3s
+    on update: zzz s 
+    on try update:  4.5s


Before this try, the timeout was set to be 4.5 + renew_deadline = 6.5s. After two failed tries (w/ sleep), the leader exited at 4.5+1.5*2=7.5s.

roycaihw · 2021-01-13T22:23:28Z

LGTM. Please squash the commits into reasonable parts.

roycaihw · 2021-01-13T22:32:27Z

leaderelection/leaderelection_test.py

+        self.name = name
+        self.namespace = namespace
+        self.identity = str(identity)
+        self.lock = threading.RLock()


is this lock per MockResourceLock? I'd expect it to be shared between all MockResourceLocks in one test

No, it should be shared. I'll update it. Thanks!

changed file naming style consistent with the existing go client code Update example.py Changed file and folder names Rename LeaderElection.py to leaderelection.py Rename threadingWithException.py to threadingwithexception.py Rename ConfigMapLock.py to configmaplock.py LeaderElection to leaderelection Added boiler plate headers, updated variable and function names consistent with the guidelines, removed the ctypes dependency by using traces to kill threads, changed logic for leader now it gives up and doesn't re-join as a follower if it fails to update lease added correct boiler plate year Rename threadingWithTrace.py to threadingwithtrace.py Update leaderelection.py Update example.py Changes based on review - logging, OnStoppedLeading is not killed abruptly, OnStartedLeading is not run in a separate thread, adding README Update example.py updated comments set threads as daemon Update README.md Code made consistent with other clients. Update example.py Update leaderelection.py Error & exception handling for the annotation, reduced indentation Adding serializing functions for serializing & de-serializing locks, leader_election_record as a class Adding a test Adding boilerplate header Rename leaderelectiontest.py to leaderelection_test.py Updated boiler plates handling imports for pytest handling 'HTTP not found' compatibility with python 2 & 3, & handling relative imports Update leaderelection.py to check tests for tox assertEquals -> assertEqual Update leaderelection_test.py making Threading compatible for Python 2 changing datetime.timestamp for backward compatibility with Python 2.7 Adding comments for test_Leader_election_with_renew_deadline & making candidates run in parallel for test_leader_election remove redundant daemon = True reassignment common thread lock for MockResourceLock

roycaihw · 2021-01-15T01:16:48Z

/lgtm
/approve

Thanks!

k8s-ci-robot · 2021-01-15T01:16:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Invictus17, roycaihw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [roycaihw]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from mbohlool and roycaihw July 31, 2020 23:11

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 31, 2020

k8s-ci-robot assigned mbohlool Jul 31, 2020

k8s-ci-robot assigned roycaihw Aug 3, 2020

k8s-ci-robot assigned palnabarun Aug 3, 2020

roycaihw reviewed Aug 6, 2020

View reviewed changes

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 7, 2020

yliaog reviewed Aug 9, 2020

View reviewed changes

roycaihw reviewed Aug 11, 2020

View reviewed changes

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 12, 2020

Invictus17 requested review from roycaihw and yliaog August 15, 2020 18:14

roycaihw reviewed Aug 15, 2020

View reviewed changes

yliaog reviewed Aug 17, 2020

View reviewed changes

Invictus17 requested a review from roycaihw August 24, 2020 01:56

Invictus17 requested a review from yliaog August 24, 2020 01:56

yliaog reviewed Aug 24, 2020

View reviewed changes

roycaihw reviewed Aug 25, 2020

View reviewed changes

Invictus17 requested review from roycaihw and yliaog August 27, 2020 05:10

roycaihw reviewed Sep 2, 2020

View reviewed changes

Invictus17 requested a review from roycaihw September 14, 2020 05:58

roycaihw reviewed Sep 16, 2020

View reviewed changes

leaderelection/resourcelock/configmaplock.py Show resolved Hide resolved

yliaog reviewed Sep 16, 2020

View reviewed changes

k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 25, 2020

Invictus17 requested review from roycaihw and yliaog December 25, 2020 00:18

roycaihw reviewed Jan 12, 2021

View reviewed changes

roycaihw reviewed Jan 13, 2021

View reviewed changes

Invictus17 force-pushed the master branch from 1771cd4 to 4d29af1 Compare January 14, 2021 00:49

k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 15, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2021

k8s-ci-robot merged commit 4bf72d7 into kubernetes-client:master Jan 15, 2021



		# citing source: https://www.geeksforgeeks.org/python-different-ways-to-kill-a-thread/
		class thread_with_trace(threading.Thread):

		from kubernetes.client.api_client import ApiClient


		class ConfigMapLock:



		# keep checking for lease updates every retry_period seconds
		self.followerlease_checkscheduler = scheduler.enter(int(self.election_config.retry_period), 1, self.check_lease_updates, (scheduler,))



		# If This candidate is not the leader and lease duration is yet to finish
		if str(self.election_config.lock.identity) != self.observed_record['holderIdentity'] and self.observed_time_milliseconds + self.election_config.lease_duration1000 > int(time.time()1000):


		return self.update_lock(lock_response, leader_election_record)

		def update_lock(self, lock_response, leader_election_record):

Leader Election issue #434 #206

Leader Election issue #434 #206

Conversation

Invictus17 commented Jul 31, 2020 • edited Loading

k8s-ci-robot commented Jul 31, 2020

Invictus17 commented Jul 31, 2020

Invictus17 commented Jul 31, 2020

yliaog commented Aug 2, 2020

Invictus17 commented Aug 2, 2020

roycaihw commented Aug 3, 2020

palnabarun commented Aug 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Invictus17 Aug 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Invictus17 commented Aug 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Invictus17 commented Aug 12, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Invictus17 Aug 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 24, 2020

Codecov Report

Invictus17 commented Aug 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roycaihw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Dec 25, 2020

Codecov Report

Invictus17 commented Jan 3, 2021

Invictus17 commented Jul 31, 2020 •

edited

Loading

Invictus17 Aug 7, 2020 •

edited

Loading

Invictus17 Aug 15, 2020 •

edited

Loading

Invictus17 Jan 12, 2021 •

edited

Loading