Feature/mvp queue #6

awm33 · 2016-11-06T21:52:23Z

Motivation

Queue implementation for the Cognoma MVP.

API changes

Some fields have been removed for MVP or are not needed for this queue backing type.

Implementation Notes

It uses a postgres SKIP LOCKED based task table and should handle a decent amount of concurrency, even greater than the number of workers we have budgeted for (2 to 3).

Functional Tests

Testing against the API using curl and the beginnings of the ml-workers code.

awm33 · 2016-11-06T22:02:06Z

@dhimmel @cgreene @dcgoss @stephenshank Here is our task service MVP. Please take a look, even if it's just to help me catch Python issues.

cgreene

A few questions. Looks good to me once those are cleared up!

cgreene · 2016-11-07T12:43:19Z

.gitignore

@@ -81,6 +81,7 @@ celerybeat-schedule
 # virtualenv
 venv/


Do we want to edit the readme, contributors, or create a pull request template to suggest that if someone is storing their virtualenv in the project, they choose one of these locations? This way someone doesn't accidentally contribute a PR that includes their virtual environment.

cgreene · 2016-11-07T12:45:29Z

api/models.py

+        self.base = base
+
+    def __call__(self, value):
+        print('in validator')


Don't quite get how this would be used yet, but should this be a call to a logger instead of print?

cgreene · 2016-11-07T12:47:00Z

api/models.py

-    received_at = models.DateTimeField(null=True)
+    status = models.CharField(choices=STATUS_CHOICES, max_length=17, default='queued')
+    worker_id = models.CharField(null=True, max_length=255)
+    locked_at = models.DateTimeField(null=True)
    priority = models.CharField(choices=PRIORITY_CHOICES, max_length=8, default="normal")
    unique = models.CharField(null=True, max_length=255)


I don't, at this moment, understand what unique is for.

It maintains idempotency of scheduling a task in a distributed environment. Say if two things try to schedule something at the same time or if a client believes failure has occurred (502 error for instance) and retries, but the server did successfully schedule it.

cgreene · 2016-11-07T12:49:31Z

api/queue.py

+
+from api.models import TaskDef, Task
+
+get_task_sql = """


Is there something that raw SQL is giving you that isn't available via the django queryset api?

I could load some postgres specific django DSL, but I find them harder to read than SQL for a complex query like this. We're using a SQL backend, anyone programming in the backend should understand SQL.

cgreene · 2016-11-07T12:52:13Z

api/views.py

+        except Task.DoesNotExist:
+            raise NotFound('Task not found')
+
+        task.status = 'dequeued'


Is this one of the options for status? I missed where it came from.

@awm33 cleared this up. Ignore.

Yes, I commented above

cgreene · 2016-11-07T12:53:08Z

api/models.py

@@ -6,8 +6,6 @@
 from django.contrib.postgres import fields as postgresfields

 STATUS_CHOICES = (
-    ("pending_queue", "Pending Queue"),
-    ("scheduled", "Scheduled"),
    ("queued", "Queued"),


Later dequeued is used. Didn't see it defined here.

It's in the original code, look beyond the diff

@awm33 - aha!

cgreene · 2016-11-07T12:54:18Z

docker-compose.yml

+    image: postgres
+  task:
+    build: .
+    command: bash -c "python manage.py migrate && python manage.py runserver 0.0.0.0:8001"


@Ahmed had some thoughts about alternative deployment strategies. Tagging him here so that he's aware. Altering deployment is probably a second PR.

Do you mean for cognoma/core-service#39 ? I think we should use an application server here as well, but docker compose is used for the development environment, not production.

@awm33 : yea - as in that issue. We may want to have both APIs hosted the same way. Particularly if a number of workers are going to access this one.

awm33 · 2016-11-07T13:44:57Z

@cgreene I forget to do a search for print before creating the PR :). I'll remove those. I also need to update the README and docs

cgreene · 2016-11-07T13:47:50Z

@awm33 : Sounds good. Busy day so I probably won't have another chance to look at it. Approving based on those incoming changes addressing print & docs.

dhimmel

Please take a look, even if it's just to help me catch Python issues.

As you wish. I made several non-functional (extremely trivial) comments. (:

dhimmel · 2016-11-07T22:18:02Z

api/auth.py

+import jwt
+
+class CognomaAuthentication(authentication.BaseAuthentication):
+    def authenticate(self, request):


Style: blank line following class definition. PEP8 I think advocates for two blank lines. Personally, I think 1 is sufficient.

That convention doesn't seem to be followed by any of the Django Rest Framework docs

dhimmel · 2016-11-07T22:20:52Z

api/queue.py

+"""
+
+def dictfetchall(cursor):
+    "Return all rows from a cursor as a dict"


Use triple quotes for docstring.

Should it be: Return all rows from a cursor as a list of dicts

dhimmel · 2016-11-07T22:26:15Z

api/serializers.py

+        try:
+            return TaskDef.objects.create(**validated_data)
+        except IntegrityError:
+            raise exceptions.ValidationError({'name': '"' + validated_data['name'] + '" already taken.'})


I prefer readability of:

raise exceptions.ValidationError('"{name}" already taken.'.format(**validated_data))

@dhimmel ValidationError requires a dict or list of dicts. The idea being that each key is the field with the error, then a pretty formatter could interpret it on the other end.

Sorry I meant:

raise exceptions.ValidationError({ 'name': '"{name}" already taken.'.format(**validated_data) })

Just using the formatter to clean up the value creation.

Wouldn't that dump all the keys from validated data? I just want name

Oh no, it would pass it but the string is only referring to {name}

Unless validated_data has many elements, I don't think this is a concern. But if validated_data is a large dictionary than you could always go:

'"{}" already taken.'.format(validated_data['name'])

dhimmel · 2016-11-07T22:28:09Z

api/test/test_tasks.py

+                                                           'previous',
+                                                           'results'])
+        self.assertEqual(len(list_response.data['results']), 2)
+        self.assertEqual(list(list_response.data['results'][0].keys()), task_keys)


.keys() is not essential here, but okay for explicitness. In other words, taken a list of a dict, takes a list of the keys.

dhimmel · 2016-11-07T22:28:41Z

api/test/test_tasks_queue.py

+
+        self.task_number = 0
+
+        for x in range(0,10):


Remove 0?

dhimmel · 2016-11-07T22:32:52Z

api/views.py

+        else:
+            timeout = 600
+
+        if timeout < 0 or timeout > 86400:


I recently learned python can do:

if not 0 < timeout < 86400:

Up to you which you think is clearer.

Wow. I just learned this too!

I was worried that this was taking advantage of the fact that 0 < timeout was first evaluating to True which was < 86400.

I played around with it a bit though, and it seems legit:

>>> if not -10 < -5 < 0: ... print("asdf") ... >>> if not (-10 < -5) < 0: ... print("asdf") ... asdf >>>

Cool!

awm33 · 2016-11-08T02:46:02Z

Just realized I'm not updating the status to "failed_retrying", "failed", or "complete" based on updates from the worker. I need to add that logic and some tests for it.

I'm sure once it's being used by ml-workers, there will be more feedback / changes.

awm33 · 2016-11-13T20:44:54Z

@cgreene @dhimmel I just pushed up a small change to handle updating the status based on 'complete', 'failed_retrying', or 'failed' states along with some tests for that logic. If there's no feedback or feedback that requires changes with the next dayish, I'm going to merge it.

dhimmel · 2016-11-13T21:07:14Z

I just pushed up a small change to handle updating the status based on 'complete', 'failed_retrying', or 'failed' states along with some tests for that logic. If there's no feedback or feedback that requires changes with the next dayish, I'm going to merge it.

Sounds good.

cgreene

Commit LGTM 👍

awm33 added 16 commits August 15, 2016 22:15

Switch to generic views

2a25ef1

Start testing basic API abilites

22c7fdc

Queue task

0650565

More tests, refactor. Add CircleCI

479ccd8

Basic queue functionality working

ae5d84f

Docker

bab55f8

Add auth

c37b017

Clean up queue SQL, query params, and tests

fe29d8b

Implement touch, release, and dequeue

8a01e06

Remove priority levels

750b495

Fixing CI tests - trying atomic block

56d3e80

Testing first failed POST

89a4f6e

Add 201s

9d7872a

Forgot migration...

4591c72

Test queue order

6c909db

Add more queue tests

12473ef

cgreene reviewed Nov 7, 2016

View reviewed changes

cgreene approved these changes Nov 7, 2016

View reviewed changes

dhimmel reviewed Nov 7, 2016

View reviewed changes

awm33 added 3 commits November 7, 2016 21:31

Small refactors based on feedback

6e5700b

Edit docstring

8bb0bb7

Update README and docs

0dff87d

Add fail and complete logic

3f862d6

cgreene approved these changes Nov 13, 2016

View reviewed changes

awm33 merged commit e07358e into master Nov 13, 2016

awm33 deleted the feature/mvp-queue branch November 13, 2016 23:51

Feature/mvp queue #6

Feature/mvp queue #6

Conversation

awm33 commented Nov 6, 2016

Motivation

API changes

Implementation Notes

Functional Tests

awm33 commented Nov 6, 2016

cgreene left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awm33 commented Nov 7, 2016

cgreene commented Nov 7, 2016

dhimmel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgreene Nov 8, 2016 • edited Loading

Choose a reason for hiding this comment

awm33 commented Nov 8, 2016 • edited Loading

awm33 commented Nov 13, 2016

dhimmel commented Nov 13, 2016

cgreene left a comment

Choose a reason for hiding this comment

cgreene Nov 8, 2016 •

edited

Loading

awm33 commented Nov 8, 2016 •

edited

Loading