Stream JSON responses #164

dracos · 2015-01-28T18:05:00Z

When we start, we have an iterable QuerySet. So this changes add_codes to be iterable, passes an iterdict [1] rather than a dict to output_json, changes output_json to use a StreamingHttpResponse by default, and updates the callback middleware to work with streaming or non-streaming responses. This hopefully reduces memory usage for one particular request from c. 2Gb to 200Mb. Similarly for output_html, we keep everything as an iterable, doing something horrible with a template in order to put our iteration in the middle.

[1] An iterdict is a dict subclass that does nothing but provide an iterable for items/iteritems, in order so that the json package will think it's a dictionary, iterate over it and output an iterator. Otherwise we'd need to load everything into a dictionary in memory to iterate it all back out...

dracos · 2015-01-29T08:51:39Z

Note that:

Django templates are not streamable – https://code.djangoproject.com/ticket/13910 – worked around in 0466f06;
although streaming responses have been around since Django 1.5, it seems no-one has tried putting Unicode in them and gzipping them – https://code.djangoproject.com/ticket/24240 – worked around in ad7fd5f;
or indeed tried gzipping them at all – https://code.djangoproject.com/ticket/24242 – worked around in ce76f68.

davea · 2015-01-30T09:47:57Z

mapit/shortcuts.py

 def output_html(request, title, areas, **kwargs):
    kwargs['json_url'] = request.get_full_path().replace('.html', '')
    kwargs['title'] = title
-    kwargs['areas'] = sorted_areas(areas)
    kwargs['indent_areas'] = kwargs.get('indent_areas', False)


Wasn't this the line you decided to chop?

Yeah, but then I need to pass it in the template_area iterator so thought I might as well leave it for that.

But it doesn't need to be in kwargs so changed for the below.

davea · 2015-01-30T11:40:30Z

Aside from those little inline comments, 👍

Create an iterator of the callback header, content, and footer. Both HttpResponse and StreamingHttpResponse handle this fine.

We subclass dict so that we can create an iterator that json will use.

This works around https://code.djangoproject.com/ticket/24240 which otherwise dies when trying to cope with a StreamingHttpResponse containing unicode strings.

This works around https://code.djangoproject.com/ticket/24242 so that gzipped streamed responses aren't larger than the original content.

It does this in quite a horrible way, by putting a flag in the template, and then inserting the iterable of areas in place of that flag. Also make sure we cache template loading. Thanks to Dave Arter for the defaultiter implementation.

dracos added the Reviewing label Jan 28, 2015

dracos force-pushed the stream-responses branch 2 times, most recently from 9ebcbdd to 0466f06 Compare January 29, 2015 15:32

davea reviewed Jan 30, 2015
View reviewed changes

dracos added 9 commits January 30, 2015 15:26

Test mapit_gb app on Travis.

7194cfb

Perform area sort in the database rather than code

084bc8a

Set all_codes on Area directly, not via code_list.

2905990

Add streaming handling to callback middleware.

a9fdf0b

Create an iterator of the callback header, content, and footer. Both HttpResponse and StreamingHttpResponse handle this fine.

Use StreamingHttpResponse by default for JSON.

9c20a09

Enable areas to be an iterator throughout request.

ae4e58b

We subclass dict so that we can create an iterator that json will use.

Make sure JSON output are byte strings.

c3ff8fb

This works around https://code.djangoproject.com/ticket/24240 which otherwise dies when trying to cope with a StreamingHttpResponse containing unicode strings.

Use local patched copy of gzip middleware.

ca4ca1b

This works around https://code.djangoproject.com/ticket/24242 so that gzipped streamed responses aren't larger than the original content.

Stream data sent through output_html().

b19f2f9

It does this in quite a horrible way, by putting a flag in the template, and then inserting the iterable of areas in place of that flag. Also make sure we cache template loading. Thanks to Dave Arter for the defaultiter implementation.

dracos force-pushed the stream-responses branch from babcbaa to b19f2f9 Compare January 30, 2015 16:24

dracos merged commit b19f2f9 into master Jan 30, 2015

dracos deleted the stream-responses branch December 22, 2015 12:02

theseanything mentioned this pull request Jan 29, 2021

Improve caching of postcode endpoints alphagov/mapit#122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream JSON responses #164

Stream JSON responses #164

dracos commented Jan 28, 2015

dracos commented Jan 29, 2015

davea Jan 30, 2015

dracos Jan 30, 2015

dracos Jan 30, 2015

davea commented Jan 30, 2015

Stream JSON responses #164

Stream JSON responses #164

Conversation

dracos commented Jan 28, 2015

dracos commented Jan 29, 2015

davea Jan 30, 2015

Choose a reason for hiding this comment

dracos Jan 30, 2015

Choose a reason for hiding this comment

dracos Jan 30, 2015

Choose a reason for hiding this comment

davea commented Jan 30, 2015