ENH: make to_json & to_csv transformers have deterministic filenames #862

jakevdp · 2018-05-17T18:19:02Z

Addresses #857

update functions
add tests

The idea here is that when someone uses the json or csv data transformer, the dataframes are stored on disk with a filename that is determined from the hash of the data contents. This prevents the proliferation of temporary files when doing a lot of plotting.

ellisonbg

A few minor things, but this is really fantastic!

ellisonbg · 2018-05-20T15:43:33Z

altair/utils/data.py

+    """
+    Write the data model to a .json file and return a url based data model.
+    """
+    data_json = _data_to_json_string(data)


I tested the logic locally and it works as expected. In particular I made multiple plots of the same dataset (one file) and then mutated the data frame (second file). Very nice!

ellisonbg · 2018-05-20T15:47:03Z

altair/utils/data.py

+def _data_to_csv_string(data):
+    """return a CSV string representation of the input data"""
+    check_data_type(data)
+    if isinstance(data, pd.DataFrame):


Just thinking - shouldn't the csv transformer also work with dict/values as the json transformer does? I know the original code didn't work this way, but I don't see any reason to not add the same logic here.

Sounds good – I'll add that.

ellisonbg · 2018-05-20T15:47:17Z

altair/utils/data.py

+        data = sanitize_dataframe(data)
+        return data.to_csv(index=False)
+    else:
+        raise NotImplementedError('to_csv only works with Pandas DataFrame objects.')


And this error message would need to change...

ellisonbg · 2018-05-20T15:48:10Z

altair/utils/tests/test_data.py

@@ -63,3 +66,39 @@ def test_type_error():
    for f in (sample, limit_rows, to_values):
        with pytest.raises(TypeError):
            pipe(0, f)
+
+
+def test_to_json():


Maybe also a simple test that covers the dict/values path?

Yep, good idea.

jakevdp · 2018-05-21T03:20:08Z

I added support for dict input in to_csv, and also added tests of dict input for to_json and to_csv.

FlorianGD · 2018-05-22T15:36:17Z

Great, thanks!

jakevdp added 2 commits May 17, 2018 11:17

ENH: make to_json & to_csv transformers have deterministic filenames

4076f04

TST: add tests of to_json and to_csv data transformers

d140406

jakevdp requested a review from ellisonbg May 17, 2018 18:59

jakevdp added this to the 2.1 milestone May 18, 2018

ellisonbg requested changes May 20, 2018

View reviewed changes

jakevdp added 2 commits May 20, 2018 20:18

ENH: add dict support in to_csv data_transformer

912b476

TST: add tests of dict input to data_transformers

c0d324b

ellisonbg approved these changes May 22, 2018

View reviewed changes

ellisonbg merged commit c70990b into vega:master May 22, 2018

jakevdp deleted the tojson branch May 22, 2018 01:41

jakevdp mentioned this pull request May 22, 2018

Name and location of json files created by alt.data_transformers.enable('json') #857

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: make to_json & to_csv transformers have deterministic filenames #862

ENH: make to_json & to_csv transformers have deterministic filenames #862

jakevdp commented May 17, 2018 •

edited

Loading

ellisonbg left a comment

ellisonbg May 20, 2018

ellisonbg May 20, 2018

jakevdp May 21, 2018

ellisonbg May 20, 2018

ellisonbg May 20, 2018

jakevdp May 21, 2018

jakevdp commented May 21, 2018

FlorianGD commented May 22, 2018

ENH: make to_json & to_csv transformers have deterministic filenames #862

ENH: make to_json & to_csv transformers have deterministic filenames #862

Conversation

jakevdp commented May 17, 2018 • edited Loading

ellisonbg left a comment

Choose a reason for hiding this comment

ellisonbg May 20, 2018

Choose a reason for hiding this comment

ellisonbg May 20, 2018

Choose a reason for hiding this comment

jakevdp May 21, 2018

Choose a reason for hiding this comment

ellisonbg May 20, 2018

Choose a reason for hiding this comment

ellisonbg May 20, 2018

Choose a reason for hiding this comment

jakevdp May 21, 2018

Choose a reason for hiding this comment

jakevdp commented May 21, 2018

FlorianGD commented May 22, 2018

jakevdp commented May 17, 2018 •

edited

Loading