Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BoxPlot - multiple issues #2222

Closed
3 tasks done
nkconnor opened this issue Feb 22, 2017 · 5 comments · Fixed by #11199
Closed
3 tasks done

BoxPlot - multiple issues #2222

nkconnor opened this issue Feb 22, 2017 · 5 comments · Fixed by #11199
Labels
.pinned Draws attention

Comments

@nkconnor
Copy link

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if any
  • I have reproduced the issue with at least the latest released version of superset
  • I have checked the issue tracker for the same issue and I haven't found one similar

Superset version

0.15.4

Expected results

Box plot displays using provided options.

Actual results

Box plot throws KeyError on query for group by terms (1 or many)

Steps to reproduce

Add a datasource

SELECT *,
        ROUND(RAND(1), 2) AS `measurement` 
FROM (
    SELECT
        NOW()-INTERVAL 1 HOUR AS `date`
    ) t1
    CROSS JOIN (
        SELECT "US" AS `attribute_1` UNION SELECT "CA" UNION SELECT "DE" UNION SELECT "AU"
    ) t2
     CROSS JOIN (
        SELECT "M" AS `attribute_2` UNION SELECT "F" UNION SELECT "C" UNION SELECT "N\A" 
    ) t3
;
  1. Enter visualization for that data source
  2. Select Box Plot
  3. Add measurement to metrics; attribute_1, attribute_2, together or alone as group by terms
  4. Hit Query

Stack trace

2017-02-22 16:27:06,176:ERROR:root:u'attribute_1'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/superset/views.py", line 1406, in explore_json
    payload = viz_obj.get_payload()
  File "/usr/local/lib/python2.7/dist-packages/superset/viz.py", line 360, in get_payload
    data = self.get_data()
  File "/usr/local/lib/python2.7/dist-packages/superset/viz.py", line 885, in get_data
    df = self.get_df()
  File "/usr/local/lib/python2.7/dist-packages/superset/viz.py", line 858, in get_df
    df = df.groupby(form_data.get('groupby')).agg(aggregate)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 3778, in groupby
    **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 1427, in groupby
    return klass(obj, by, **kwds)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 354, in __init__
    mutated=self.mutated)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2383, in _get_grouper
    in_axis, name, gpr = True, gpr, obj[gpr]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1997, in __getitem__
    return self._getitem_column(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2004, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1350, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3290, in get
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/indexes/base.py", line 1947, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)
  File "pandas/index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)
  File "pandas/hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)
  File "pandas/hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)
KeyError: u'attribute_1'

Additionally the Box Plot graph type is requiring a date. This shouldn't be necessary.

@prcastro
Copy link

I'm having the same issue. Box plot is requiring a datetime column (which shouldn't be necessary) and I can't plot what I need

@AxelMathei
Copy link
Contributor

The datetime issue was already mentioned in #786 and the commit 9cdd289 fixed it but the is_timeseries boolean is again at True.

@jpambrun
Copy link

jpambrun commented Dec 4, 2018

Any way to move that forward? I am new to superset and I want to create a boxtplot of patient weight with respect to sex.

I am forced to aggregate weight and the query looks like :

SELECT patientsex AS patientsex,
       studydate AS __timestamp,
       AVG(patientweight) AS "AVG(patientweight)"
FROM images
GROUP BY patientsex,
         studydate
ORDER BY "AVG(patientweight)" DESC

the studydate AS __timestamp and GROUP BY studydate is not something I want and is not something I can change from the UI. I assume this is caused by the is_timeseries=False flag?

I don't quite understand the use-case for a timeseries-based boxplots; is this really the intended behaviour?

@stale
Copy link

stale bot commented Apr 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 10, 2019
@muraiki
Copy link

muraiki commented Apr 11, 2019

Please don't close this issue, as it prevents boxplots from being used.

@stale stale bot removed the inactive Inactive for >= 30 days label Apr 11, 2019
@mistercrunch mistercrunch added the .pinned Draws attention label Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
.pinned Draws attention
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants