Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

189 ordinal scale domain item ordering #1

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

monfera
Copy link
Owner

@monfera monfera commented Mar 29, 2016

CR: plotly#189

Initial, minimal changeset for demonstrating the categorymode = 'array' option.

Test:

Plotly.newPlot('embedded-graph', [
            {x: ['a','b',    'd','e','f'    ], y: [100,110,    130,140,150    ]},
            {x: ['a',    'c',    'e','f','g'], y: [101,    121,    141,151,161]}],
            {xaxis: {type: 'category', categorymode: 'array', categorylist: ['a','b','c','d','e','f','g']}}
        );

should output
image

// progressively here as we insert?
// it is assumed that this function is being invoked in the
// already sorted category order; otherwise there would be
// a disconnect between the array and the index returned

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera can you elaborate on this?

What else have you tried?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I tried things like this:

  1. I've first tried the change inside the suggested place (https://github.com/plotly/plotly.js/blob/bbb8205903cdd34ae019d5cd6a2c197e9a2550e7/src/plots/cartesian/set_convert.js#L179-L194): instead of pushing into the array, I inserted it to ensure orderedness (insertion sort). More accurately, for simplicity, I just sorted upon each insertion to try before doing too much work (paid attention to the fact that [].sort mutates the array).
  2. Then I also quickly tried doing the sort only after the loop that invokes ax.d2c but the issue is, by that point we already have the indices, and a sort renders them obsolete.

Then I spent a bit of time better understanding concepts, the data flow and the various ways it can be invoked (e.g. plotting multiple lines; overplotting onto an already existing plot etc). Learnt that currently, the categorical X axis ordering is totally driven by the order in which the point tuples arrive and made sample plots. Which was the point at which I gave up on this point for modification and went for the lower hanging fruit of array. The other options will, I think, be more intrusive to the data flow.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard maybe an example is helpful: The first vector has X cat. values ['a', 'c', 'd']. This is already sorted. Then a subsequent series is plotted in the same chart, with values for ['a', 'b', 'c', 'd']. By this point, the function has returned indices such as 0, 1, 2 for ['a', 'c', 'd'], respectively. Then in the next series comes a new point 'b' correctly inserted at index 1 but the previously returned indices will become invalid and the plot will come out nonsensical (more precisely, having checked the plot, the way it came out was consistent with this mechanics).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I also considered doing a sorting a lot earlier (more upstream), but it would have issues: 1) the code should possibly reorder only after it's established that it's a categorical axis (it can be user specified or determined heuristically); 2) by that point there are lots of places where _fullData etc. are persisted on objects in the trace order; 3) I believe it would be wrong to change the trace order just for this CR because of the snail trail example and the general possibility that the user does not want to have plotly change the user input order (which is currently implicitly taken as the trace order).

So my current thought is that we need to do the axis tick sorting downstream, nearer the point which is responsible for the d3 data binding order for the axis ticks, and we have to accommodate for the possibility that new lines introduce new points that have to be inserted in lines that have already been added previously.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera Thanks for this very detailed overview.

You've got me convinced, ax._categories can't be sorted from inside the d2c routine.

We need to fill in ax._categories earlier. I think within makeCalcdata makes the most sense at the moment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filling in ax._categories within the default step would definitely work (albeit with a slight performance hit).

We already need to loop all traces to check for box plot irregularities here, so maybe you could fill in ax._categories within the same loop.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I'll look into it, thanks! Unrelated: I learnt that date axes take ISO strings "2016-03-29" as values. When I expressly indicate axis type = 'category' it behaves as categories and the axis ticks get rendered as plaintext "2016-03-29" rather than nicely formatted date. So I'm not yet grounded in the approach for coercion which was step #2 in your original comment.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I'll look into it more closely today, but on an initial look, the suggested place isn't as trivial (to me) and I'd put it slightly after this point, because

  • setAutoType bails if ax.type!=='-' and at the only place it's called from, there is an identical (redundant) check
  • the loop in question depends on isBoxWithoutPositionCoords returning true
  • the inside of the loop (now) is solely dedicated to building up the boxPositions
  • if axis type 'category' isn't explicitly specified, it won't be known until the subsequent call to autoType() at the end

Therefore I'm planning to put the logic after having returned from setAutoType; the earliest point seems to be just before the call to setConvert.

This way, we wouldn't reuse the suggested loop, but the loop over the traces probably doesn't have a measurable performance impact anyway (I'm assuming there usually aren't thousands or 10ks of trace lines).

I'll do some more tests to ensure I'm not overlooking something; just wanted to share current thinking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anywhere, in plots/cartesian/axis_defaults.js is fine at this stage. We'll fine tune the location for performance once the functionality is in a working state.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Thanks for the quick note, I'll go ahead.

Plotly.plot(gd, [{x: ['c','a','e','b','d'], y: [15,11,12,13,14]}], { xaxis: {
type: 'category',
categorymode: 'trace'
// Wouldn't it be preferred to supply a function and plotly would have several functions like this?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice 😏 , but only for JS users.

All plotly.js attributes must be JSON serializable so that folks using our python and R libraries can have access to the same features.

…if there's no category encountered for a specific categorylist, it should yield null rather than being skipped over)
…ecked; no assumption about trace order (unlike my first cut of the test cases)
expect(gd.calcdata[0][1]).toEqual(jasmine.objectContaining({x: 0, y: 11}));
expect(gd.calcdata[0][2]).toEqual(jasmine.objectContaining({x: 4, y: 12}));
expect(gd.calcdata[0][3]).toEqual(jasmine.objectContaining({x: 1, y: 13}));
expect(gd.calcdata[0][4]).toEqual(jasmine.objectContaining({x: 3, y: 14}));
});
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Just an FYI, since I started this CR with the test cases, my conception on where order would be present became outdated. As now I understand that gd.calcdata will keep the trace order, I'm updating test cases such that the original trace order is expected, and explicit checks on the x/y tuples are in place. It looks good to me but please tell me if you have an alternative suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done.

var domTickTexts = Array.prototype.slice.call(document.querySelectorAll('g.xtick'))
.map(function(e) {return e.__data__.text;});

expect(domTickTexts).toEqual(['b', 'x', 'a', 'd', 'z', 'e', 'c']); // y, q and k has no data points
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard a few questions regarding this type of test:

  1. I've used library-free querying of the DOM; I think it's good but let me know if I should import d3 for this purpose, or you could point out another test file as a pattern to follow.
  2. I haven't yet bumped into axes/ticks tests of this nature, where we query the DOM, which is downstream of the 'viewModel' testing this file mostly has but upstream of image based testing. Do we need such DOM based testing, and are there preexisting DOM tests I should instead use that ensure e.g. proper category or tracing order?
  3. If we need such tests, should I add similar ones for the other test cases in this file?

var createGraph = require('../assets/create_graph_div');
var destroyGraph = require('../assets/destroy_graph_div');
var createGraphDiv = require('../assets/create_graph_div');
var destroyGraphDiv = require('../assets/destroy_graph_div');

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I appended 'Div' to these, to be in sync with how it's usually called.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done.


}
}
};
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Is this a reasonable cut at the coercion?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice job.

// flattenUniqueSort :: String -> Function -> [[String]] -> [String]
function flattenUniqueSort(axisLetter, sortFunction, data) {
return flattenUnique(axisLetter, data).sort(sortFunction);
return categoryArray;
}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard I quite rewrote the ordering logic this evening on initial suspicion that image based cases may fail in presence of a large number of points, because of possible slowness due to the O(N) complexity. I switched to bisection. While it didn't make a difference, better scalability is good anyway. As it changed a good bit, it's useful if you know about it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good 👍

monfera pushed a commit that referenced this pull request Aug 11, 2016
Legend item wrap with layout.legend.orientation = h
monfera added a commit that referenced this pull request Feb 5, 2017
monfera added a commit that referenced this pull request Feb 12, 2017
monfera added a commit that referenced this pull request Oct 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants