Default to grouped-distribution scale? (instead of equal buckets) #7

daguar · 2014-01-17T22:42:14Z

@lyzidiamond raised this issue. Right now the data is scaled really simply (given min and max, break it into N buckets, and put each value in a bucket, called the Quantize scale in D3.)

This is analytically-problematic for values that are close to the bucket boundaries (since otherwise-similar values may be broken out.) @lyzidiamond mentioned that ArcGIS defaults to a grouped-distribution, and so might be worth using that instead.

Other options:

Warn user when a quantize scale has many boundary problems
Make scaling approach a configurable parameter

cc @itsthomson, in case you have an opinion

itsthomson · 2014-01-17T22:50:06Z

Is your goal to have each bucket equally sized? If so, +1 to grouping by quantiles--it's more robust against any kind of distribution.

daguar · 2014-01-17T23:01:44Z

@itsthomson I should clarify. Right now, the interval size is equal across buckets (example, 5-bucket 1-10 scale is 0-2, >2-4, etc.)

So you're saying a quantile scale (equal number of data points per bucket) is more robust for an arbitrary distro, yes?

daguar · 2014-01-17T23:08:35Z

Also, high-level goal is really "most clearly shows gradients of an arbitrary distribution."

If there's an intermediate metric to instrument the distro (variance?) I'd be even into using that, and then conditioning the scaling on it.

Mr0grog · 2014-01-18T01:33:43Z

This potentially a really naive question, but would just using gradients instead of buckets be the simple solution here?

lyzidiamond · 2014-01-18T06:31:56Z

Maybe I wasn't understanding how it was being grouped before. I believed that if there were 5 buckets, assuming 50 states, the lowest 10 FEATURES would go in the first group, the next lowest ten FEATURES in the second, etc. If they are being broken up equally as you describe (assuming 50 values ranging from 0 to 20, 5 buckets, first bucket is 0-4 with a variable number of features, next is 5-8 with a variable number of features, etc.) it's not as much of a big deal as I had stated previously.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default to grouped-distribution scale? (instead of equal buckets) #7

Default to grouped-distribution scale? (instead of equal buckets) #7

daguar commented Jan 17, 2014

itsthomson commented Jan 17, 2014

daguar commented Jan 17, 2014

daguar commented Jan 17, 2014

Mr0grog commented Jan 18, 2014

lyzidiamond commented Jan 18, 2014

Default to grouped-distribution scale? (instead of equal buckets) #7

Default to grouped-distribution scale? (instead of equal buckets) #7

Comments

daguar commented Jan 17, 2014

itsthomson commented Jan 17, 2014

daguar commented Jan 17, 2014

daguar commented Jan 17, 2014

Mr0grog commented Jan 18, 2014

lyzidiamond commented Jan 18, 2014