Skip to content
This repository has been archived by the owner on Sep 1, 2020. It is now read-only.

Cartesian product ordering #37

Open
marcusps opened this issue Feb 6, 2015 · 4 comments
Open

Cartesian product ordering #37

marcusps opened this issue Feb 6, 2015 · 4 comments

Comments

@marcusps
Copy link

marcusps commented Feb 6, 2015

Although I suppose it is only a matter of convention, it strikes me as unnatural to have product return a sequence that is not in lexicographical order — or rather, the results is lexicographically ordered but from right to left instead of left to right. As the documentation illustrates, the snipped

for p in product(1:3,1:2)
    @show p
end

results in

p = (1,1)
p = (2,1)
p = (3,1)
p = (1,2)
p = (2,2)
p = (3,2)

Other implementations of a cartesian product (in Mathematica and Python) seem to return sequences in left-to-right lexicographical order. The result of the snippet above would then be

p = (1,1)
p = (1,2)
p = (2,1)
p = (2,2)
p = (3,1)
p = (3,2)

Is there a deeper motivation for the right-to-left choice that is currently implemented?

@simonster
Copy link
Member

Mathematically the Cartesian product is a set, so I'm not sure there's a convention for how it should be ordered. Julia's arrays are column-major, so traversing them according in the order returned by the product iterator is equivalent to linear indexing. Mathematica and Python are both row-major, so the other order above is equivalent to linear indexing in those languages. Another way to motivate this is that collect(product(1:3, 1:2)) returns the same thing as:

julia> vec([(i, j) for i = 1:3, j = 1:2])
6-element Array{(Int64,Int64),1}:
 (1,1)
 (2,1)
 (3,1)
 (1,2)
 (2,2)
 (3,2)

Because Julia is column-major, vec first goes down the columns and then down the rows, rather than the other way around.

If there's a use case for a row-major Cartesian product iterator, we could add it to the package, but IMO the current ordering is reasonable and should be the default.

@marcusps
Copy link
Author

marcusps commented Feb 7, 2015

Sets may not have an ordering convention, but the enumeration of the elements of a set (or iteration over them) is quite often done in (left-to-right) lexicographical order. Although I see your motivation for bringing up row-major vs column-major conventions, I don't think that is why Python choose to enumerate Cartesian products in the left-to-right order, as iterators usually try to avoid instantiating the corresponding arrays explicitly (I don't think Mathematica has iterators, but I could be wrong)

Although we are talking about Cartesian products, the issue came about when I was writing some code dealing with Kronecker products. I was trying to project vectors and operators into particular subspaces defined in terms of properties of the tuples of indices of the matrices that were ⊗ together, which required me to enumerate the set of indices in the same order that they appear in a Kronecker product. Regardless of row- or column-major ordering, when one enumerates the indices along a dimension of A⊗B, what one obtains in the Cartesian product of the index sets of A and B but enumerated lexicographically in left-to-right order (at least that is how it is described in the linear algebra texts I am familiar with, such as Horn and Johnson).

For that reason I would say that the left-to-right lexicographical enumeration is important to have in the library (it will have many applications), and I would even hazard to argue that it should be the default, but I can't claim to know that this is indeed the most prevalent ordering in mathematical applications.

Lexicographical ordering could also be motivated by using nested loops instead of list comprehensions, and there no matrix element ordering convention plays a role, although I don't think that nesting (nor list vectorization) has much to do with how iterators are implemented.

@timholy
Copy link
Member

timholy commented Feb 7, 2015

I agree with @simonster; even if you don't instantiate the array, the returned product can be used for indexing, and there are huge cache-friendliness advantages to traversing an array in the same order as it is laid out in memory.

But there's no reason not to have both. A keyword argument to product seems like a reasonable choice to me. Since this matters to you @marcusps, can you please tackle this?

@marcusps
Copy link
Author

marcusps commented Feb 7, 2015

Sure thing.
On Feb 7, 2015 7:24 AM, "Tim Holy" [email protected] wrote:

I agree with @simonster https://github.com/simonster; even if you don't
instantiate the array, the returned product can be used for indexing, and
there are huge cache-friendliness advantages to traversing an array in the
same order as it is laid out in memory.

But there's no reason not to have both. A keyword argument to product
seems like a reasonable choice to me. Since this matters to you @marcusps
https://github.com/marcusps, can you please tackle this?


Reply to this email directly or view it on GitHub
#37 (comment)
.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants