[f(x) | x=y] not the same as map(f, y) #670

ksvanhorn · 2012-04-05T04:57:32Z

typeof(map(x -> x^2, [1,2,3])) = Array{int64,1}

typeof([ x^2 | x=[1,2,3]) = Array{Any, 1}

It seems odd that the comprehension loses the type information, while map retains it.

pao · 2012-04-05T12:26:25Z

Looks like another head of the hydra that is issue 524. (Edited to delink; see below.)

StefanKarpinski · 2012-04-05T22:26:06Z

No, this is actually a completely different issue and one of the most deep and vexing problems with Julia's approach to types. The problem is that in general we have semantics that only depend on run-time types, not on anything that the compiler can infer about types. However, we also store arrays with concrete element types like Vector{Float64} inline for efficiency and memory compatibility with C/Fortran, so that you can do things like call LAPACK on a float array.

The problem comes in when you want to map or comprehend and need to know what T should be in the resulting Vector{T} object. In general, what should the result type be for

map(f, v)

or

[ f(x) | x=v ]

?

In a statically typed language, the compiler has to be able to completely determine the type behavior of f, so it knows what the storage type of the resulting array ought to be. In a traditional dynamically language, like Python or Ruby, arrays can hold any kind of object because they're actually arrays of pointers to boxed objects, so there's nothing to worry about. In Julia, however, let's say we do this:

map(x->2x+1, [0:9])

The expression [0:9] produces a vector of type Vector{Int64} and we'd really like for the result to be of type Vector{Int64} too. But how do you figure that out? Here are a few options:

compute all the values and figure out the tightest type they all fit it, in this case it would be Int64
compute a single value and assume that everything else will have the same type
use dynamic type inference to determine an upper bound on the element type

The first option sucks because you either end up storing the computed values in some inefficient intermediate location, or computing them all twice! The second option is what we're currently doing for map. I hate this option — it makes my skin crawl and my teeth ache. It must die. It's just wrong, wrong, wrong, wrong. The third option is what we're doing for comprehensions, but that ability isn't exposed to user code yet, so it can't be used in map — it's built into how comprehensions work.

The third option is the best, but it still sucks. Here's why: it makes program semantics depend on type inference — which is, in every other circumstance, just a heuristic that speeds programs up without changing their behavior. Suppose I have a situation where w = [ f(x) | x=v ] is inferred to have element type Union(Int64,ASCIIString). In some later code, you insert an ASCIIString value into w and it works just fine. Now, some time in the future, you upgrade to the latest and greatest Julia version, in which type inference has gotten sharper, which is generally a good thing. However, the type inference now determines that the element type of w actually has to be Int64 — f never actually returns an ASCIIString. Now your code breaks: when you try to insert a string into w, it doesn't work because the type of w is Vector{Int64}.

Jeff and I have talked about this a lot, especially over the past few days while we were at Lang.NEXT. I think we have worked out a way to do this that will be practical and make sure that programs that work will continue working even when the type inference is improved. Basically, the conservative thing to do here is to throw an error if the type inference can't determine a tight bound on the element type — and have the same behavior for both map and comprehensions. If you use a really complicated mapping function that type inference can't infer a tight bound for, then you will have to explicitly put a type annotation in, indicating the type you want your array to be of. That's ok. If the type inference can't figure out a tight element type, that's probably a good thing to do anyway.

pao · 2012-04-05T23:09:27Z

I should really use more question marks. Thanks for the detailed comment. I've edited my first comment to delink.

JeffBezanson · 2012-04-09T16:45:54Z

duplicate of #210.

* Base.hash for test statistics * woops, simple equality

StefanKarpinski mentioned this issue Apr 8, 2012

function types #210

Closed

JeffBezanson closed this as completed Apr 9, 2012

WestleyArgentum mentioned this issue Apr 25, 2013

map(Function, AbstractArray) and siblings use type of first output to determine type of all output #2938

Closed

simonster mentioned this issue Apr 29, 2014

RFC: Provide a way to get the inferred return type of a function #6692

Closed

tkelman mentioned this issue Dec 31, 2014

Why don't array comprehensions desugar into map? #9515

Closed

LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Oct 11, 2021

Base.hash for test statistics (JuliaLang#670)

f32d0f1

* Base.hash for test statistics * woops, simple equality

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[f(x) | x=y] not the same as map(f, y) #670

[f(x) | x=y] not the same as map(f, y) #670

ksvanhorn commented Apr 5, 2012

pao commented Apr 5, 2012

StefanKarpinski commented Apr 5, 2012

pao commented Apr 5, 2012

JeffBezanson commented Apr 9, 2012

[f(x) | x=y] not the same as map(f, y) #670

[f(x) | x=y] not the same as map(f, y) #670

Comments

ksvanhorn commented Apr 5, 2012

pao commented Apr 5, 2012

StefanKarpinski commented Apr 5, 2012

pao commented Apr 5, 2012

JeffBezanson commented Apr 9, 2012