Skip to content

Commit

Permalink
Add missing and Missing (JuliaLang#24653)
Browse files Browse the repository at this point in the history
Add basic support with special methods for operators and standard math functions.
Adapt ==(::AbstractArray, ::AbstractArray), all() and any() to support
three-valued logic. This requires defining missing and Missing early in the
bootstrap process, but other missing-related code is included relatively late
to be able to add methods to Base functions defined in various places.
Add new manual section about missing values.
  • Loading branch information
nalimilan authored and evetion committed Dec 12, 2017
1 parent 6018887 commit c070ae1
Show file tree
Hide file tree
Showing 20 changed files with 850 additions and 53 deletions.
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ New language features
`@generated` and normal implementations of part of a function. Surrounding code
will be common to both versions ([#23168]).

* The `missing` singleton object (of type `Missing`) has been added to represent
missing values ([#24653]). It propagates through standard operators and mathematical functions,
and implements three-valued logic, similar to SQLs `NULL` and R's `NA`.

Language changes
----------------

Expand Down Expand Up @@ -1700,3 +1704,4 @@ Command-line option changes
[#24320]: https://github.com/JuliaLang/julia/issues/24320
[#24396]: https://github.com/JuliaLang/julia/issues/24396
[#24413]: https://github.com/JuliaLang/julia/issues/24413
[#24653]: https://github.com/JuliaLang/julia/issues/24653
8 changes: 6 additions & 2 deletions base/abstractarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1572,12 +1572,16 @@ function (==)(A::AbstractArray, B::AbstractArray)
if isa(A,AbstractRange) != isa(B,AbstractRange)
return false
end
anymissing = false
for (a, b) in zip(A, B)
if !(a == b)
eq = (a == b)
if ismissing(eq)
anymissing = true
elseif !eq
return false
end
end
return true
return anymissing ? missing : true
end

# sub2ind and ind2sub
Expand Down
23 changes: 23 additions & 0 deletions base/essentials.jl
Original file line number Diff line number Diff line change
Expand Up @@ -731,3 +731,26 @@ This function simply returns its argument by default, since the elements
of a general iterator are normally considered its "values".
"""
values(itr) = itr

"""
Missing
A type with no fields whose singleton instance [`missing`](@ref) is used
to represent missing values.
"""
struct Missing end

"""
missing
The singleton instance of type [`Missing`](@ref) representing a missing value.
"""
const missing = Missing()

"""
ismissing(x)
Indicate whether `x` is [`missing`](@ref).
"""
ismissing(::Any) = false
ismissing(::Missing) = true
6 changes: 6 additions & 0 deletions base/exports.jl
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ export
Irrational,
Matrix,
MergeSort,
Missing,
NTuple,
Nullable,
ObjectIdDict,
Expand Down Expand Up @@ -149,6 +150,7 @@ export
EOFError,
InvalidStateException,
KeyError,
MissingException,
NullException,
ParseError,
SystemError,
Expand Down Expand Up @@ -881,6 +883,10 @@ export
isready,
fetch,

# missing values
ismissing,
missing,

# time
sleep,
time,
Expand Down
118 changes: 118 additions & 0 deletions base/missing.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# This file is a part of Julia. License is MIT: https://julialang.org/license

# Missing, missing and ismissing are defined in essentials.jl

show(io::IO, x::Missing) = print(io, "missing")

"""
MissingException(msg)
Exception thrown when a [`missing`](@ref) value is encountered in a situation
where it is not supported. The error message, in the `msg` field
may provide more specific details.
"""
struct MissingException <: Exception
msg::AbstractString
end

showerror(io::IO, ex::MissingException) =
print(io, "MissingException: ", ex.msg)

promote_rule(::Type{Missing}, ::Type{T}) where {T} = Union{T, Missing}
promote_rule(::Type{Union{S,Missing}}, ::Type{T}) where {T,S} = Union{promote_type(T, S), Missing}
promote_rule(::Type{Any}, ::Type{T}) where {T} = Any
promote_rule(::Type{Any}, ::Type{Missing}) = Any
promote_rule(::Type{Missing}, ::Type{Any}) = Any
promote_rule(::Type{Missing}, ::Type{Missing}) = Missing

convert(::Type{Union{T, Missing}}, x) where {T} = convert(T, x)
# To print more appropriate message than "T not defined"
convert(::Type{Missing}, x) = throw(MethodError(convert, (Missing, x)))
convert(::Type{Missing}, ::Missing) = missing

# Comparison operators
==(::Missing, ::Missing) = missing
==(::Missing, ::Any) = missing
==(::Any, ::Missing) = missing
# To fix ambiguity
==(::Missing, ::WeakRef) = missing
==(::WeakRef, ::Missing) = missing
isequal(::Missing, ::Missing) = true
isequal(::Missing, ::Any) = false
isequal(::Any, ::Missing) = false
<(::Missing, ::Missing) = missing
<(::Missing, ::Any) = missing
<(::Any, ::Missing) = missing
isless(::Missing, ::Missing) = false
isless(::Missing, ::Any) = false
isless(::Any, ::Missing) = true

# Unary operators/functions
for f in (:(!), :(+), :(-), :(identity), :(zero), :(one), :(oneunit),
:(abs), :(abs2), :(sign),
:(acos), :(acosh), :(asin), :(asinh), :(atan), :(atanh),
:(sin), :(sinh), :(cos), :(cosh), :(tan), :(tanh),
:(exp), :(exp2), :(expm1), :(log), :(log10), :(log1p),
:(log2), :(exponent), :(sqrt), :(gamma), :(lgamma),
:(iseven), :(ispow2), :(isfinite), :(isinf), :(isodd),
:(isinteger), :(isreal), :(isnan), :(isempty),
:(iszero), :(transpose), :(float))
@eval Math.$(f)(::Missing) = missing
end

for f in (:(Base.zero), :(Base.one), :(Base.oneunit))
@eval function $(f)(::Type{Union{T, Missing}}) where T
T === Any && throw(MethodError($f, (Any,))) # To prevent StackOverflowError
$f(T)
end
end

# Binary operators/functions
for f in (:(+), :(-), :(*), :(/), :(^),
:(div), :(mod), :(fld), :(rem), :(min), :(max))
@eval begin
# Scalar with missing
($f)(::Missing, ::Missing) = missing
($f)(d::Missing, x::Number) = missing
($f)(d::Number, x::Missing) = missing
end
end

# Rounding and related functions
for f in (:(ceil), :(floor), :(round), :(trunc))
@eval begin
($f)(::Missing, digits::Integer=0, base::Integer=0) = missing
($f)(::Type{>:Missing}, ::Missing) = missing
($f)(::Type{T}, ::Missing) where {T} =
throw(MissingException("cannot convert a missing value to type $T"))
end
end

# to avoid ambiguity warnings
(^)(::Missing, ::Integer) = missing

# Bit operators
(&)(::Missing, ::Missing) = missing
(&)(a::Missing, b::Bool) = ifelse(b, missing, false)
(&)(b::Bool, a::Missing) = ifelse(b, missing, false)
(&)(::Missing, ::Integer) = missing
(&)(::Integer, ::Missing) = missing
(|)(::Missing, ::Missing) = missing
(|)(a::Missing, b::Bool) = ifelse(b, true, missing)
(|)(b::Bool, a::Missing) = ifelse(b, true, missing)
(|)(::Missing, ::Integer) = missing
(|)(::Integer, ::Missing) = missing
xor(::Missing, ::Missing) = missing
xor(a::Missing, b::Bool) = missing
xor(b::Bool, a::Missing) = missing
xor(::Missing, ::Integer) = missing
xor(::Integer, ::Missing) = missing

*(d::Missing, x::AbstractString) = missing
*(d::AbstractString, x::Missing) = missing

function float(A::AbstractArray{Union{T, Missing}}) where {T}
U = typeof(float(zero(T)))
convert(AbstractArray{Union{U, Missing}}, A)
end
float(A::AbstractArray{Missing}) = A
30 changes: 26 additions & 4 deletions base/reduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -597,6 +597,9 @@ Determine whether predicate `p` returns `true` for any elements of `itr`, return
`true` as soon as the first item in `itr` for which `p` returns `true` is encountered
(short-circuiting).
If the input contains [`missing`](@ref) values, return `missing` if all non-missing
values are `false` (or equivalently, if the input contains no `true` value).
```jldoctest
julia> any(i->(4<=i<=6), [3,5,7])
true
Expand All @@ -610,10 +613,16 @@ true
```
"""
function any(f, itr)
anymissing = false
for x in itr
f(x) && return true
v = f(x)
if ismissing(v)
anymissing = true
elseif v
return true
end
end
return false
return anymissing ? missing : false
end

"""
Expand All @@ -623,6 +632,9 @@ Determine whether predicate `p` returns `true` for all elements of `itr`, return
`false` as soon as the first item in `itr` for which `p` returns `false` is encountered
(short-circuiting).
If the input contains [`missing`](@ref) values, return `missing` if all non-missing
values are `true` (or equivalently, if the input contains no `false` value).
```jldoctest
julia> all(i->(4<=i<=6), [4,5,6])
true
Expand All @@ -635,12 +647,22 @@ false
```
"""
function all(f, itr)
anymissing = false
for x in itr
f(x) || return false
v = f(x)
if ismissing(v)
anymissing = true
# this syntax allows throwing a TypeError for non-Bool, for consistency with any
elseif v
continue
else
return false
end
end
return true
return anymissing ? missing : true
end


## in & contains

"""
Expand Down
2 changes: 1 addition & 1 deletion base/reducedim.jl
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ function reducedim_init(f, op::typeof(*), A::AbstractArray, region)
end
function _reducedim_init(f, op, fv, fop, A, region)
T = promote_union(eltype(A))
if applicable(zero, T)
if T !== Any && applicable(zero, T)
x = f(zero(T))
z = op(fv(x), fv(x))
Tr = typeof(z) == typeof(x) && !isbits(T) ? T : typeof(z)
Expand Down
3 changes: 3 additions & 0 deletions base/sysimg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,9 @@ const × = cross
# statistics
include("statistics.jl")

# missing values
include("missing.jl")

# libgit2 support
include("libgit2/libgit2.jl")

Expand Down
1 change: 1 addition & 0 deletions doc/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ const PAGES = [
"manual/metaprogramming.md",
"manual/arrays.md",
"manual/linear-algebra.md",
"manual/missing.md",
"manual/networking-and-streams.md",
"manual/parallel-computing.md",
"manual/dates.md",
Expand Down
1 change: 1 addition & 0 deletions doc/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
* [Metaprogramming](@ref)
* [Multi-dimensional Arrays](@ref man-multi-dim-arrays)
* [Linear Algebra](@ref)
* [Missing Values](@ref missing)
* [Networking and Streams](@ref)
* [Parallel Computing](@ref)
* [Date and DateTime](@ref)
Expand Down
7 changes: 4 additions & 3 deletions doc/src/manual/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -617,16 +617,17 @@ all/many future usages of the other functions in module Foo that depend on calli

Unlike many languages (for example, C and Java), Julia does not have a "null" value. When a reference
(variable, object field, or array element) is uninitialized, accessing it will immediately throw
an error. This situation can be detected using the `isdefined` function.
an error. This situation can be detected using the [`isdefined`](@ref) or [`isassigned`](@ref)
functions.

Some functions are used only for their side effects, and do not need to return a value. In these
cases, the convention is to return the value `nothing`, which is just a singleton object of type
`Void`. This is an ordinary type with no fields; there is nothing special about it except for
this convention, and that the REPL does not print anything for it. Some language constructs that
would not otherwise have a value also yield `nothing`, for example `if false; end`.

For situations where a value exists only sometimes (for example, missing statistical data), it
is best to use the `Nullable{T}` type, which allows specifying the type of a missing value.
To represent missing data in the statistical sense (`NA` in R or `NULL` in SQL), use the
[`missing`](@ref) object. See the [`Missing Values|](@ref missing) section for more details.

The empty tuple (`()`) is another form of nothingness. But, it should not really be thought of
as nothing but rather a tuple of zero values.
Expand Down
1 change: 1 addition & 0 deletions doc/src/manual/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
* [Metaprogramming](@ref)
* [Multi-dimensional Arrays](@ref man-multi-dim-arrays)
* [Linear algebra](@ref)
* [Missing Values](@ref missing)
* [Networking and Streams](@ref)
* [Parallel Computing](@ref)
* [Date and DateTime](@ref)
Expand Down
Loading

0 comments on commit c070ae1

Please sign in to comment.