Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AbstractReMat, make reterms and feterms fields #380

Merged
merged 3 commits into from
Sep 28, 2020
Merged

Conversation

dmbates
Copy link
Collaborator

@dmbates dmbates commented Sep 18, 2020

  • add an abstract type AbstractReMat which, I think, addresses ReMat type that holds "raw" matrix #77 (@kleinschmidt please confirm if this is what you want)
  • add reterms and feterms as fields in a LinearMixedModel instead of properties
    • these fields are just vectors of pointers so the cost is negligible
    • evaluating these vectors from the allterms field was surprisingly expensive when done in a hot loop

@dmbates dmbates marked this pull request as draft September 18, 2020 17:40
@codecov
Copy link

codecov bot commented Sep 18, 2020

Codecov Report

Merging #380 into master will increase coverage by 0.12%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #380      +/-   ##
==========================================
+ Coverage   94.90%   95.03%   +0.12%     
==========================================
  Files          23       23              
  Lines        1591     1612      +21     
==========================================
+ Hits         1510     1532      +22     
+ Misses         81       80       -1     
Impacted Files Coverage Δ
src/MixedModels.jl 100.00% <ø> (ø)
src/generalizedlinearmixedmodel.jl 83.10% <100.00%> (+0.77%) ⬆️
src/linearmixedmodel.jl 98.89% <100.00%> (-0.02%) ⬇️
src/mixedmodel.jl 80.95% <100.00%> (ø)
src/remat.jl 95.37% <100.00%> (+0.11%) ⬆️
src/randomeffectsterm.jl 94.11% <0.00%> (ø)
src/utilities.jl 98.95% <0.00%> (+0.20%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 99935f3...1958d0f. Read the comment docs.

@dmbates
Copy link
Collaborator Author

dmbates commented Sep 18, 2020

I am perplexed by the CI failure on nightly but I can reproduce it locally. I doubt that I will be able to get into the innards of the compiler but I may be able to create a reproducible example to allow others to isolate it.

@dmbates
Copy link
Collaborator Author

dmbates commented Sep 18, 2020

The failure on nightly can be reproduced by

using MixedModels, StatsModels
LMM = LinearMixedModel
d3 = MixedModels.dataset(:d3)
sch = schema(d3)
vf1 = modelcols(apply_schema(@formula(y ~ 1 + u + (1+u|g)), sch, LMM), d3)[2][2]

I may need to call in the cavalry (@kleinschmidt, @palday) on this one.

The first reference to our code in the traceback is

_ranef_refs at /home/bates/.julia/dev/MixedModels/src/randomeffectsterm.jl:103

Do you think we should try to use the DataAPI generics instead of messing with the CategoricalArrays innards?

@dmbates
Copy link
Collaborator Author

dmbates commented Sep 18, 2020

So I have good news, I think. The problem with nightly can be triggered without calling any code in MixedModels or StatsModels. Just

using StatsBase
using MixedModels: dataset
describe(dataset(:d3))

seems to be sufficient. Although I am using MixedModels to get the data set, it is Feather in the background creating the data frame. The segfault backtrace is

julia> versioninfo()
Julia Version 1.6.0-DEV.969
Commit 81f79d5f57 (2020-09-18 21:35 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, skylake)

julia> describe(dataset(:d3))

signal (11): Segmentation fault
in expression starting at REPL[4]:1
_ZNK12_GLOBAL__N_116X86MCCodeEmitter17encodeInstructionERKN4llvm6MCInstERNS1_11raw_ostreamERNS1_15SmallVectorImplINS1_7MCFixupEEERKNS1_15MCSubtargetInfoE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13MCELFStreamer14EmitInstToDataERKNS_6MCInstERKNS_15MCSubtargetInfoE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm16MCObjectStreamer19EmitInstructionImplERKNS_6MCInstERKNS_15MCSubtargetInfoE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm16MCObjectStreamer15EmitInstructionERKNS_6MCInstERKNS_15MCSubtargetInfoE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13X86AsmPrinter23EmitAndCountInstructionERNS_6MCInstE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13X86AsmPrinter15EmitInstructionEPKNS_12MachineInstrE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm10AsmPrinter16EmitFunctionBodyEv at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13X86AsmPrinter20runOnMachineFunctionERNS_15MachineFunctionE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /home/bates/git/julia/usr/bin/../lib/libLLVM-9jl.so (unknown line)
operator() at /home/bates/git/julia/src/jitlayers.cpp:536
addModule at /home/bates/git/julia/usr/include/llvm/ExecutionEngine/Orc/IRCompileLayer.h:93 [inlined]
addModule at /home/bates/git/julia/src/jitlayers.cpp:658
jl_add_to_ee at /home/bates/git/julia/src/jitlayers.cpp:952 [inlined]
jl_add_to_ee at /home/bates/git/julia/src/jitlayers.cpp:996
jl_add_to_ee at /home/bates/git/julia/src/jitlayers.cpp:981
jl_add_to_ee at /home/bates/git/julia/src/jitlayers.cpp:981
jl_add_to_ee at /home/bates/git/julia/src/jitlayers.cpp:1018 [inlined]
_jl_compile_codeinst at /home/bates/git/julia/src/jitlayers.cpp:132
jl_generate_fptr at /home/bates/git/julia/src/jitlayers.cpp:313
jl_compile_method_internal at /home/bates/git/julia/src/gf.c:1895
jl_compile_method_internal at /home/bates/git/julia/src/gf.c:1849 [inlined]
_jl_invoke at /home/bates/git/julia/src/gf.c:2155 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
#59 at /home/bates/.julia/packages/DataFrames/cdZCk/src/abstractdataframe/abstractdataframe.jl:584
unknown function (ip: 0x7f6c1e04d951)
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
iterate at ./generator.jl:47 [inlined]
_collect at ./array.jl:699
collect_similar at ./array.jl:628
unknown function (ip: 0x7f6c1e04c5bc)
map at ./abstractarray.jl:2233 [inlined]
_describe at /home/bates/.julia/packages/DataFrames/cdZCk/src/abstractdataframe/abstractdataframe.jl:578
#describe#53 at /home/bates/.julia/packages/DataFrames/cdZCk/src/abstractdataframe/abstractdataframe.jl:540
unknown function (ip: 0x7f6c1e04b4ce)
describe at /home/bates/.julia/packages/DataFrames/cdZCk/src/abstractdataframe/abstractdataframe.jl:540
unknown function (ip: 0x7f6c1e047967)
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
jl_apply at /home/bates/git/julia/src/julia.h:1682 [inlined]
do_call at /home/bates/git/julia/src/interpreter.c:115
eval_value at /home/bates/git/julia/src/interpreter.c:204
eval_stmt_value at /home/bates/git/julia/src/interpreter.c:155 [inlined]
eval_body at /home/bates/git/julia/src/interpreter.c:557
jl_interpret_toplevel_thunk at /home/bates/git/julia/src/interpreter.c:669
jl_toplevel_eval_flex at /home/bates/git/julia/src/toplevel.c:837
jl_toplevel_eval_flex at /home/bates/git/julia/src/toplevel.c:785
jl_toplevel_eval_in at /home/bates/git/julia/src/toplevel.c:880
eval at ./boot.jl:345
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
eval_user_input at /home/bates/git/julia/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:139
repl_backend_loop at /home/bates/git/julia/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:200
start_repl_backend at /home/bates/git/julia/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:185
#run_repl#42 at /home/bates/git/julia/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:316
run_repl at /home/bates/git/julia/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:304
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
#848 at ./client.jl:387
jfptr_YY.848_32282 at /home/bates/git/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
jl_apply at /home/bates/git/julia/src/julia.h:1682 [inlined]
do_apply at /home/bates/git/julia/src/builtins.c:672
jl_f__apply_latest at /home/bates/git/julia/src/builtins.c:722
#invokelatest#2 at ./essentials.jl:709 [inlined]
invokelatest at ./essentials.jl:708 [inlined]
run_main_repl at ./client.jl:372
exec_options at ./client.jl:302
_start at ./client.jl:485
jfptr__start_32201 at /home/bates/git/julia/usr/lib/julia/sys.so (unknown line)
_jl_invoke at /home/bates/git/julia/src/gf.c:2163 [inlined]
jl_apply_generic at /home/bates/git/julia/src/gf.c:2345
unknown function (ip: 0x5590c50cc9d9)
unknown function (ip: 0x5590c50cc5a6)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x5590c50cc64d)
Allocations: 21210617 (Pool: 21206691; Big: 3926); GC: 17
Segmentation fault (core dumped)

@dmbates
Copy link
Collaborator Author

dmbates commented Sep 19, 2020

Probably the same problem as reported in JuliaLang/julia#37640

@palday
Copy link
Member

palday commented Sep 19, 2020

I'll review later -- but since the failure is unrelated and on the optional Future tests, I would count this as an overall CI pass.

@@ -355,6 +355,8 @@ function GeneralizedLinearMixedModel(
LMM = LinearMixedModel(
LMM.formula,
LMM.allterms,
LMM.reterms,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it wouldn't be easier to just modify the already constructed LMM. There's not much overhead here since we're side-stepping the constructor and not doing any extra allocations, but we could replace the weights and response fields in the existing struct (I just checked and LinearMixedModel is not immutable.)

@kleinschmidt
Copy link
Member

As far as #77 goes it's a good start; we'd also need to widen the signatures on some of the methods that take ReMats now and define what the interface for AbstractReMat looks like but as long as the abstract type is in place I don't think any of those changes would be blocking.

@palday
Copy link
Member

palday commented Sep 28, 2020

@dmbates I think we're good to go the release version with the expanded type hierarchy. As we see actual use cases of AbstractReMat, we can think about how to expand the method signatures for various things and what the type contract / interface for AbstractReMat should look like.

In other words: I'm for merging this in, but keeping "more AbstractReMat work" in our todo list.

(This is just repeating what @kleinschmidt said with different words.)

@dmbates
Copy link
Collaborator Author

dmbates commented Sep 28, 2020

I will review places where feterms and reterms provide for a cleaner interface than the current allterms and push changes. Then I need to work out how to change this PR from a Draft to a "for real" PR.

@dmbates dmbates linked an issue Sep 28, 2020 that may be closed by this pull request
@dmbates dmbates marked this pull request as ready for review September 28, 2020 19:07
@dmbates
Copy link
Collaborator Author

dmbates commented Sep 28, 2020

@palday Please squash and merge if this looks okay.

@palday palday merged commit fd07f92 into master Sep 28, 2020
@palday palday deleted the refeallterms branch September 28, 2020 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ReMat type that holds "raw" matrix
3 participants