KNL planning #391

andreasnoack · 2016-11-03T13:26:57Z

UPDATED: I'll try to maintain KNL builds. You can use

/global/cscratch1/sd/noack/julia/julia      # built with gcc 4.8.5
/global/cscratch1/sd/noack/juliaintel/julia # built with icc 17.0.0 20160721

Some adjustments are necessary before Celeste can run on KNL. My thoughts were that this issue could be a source for the relevant information related to KNL, i.e. updated build information for Julia and a list issue related to KNL runs.

The main issue is that we'll need LLVM's development version to run on KNL. To use LLVM-svn, as the development version is called, we'll need the development version of Julia. Hence, Celeste will be having two moving targets which might cause some headaches because LLVM changes might break Julia. In order to make this easier to manage, I'll suggest that we keep commits SHAs for a pair of Julia and LLVM-svn that work together. I'll try to keep them updated.

So the build info right now are

Build instructions

In Make.user put

override LLVM_VER=svn
override USE_INTEL_MKL=1

and execute make -C deps get-llvm. Then checkout

Julia df62c9922c320bff0f6d32bdf3faf87337925e2f
LLVM  584a5d174f2d0776a9390b12799abe8262ffc2ba
Updated 6 March 2017

Intel sepcific

If building with the Intel compilers, it is also necessary to apply this patch to LLVM.
https://reviews.llvm.org/D27610

Right now, Julia's master version is not much different from Julia 0.5 but it might change. If that happens, we'll need to decide what to do. We might need fixes from master so the best thing might be to keep Celeste up to date with master but that could be demanding. Alternatively, we could branch off and only cherry-pick the fixes we need but I'll suggest that we take that discussion only when it becomes relevant.

Finally, there are some performance considerations for KNL. To do well on KNL we'd need threading and vectorization to work well. Some of the code base might not vectorize in the form it's written now (I don't know yet) but we should be aware that some refactoring for vectorization might be needed.

The text was updated successfully, but these errors were encountered:

andreasnoack · 2016-11-03T23:15:32Z

Celeste is running on KNL. See screenshot below. It seems to work fine except that much of the code is single threaded so the runtime is pretty slow for benchmark_infer.

andreasnoack · 2016-11-06T12:19:38Z

@kpamnany I've updated the SHAs. Yichao's patch to LLVM has been merged so latest svn version works with Julia master.

Keno · 2016-11-28T00:58:34Z

I've ported rr to Cori/KNL (rr-debugger/rr#1904). For any segfaults, etc. capturing such a problem in rr and keeping the trace would allow me to easily diagnose. I suspect it won't work with MPI out of the box, but looking at that is on my todo list for the end of next week.

andreasnoack · 2016-12-09T03:07:55Z

I've updated the SHAs to versions with Keno's fixes included. I've also rebuilt /global/cscratch1/sd/noack/julia/julia such that it includes the fixes. I'll recommend that you wait about 24 hours before you try anything with this binary since many packages need new tags after some changes to in Base's macro handling.

andreasnoack · 2016-12-09T19:58:51Z

@kpamnany I've now managed to built with icc. See the top post for a link to the binary.

andreasnoack · 2016-12-16T01:53:05Z

On KNL, the memory allocation required by Julia's threads and LAPACK (eigenvalue problem in Newton's method) through OMP can collide such that Julia crashes. We should probably set OMP_NUM_THREADS=1 when running on KNL.

jeff-regier · 2016-12-16T15:00:51Z

I always set OMP_NUM_THREADS=1 in the slurm scripts: https://github.com/jeff-regier/Celeste.jl/blob/master/nersc/infer.sl#L16 . I think @kpamnany does this too for runs on the supercomputer.

Keno · 2016-12-18T19:49:21Z

Linking JuliaLang/julia#19640

andreasnoack · 2017-01-04T18:56:07Z

I've updated the top comment with the extra info for building with the Intel compilers.

kpamnany · 2017-01-04T19:53:48Z

Building with Intel compilers, I get:

    JULIA usr/lib/julia/inference.ji
essentials.jl
ctypes.jl
generator.jl
reflection.jl
options.jl
promotion.jl
tuple.jl
range.jl
expr.jl
error.jl
bool.jl
number.jl
int.jl
A method error occurred before the base MethodError type was defined. Aborting...
Core.Inference.#convert() world 1007
(UInt128, 0)
while loading int.jl, in expression starting on line 415
rec_backtrace at /scratch/kpamnany/julia.intel/src/stackwalk.c:84
...

Any ideas?

andreasnoack · 2017-01-04T20:08:33Z

This must be a version of Julia with JuliaLang/julia#17057. I don't know why it wouldn't work on KNL but we are still adjusting to that change. If you want something running now, I'd try to build 374c3d6b3c3e15c00f7d3df8b5cb7c8a763aa746 instead of latest master.

andreasnoack · 2017-01-26T16:42:11Z

I've updated and tagged most packages to work and not show warnings on Julia master. The main exception is StaticArrays where my PR is still under review. Therefore you should use the master branch of my fork https://github.com/andreasnoack/StaticArrays.jl. Hopefully, we'll be able to get it reviewed and merged today but I don't have commit access to that repo.

A minor problem is DataFrames and DataArrays. I don't think they are critical in the computations but they still throw a lot of warnings and it is not likely that DataArrays will be fixed. We should check that we don't call any code that uses DataArrays or DataFrames when we benchmark because the deprecation warnings are slowing things down a lot.

Finally, the Intel build in the juliaintel directory segfaults in the benchmark_infer.jl tests but the gcc build doesn't. This only happens on KNL and I only realized this yesterday after all the packages had been fixed. @Keno is looking into this.

hsseung · 2017-02-21T05:55:24Z

Regarding the build info of 26 January 2017,
The Julia checkout returns the error message "fatal: reference is not a tree:"
The LLVM checkout works fine.

andreasnoack · 2017-02-21T19:00:59Z

I've just updated the commit hashes to the versions we use now. Could you try them instead?

hsseung · 2017-02-21T19:47:44Z

Thanks! This time I have the opposite problem. Julia checks out fine but LLVM does not.

andreasnoack · 2017-02-21T19:54:34Z

Hm. Just checked and it looks right. It is this commit llvm-mirror/llvm@72258b4

hsseung · 2017-02-21T23:48:33Z

I tried again with a virgin git clone of the repo, but the second checkout doesn't work. Am I doing something wrong?

$ make -C deps get-llvm

[output omitted]

$ git checkout 3181500e361991b25a0ed8d63a821eb3c7a2e4bf
Note: checking out '3181500e361991b25a0ed8d63a821eb3c7a2e4bf'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

git checkout -b new_branch_name

HEAD is now at 3181500... Fix missing root across safe-point
$ git checkout 72258b42b0805337ea2d0d042454d2a8c173fcf4
fatal: reference is not a tree: 72258b42b0805337ea2d0d042454d2a8c173fcf4

andreasnoack · 2017-02-22T00:35:39Z

Maybe there is an issue with the make target. Try to cd into deps/srccache/llvm-svn and do a git fetch origin. If that doesn't work then try checking the status of the llvm repo.

hsseung · 2017-02-22T15:22:06Z

The LLVM repo was two commits behind, so I did a git pull and then was able to checkout. Compilation worked. Thanks!

peakflops(10000) is almost 1.5 teraflops, about 20x faster than my MacBook Pro. This is encouraging! Performance is presumably limited by MKL for KNL? Theoretical max is 6 teraflops for my KNL, I believe.

Transcendental functions are 4x slower though...

kpamnany · 2017-02-22T16:13:01Z

You're now getting the same performance with Julia on KNL that we are. Andreas and Keno have been working to improve things (see the issue Keno linked above), but it's going to be a moving target for some time.

andreasnoack · 2017-02-22T16:23:08Z

I'd expect MKL's gemms to be close to what you can possibly get. Isn't the 6 teraflops for Float32s? Julia's peakflops measures Float64 performance.

If you are not already doing it, you should try to make vectorized calls to VML for transcendental functions. I see a 50x difference between using Julia's exp and VML's exp for vectors of Float32s.

hsseung · 2017-02-22T17:20:15Z

Ah you're right. The analog of peakflops for Float32 yields 3.6 teraflops, which is getting close to theoretical max.

I'm puzzled by your 50x number. I tried

julia> a=rand(Float32, 100000000);
julia> @time exp.(a);

VML.jl gave me 4x speedup on my MacBook but only a few percent on KNL. Only a single thread is used on both machines.

andreasnoack · 2017-02-22T20:59:26Z

Did you change the path to the library as described in https://github.com/JuliaMath/VML.jl#using-vmljl? It should point to the avx512_mic version. I guess that should give you 4x because the default is just AVX. Also, I timed the non-allocating version, i.e. AVX.exp!(y,x).

hsseung · 2017-04-23T20:46:34Z

Oops forgot to say that this did solve my problem. Thanks!

hayatoikoma · 2017-07-11T23:23:46Z

Is there any status update of this topic after the release of Julia v0.6? I am trying to build Julia v0.6 on KNL with Intel's compiler, but I still haven't been able to build it. It would be great if you can update the information provided above.

andreasnoack · 2017-07-12T00:36:42Z

Julia 0.6 doesn't support KNL out the box. You can use Julia 0.6 on KNL but you'll need a more recent LLVM. I think building Julia with the Intel compilers works fine but I gave up on building LLVM with the Intel compilers. GCC worked fine. Notice that you can use MKL without building Julia with the Intel compilers.

hsseung · 2017-07-12T01:22:00Z

I did not yet build 0.6 for KNL, but I did manage a MKL build for a regular Intel CPU. As Andreas says, LLVM does not build properly with Intel's icc. I think I got an error message about linking to libirc.a. Building with gcc worked, but I had to manually set the environment variables MKLROOT and LD_LIBRARY_PATH. I couldn't use the Intel script compilervars.sh because it evidently sets some incorrect variables for this build. I found the right setting of LD_LIBRARY_PATH by attempting to build a few times and examining the error messages.

hayatoikoma · 2017-07-12T01:38:51Z

Thank you, @andreasnoack and @hsseung.

I have managed to compile Julia and LLVM with GCC. I used the commits suggested on the first Andreas's comment. However, I haven't been able to compile the released v0.6 even with the GCC compiler.

andreasnoack · 2017-07-12T02:11:47Z

However, I haven't been able to compile the released v0.6 even with the GCC compiler.

Which version of LLVM are you using? The problem is that LLVM's API changes over time so Julia has to be adjusted a bit every time the LLVM version changes.

hayatoikoma · 2017-07-12T06:08:37Z

I have checked out the commit you suggested on the first comment for LLVM and checked out the release-v0.6 for Julia. Which version would you suggest for LLVM?

hayatoikoma · 2017-07-12T16:20:42Z

After trying out different versions of LLVM, I was able to compile the release-0.6 of Julia with release-40 of LLVM! Thank you for your help!

aramirezreyes · 2018-11-27T21:11:19Z

Hi. I am using Julia on Cori. Currently on Haswell (downloaded binaries for julia 1.0.2). Is there a way to get your compile script for julia on KNL?

andreasnoack · 2018-11-27T21:42:49Z

Have you tried just to run Julia 1.0.2 on a KNL node? We have upgraded LLVM in the meantime so I believe Julia should just run on KNL now. You might want to build a custom system image to the KNL, see PackageCompiler.jl and you most likely would like to use MKL, see what I'm doing in https://github.com/JuliaComputing/MKL.jl. The current trick is to adjust the paths in build_h.jl to point to libmkl_rt instead of openblas64_ and the build a native system image with PackageCompiler.

andreasnoack self-assigned this Nov 3, 2016

andreasnoack mentioned this issue Dec 1, 2016

LLVM assertion error on KNL in fft test #440

Closed

jeff-regier closed this as completed Apr 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KNL planning #391

KNL planning #391

andreasnoack commented Nov 3, 2016 •

edited

Loading

andreasnoack commented Nov 3, 2016 •

edited

Loading

andreasnoack commented Nov 6, 2016

Keno commented Nov 28, 2016

andreasnoack commented Dec 9, 2016

andreasnoack commented Dec 9, 2016

andreasnoack commented Dec 16, 2016 •

edited

Loading

jeff-regier commented Dec 16, 2016

Keno commented Dec 18, 2016

andreasnoack commented Jan 4, 2017

kpamnany commented Jan 4, 2017

andreasnoack commented Jan 4, 2017

andreasnoack commented Jan 26, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 21, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 21, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Feb 22, 2017

kpamnany commented Feb 22, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Feb 22, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Apr 23, 2017

hayatoikoma commented Jul 11, 2017

andreasnoack commented Jul 12, 2017

hsseung commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

andreasnoack commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

aramirezreyes commented Nov 27, 2018 •

edited

Loading

andreasnoack commented Nov 27, 2018

KNL planning #391

KNL planning #391

Comments

andreasnoack commented Nov 3, 2016 • edited Loading

Build instructions

Intel sepcific

andreasnoack commented Nov 3, 2016 • edited Loading

andreasnoack commented Nov 6, 2016

Keno commented Nov 28, 2016

andreasnoack commented Dec 9, 2016

andreasnoack commented Dec 9, 2016

andreasnoack commented Dec 16, 2016 • edited Loading

jeff-regier commented Dec 16, 2016

Keno commented Dec 18, 2016

andreasnoack commented Jan 4, 2017

kpamnany commented Jan 4, 2017

andreasnoack commented Jan 4, 2017

andreasnoack commented Jan 26, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 21, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 21, 2017

hsseung commented Feb 21, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Feb 22, 2017

kpamnany commented Feb 22, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Feb 22, 2017

andreasnoack commented Feb 22, 2017

hsseung commented Apr 23, 2017

hayatoikoma commented Jul 11, 2017

andreasnoack commented Jul 12, 2017

hsseung commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

andreasnoack commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

hayatoikoma commented Jul 12, 2017

aramirezreyes commented Nov 27, 2018 • edited Loading

andreasnoack commented Nov 27, 2018

andreasnoack commented Nov 3, 2016 •

edited

Loading

andreasnoack commented Nov 3, 2016 •

edited

Loading

andreasnoack commented Dec 16, 2016 •

edited

Loading

aramirezreyes commented Nov 27, 2018 •

edited

Loading