Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover some lost build flags for openblas #21688

Merged
merged 1 commit into from
May 4, 2017

Conversation

fredrikekre
Copy link
Member

@fredrikekre fredrikekre commented May 3, 2017

... that went lost in #20806.

Fix #21624

This should fix the following regressions from 0.5 (found in #20993):

| ["linalg","factorization",("eig","Matrix",1024)] | 1.94 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eig","Matrix",256)] | 2.29 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eig","SymTridiagonal",1024)] | 1.59 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eig","SymTridiagonal",256)] | 1.61 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eigfact","Matrix",1024)] | 1.95 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eigfact","Matrix",256)] | 2.28 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eigfact","SymTridiagonal",1024)] | 1.59 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("eigfact","SymTridiagonal",256)] | 1.61 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("lu","Tridiagonal",1024)] | 1.70 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("schur","Matrix",1024)] | 2.10 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("schur","Matrix",256)] | 2.36 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("schurfact","Matrix",1024)] | 2.09 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("schurfact","Matrix",256)] | 2.36 (45%) ❌ | 1.00 (1%) |
| ["linalg","factorization",("svd","Bidiagonal",1024)] | 1.46 (45%) ❌ | 1.00 (1%) |

First trip to makefile-land, but at least it works out locally.

Found this from the following. On 0.5.1:
make -C build/openblas-12ab1804b6ebcd38b26960d65d254314d8bc33d6/ CC=gcc -m64 FC=gfortran -m64 RANLIB=ranlib FFLAGS= -O2 -fPIC TARGET= BINARY=64 USE_THREAD=1 GEMM_MULTITHREADING_THRESHOLD=50 NUM_THREADS=16 NO_AFFINITY=1 DYNAMIC_ARCH=1 INTERFACE64=1 SYMBOLSUFFIX=64_ LIBPREFIX=libopenblas64_ MAKE_NB_JOBS=0

and master:
make -C scratch/openblas-85636ff1a015d04d3a8f960bc644b85ee5157135/ CC=gcc -m64 FC=gfortran -m64 RANLIB=ranlib FFLAGS= -O2 -fPIC TARGET= BINARY=64 USE_THREAD=1 GEMM_MULTITHREADING_THRESHOLD=50 NUM_THREADS=16 NO_AFFINITY=1 DYNAMIC_ARCH=1 INTERFACE64=1 SYMBOLSUFFIX=64_ LIBPREFIX=libopenblas64_ CFLAGS= FFLAGS= MAKE_NB_JOBS=0

Note the FFLAGS= in the end of the master build command overriding FFLAGS= -O2 -fPIC in the beginning of it.

@KristofferC
Copy link
Sponsor Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@KristofferC
Copy link
Sponsor Member

cc @maleadt

@martinholters
Copy link
Member

Nice detective work tracking that down!

@maleadt
Copy link
Member

maleadt commented May 3, 2017

Ouch, sorry for that! Nice work tracking it down though 🥇

@KristofferC KristofferC added backport pending 0.6 performance Must go faster building Build system, or building Julia or its dependencies labels May 3, 2017
@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

deps/blas.mk Outdated
@@ -46,7 +46,7 @@ $(BUILDDIR)/$(OPENBLAS_SRC_DIR)/build-compiled: | $(BUILDDIR)/objconv/build-comp
endif
endif

OPENBLAS_FFLAGS :=
OPENBLAS_FFLAGS := $(FFLAGS) $(JFFLAGS)
OPENBLAS_CFLAGS :=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably do the same for cflags

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC they are only appended conditionally though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the line you changed below is windows only too...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but thats just recovering the status pre #20806 . If anything, this patch adds JFFLAGS on all systems (which was not the case pre #20806 where on windows the set FFLAGS here was overwritten here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Why wouldn't the opt flag have been set at https://github.com/xianyi/OpenBLAS/blob/v0.2.19/Makefile.system#L992-L1005 then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe save the VERBOSE log of a build with vs without this change to separate files and diff them? there might be an upstream bug here with flags not getting sent everywhere they should

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this request perhaps be moved to an issue. Is it necessary to solve before merging this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the track record of the last few times anything was changed about the way we build blas and related libraries, we should identify the real cause of the problem here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if there's a problem with FFLAGS, then it's likely that being inconsistent and not doing the same thing with CFLAGS is still an issue. This is setting CFLAGS to empty everywhere except 32 bit windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but thats just recovering the status pre #20806

Not with CFLAGS - looking at https://github.com/JuliaLang/julia/blob/80b2da8137c71d8d53ce70a6b439fbc06fc05192~/deps/blas.mk CFLAGS is only ever appended to, never set to empty.

@fredrikekre
Copy link
Member Author

Here is the requested verbose log diff: https://gist.github.com/fredrikekre/200dbaf4d5c9a334db881e5f835d9e1b generated after make -C deps clean-openblas with:

make VERBOSE=true |& tee master
make VERBOSE=true |& tee PR
diff master PR > diff

@tkelman
Copy link
Contributor

tkelman commented May 4, 2017

Ah, looks like the difference is only in the parts of lapack that openblas is building straight from netlib sources, not any of the custom optimized lapack routines that openblas has their own versions of. For the source straight from netlib, lapack doesn't use openblas's Makefile.system so it misses the usual COMMON_OPT setting. Anything from cblas would also have the same problem but with CFLAGS instead of FFLAGS, and we do use a few cblas functions.

@fredrikekre
Copy link
Member Author

So this?

diff --git a/deps/blas.mk b/deps/blas.mk
index d473dd4..1044047 100644
--- a/deps/blas.mk
+++ b/deps/blas.mk
@@ -4,7 +4,7 @@ OPENBLAS_GIT_URL := git://github.com/xianyi/OpenBLAS.git
 OPENBLAS_TAR_URL = https://api.github.com/repos/xianyi/OpenBLAS/tarball/$1
 $(eval $(call git-external,openblas,OPENBLAS,,,$(BUILDDIR)))
 
-OPENBLAS_BUILD_OPTS := CC="$(CC)" FC="$(FC)" RANLIB="$(RANLIB)" FFLAGS="$(FFLAGS) $(JFFLAGS)" TARGET=$(OPENBLAS_TARGET_ARCH) BINARY=$(BINARY)
+OPENBLAS_BUILD_OPTS := CC="$(CC)" FC="$(FC)" RANLIB="$(RANLIB)" TARGET=$(OPENBLAS_TARGET_ARCH) BINARY=$(BINARY)
 
 # Thread support
 ifeq ($(OPENBLAS_USE_THREAD), 1)
@@ -46,8 +46,8 @@ $(BUILDDIR)/$(OPENBLAS_SRC_DIR)/build-compiled: | $(BUILDDIR)/objconv/build-comp
 endif
 endif
 
-OPENBLAS_FFLAGS :=
-OPENBLAS_CFLAGS :=
+OPENBLAS_FFLAGS := $(FFLAGS) $(JFFLAGS)
+OPENBLAS_CFLAGS := $(CFLAGS)
 
 # Decide whether to build for 32-bit or 64-bit arch
 ifneq ($(BUILD_OS),$(OS))
@@ -56,9 +56,9 @@ endif
 ifeq ($(OS),WINNT)
 ifneq ($(ARCH),x86_64)
 ifneq ($(USECLANG),1)
-OPENBLAS_CFLAGS += $(CFLAGS) -mincoming-stack-boundary=2
+OPENBLAS_CFLAGS += -mincoming-stack-boundary=2
 endif
-OPENBLAS_FFLAGS += $(FFLAGS) -mincoming-stack-boundary=2
+OPENBLAS_FFLAGS += -mincoming-stack-boundary=2
 endif
 endif

@tkelman
Copy link
Contributor

tkelman commented May 4, 2017

Yes. Guess what we'd really want to look at is diff of the flags before #20806 vs after this PR. Or diff of before #20806 vs after #20842, to make sure this restores everything that was lost.

@tkelman
Copy link
Contributor

tkelman commented May 4, 2017

Actually now that I think back and remember why I only partially reverted #20806 (leaving the part that caused this issue) instead of completely reverting it, is that I thought having separate OPENBLAS_CFLAGS and OPENBLAS_FFLAGS makefile variables that you could set independently might be useful in some situations. It still could, but if you have to remember the rest of the content of CFLAGS and FFLAGS to initialize it equivalently, maybe it would be better to initialize them to

OPENBLAS_FFLAGS := $(JFFLAGS)
OPENBLAS_CFLAGS := -O2

...

OPENBLAS_BUILD_OPTS += CFLAGS="$(CFLAGS) $(OPENBLAS_CFLAGS)"
OPENBLAS_BUILD_OPTS += FFLAGS="$(FFLAGS) $(OPENBLAS_FFLAGS)"

@fredrikekre
Copy link
Member Author

Okay, I updated. Someone wanna trigger nanosoldier again?

@KristofferC
Copy link
Sponsor Member

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@tkelman tkelman merged commit 9beece6 into JuliaLang:master May 4, 2017
@fredrikekre fredrikekre deleted the fe/openblasflags branch May 4, 2017 22:48
tkelman pushed a commit that referenced this pull request May 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Factorization twice as slow on 0.6 than 0.5
6 participants