Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.5.0] Enable multithreading in lib build by default #2584

Merged
merged 2 commits into from
May 7, 2021

Conversation

senhuang42
Copy link
Contributor

@senhuang42 senhuang42 commented Apr 20, 2021

With 1.5.0, the default lib build now will be multithreaded.

  • Force enable multithreading on both dynamic and static libraries by appending -mt to the target, e.g. make lib-mt.
  • Force disable multithreading on both dynamic and static libraries by appending -nomt to the target, e.g. make lib-nomt.
  • By default, as mentioned before, dynamic library is multithreaded, and static library is single-threaded, e.g. make lib.

Makefile Outdated
@@ -151,7 +151,7 @@ clean:
ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD DragonFly NetBSD MSYS_NT Haiku))

HOST_OS = POSIX
CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON -DCMAKE_BUILD_TYPE=Release
CMAKE_PARAMS = -DZSTD_BUILD_CONTRIB:BOOL=ON -DZSTD_BUILD_STATIC:BOOL=ON -DZSTD_BUILD_TESTS:BOOL=ON -DZSTD_ZLIB_SUPPORT:BOOL=ON -DZSTD_LZMA_SUPPORT:BOOL=ON -DZSTD_MULTITHREAD_SUPPORT:BOOL=ON -DCMAKE_BUILD_TYPE=Release
Copy link
Contributor

@Cyan4973 Cyan4973 Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean -DZSTD_MULTITHREAD ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like CMake uses a different var for multithreading:

option(ZSTD_MULTITHREAD_SUPPORT "MULTITHREADING SUPPORT" ON)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ! I was confused because -DNAME feels similar to gcc method to add preprocessor macro NAME on command line,
but apparently it is the very same convention used by cmake to define its own variables,
hence the confusion.
Anyway, in such a case, the cmake build script can indeed use its own naming scheme.

The other thing that feels a bit unexpected here
is to define CMAKE_PARAMS so far away from the only place which uses it.
It looks to me that this variable definition should be brought down closer to cmakebuild,
which is the only recipe which uses it.
(pay attention to the ifneq (,$(filter MSYS%,$(shell uname))) just above which also redefines CMAKE_PARAMS).

lib/Makefile Outdated
@@ -292,16 +295,20 @@ libzstd : $(LIBZSTD)

.PHONY: lib
lib : libzstd.a libzstd
@if [ $(ZSTD_NO_MULTITHREAD) = 1 ]; then \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure of this syntax ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this works? I'm not too familiar with what's acceptable in a Makefile, though I suppose another way to emit this message would be to use the Makefile syntax, and do something like:

ZSTD_MULTITHREAD_MSG = multithreading disabled
ifneq ($(ZSTD_NO_MULTITHREAD), 1)
  ZSTD_MULTITHREAD_MSG = multithreading enabled
endif

lib : libzstd.a libzstd
	@echo $(ZSTD_MULTITHREAD_MSG)

Is this better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this construction would be a bit more familiar.

# make does not consider implicit pattern rule for .PHONY target

%-mt : CPPFLAGS += -DZSTD_MULTITHREAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor : modified scope

The purpose of %-mt rule is to convert any base rule (lib, libzstd, libzstd.a) into a multithreaded variant. Heck, even make all could become make all-mt.

By removing these lines, we make invoking these targets an error.
I get it that they become redundant,
but maybe there are users who are using them right now
and it would be great that their build script still work.

Suggestion : keep it, as a simple redirector, so that invoking them still works.
%-mt : %
Maybe update the message to indicate that -mt suffix is no longer necessary, and that the target will be removed in a future version (v1.6.0?).

@Cyan4973
Copy link
Contributor

We may have one remaining issue
for applications which link to the static library libzstd.a.

Unless I'm mistaken, it's not possible to embed the link -pthread into the archive libzstd.a.
(in contrast, the dynamic library libzstd.so.1 can be generated with an embedded link to -pthread).

A consequence is that an existing program linking to libzstd.a and expecting the usual "single-threaded" variant will fail the link stage, unless it adds -pthread to its build script.
This is in contrast with another program linking to the dynamic libzstd.so.1, which would still run and link exactly the same way, and wouldn't need to be aware that libzstd.do.1 is now multithreaded.

To move this PR forward, we will have to decide something regarding the static library.
Either :

  • Keep it that way, users linking to the static library libzstd.a must add -pthread to their build script. Unmodified build scripts will fail.
  • Make libzstd.a single-threaded by default, while libzstd.so.1 would become multi-threaded. This way, existing build scripts would remain unaffected.
  • Scrap the PR, keep everything single-threaded by default, as it is today.

While the second option may feel a bit "weird", due to creating different scopes for the static and dynamic libraries, I also note that it would make libzstd.a similar to "integrate zstd by source code" scenario, which is still "single-threaded" by default (one has to add ZSTD_MULTITHREAD macro to make it multi-threaded).

For discussion.

@senhuang42
Copy link
Contributor Author

Option 2 to generate a single-threaded libzstd.a with muli-threaded libzstd.so.1 is an interesting option... but I think it does start to make zstd + mt somewhat confusing with all the possible combinations?

If we go with option 2 we'll have:

  • ZSTD_NO_MULTITHREAD - new addition to disable multithreading by default. Has no effect on libzstd.a.
  • ZSTD_MULTITHREAD - existing macro to enable multithreading, which would get enabled by default in the dylib build, but not the libzstd.a build
    • Which would need to be directly specified when building libzstd.a to get multithreading
  • libzstd-nomt - another build target for with single-thread mode and without zstdmt_compress.c

@Cyan4973
Copy link
Contributor

Yeah, that's a bit confusing.
It points to a need for better documentation.

What seems to be logical would be :

  • make lib-mt : both libraries (dynamic and static) are produced with multithreading enabled
  • make lib-nomt : both libraries are produced with multithreading disabled
  • make lib : default mode, dynamic library is multithreaded, static library is single-threaded.
    we will also have to justify why we made this choice for default
    (impact on existing build scripts).
  • In all cases : integration at source code level remains single-threaded by default (multi-thread requires explicit setting of build macro ZSTD_MULTITHREAD).

@senhuang42 senhuang42 force-pushed the multithread_default_build branch 2 times, most recently from 2e14b4b to d9df6b4 Compare April 28, 2021 20:46
lib/Makefile Outdated
@@ -166,6 +169,14 @@ endif
endif
CPPFLAGS += -DZSTD_LEGACY_SUPPORT=$(ZSTD_LEGACY_SUPPORT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a slight imbalance in the way flags are passed down the chain
between the static of dynamic libraries.
The static library depends on modifications of default flags implicitly updated (or not),
while the dynamic library entirely resets the flag variables during the final recipe.
While it works, I'm afraid such imbalance could impact maintenance (i.e. future modifications).

So here is my recommendation.

  • Have a variable CPPFLAGS_STATICLIB the same way you already have CPPFLAGS_DYNLIB.
    • Initialize it to "empty"
    • Prefer to set it using = rather than :=. It's a nice convention for the reviewer to pay attention that this value will be updated later on.
  • Initialize CPPFLAGS_DYNLIB and LDFLAGS_DYNLIB to their delta vs baseline (not the full line), aka:
    • CPPFLAGS_DYNLIB = -DZSTD_MULTITHREAD
    • LDFLAGS_DYNLIB = -pthread
    • Alternatively, use CPPFLAGS_MT to declare starting values as constant (almost the same as currently, but only the delta to baseline, not the full concatenation). In this case, use := to indicate they are supposed to remain unmodified.
  • In the recipe for the dynamic library :
    • $(LIBZSTD): CPPFLAGS += $(CPPFLAGS_DYNLIB) ( note the +=)
    • $(LIBZSTD): LDFLAGS += $(LDFLAGS_DYNLIB)
  • In the recipe for the static library :
    • $(ZSTD_STATLIB): CPPFLAGS += $(CPPFLAGS_STATICLIB)
  • Now, in %-mt and %-nomt, set these 3 complementary variables to their appropriate values.

The outcome will be similar. But the chain of flag modifications will be symmetric between the static and the dynamic libraries, which makes it easier to follow and update in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion makes a lot of sense, and is definitely easier to reason about.

lib/Makefile Outdated Show resolved Hide resolved
@senhuang42 senhuang42 force-pushed the multithread_default_build branch 2 times, most recently from 068f745 to 23ff609 Compare April 29, 2021 14:21
lib/Makefile Outdated
@echo compiling multi-threaded static library $(LIBVER)
else
@echo compiling single-threaded static library $(LIBVER)
endif
Copy link
Contributor Author

@senhuang42 senhuang42 Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bizzare behavior I don't understand here: When I compile with BUILD_DIR set, somehow the expression $(filter -DZSTD_MULTITHREAD,$(CPPFLAGS)) evaluates to empty string (this doesn't happen normally). This causes us to output compiling single-threaded even in mt builds.

The confusing thing is that even when I also @echo $(CPPFLAGS) in the same line, it clearly shows that -DZSTD_MULTITHREAD is in CPPFLAGS, and even @echo $(filter -DZSTD_MULTITHREAD,$(CPPFLAGS)) evaluates to -DZSTD_MULTITHREAD. Is there something weird going on with the SET_CACHE_DIRECTORY thing?

Edit:
Hmm, I suppose one explanation is that ifneq (,$(filter -DZSTD_MULTITHREAD,$(CPPFLAGS))) gets evaluated before any of this begins? So then $(CPPFLAGS) would not contain -DZSTD_MULTITHREAD at the beginning of execution. Is there a proper way to make the if check a lazy one?

Edit2:
Changed this to the shell if, seems to work now with BUILD_DIR set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it was a question of parsing order,
with the initial formulation being part of stage 1.
Since BUILD_DIR skips reloading the Makefile,
all updates, which are part of stage 2, are ignored for stage 1
(without BUILD_DIR, Makefile is reloaded, and variables are passed to following process,
hence "stage 2" changes become "stage 1" for the following process).

The new formulation is part of stage 3,
thus applies any change achieved during previous stages.

@@ -8,6 +8,9 @@
# You may select, at your option, one of the above-listed licenses.
# ################################################################

# Note: by default, the static library is built single-threaded and dynamic library is built
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worthwhile to indicate something similar
in lib/README.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! In the process of updating the readme now.

@senhuang42 senhuang42 force-pushed the multithread_default_build branch 3 times, most recently from 31897fd to fc81460 Compare May 6, 2021 20:37
lib/README.md Outdated
For convenience, we provide a build target to generate multi and single threaded libraries:
- Force enable multithreading on both dynamic and static libraries by appending `-mt` to the target, e.g. `make lib-mt`.
- Force disable multithreading on both dynamic and static libraries by appending `-nomt` to the target, e.g. `make lib-nomt`.
- And as mentioned before, dynamic library is multithreaded, and static library is single-threaded, e.g. `make lib`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "By default, as mentioned before, ..."

@Cyan4973
Copy link
Contributor

Cyan4973 commented May 6, 2021

minor : PR description :
I guess ZSTD_NO_MULTITHREAD is no longer within the picture

@senhuang42 senhuang42 merged commit 91465e2 into facebook:dev May 7, 2021
@ghost ghost mentioned this pull request May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants