Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Linux shared library modules (*.so files) #2445

Merged
merged 4 commits into from
Jan 29, 2019

Conversation

Anthony-Mai
Copy link
Contributor

exports all symbols. This creates large module files as the export
symbol table contains too many entries. The correct approach is
to export nothing by default. Anything that needs to be exported
must be explicitly specified. This is done by the following steps:

In the Makefile, add "-fvisibility=hidden" flag. You can search
for "-fPIC" to find the appropriate place to add the flag. This
hides symbols by default if not explicitly specified otherwise.

To declare of any symbol to be exported, add this attribute:
  __attribute__((visibility("default")))
The attribute string can be added using a macro definition. It
should be added right before the return type for functions, or
right after the 'class' or 'struct' keyword for class/struct.

For more info on shared module export symbol visibility read:
https://gcc.gnu.org/wiki/Visibility

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers.

@Anthony-Mai Anthony-Mai changed the title Default compilation of Linux shared library modules (*.so files) Optimize Linux shared library modules (*.so files) Jan 16, 2019
@Anthony-Mai
Copy link
Contributor Author

Anthony-Mai commented Jan 16, 2019

The automatic build failed. Somebody please help me to understand why it failed. It was building on my local machine:
http://ci.tvm.ai:8080/blue/rest/organizations/jenkins/pipelines/tvm/branches/PR-2445/runs/1/nodes/4/steps/21/log/?start=0

It looks to me there are only warnings, not errors. And they are all related to Doxygen unable to parse the code correctly. It should not effect the builds in anyway. However I do not know how to fix warnings.

@Anthony-Mai
Copy link
Contributor Author

It appears to me the code does not have a problem, but the Jenkins script checking syntax cannot understand attribute((visibility("default"))) placed in front of a function protocol declaration. The syntax is valid according to:
https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html#Attribute-Syntax
Please fix the Jenkins script to suppress the warnings as I have no access to it and don't know how.
I still want this pull request taken. It reduces the compiled *.so file by at least 16% which is not trivial.

@Anthony-Mai
Copy link
Contributor Author

/workspace/nnvm/include/nnvm/c_api.h:372: warning: argument 'msg' of command @param is not found in the argument list of attribute((visibility("default"))) const
/workspace/nnvm/include/nnvm/c_api.h:372: warning: argument 'out_size' of command @param is not found in the argument list of attribute((visibility("default"))) const
...
/workspace/nnvm/include/nnvm/c_api.h:372: warning: argument 'dst' of command @param is not found in the argument list of attribute((visibility("default"))) const

It appears that the warnings occurs near the end of parsing the header file. The Doxygen parser tries to match the parameters it sees in the comment with the actual parameters in the function protocol:

/*!

  • \brief Set the last error message needed by C API
  • \param msg The error message to set.
    /
    NNVM_DLL void NNAPISetLastError(const char
    msg);

It expands NNVM_DLL into attribute((visibility("default"))), and thinks it is the function protocol, and tries to find within it the parameter msg mentioned in the comment, but could not find it. The code has no problem. But the doxygen parser could not understand the attribute((visibility("default"))) is just an attribute tag. I do not know how to suppress that warning. Some one please help.

@Anthony-Mai
Copy link
Contributor Author

This still does not work. Now the error is unresolved symbols:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/3/pipeline

This is probably because with the new way how the macro is defined:

#elif GNUC >= 4
#define NNVM_DLL attribute((visibility("default")))
#else
#define NNVM_DLL
#endif

It some how result in NNVM_DLL to be defined to nothing, thus the symbol is not exported. Strangely, as NNVM_DLL is defined to nothing, it does not trigger the Doxygen parser warning any more.

I am going to try something else to make this working: Define NNVM_DLL in the way of original pull request, without the GNUC version check. Get rid of the visibility specifier NNVM-DLL from the header file where the Doxygen comments sit, but add it to the cc file where the function actually sits.

@tqchen
Copy link
Member

tqchen commented Jan 18, 2019

Thanks for the contribution. I think we should still find ways to put these DLL marks in the header file, as they are the user-facing places that recognizes which ones need to be exported

@Anthony-Mai
Copy link
Contributor Author

tqchen:

As much as I would initially prefer the visibility specifier showing up in the header file as you did, I could not resolve the conflict with the Doxygen parser getting confused. After I give it more thought I find there is no justification to put it in header files. A header file is the place where provider and user agree on something, like "This is how this particular function call protocol should look like". A visibility specifier is NOT about what the function should look like (what parameter it takes, what is the return value, etc. etc.) but something that tells the compiler to make something in the current translation unit available by making it visible. The Microsoft way of defining either dll_import or dll_export in the header file, depending on who includes that header file, is awful. Now different parties no longer agrees on what the header file looks like: It's a "One Macro, but up to different interpretation what the macro is", defeating what a header file should do.

Thus it makes more sense to have the visibility specifier not where function protocol is define, but rather where the definition of the function is, i.e., where an actual function sits. like "Here is the actual function and I want to export it." For the same reason, it is absurd to add the visibility specifier to a virtual function, or even a pure virtual function in a header file. "I want to export a pure virtual function which actually does not exist (???)".

Do you now agree that the place where a function is implemented is the logically right place to specify that it should be made visible and exported, not the place a function is declared?

@FrozenGene
Copy link
Member

As far as I know, the common way is put function attribute (include visibility) into function declaration, not function implementation. I agree that the header file is end users care about. Like the doc(https://gcc.gnu.org/wiki/Visibility) say:
In your header files, wherever you want an interface or API made public outside the current DSO, place attribute ((visibility ("default"))) in struct, class and function declarations you wish to make public (it's easier if you define a macro as this). You don't need to specify it in the definition.

@Anthony-Mai
Copy link
Contributor Author

The changes finally builds with every symbols resolved, in patch #13. However now there are unit test failures all related to one spot:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/13/pipeline/36

File "/workspace/vta/python/vta/testing/util.py", line 35, in run
assert simulator.enabled()
AssertionError

The code in util.py says:
else: # Make sure simulation library exists # If this fails, build vta on host (make) # with TARGET="sim" in the json.config file. assert simulator.enabled() run_func(env, rpc.LocalSession())
Can some one help to resolve this problem?

@Anthony-Mai
Copy link
Contributor Author

FrozenGene:
I would have preferred to leave the visibility attribute in the header files as commonly done. But it interference with how Doxygen parser works. The Doxygen parse extracts parameters from the comments above the function declaration, then try to match the parameters in the subsequent function protocol below it. It expands the TVM_DLL macro into attribute((visibility("default"))) and thought that must be the corresponding function protocol, and it is puzzled that the parameters cannot be found. This is a problem I cannot solve and so I was forced to move the visibility attribute to the implementation files instead.

Unless some one has a better solution I propose we take it as it is, with the visibility attribute stay with the implementation. This visibility change is very valuable in reducing module size so should be taken.

@FrozenGene
Copy link
Member

Yes. I fully understand its value in reducing module size. However, I think place attribute in the funcion definition because of Doxygen issue is not the right way. I think the right way should be Doxygen preprocessing setting. In fact, many users has similar problem, for example. https://sites.google.com/site/opensourceconstriubtions/ettl-martin-1/tutorials/doxygen-is-not-able-to-parse-the-gcc-keywords-__attribute__-x-how-to-solve-this . You could also check this document: http://www.doxygen.nl/manual/preprocessing.html I think these two links should be related with your Doxygen problem.

@Anthony-Mai
Copy link
Contributor Author

FrozenGene:
Well I do not have access to the Doxygen parser code at ci.tvm.ai thus cannot do anything about it. If you guys can do something about it go ahead do it. I cannot do anything from the source code end to fix the issue. Please help if you can.

@tqchen
Copy link
Member

tqchen commented Jan 18, 2019

The doxygen code is here https://github.com/dmlc/tvm/blob/master/docs/Doxyfile we can reproduce locally via make doc

@Anthony-Mai
Copy link
Contributor Author

tqchen:
Thanks for the tip. Yes I see the docs/Doxyfile file, and modified accordingly. I run doxygen docs/Doxyfile from the tvm main folder, and everything works, fine. I fixed one more exported symbols. Now let's see how things are going: http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/17/pipeline/

@tqchen
Copy link
Member

tqchen commented Jan 19, 2019

Please rebase against current master and squash the commits, ideally we only have to change the def of TVM_DLL

@Anthony-Mai
Copy link
Contributor Author

I am trying to squash multiple git commit into one and move the visibility specifier back to the head file. Let's see how it goes: http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/27/pipeline

@Anthony-Mai
Copy link
Contributor Author

TQChen:
I finally get all tests passed:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/31/pipeline/
As you can see it certainly takes more than just updating the def of TVM_DLL. The point is, once everything default to hidden, all needed exports must now be explicitly specified as visibility "default". Where the specifiers were previously missing it was OK, but not any more. I end up having to modify 18 files but most are trivial.
A few things worth noting:
in include/tvm/ir_operator.h I removed two definitions which were duplicated in the file
in nnvm/include/nnvm/node.h I moved some inline functions into places where they are declared, instead of keep the declarations and definitions separate. This is needed to resolve some compiler warnings.
Same for nnvm/include/nnvm.op.h Any inline function should be directly defined at the places of declaration. This resolved compiler warnings and is a good coding style.

Please review again.

@Anthony-Mai
Copy link
Contributor Author

@tqchen and @FrozenGene could you review again? The CR is now in a good shape. Passed all tests. And my local tests all works fine. This can reduce module sizes quite a bit and speed up load time, too.

@FrozenGene
Copy link
Member

I have no other comments. @tqchen maybe could give some comments.

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see quite a bit un-necessary changes

include/tvm/relay/pass.h Outdated Show resolved Hide resolved
include/tvm/ir_pass.h Outdated Show resolved Hide resolved
include/tvm/ir_operator.h Show resolved Hide resolved
nnvm/include/nnvm/node.h Outdated Show resolved Hide resolved
nnvm/include/nnvm/op.h Outdated Show resolved Hide resolved
@tqchen tqchen added the status: need update need update based on feedbacks label Jan 25, 2019
@Anthony-Mai
Copy link
Contributor Author

@tqchen I updated my changes to sync up with the master branch, and fixed a trivial alignment. You questioned why I moved a few inline functions into their places. These are necessary fixes to get rid of the compiler and lint check warnings, and those are the reasons why the original code had to have the "NOLINT" comment. Basically if you want a function to be inline, then do not separate declaration and implementation. Keep them in the same place at the declaration. If you insist to keep declaration and implementation separate, then the function probably should not be inline in the first place.

Please review again.

@Anthony-Mai
Copy link
Contributor Author

Patch build 33 failed:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/33/pipeline
because of this recent change to tests/cpp/ir_simplify_test.cc:
859bda8#diff-c4fa8c4df028d3de52cadfd2d4cc9b99
The change now refers to a symbol in HalideIR which now needs to be exported:
https://github.com/dmlc/HalideIR/blob/master/src/ir/IR.cpp#L652
A keyword "EXPORT" needs to be added at line 652. It was already added on line 653 and 654.
I will submit a separate pull request on that one.

@Anthony-Mai
Copy link
Contributor Author

@tqchen Please approve separate pull request: dmlc/HalideIR#45
The need of that change is evident once you use objdump to exam the compiled libtvm.so:
$ objdump -TC libtvm.so | grep -n ".text" | grep "HalideIR" | grep -w "ExprNode"
1303:00000000006f5ea0 g DF .text 0000000000000012 Base HalideIR::Internal::ExprNodeHalideIR::Internal::Min::accept(HalideIR::Internal::IRVisitor*, HalideIR::Expr const&) const
1768:00000000006f5ec0 g DF .text 0000000000000012 Base HalideIR::Internal::ExprNodeHalideIR::Internal::Max::accept(HalideIR::Internal::IRVisitor*, HalideIR::Expr const&) const

Those only two entries corresponds to the two lines, 653 and 654 in src/ir/IR.cpp which already had the EXPORT keyword. Line 652 was not exported since it did not have the "EXPORT":
dmlc/HalideIR@bec3aff

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please still restore the inline functions

include/tvm/ir_pass.h Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Jan 25, 2019

OK, some comments wrt to the inline function declaration vs implementation. I agree that it is a subjective thing, and the reason that inline keyword exists to enable such separation.

The main reason to keep declaration and implementation apart is to make the interface more clear without mixing implementation into it.

While I agree this is an subjective choice of code style, it has nothing to do with the compiler warning or lint, so let us keep the way as it is

@Anthony-Mai
Copy link
Contributor Author

Change updated and synced with master again. Waiting for the checks:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/35/pipeline/
@tqchen and @FrozenGene Please review again.

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final comment on alignment, once it is fixed, we can merge it in

vta/include/vta/runtime.h Outdated Show resolved Hide resolved
@Anthony-Mai
Copy link
Contributor Author

@tqchen Your final comment was addressed. The check comes back all green:
http://ci.tvm.ai:8080/blue/organizations/jenkins/tvm/detail/PR-2445/36/pipeline/
Please review again.

@tqchen
Copy link
Member

tqchen commented Jan 28, 2019

@Anthony-Mai somehow the commits seems to be messed up, this is likely due to the fact that you did a merge instead of rebase.

Please try some of the tips in https://docs.tvm.ai/contribute/git_howto.html, you can start to apply your patches from a clean state.

Default compilation of Linux shared library modules (*.so files)
exports all symbols. This creates large module files as the export
symbol table contains too many entries. The correct approach is
to export nothing by default. Anything that needs to be exported
must be explicitly specified. This is done by the following steps:

    In the Makefile, add "-fvisibility=hidden" flag. You can search
    for "-fPIC" to find the appropriate place to add the flag. This
    hides symbols by default if not explicitly specified otherwise.

    To declare of any symbol to be exported, add this attribute:
      __attribute__((visibility("default")))
    The attribute string can be added using a macro definition. It
    should be added right before the return type for functions, or
    right after the 'class' or 'struct' keyword for class/struct.

    To supress Doxygen parser warnings, modify docs/Doxyfile and
    add to PRE_DEFINED: TVM_DLL= NNVM_DLL= __attribute__(x)=

For more info on shared module export symbol visibility read:
    https://gcc.gnu.org/wiki/Visibility

Update submodule HalideIR to 7a3287d3883fdeac3aba2a7f3865c7ab78e1925c
and dlpack to 5c792cef3aee54ad8b7000111c9dc1797f327b59.

Explicitly export __gnu_f2h_ieee() which is needed in a unit test.

Move the visibility specifier to header files.
@Anthony-Mai
Copy link
Contributor Author

@tqchen Thanks for the tip https://docs.tvm.ai/contribute/git_howto.html That was exactly what messed up. I used git pull to rebase to upstream instead of the correct way of rebase upstream/master. I used your method and it is much cleaner. Now it shows only the files I actually modified, and the rebasing went much smoother. I also fixed the remaining alignment of runtime.h. Please review again.

@tqchen tqchen merged commit 75f91c4 into apache:master Jan 29, 2019
@tqchen
Copy link
Member

tqchen commented Jan 29, 2019

Thanks, @Anthony-Mai, @FrozenGene ! this is now merged

@tqchen tqchen added status: accepted and removed status: need update need update based on feedbacks labels Jan 29, 2019
@Anthony-Mai
Copy link
Contributor Author

Anthony-Mai commented Jan 29, 2019

Thanks @FrozenGene and @tqchen for spending time to review. And thanks for teaching me the correct way of rebase with upstream master. Cheers!

To give an idea how much making the default symbol visibility as hidden would save. here is the a comparison of before and after module sizes:
(This is before) tvm0/build$ ls -l *.so
-rwxr-xr-x 1 amai amai 3956896 Jan 25 17:56 libnnvm_compiler.so
-rwxr-xr-x 1 amai amai 971416 Jan 25 17:52 libtvm_runtime.so
-rwxr-xr-x 1 amai amai 14760168 Jan 25 17:55 libtvm.so
-rwxr-xr-x 1 amai amai 1114488 Jan 25 17:55 libtvm_topi.so
-rwxr-xr-x 1 amai amai 188320 Jan 25 17:52 libvta.so
(This is after) tvm/build$ ls -l *.so
-rwxr-xr-x 1 root root 3512344 Jan 29 09:48 libnnvm_compiler.so
-rwxr-xr-x 1 root root 891904 Jan 29 09:45 libtvm_runtime.so
-rwxr-xr-x 1 root root 12848360 Jan 29 09:47 libtvm.so
-rwxr-xr-x 1 root root 965104 Jan 29 09:47 libtvm_topi.so
-rwxr-xr-x 1 root root 175672 Jan 29 09:45 libvta.so

merrymercy pushed a commit to merrymercy/tvm that referenced this pull request Feb 18, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
tqchen pushed a commit to tqchen/tvm that referenced this pull request Mar 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants